DBMS Viva Questions and Answers

17 Dec 2024

6 min read

Table of contents

Basic DBMS Viva Questions and Answers
Advanced Database Management System Viva Questions
Conclusion
Frequently Asked Questions

The Database Management System (DBMS) is one of the most important subjects for computer science and IT students. It plays a pivotal role in organizing, storing, and managing data efficiently. When preparing for exams or practical sessions, DBMS Viva Questions are often asked to evaluate the depth of a student's understanding of various database concepts.

In this article, we will explore some of the most common DBMS Viva Questions and answers that will help you prepare for your DBMS Viva session, along with the basic and advanced questions.

Basic DBMS Viva Questions and Answers

1. What is a DBMS?

A DBMS (Database Management System) is a software system designed to manage databases. It provides a systematic way to store, retrieve, and manipulate data efficiently. A DBMS ensures data integrity, security, and consistency. It allows multiple users to interact with the database simultaneously and offers a centralized data management system.

2. What are the types of DBMS?

There are four primary types of DBMS:

Hierarchical DBMS: Data is structured in a tree-like format. Each record has a single parent, making it suitable for applications like banking systems.
Network DBMS: Data is represented using graphs where multiple relationships between records exist.
Relational DBMS (RDBMS): Data is stored in tables with rows and columns. The RDBMS utilizes Structured Query Language (SQL) for managing and manipulating data..
Object-Oriented DBMS: Data is represented as objects, similar to object-oriented programming concepts, and it integrates database capabilities with object-oriented programming features.

3. What is SQL?

SQL defines Structured Query Language as a standard programming language. It is used for managing and querying data in a relational database. It allows users to interact with the database by operating such as SELECT, INSERT, DELETE, and UPDATE.

SQL is used for various purposes, including defining a database's structure (DDL), managing data (DML), and controlling user access (DCL).

4. What are the relationship types in DBMS?

In DBMS, relationships between tables are defined by how they are connected. There are three primary types of primary relationships:

One-to-One: Each record in one table is connected to one record in another table.

Example: A person has one passport.

One-to-Many: A single record in one table connects to many records in another table.

Example: One customer can have multiple orders.

Many-to-Many: Multiple entries in one table can be connected to multiple entries in another table.

Example: In a system involving students and courses, each student can enroll in various courses, and each course can have several students.

5. What is a database?

Databases are structured collections of data stored and accessed electronically. It provides a systematic way to manage, store, and retrieve data for different applications.

6. What is normalization?

A normalization process involves dividing large tables into smaller ones to reduce redundancy and dependency in a database. It ensures that each piece of information is stored in the right place, eliminating anomalies like update, insertion, and deletion anomalies. The goal is to ensure that each table contains only related data, thus improving data integrity and efficiency.

7. Explain the different normal forms in DBMS.

Normalization involves multiple stages, or "normal forms," to eliminate data redundancy and dependency:

1NF (First Normal Form): Ensures that each column contains atomic (indivisible) values and that there are no repeating groups or arrays.
2NF (Second Normal Form): Builds on 1NF and removes partial dependencies, ensuring that every non-key attribute is fully dependent on the primary key.
3NF (Third Normal Form): Removes transitive dependencies, ensuring that non-key attributes are only dependent on the primary key.
BCNF (Boyce-Codd Normal Form): A stricter version of 3NF that ensures every determinant is a candidate key.
4NF (Fourth Normal Form): Deals with multi-valued dependencies, ensuring that no table contains more than one independent multi-valued dependency.

8. What is a primary key?

A primary key serves as a unique identifier for each entry in a database table. It ensures that no two rows in the table can have the same value in the primary key column(s). A primary key is crucial for maintaining data integrity and is used to establish relationships with other tables through foreign keys. A primary key cannot have NULL values.

9. What is a foreign key?

A foreign key is a column in a table that uniquely identifies a row in another table, or in the same table when referencing itself. It is used to create and maintain a relationship between the data in two tables, ensuring that the connection between them remains consistent and valid. The foreign key column in one table points to the primary key in another table, ensuring valid data relationships.

10. What is an index?

An index is a database object that improves the speed of data retrieval operations on a table. It works similarly to an index in a book, allowing the database to quickly locate the rows without having to search the entire table. Indexes can be created on one or more columns, and although they speed up query performance, they can slow down data modification operations like insertions, deletions, and updates.

11. What is join in SQL?

JOIN is used to combine rows from two or more tables based on a related column between them. Joins allow you to retrieve data that is spread across multiple tables in a relational database. The most commonly used joins are INNER JOIN, LEFT JOIN, RIGHT JOIN, and FULL OUTER JOIN. These types of join determine how unmatched rows are handled in SQL JOINS.

12. Differentiate between INNER JOIN and Outer JOIN.

Here are the differences between INNER JOIN and OUTER JOIN:

INNER JOIN	OUTER JOIN
Returns only the matching values in both tables.	Returns all rows from one table and matched rows from the other; non-matching rows will have NULLs.
Used when you need data that exists in both tables, e.g., matching customer orders.	Used when you need all data from one or both tables, even if no matching data exists.
It does not have any JOINS.	It consists of three types: LEFT OUTER JOIN, RIGHT OUTER JOIN, and FULL OUTER JOIN.

13. Explain the ACID properties.

The ACID properties maintain the consistency and reliability of transactions in a database are:

Atomicity: Ensures that a transaction is treated as a single unit, meaning either all of its operations are successfully executed or none of them are.
Consistency: Ensures that each transaction maintains the database's integrity while shifting it from one valid state to another.
Isolation: Ensures that transactions are isolated from each other, meaning the operations of one transaction are not visible to others until the transaction is completed.
Durability: Assures that once a transaction is committed, its changes are permanent, even if there is a system disruption.

14. What is a transaction in DBMS?

A transaction in DBMS is a sequence of one or more operations that are executed as a single unit of work. These operations typically include INSERT, UPDATE, or DELETE commands. A transaction ensures that the database is always in a consistent state by adhering to the ACID properties. Transactions help in managing changes to the database effectively and are used in scenarios where multiple operations need to be executed in sequence.

15. What is a deadlock in DBMS?

A deadlock is a situation that occurs when two or more transactions are blocked forever, each waiting for the other to release resources. In a deadlock, no transaction can proceed because each one holds a resource that the other needs. The DBMS must detect and resolve deadlocks by aborting one of the transactions, allowing others to proceed. This is typically handled by deadlock prevention and detection algorithms.

16. What is a view in DBMS?

A view in DBMS is a virtual table created by a query that combines data from one or more tables. Unlike a physical table, a view does not store data itself; it only provides a way to view data based on the underlying tables. Views can simplify complex queries, provide a level of abstraction, and enhance security by restricting access to sensitive data. They can also be used to hide complexity or ensure consistency.

17. What is a trigger in DBMS?

A trigger is a set of SQL statements that are automatically executed in response to certain events on a particular table or view. These events include INSERT, UPDATE, or DELETE operations. Triggers are commonly used to enforce business rules, ensure data integrity, or automatically generate values such as timestamps. They can be set to execute either before or after the triggering event.

18. What are the advantages of DBMS?

DBMS offers numerous advantages, including:

It ensures accuracy and consistency
Provides mechanisms for data access control
Minimizes duplication of data
Data Independence
Improved Data Sharing

19. What is data redundancy?

Data redundancy refers to the excessive duplication or repetition of data within a database. It leads to inefficient use of storage, inconsistencies, and difficulties in updating data. Redundant data can cause errors such as update, insert, or delete anomalies. Normalization is employed in DBMS to eliminate redundancy by dividing the data into multiple tables that can be linked using keys.

20. What is data independence?

Data independence refers to the ability of a database system to change its schema without affecting the application programs. It is classified into two types:

Logical Data Independence: The ability to change the logical schema (e.g., adding new fields) without affecting the application programs.
Physical Data Independence: The ability to change the physical schema (e.g., how data is stored) without affecting the logical schema or application programs. Data independence is a key advantage of using a DBMS over traditional file systems.

21. What is the difference between a clustered and a non-clustered index?

The difference between clustered and non-clustered index are listed below:

Clustered Index	Non-Clustered Index
The data rows are stored according to the order of the clustered index key. There is only one clustered index per table.	The data rows are stored independently of the index. Several non-clustered indexes can be created on a table.
The index itself is the actual table or a sorted version of it.	The index is a separate structure with pointers to the data rows.
Faster for range queries (e.g., between values) as data is physically stored in sequence.	Faster for specific queries, especially when only a subset of columns is needed, but not as efficient for range queries.

22. What are the different types of indexes?

There are two main types of indexes in DBMS:

Single-Level Index	Multi-Level Index
Contains only one level of indexing with pointers to the data.	Organized into multiple levels, where each level points to the next.
Simpler, straightforward structure.	More complex due to the multiple levels of indexing.
Faster for small data sets, as it requires fewer comparisons.	More efficient for large data sets, minimizing comparisons through a hierarchical structure.

23. What is the purpose of the GROUP BY clause in SQL?

The GROUP BY clause in SQL is used to organize rows with identical values in specified columns into summary rows. It is commonly used with aggregate functions like COUNT, SUM, AVG, MIN, and MAX to perform calculations on each group of rows. For example, you can use the GROUP BY clause to calculate the total sales by each region or count the number of employees in each department. It helps organize data and generate reports.

24. What is schema in DBMS?

In DBMS, a schema defines the structure of a database, including tables, relationships, views, and constraints. It serves as a blueprint, outlining how data is organized and accessed.

25. What is a relational database in DBMS?

A relational database is a type of database that organizes data into tables consisting of rows and columns, using a relational model to structure the data. The tables in a relational database are linked to each other using keys.

26. Explain the term ‘database normalization’.

Database normalization is the process of organizing a database in such a way that reduces redundancy and dependency by dividing large tables into smaller, manageable ones. The goal is to ensure that each table stores data related to only one entity or concept. By applying normalization rules, DBMS avoids anomalies that might arise when inserting, updating, or deleting data. It also improves data integrity and efficiency.

It usually involves transforming data into a set of well-structured tables by applying several normal forms (1NF, 2NF, 3NF, etc.) to ensure the database schema is optimized.

27. What is a Composite Key?

A composite key consists of two or more columns in a table that, when combined, uniquely identify a record. Each column in the composite key may not uniquely identify a row, but when combined, they provide uniqueness. Composite keys are useful when no single column can serve as a primary key.

For example, in a table representing a class enrollment system, the combination of student_id and course_id could serve as a composite key to uniquely identify each record, a student can register for several courses, and each course can accommodate multiple students.

28. What is a surrogate key?

A surrogate key is a unique identifier assigned to each record in a database table, which is not derived from any business-related data. These keys are typically system-generated and are often used in situations where natural keys (such as email addresses or product codes) are either too long, complex, or prone to changes.

29. What is a cascade delete in DBMS?

A cascade delete is a type of referential action that automatically deletes records in related tables when a record in the primary table is deleted. This operation ensures that no orphan records (i.e., records in child tables that reference a non-existing parent record) remain in the database.

30. What are the various types of joins in SQL?

In SQL, the following join types are used to retrieve data from multiple tables:

INNER JOIN: Fetches rows that have corresponding matches in both tables.
LEFT JOIN (LEFT OUTER JOIN): Fetches all rows from the left table and the matched rows from the right table. When no match is found, the columns from the right table are filled with NULL values.
RIGHT JOIN (RIGHT OUTER JOIN): Fetches all rows from the right table and the matched rows from the left table. NULL values are returned for the left table columns if there’s no match.
FULL JOIN (FULL OUTER JOIN): Retrieves all rows when there is a match in either of the tables, with NULL values where there is no match.
CROSS JOIN: Merges each row from the first table with every row from the second table, creating a Cartesian product.
SELF JOIN: Joins a table to itself, which is useful for comparing rows within the same table.

Advanced Database Management System Viva Questions

31. What is denormalization?

Denormalization is the process of intentionally introducing redundancy into a database by combining tables that were previously separated during normalization. This technique is often used in situations where performance improvements in query execution are necessary, such as in data warehousing or OLAP systems.

32. What is a hash join?

A hash join is an efficient algorithm for joining large datasets, especially when the data is not sorted. The process involves creating a hash table for one of the tables (typically the smaller table) using the join key as the hash key. The second table is then scanned, and for each record, the hash table is checked to find matching records.

Hash joins are especially useful when dealing with large, unsorted data sets, and they help reduce the time complexity of join operations. The hash join algorithm is widely used in modern DBMS to improve query performance.

33. What is the difference between a file system and a DBMS?

The differences between file system and DBMS are:

File System	DBMS
Data stored in files, with no structure.	Data is stored in structured tables.
No enforced consistency or integrity checks.	Ensures data integrity with ACID properties.
Limited, manual access and management.	Provides advanced query languages (SQL) for easy data access.

34. What is the purpose of a database query?

A database query allows users to retrieve specific data from a database. Using SQL (Structured Query Language), users can perform operations like SELECT, INSERT, UPDATE, and DELETE to interact with the database, making it easy to manipulate and view data.

35. What are aggregate functions in SQL?

The aggregate functions in SQL are:

COUNT(): Returns the number of rows in a table or a specific column.
SUM(): Calculates the sum of values in a numeric column.
AVG(): Computes the average value of a numeric column.

These functions are frequently paired with the GROUP BY clause to summarize data.

36. What is a ‘constraint’ in DBMS?

A constraint is a rule enforced on data columns in a table to maintain data integrity and consistency. Some common types of constraints include:

PRIMARY KEY: Ensures each record is uniquely identifiable.
FOREIGN KEY: Maintains referential integrity between tables.
NOT NULL: Ensures that a column cannot have NULL values.

37. What is a non-relational database?

A non-relational database (NoSQL) is designed to handle unstructured or semi-structured data, offering flexibility in data modeling. It differs from relational databases because it does not use tables and rows for data storage. Examples of NoSQL databases include MongoDB and Cassandra.

38. What is the difference between UNION and UNION ALL?

The main differences between UNION and UNION ALL are:

UNION	UNION ALL
Combines results of two queries, eliminating duplicate rows.	Combines results of two queries, including duplicates.
Slower due to duplicate elimination.	Faster since no duplicate removal.
When you need distinct results.	When duplicates are allowed.

39. What is the purpose of the DELETE statement?

The DELETE statement in SQL is used to eliminate rows from a table. It can be used with the WHERE clause to specify which records to delete. If the WHERE clause is omitted, all records in the table will be deleted.

40. What is a B-tree index?

A B-tree index is a balanced tree structure used to speed up data retrieval.

41. What is Data Warehouse?

A data warehouse is a large, centralized repository of data that is designed to support decision-making processes. Data is often extracted from different transactional databases, transformed, and loaded (ETL process) into the warehouse. It enables complex queries and analytics for reporting and business intelligence.

42. What is the difference between OLAP and OLTP System?

Here are the differences between OLAP and OLTP System:

OLAP (Online Analytical Processing)	OLTP (Online Transaction Processing)
Designed for complex queries and data analysis. It involves large amounts of historical data and is used for decision-making processes (e.g., data warehouses).	Designed for managing transactional data in real-time. It involves frequent insert, update, and delete operations, and is optimized for speed and accuracy (e.g., banking systems).

43. What is a query execution plan?

A query execution plan is a strategy that the DBMS uses to execute a SQL query. It outlines the steps the system will take to retrieve or manipulate the required data. The plan may include operations like table scans, index scans, joins, sorts, etc. Query execution plans can be analyzed to optimize query performance.

44. What is CAP Theorem in distributed databases?

The CAP Theorem (Consistency, Availability, Partition Tolerance) states that in a distributed database system, it is impossible to simultaneously guarantee all three of the following:

Consistency: It ensures that all nodes in the system view the same data simultaneously.
Availability: Every request to the system receives a response, even if some of the data is outdated.
Partition Tolerance: The system remains functional even in the event of network partitions between nodes.

According to CAP, a distributed system can only guarantee two of the three properties, but not all three simultaneously. For example, systems like Cassandra prioritize Availability and Partition Tolerance.

45. What is indexing and its types?

Indexing is a technique used to speed up data retrieval operations in a database by creating a data structure (index) that allows faster searching. Indexes are created on frequently queried columns. Types of indexes include:

B-tree Index: A balanced tree structure that provides quick search, insert, and delete operations.
Hash Index: Uses a hash function to map data to a specific location for fast retrieval. Useful for equality comparisons.
Bitmap Index: Efficient for columns with low cardinality (few unique values), using a bitmap to represent data.
Clustered Index: Determines the physical order of data in the table; only one clustered index is allowed per table.
Non-clustered Index: A separate data structure from the table, providing a logical ordering of the data.

46. What is Replication in a database, and why is it important?

Replication is the process of copying data from one database (master) to one or more databases (slaves) to ensure data availability and fault tolerance. Replication can be synchronous (data is copied in real-time) or asynchronous (data is copied after a delay). It is important for high availability, load balancing, disaster recovery, and fault tolerance.

47. What is Horizontal Scaling and Vertical Scaling

Vertical Scaling: Involves upgrading the existing server (increasing CPU, memory, storage) to handle more load. It's often referred to as scaling up.
Horizontal Scaling: Involves adding more servers to the system to distribute the load. This is often referred to as scaling out and is typically used in distributed systems and cloud environments.

48. What are the different types of isolation levels in DBMS?

The transaction isolation levels define the visibility of a transaction’s intermediate state to other transactions. They control the degree to which transactions are isolated from each other:

Read Uncommitted: Transactions are able to read changes that have not yet been committed by other transactions. This allows high concurrency but risks dirty reads.
Read Committed: Transactions can only read committed changes, preventing dirty reads, but still allowing non-repeatable reads.
Repeatable Read: Guarantees that if a transaction reads a value, no other transaction can modify it until the first transaction completes, preventing both dirty reads and non-repeatable reads, but still allowing phantom reads.
Serializable: The highest isolation level, ensuring complete isolation by executing transactions serially, one at a time.This eliminates the possibility of dirty reads, non-repeatable reads, and phantom reads.

49. Explain the concept of Entity Integrity and Referential integrity

Entity Integrity: Ensures that each row in a table has a unique primary key and that no primary key value is NULL. This guarantees that each record is uniquely identifiable.
Referential Integrity: Ensures that relationships between tables remain consistent. Specifically, if one table has a foreign key referring to another table, the value in the foreign key column must either be NULL or match an existing value in the referenced table's primary key.

50. What is Bitmap Indexing?

Bitmap indexing is an index technique where each possible value of a column is represented by a bitmap (a string of bits). Each bit in the bitmap corresponds to a row in the table, and the bit is set to 1 if the value is present for that row and 0 if it is not. Bitmap indexing is efficient for columns with low cardinality (i.e., few distinct values, like gender or boolean fields) and is often used in data warehousing systems for fast querying and reporting.

Conclusion

In conclusion, this article covers key concepts of DBMS, including SQL operations, normalization, types, and database management techniques. These concepts are essential for understanding how data is organized, queried, and managed in a database. A solid understanding of these DBMS viva questions and answers will serve as a great foundation for database management tasks, as well as in preparation for any Viva related to Database Management Systems.

Frequently Asked Questions

1. What are some common DBMS SQL Viva Questions?

Common DBMS SQL Viva Questions include queries related to JOIN, GROUP BY, HAVING, and WHERE clauses.

2. What are some DBMS Lab Viva Questions?

Questions in the DBMS lab viva typically include writing SQL queries, designing tables, or explaining database design concepts.