0% found this document useful (0 votes)
118 views23 pages

DBMS

Uploaded by

retobad234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
118 views23 pages

DBMS

Uploaded by

retobad234
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

1.Difference between Relational Calculus and Relational Algebra?

Relational Algebra:
• Procedural language that defines how to retrieve data.
• Uses operations like SELECT, PROJECT, UNION, etc.
• Follows a step-by-step approach.
• Easier to implement and optimize in DBMS.
Relational Calculus:
• Non-procedural language that defines what data to retrieve.
• Based on logic and mathematical predicates.
• Includes Tuple and Domain relational calculus.
• Focuses on data requirements, not the process.
2. Explain the following SQL constructs with examples:
(1) ORDER BY
The ORDER BY clause in SQL is used to sort the result of a query based on one or more columns. By
default, it sorts in ascending order, but it can be changed to descending using the DESC keyword.
Example: SELECT name, marks FROM students ORDER BY marks DESC;
This query retrieves all student names and marks, sorting the result in descending order of marks.
(2) GROUP BY
The GROUP BY clause groups rows that have the same values in specified columns into summary
rows. It is commonly used with aggregate functions like COUNT(), SUM(), AVG(), etc.
Example: SELECT department, COUNT(*) FROM employees GROUP BY department;
This query returns the number of employees in each department by grouping the data based on the
department column.
3.Discuss in detail the operators SELECT, PROJECT, UNION with suitable examples:
SELECT (σ)
The SELECT operator is used to filter rows from a relation that satisfy a given condition. It performs
horizontal filtering. Example: σ_age > 18(Student)
This returns all students whose age is greater than 18.
PROJECT (π)
The PROJECT operator is used to select specific columns from a relation. It performs vertical filtering.
Example: π_name, age (Student)
This returns only the name and age columns from the student relation.
UNION (∪)
The UNION operator combines the results of two relations and removes duplicates. Both relations must
be union-compatible (i.e., have the same number and type of attributes).
Example: π_name(Student) ∪ π_name(Teacher). This returns a combined list of unique names from
both Student and Teacher relations.
4. Difference between Logical Data Independence and Physical Data Independence
Logical Data Independence:
• Ability to change the logical schema without affecting applications.
• Deals with changes like adding/removing tables or columns.
• More difficult to achieve.
• Example: Adding a new column in a table without changing existing queries.
Physical Data Independence:
• Ability to change the physical storage without affecting logical schema.
• Deals with file formats, indexing, or storage devices.
• Easier to achieve.
• Example: Changing data storage method from heap file to indexed file.
5. Discuss the terms Super Key, Candidate Key, and Primary Key with proper examples:
Super Key
A super key is a set of one or more attributes that can uniquely identify a tuple in a relation. It may
contain redundant attributes.
Example: In a Student table, the sets {roll_no}, {roll_no, name}, and {roll_no, email} are all super keys
if roll_no is unique for every student.
Candidate Key
A candidate key is a minimal super key, meaning it has no unnecessary attributes. There can be
multiple candidate keys in a table.
Example: If both roll_no and email uniquely identify a student, then both are candidate keys.
Primary Key
A primary key is a candidate key chosen by the database designer to uniquely identify tuples in a table.
It must be unique and cannot contain NULL values.
Example: If roll_no is chosen as the primary key in a Student table, it means each student must have
a unique, non-null roll_no.
6. Discuss the difference between 'Strong Entity Set' and 'Weak Entity Set'.
Strong Entity Set:
• Can exist independently without any other entity.
• Has a primary key that uniquely identifies each entity.
• Represented by a single rectangle in ER diagrams.
• Example: Employee(emp_id, name) — emp_id is a primary key.
Weak Entity Set:
• Depends on a strong entity for its existence.
• Has no primary key; identified using a partial key and a foreign key from a strong entity.
• Represented by a double rectangle in ER diagrams.
• Example: Dependent(name, emp_id) — depends on Employee, where emp_id is a foreign key.

7. Relational Algebra for: Identify the commissions of all the salespersons who receive at least
one order of amount greater than Rs. 5,000
Given Tables:
• SALESPEOPLE(snum, sname, city, commission)
• ORDERS(onum, amt, odate, cnum, snum)
Expression:
π_commission (σ_amt > 5000 (ORDERS) ⨝ SALESPEOPLE)
This selects orders where amount > 5000, joins with SALESPEOPLE, and projects the commission.
Q8. Relational Algebra for: Identify all customers located in cities where salesperson ‘Amit’ has
customers
Given Tables:
SALESPEOPLE(snum, sname, city, commission)
CUSTOMERS(cnum, cname, city, rating, snum)
Step 1: Get the cities where salesperson 'Amit' has customers
π<sub>city</sub>(σ<sub>snum = (π<sub>snum</sub>(σ<sub>sname =
'Amit'</sub>(SALESPEOPLE)))</sub>(CUSTOMERS))
→ This gives the cities where Amit has customers.
Step 2: Get all customers in those cities
σ<sub>city ∈ (result from step 1)</sub>(CUSTOMERS)
Final Expression:
π<sub>*</sub>(σ<sub>city ∈ (π<sub>city</sub>(σ<sub>snum =
(π<sub>snum</sub>(σ<sub>sname =
'Amit'</sub>(SALESPEOPLE)))</sub>(CUSTOMERS))</sub>(CUSTOMERS))
Q9. Explain the concept of aggregation with a suitable example
Aggregation is used in ER modeling when a relationship needs to participate in another
relationship.
Aggregation is used in the ER model when a relationship itself participates in another relationship.
It allows us to treat a relationship as an abstract entity, enabling more complex modeling.
Example:
• Suppose Employee works on a Project → (Works_On relationship).
• A Manager supervises the work of an employee on a project, not just the employee or project
individually.
• So, we aggregate the Works_On relationship and relate it to Manager through another
relationship called Supervises.
Thus, aggregation is needed when we want to relate an entity (like Manager) to a relationship (like
Works_On).
It helps in representing situations where relationships depend on other relationships.
Q10. Recall the definition of functional dependency to determine that each of Armstrong's
axioms is sound
A functional dependency X → Y means if two tuples agree on X, they must agree on Y.
Armstrong’s Axioms (all are sound):
• Reflexivity: If Y ⊆ X, then X → Y
• Augmentation: If X → Y, then XZ → YZ
• Transitivity: If X → Y and Y → Z, then X → Z
• Union: If X → Y and X → Z, then X → YZ
• Decomposition: If X → YZ, then X → Y and X → Z
• Pseudo-Transitivity: If X → Y and YW → Z, then XW → Z

Q12. Insertion Anomalies with Example and Solution


Insertion anomaly occurs when data cannot be inserted into a database due to missing unrelated data.
Example: If we have a table:
Student(CourseID, CourseName, StudentName)
We cannot insert a new course until at least one student enrolls in it.
Solution:
Normalize the table to separate Course(CourseID, CourseName) and Enrollment(CourseID,
StudentName).
Q13. Determine Partial and Transitive Dependency with Examples
• Partial Dependency: When a non-prime attribute is functionally dependent on part of a
composite primary key.
Example:
In Student(CourseID, StudentID, Name), if Name depends only on StudentID, it's a partial
dependency.
• Transitive Dependency: When a non-prime attribute depends on another non-prime attribute.
Example:
In Employee(EmpID, DeptID, DeptName), if DeptName depends on DeptID and not directly on
EmpID.

Q14. Illustrate BCNF and Why It's Stronger than 3NF


• BCNF (Boyce-Codd Normal Form): A relation is in BCNF if for every functional dependency X
→ Y, X is a superkey.
Example:
R(A, B, C)
F = {A → B, B → A, B → C}
Here, B is not a superkey, violating BCNF. So, decompose to meet BCNF.
BCNF is stronger than 3NF because it removes anomalies that 3NF allows when non-superkey
determinants exist.
Q15. Illustrate the Fully Functional Dependency with Example
Definition:
A fully functional dependency occurs when a non-prime attribute is functionally dependent on the entire
composite key and not on any part of it.
Example:
In a Course_Registration(StudentID, CourseID, Grade) table:
• {StudentID, CourseID} → Grade is a fully functional dependency
• But StudentID → Grade and CourseID → Grade do not hold.
Reason: Grade depends on both StudentID and CourseID together — not individually. Hence, it's a
fully functional dependency.

Q16. Explain the Fundamental Operations in Relational Algebra


1. Selection (σ): Filters rows based on a condition.
σ_age > 18(Student)
2. Projection (π): Extracts specific columns.
π_name, age(Student)
3. Union (⋃): Combines rows from two relations.
A⋃B
4. Set Difference (−): Rows in one relation but not in another.
A−B
5. Cartesian Product (×): Combines every row of A with every row of B.
A×B
6. Rename (ρ): Renames relation or attributes.
Q17. Estimate the multi-version two-phase locking with the lock conversion technique.
• Multi-version 2PL (MV2PL) maintains multiple versions of data items to improve concurrency
while preserving consistency.
• Read operations access the most recent committed version, allowing them to proceed without
blocking.
• Write operations use exclusive locks and create a new version after the transaction commits.
• Lock conversion allows a transaction to upgrade (from shared to exclusive) or downgrade (from
exclusive to shared) a lock when needed.
Q18. Illustrate the ACID properties of a transaction.
• Atomicity: Ensures that all operations of a transaction are completed; if not, the transaction is
aborted.
• Consistency: Database remains in a valid state before and after the transaction.
• Isolation: Concurrent transactions do not interfere with each other.
• Durability: Once a transaction commits, its changes are permanent, even in case of failure.

Q19. Differentiate between Wait-Die and Wound-Wait Protocols for deadlock prevention.
Wait-Die Protocol
• Uses timestamps to decide which transaction waits or aborts.
• Older transaction waits; younger transaction dies (is aborted).
• Non-preemptive protocol.
• Deadlock is prevented by forcing younger transactions to restart.
Wound-Wait Protocol
• Also timestamp-based.
• Older transaction wounds (preempts) younger one, forcing it to abort.
• Preemptive protocol.
• Reduces waiting time for older transactions and avoids deadlock.
Q20. Analyze Concurrency Control Protocol and deduce the problems of concurrency.
• Concurrency Control Protocol ensures that database transactions are executed safely in a multi-
user environment.
• Problems without proper concurrency control:
o Lost Update: Two transactions update the same data, and one update is lost.
o Dirty Read: A transaction reads data written by an uncommitted transaction.
o Non-repeatable Read: Data changes between two reads in the same transaction.
o Phantom Read: New rows appear in a repeated query due to other transactions.

Q21. Analyze the concept of cascading rollback with proper examples.


Cascading rollback happens when one transaction's failure causes others to roll back due to
dependency.
Example: T1 writes to X → T2 reads X → T1 fails → T2 must roll back.
Prevention: Use strict 2PL, which delays reads/writes until previous transaction commits.
Q22. Compose the benefit of strict two-phase locking provided.
• Strict 2PL ensures serializability by holding all exclusive locks until commit or abort.
• Prevents cascading rollbacks.
• Makes recovery easier as only committed data is read or written.
• Simplifies undo/redo logging during failures.

Q23. Report Multi-valued Dependency with a suitable example


A Multivalued Dependency (MVD) occurs when one attribute in a relation determines a set of values of
another attribute, independently of other attributes.
Example: In a relation STUDENT (StudentID, Course, Hobby):
A student can have multiple hobbies and multiple courses.
The hobbies and courses are independent of each other.
This creates a multivalued dependency:
StudentID →→ Course and StudentID →→ Hobby
To avoid redundancy, such dependencies are handled using 4NF (Fourth Normal Form).
Q24. Justify the terms ‘Fully Functional Dependency’ and ‘Multivalued Dependency’ with
examples
Fully Functional Dependency (FFD): An attribute Y is fully functionally dependent on attribute X if it
depends on the whole of X, not just a part.
Example: In STUDENT (RollNo, Subject, Marks),
Marks depends on both RollNo and Subject, not just one.
So, (RollNo, Subject) → Marks is a fully functional dependency.
Multivalued Dependency (MVD): As explained above, MVDs occur when a single attribute determines
multiple independent values of another.
Example: StudentID →→ Hobby means a student can have multiple hobbies independent of their
course.
Q25. Express the concept of Primary Indexing with a suitable example
Primary Indexing is created on the primary key of a table. It is an ordered file with entries for each block
in the data file.
Example: Consider a table EMPLOYEE(EmpID, Name, Dept).
If EmpID is the primary key, then an index is created on it.
The index stores values like:
EmpID: 101 → Block 1
EmpID: 108 → Block 2
This helps in faster retrieval of records using the primary key.
Q26. Express the concept of indexing and identify the different types of indexing
Indexing is a data structure technique to quickly retrieve records from a database table. It works like
an index in a book.
Types of Indexing:
Primary Indexing – Based on the primary key.
Secondary Indexing – Based on non-primary attributes.
Clustered Indexing – Records are sorted as per the index.
Multilevel Indexing – Indexes are built on top of other indexes for large datasets.
Indexing improves query performance significantly.
Q27. Identify the difference between Single-valued vs Multi-valued and Stored vs Derived
attributes.
• Single-valued Attribute
o Holds only one value per entity instance.
o Simple to represent in a table as a single column.
• Multi-valued Attribute
o Can hold multiple values for a single entity.
o Often represented using a separate table or set.
Example: An employee has one employee ID (single-valued) but may have multiple contact numbers
(multi-valued).
• Stored Attribute
o The value is physically stored in the database.
o Entered and maintained manually.
• Derived Attribute
o The value is calculated from other stored attributes.
o Not stored directly but derived when needed.
Example: Date of Birth is stored, while Age is derived from it.
Q28. Compare Data Abstraction and Data Independence in a Database System
Data Abstraction:
• It is the process of hiding low-level details and showing only essential information to users.
• It is achieved through three levels: physical, logical, and view.
• Example: A user views student names without knowing how they are stored in memory.
Data Independence:
• It refers to the capacity to change the schema at one level without affecting the next higher level.
• Two types: physical (change storage without changing structure) and logical (change structure
without affecting apps).
• Example: Changing file storage method without altering SQL queries.

Q29. Choose between an entity-relationship model and a relational data model for representing
complex relationships
For representing complex relationships, the Entity-Relationship (ER) model is more suitable during the
design phase because:
• It visually maps entities, relationships, and constraints.
• It handles one-to-many, many-to-many, and recursive relationships clearly.
• It supports abstraction concepts like generalization and aggregation.
• After design, the ER model can be converted into a Relational Data Model for implementation.
Q30. Determine the key differences between a data definition language (DDL) and a data
manipulation language (DML) in a database system
Purpose:
DDL is used to define and modify database schema.
DML is used to access and manipulate data in the database.
Execution:
DDL changes affect structure and require commit.
DML changes affect data and can be rolled back.
Examples:
DDL: CREATE, ALTER, DROP
DML: SELECT, INSERT, UPDATE, DELETE
Q31. Discover the main integrity constraints used in a relational database and explain their
significance
The main integrity constraints are:
• Primary Key Constraint – Ensures each record is unique and identifiable.
• Foreign Key Constraint – Maintains referential integrity between related tables.
• Unique Constraint – Prevents duplicate values in a column.
• Not Null Constraint – Ensures a column cannot have NULL values.
• Check Constraint – Restricts values based on a condition.
• These constraints maintain accuracy, consistency, and reliability of data.
Q32. Compare the entity-relationship model and the relational data model in terms of their
representation of data relationships
Entity-Relationship (ER) Model:
1. Represents data using entities, attributes, and relationships using diagrams.
2. Relationships are shown as diamonds connecting entity sets.
3. Used mainly during database design to visualize structure.
Relational Data Model:
1. Represents data using tables (relations) with rows and columns.
2. Relationships are shown using foreign keys across different tables.
3. Used during database implementation and querying with SQL.
Q33. Choose between MySQL and Oracle as a database management system for a small
business website
MySQL is more suitable for a small business website because:
• It is open-source and cost-effective.
• Offers sufficient scalability and performance.
• Supported by many web hosting services.
• Has a large community and documentation.
• Oracle is more powerful but better suited for enterprise-level applications due to higher cost and
complexity.
Q34. Determine the differences between relational algebra and SQL3 as query languages
• Nature of Language:
o Relational Algebra is a procedural query language — it tells how to retrieve data.
o SQL3 is a declarative query language — it tells what data to retrieve.
• Purpose:
o Relational Algebra is used mainly for theoretical foundation of query processing.
o SQL3 is used in practical applications within database systems.
• Operations:
o Relational Algebra provides basic operations like select, project, join, union, etc.
o SQL3 supports complex operations, including aggregation, subqueries, triggers, and
more.
• Result Format:
o Relational Algebra returns sets without duplicate handling by default.
o SQL3 returns multisets (bags) and can handle duplicates unless specified with
DISTINCT.
Q35. Discover the role of Armstrong's axioms in relational database design and normalization.
Armstrong's axioms are a set of inference rules used to derive all functional dependencies on a
relational database schema. These include reflexivity, augmentation, and transitivity. They help in:
• Deriving closure of attribute sets.
• Identifying candidate keys.
• Performing normalization like 2NF, 3NF, and BCNF.
Thus, they ensure the correctness and completeness of dependency analysis in database design.

Q36. Compare the benefits of using a B-tree index with a hash index in a database system.
• B-tree Index:
o Supports range queries (e.g., >, <, BETWEEN).
o Keeps data sorted, enabling ordered traversal.
o Performs well for insertions, deletions, and lookups in balanced time.
• Hash Index:
o Offers faster exact-match queries (e.g., =, IN).
o Not suitable for range queries or sorting.
o Better for workloads with frequent equality searches.
Use B-tree for flexible and ordered queries; use hash for fast, fixed-value lookups.
Q37. Choose between a locking-based concurrency control mechanism and a timestamp-based
concurrency control mechanism for a database system.
Locking-based concurrency control is preferred when managing high data contention with
frequent updates.
Reasons:
• Ensures serializability using Two-Phase Locking (2PL).
• Efficient in write-heavy environments with proper deadlock handling.
• Allows concurrent reads using shared locks, improving performance.

Q38. Determine the importance of the ACID properties in transaction processing.


• Atomicity ensures all operations in a transaction complete or none do.
• Consistency guarantees that the database remains valid before and after the transaction.
• Isolation prevents interference between concurrent transactions.
• Durability ensures changes persist after a successful transaction, even during failures.
Q39. Discover the purpose of a database recovery mechanism in a database system.
• A recovery mechanism restores the database to a consistent state after a failure.
• It handles failures like system crashes, disk failures, or transaction errors.
• Uses techniques like log-based recovery, checkpoints, and shadow paging.
• Ensures data durability and consistency, upholding ACID properties.
Q40. Compare the query optimization process with the evaluation of relational algebra
expressions in a database system.
Query Optimization:
1. Focuses on finding the most efficient execution plan for a given query.
2. Uses cost-based or rule-based strategies to choose optimal plans.
3. Happens before execution, during query compilation.
Evaluation of Relational Algebra Expressions:
1. Refers to the actual execution of the relational algebra operations.
2. Involves operations like selection, projection, join, etc.
3. Happens after optimization, using the selected execution plan.
Q41. Choose between 3NF and BCNF as the most appropriate normal form for a given
database schema.
Choice: Boyce-Codd Normal Form (BCNF) is preferred when eliminating all redundancy due to
functional dependencies is critical.
Reasons:
• BCNF is stricter than 3NF, ensuring no anomalies due to any functional dependency.
• Use 3NF when BCNF is not achievable without losing dependency preservation.
• Choose BCNF when data integrity and minimal redundancy are more important than ease of
dependency enforcement.
Q42. Determine the purpose of dependency preservation in relational database design
Purpose of Dependency Preservation:
1. Maintains functional dependencies after decomposing a relation into smaller relations.
2. Ensures data integrity constraints can be enforced without needing to join the decomposed
relations.
3. Helps in efficient query processing by checking constraints on individual relations instead of
recombining them.
Q43. Discover the concept of query equivalence and why it is important in database systems.
Query equivalence refers to the concept where two different queries yield the same result for every
possible database state. It plays a vital role in query optimization, allowing the database system to
transform a given query into an equivalent one that is more efficient to execute without changing its
meaning. This helps improve performance, reduce resource usage, and ensure correctness of query
results during optimization.
Q44. Explain what relational algebra is.
Relational algebra is a formal query language used to retrieve data from a relational database. It
consists of operations like selection, projection, union, set difference, Cartesian product, and join to
manipulate and access data in a structured way.
Q45. Distinguish between Relational Algebra and SQL
Relational Algebra: *It is a procedural query language that uses a set of operations to retrieve data.
• Requires specifying how to obtain the result step by step.
• Mainly used in theoretical foundations and query optimization.
SQL: * It is a non-procedural (declarative) language used in real-world databases.
• Specifies what data is required, not how to get it.
• Widely used for defining, manipulating, and querying relational databases.

Q46. Classify the normal forms used in relational database design.


• 1NF (First Normal Form): No repeating groups or arrays; atomic values.
• 2NF: 1NF + no partial dependencies on the primary key.
• 3NF: 2NF + no transitive dependencies.
• BCNF: Every determinant is a candidate key (stricter than 3NF).
• 4NF & 5NF: Handle multi-valued and join dependencies, respectively.
Q47. Explain the concept of dependency preservation in relational database design
Dependency preservation is a key goal in relational database design when decomposing a relation into
multiple smaller relations during normalization. It ensures that all functional dependencies from the
original relation are still represented in the resulting set of relations. This allows the database to enforce
data integrity constraints without requiring expensive join operations. Preserving dependencies
simplifies constraint checking and maintains the correctness of the database schema after
decomposition.
Q48. Distinguish between a commercial and an open-source DBMS.
• Commercial DBMS:
o Developed by private companies and requires a paid license.
o Offers dedicated support, advanced features, and high security.
o Example: Oracle, Microsoft SQL Server.
• Open-source DBMS:
o Free to use and modify, maintained by community or non-profit.
o Customizable and suitable for learning or small-scale applications.
o Example: MySQL, PostgreSQL.
Choose commercial for enterprise-grade reliability; open-source for cost-effective and flexible use.
Q49. Explain the ACID properties in the context of transaction processing.
• Atomicity: Ensures the entire transaction is completed or none of it.
• Consistency: Ensures database rules are maintained after a transaction.
• Isolation: Concurrent transactions don’t interfere with each other.
• Durability: Once committed, changes remain even after a system crash.
Q50. Distinguish between locking and timestamp-based schedulers in concurrency control.
• Locking-based scheduler:
o Controls transaction access using locks (shared/exclusive).
o Can cause deadlocks, requiring prevention or detection techniques.
o Ensures serializability via protocols like Two-Phase Locking (2PL).
• Timestamp-based scheduler:
o Assigns timestamps to transactions to order operations.
o Prevents deadlocks by aborting conflicting transactions.
o Ensures serializability based on timestamps, not locks.
Locking suits systems with fewer conflicts; timestamps are better for deadlock-free execution.

Q51. State the reason why the use of DBMS is recommended? Describe by listing some of its
major advantages.
A Database Management System (DBMS) is recommended because it provides a systematic and
efficient way of storing, managing, retrieving, and manipulating data. It eliminates many limitations of
traditional file systems.
Major Advantages of DBMS:
1. Data Redundancy Control: DBMS reduces data duplication by storing data centrally and allowing
shared access.
2. Data Consistency: Because redundancy is minimized, updates to data are reflected across the
system, maintaining consistency.
3. Improved Data Security: DBMS allows access control and authorization to protect sensitive data.
4. Data Integrity and Accuracy: Constraints and rules can be enforced to ensure the correctness of
stored data.
5. Concurrent Access and Recovery: Multiple users can access data simultaneously, and DBMS
supports crash recovery and backup systems.
Q53. Describe the following terms:
• Data Redundancy and Consistency: Data redundancy occurs when the same piece of data is
stored in multiple places. This can lead to inconsistencies if all copies are not updated
together. Consistency ensures that data remains accurate and uniform across the database.
• Referential Integrity: This ensures that relationships between tables remain valid. For example,
if a foreign key in one table references a primary key in another, referential integrity ensures
that the referenced key actually exists.
• Data Atomicity: Atomicity means that a transaction must be all-or-nothing. Either every
operation within the transaction is executed successfully, or none are applied, preserving data
correctness in case of failures.
• Domain Constraints:These constraints define the permissible values for a given attribute. For
instance, a "birthdate" field must contain only valid date values, and an "age" field must be a
positive number.
• Data Models: Data models are abstract frameworks used to define how data is organized,
stored, and manipulated. Common models include the relational model (based on tables),
hierarchical model (tree-like structure), and object-oriented mode
Q54. Define candidate key, primary key, super key, composite key, and alternate key with
suitable examples.
Candidate Key: A candidate key is a minimal set of attributes that can uniquely identify each tuple in
a relation. It must contain unique values and should not have any unnecessary attributes.
Example: In the relation STUDENT(roll_no, email, name), both roll_no and email can individually
identify a student. Hence, both are candidate keys.
Primary Key:A primary key is one of the candidate keys selected to uniquely identify the records in a
table. It cannot contain NULL values and must be unique for each record.
Example: If roll_no is chosen as the unique identifier in the STUDENT table, it becomes the primary
key. → PRIMARY KEY (roll_no)
Super Key:A super key is any set of attributes that can uniquely identify a tuple. It may contain extra
attributes that are not necessary for uniqueness. All candidate keys are super keys, but not all super
keys are candidate keys.
Example: {roll_no}, {roll_no, name}, and {email, name} are super keys in STUDENT, but only
{roll_no} and {email} are candidate keys.
Composite Key: A composite key is a key that consists of two or more attributes that together
uniquely identify a tuple, especially when no single attribute can do so alone.
Example: In the relation ENROLLMENT(student_id, course_id, semester), the combination of
student_id and course_id is used to uniquely identify each enrollment.
→ COMPOSITE KEY (student_id, course_id)
Alternate Key: An alternate key is any candidate key that is not selected as the primary key. It can
still uniquely identify tuples but is not used as the main reference key.
Example: In the STUDENT table, if roll_no is the primary key, then email becomes the alternate key.

Q55. Define Three-Schema Architecture of Database Management System.


The Three-Schema Architecture provides a framework to separate the user applications from the
physical database.
1. Internal Level (Physical Schema):
• Describes how data is actually stored on storage devices.
• Deals with data compression, indexing, and file structure.
2. Conceptual Level (Logical Schema):
• Represents the logical view of the entire database.
• Describes what data is stored and relationships between them.
3. External Level (View Schema):
• Describes different views for different users.
• Allows security and access control.
Purpose:
• Enhances data abstraction.
• Supports data independence (logical and physical).
• Simplifies database maintenance.
Q56. Recognize five main advantages of database management systems over traditional file
management systems.
1. Data Redundancy Control: DBMS minimizes data duplication by storing data centrally and
enabling shared access.
2. Data Consistency: As data is stored in a single database, updates are automatically visible to
all users, ensuring consistency.
3. Data Security: DBMS provides access control mechanisms to restrict unauthorized access to
sensitive data.
4. Data Integrity: Built-in constraints in DBMS ensure accuracy and validity of stored data.
5. Efficient Data Access and Querying: SQL and indexing allow fast and flexible access to large
volumes of data, which is difficult in file systems.

Q57. Relational Algebra statements


Tables:
• SALESPEOPLE (snum, sname, city, commission)
• CUSTOMERS (cnum, cname, city, rating, snum)
• ORDERS (onum, amt, odate, cnum, snum)
(a) Show the commissions of all the salespersons who receive at least one order of amount greater
than Rs. 5,000.
text
CopyEdit
π commission (SALESPEOPLE ⨝ σ amt > 5000 (ORDERS))
• Select orders with amt > 5000.
• Join with SALESPEOPLE using snum.
• Project commission.
(b) Find all customers located in cities where salesperson ‘Amit’ has customers.
text
CopyEdit
π cname (
σ city ∈ (
π city (CUSTOMERS ⨝ σ sname = 'Amit' (SALESPEOPLE))
) (CUSTOMERS)
)
• Find cities where 'Amit' has customers.
• Retrieve customers from those cities.

Q58. State the different data types supported by SQL with suitable examples.
SQL supports various data types categorized into numeric, character/string, date/time, and
miscellaneous types. Common types include:
1. INT / INTEGER
o Stores whole numbers.
o Example: age INT;
2. VARCHAR(n)
o Stores variable-length character strings (up to n characters).
o Example: name VARCHAR(50);
3. CHAR(n)
o Stores fixed-length character strings.
o Example: gender CHAR(1);
4. DATE / TIME / DATETIME
o Stores date/time values.
o Example: birth_date DATE;, login_time TIME;
5. DECIMAL(p, s) / NUMERIC(p, s)
o Stores exact numeric values with p digits total and s digits after the decimal.
o Example: price DECIMAL(10, 2);
6. BOOLEAN
o Stores TRUE or FALSE values. Example: is_active BOOLEAN;
Q61. Design a generalization-specialization hierarchy for a motor-vehicle sales company selling
motorcycles, passenger cars, vans, and buses. Justify attribute placement.
A generalization-specialization hierarchy groups entities that share common features while allowing
special entities to define their own specific attributes.

)
Justification of Attribute Placement:
• Vehicle (superclass): Common to all—vehicle_id, manufacturer, model, price
• Motorcycle: Specific—engine_cc, type
• Passenger Car: Attributes like no_of_seats, boot_space
• Van/Bus: passenger_capacity, route_type, cargo_volume
This design improves modularity, avoids redundancy, and supports future extensions like adding trucks
or EVs.

Q62. Demonstrate bulk loading of B+ tree of order 3 with the following keys:
56, 32, 18, 72, 45, 16, 98, 83, 81, 27, 39, 51, 66, 44, 33, 22
Order 3 ⇒ Max 3 children per internal node, hence max 2 keys.
Leaf nodes: 2–3 keys allowed.
Step 1: Sort the keys
16, 18, 22, 27, 32, 33, 39, 44, 45, 51, 56, 66, 72, 81, 83, 98
Step 2: Create Leaf Nodes (max 3 keys each)
• L1: 16, 18, 22
• L2: 27, 32, 33
• L3: 39, 44, 45
• L4: 51, 56, 66
• L5: 72, 81, 83
• L6: 98
Step 3: First-Level Internal Nodes
• I1 → Keys: 27, 39 → Children: L1, L2, L3
• I2 → Keys: 72, 98 → Children: L4, L5, L6
Step 4: Root Node
• Root → Key: 72 → Children: I1, I2
Final Tree Structure:
[72]
/ \
[27, 39] [72, 98]
/ | \ / | \
L1 L2 L3 L4 L5 L6
• Leaves: (16,18,22), (27,32,33), (39,44,45), (51,56,66), (72,81,83), (98)
• Leaf nodes are linked for range queries
• Internal keys guide the search; actual data is in leaves
Q63. Explain the following:
a) Ternary Relationship: A ternary relationship is a relationship that connects three different entity sets
at the same time. It is required when the meaning of the relationship depends on all three entities
together, not just pairs.
Example: Supplies(Supplier, Part, Project)
→ A supplier supplies a part to a specific project.
We need all three: supplier, part, and project to describe this.
b) Weak Entity Set: A weak entity is an entity that cannot be uniquely identified by its own attributes. It
depends on a strong entity for identification and has a partial key that, combined with the strong entity's
key, forms a unique identifier.
Example: Dependent (weak) belongs to Employee (strong).
To identify a dependent, we need both employee_id and dependent_name.
c) Grouping: Grouping is used in relational algebra or SQL to organize data into groups based on
shared attribute values. It is usually combined with aggregate functions like SUM(), COUNT(), AVG().
Example: To find average salary by department:
SELECT department, AVG(salary)
FROM Employee
GROUP BY department;
d) Aggregation: Aggregation is a modeling concept where a relationship itself is treated as a higher-
level entity. This allows it to participate in another relationship, which is useful when a relationship has
attributes or needs to be linked further.
Example: If Instructor teaches a Course, and this relationship needs to be associated with a Semester,
we aggregate Teaches and then relate it to Semester.

Q65. Illustrate indexed sequential files with advantages and disadvantages.


Indexed Sequential File (ISF) combines sequential and random access using an index.
Structure: Index Table (Sparse):
---------------------
Key | Pointer
--------|---------
1001 | → Block 1
1050 | → Block 2
1100 | → Block 3
Data Blocks (Sorted Sequentially):
----------------------------------
Block 1 → 1001, 1005, 1010
Block 2 → 1050, 1055, 1060
Block 3 → 1100, 1110, 1120
The index stores only some keys, pointing to data blocks.
Data blocks are kept in sorted order for sequential access.
A pointer chain may be used to traverse blocks.
Advantages:
• Faster access (than purely sequential) using index
• Efficient for range queries and sequential scans
• Easy to add new records if space is reserved
Disadvantages:
• Index must be maintained when records change
• Insertion/deletion may cause overflow and require reorganization
• Extra space needed for index
66. Illustrate whether B+ tree follows a multi-level indexing. How does it differ from B-tree?
Yes, B+ tree follows multi-level indexing. It uses a hierarchical structure where internal nodes act as
indexes that point to either lower-level index nodes or leaf nodes. This layered structure allows faster
search operations.
In a B+ tree:
• Actual data is stored only in the leaf nodes.
• Internal nodes contain only keys and pointers, serving as multi-level indexes.
• Leaf nodes are linked together, enabling fast range queries.
Difference from B-tree:
• In a B-tree, data can be stored in both internal and leaf nodes.
• In a B+ tree, only leaf nodes contain actual data; internal nodes serve purely as routing indexes.
• B+ tree allows efficient range-based searches due to linked leaf nodes, while B-tree does not
support this as efficiently.

67. Explain the two-phase locking protocol with a proper example.


Two-Phase Locking (2PL) is a concurrency control protocol that guarantees serializability. It divides a
transaction’s execution into two phases:
• Growing phase: The transaction can acquire locks but cannot release any.
• Shrinking phase: After it starts releasing any lock, it cannot acquire any more.
This ensures a clear boundary between lock acquisition and release.
Example:
Transaction T1:
lock(X)
read(X)
lock(Y)
write(Y)
unlock(X)
unlock(Y)
This follows 2PL as it first acquires all needed locks and then releases them, preventing conflicts like
dirty reads or lost updates.

68. Deadlock cannot occur in timestamp-based protocol. Criticize this statement.


The statement is generally true—deadlocks do not occur in timestamp-based protocols.
In timestamp ordering:
• Every transaction is given a unique timestamp.
• Operations are scheduled based on these timestamps.
• If a conflict arises, a violating transaction is rolled back and restarted instead of waiting.
• This avoids the circular wait condition required for deadlocks.
Criticism:
• Frequent aborts may happen in high-contention systems, leading to performance issues.
• Long or older transactions may suffer starvation due to repeated restarts.
• Although deadlocks are avoided, fairness and efficiency may be compromised.

69. Distinguish between locking and timestamp protocols for concurrency control.
1. Control Mechanism: Locking protocols use locks (shared/exclusive) to control access to data
items. Timestamp protocols use unique timestamps to order transactions logically.
2. Conflict Handling: Locking may cause transactions to wait, potentially leading to deadlocks.
Timestamp protocols never wait; conflicting transactions are aborted.
3. Deadlocks and Starvation:
o Locking protocols can lead to deadlocks, but usually avoid starvation.
o Timestamp protocols avoid deadlocks but may cause starvation, especially for older
transactions.
4. Execution Order: Locking does not guarantee a specific order unless explicitly managed.
Timestamp protocols strictly follow the order defined by transaction timestamps.
5. Overhead and Efficiency: Locking introduces overhead in acquiring and managing locks.
Timestamp protocols may increase overhead due to frequent rollbacks.
70. Estimate the desirable properties of transactions.
The desirable properties of transactions are known as ACID properties, along with an additional
practical consideration:
1. Atomicity: A transaction should be all-or-nothing—either all operations are completed or none
are.
Example: If money is debited from one account but not credited to another, the whole
transaction must roll back.
2. Consistency: A transaction must maintain the integrity constraints of the database.
Example: After a transaction, total bank balances should remain valid.
3. Isolation: Transactions should not interfere with each other. Intermediate states must not be
visible.
Example: If two people book the same seat, only one should succeed.
4. Durability: Once committed, changes must persist even after a system failure.
Example: After transfer confirmation, the balance should remain updated even after a crash.
5. Recoverability (practical addition): The system should be able to recover from failures using
logs or backups, ensuring no data is lost.

71. Estimate various issues while transactions are running concurrently in DBMS.
When multiple transactions run concurrently, several issues can arise due to improper synchronization:
Lost Update
Two transactions overwrite each other’s updates unintentionally.
Example: T1 and T2 read X, then both write different values back to X.
Dirty Read
A transaction reads uncommitted changes made by another.
Example: T1 updates X but hasn’t committed. T2 reads X and uses the value.
Non-repeatable Read
A value read twice by a transaction changes due to another transaction’s update.
Example: T1 reads X, T2 updates X, and T1 reads X again—it sees a different value.
Phantom Read
A transaction sees different results for the same query due to insert/delete by others.
Example: T1 reads rows where salary > 5000. T2 inserts such a row, and T1 sees it next time.
Deadlock
Transactions wait for each other’s resources, causing a cycle with no progress.
Example: T1 holds lock on A and waits for B; T2 holds B and waits for A.

72. Illustrate database recovery and explain shadow paging in detail.


Database recovery ensures data consistency after a crash or failure. One recovery technique is
shadow paging.
Shadow Paging Concept : Maintains two page tables: current page table and shadow page table.
Shadow page table points to the most recent committed state.
Any updates go to a new copy (shadow page), not the original.
Commit Process : After updates are done, the current page table becomes the new shadow table.
The old one is discarded.
Thus, committed data is instantly available without log replay.
Benefits :No need for UNDO/REDO logs.
Simple and crash-resilient.
Drawbacks :Increased storage usage due to page copying.
Hard to manage in large databases.
Example: If page 3 is updated, a new version of page 3 is created. The old one is kept unchanged. If
system crashes mid-way, original data remains intact.
73. Evaluate the need of lock in DBMS. Explain shared lock and exclusive lock with examples.
Need for Lock in DBMS: Locks are essential in a DBMS to maintain data consistency and isolation
when multiple transactions access the same data concurrently. They help prevent problems like dirty
reads, lost updates, and uncommitted data being read, by ensuring that only valid operations occur on
the database.
Types of Locks:
1. Shared Lock (S-Lock):
o Allows multiple transactions to read the same data item but prevents any from writing to
it.
o Example:
If Transaction T1 places a shared lock on a row to read a student’s record, Transaction
T2 can also read it but cannot modify it until T1 releases the lock.
2. Exclusive Lock (X-Lock):
o Allows only one transaction to read and write the data item. Other transactions are
blocked from reading or writing until the lock is released.
o Example:
If Transaction T1 places an exclusive lock on an employee's salary row to update it, no
other transaction can read or write that row until T1 completes.

74. Test the functions of data warehouse tools and utilities.


Data warehouse tools and utilities assist in extracting, storing, and analyzing large datasets efficiently.
• Data Extraction and Loading (ETL): Tools extract data from multiple sources, transform it into a
consistent format, and load it into the warehouse.
Example: Informatica, Talend.
• Data Cleaning: Removes inconsistencies, duplicates, and null values from datasets.
Example: Replacing 'N/A' with NULL, fixing typos.
• Metadata Management: Stores information about data (schema, source, transformation rules)
to aid interpretation.
Example: Showing column mappings and data lineage.
• Query and Analysis Tools: Help users run complex queries and visualize data.
Example: OLAP tools, Power BI, Tableau.
• Backup and Recovery Utilities: Ensure data is not lost and can be restored after failure.
Example: Periodic snapshots of warehouse data.
These tools help maintain the reliability, integrity, and usability of the data warehouse for effective
decision-making.

75. Summarize deadlock and explain deadlock prevention and deadlock detection techniques.
Deadlock in DBMS occurs when two or more transactions are waiting for each other to release
resources, and none can proceed. It leads to a situation where the system gets stuck, and transactions
cannot complete.
Deadlock Prevention : Deadlock is prevented by ensuring at least one of the necessary conditions
for deadlock cannot occur. Deadlock is prevented by ensuring at least one of the necessary conditions
for deadlock cannot occur. Common techniques include:
1. Wait-Die Scheme: Older transactions wait; younger ones requesting locked items are rolled
back.
2. Wound-Wait Scheme: Older transactions preempt younger ones by rolling them back if they
hold the lock.
3. Resource Ordering: Enforces a fixed order in acquiring locks to avoid circular wait.
Deadlock Detection
In this approach, the system allows deadlocks to occur but checks for them periodically:
1. Wait-for Graph (WFG):
A directed graph is used where nodes represent transactions and edges represent waiting. A
cycle in the graph indicates a deadlock.
2. Recovery:
Once a deadlock is detected, the system breaks it by aborting one or more transactions involved.
76. Assess the universal relation R = {A,B,C,D,E,F,G,H,I} with FDs = {(A,B)→C, A→D,E, B→F,
F→G,H, D→I,J}
a) Candidate Key:
• Compute closure of (A,B):
(A,B)+ → C (by AB→C), D,E (A→D,E), F (B→F), G,H (F→G,H), I,J (D→I,J)
→ (A,B)+ = all attributes ⇒ Candidate key = (A,B)
b) Decompose to 2NF:
Check for partial dependencies (i.e., from part of candidate key):
• A → D,E ⇒ partial
• B → F ⇒ partial
Decompose:
• R1(A, B, C)
• R2(A, D, E)
• R3(B, F)
• R4(F, G, H)
• R5(D, I, J)
No partial dependency remains → 2NF achieved
c) Decompose to 3NF:
Check for transitive dependencies:
• D → I,J and F → G,H are already separated
All transitive dependencies resolved
→ Final relations R1–R5 are in 3NF

77. Given relation R = {A, B, C, D, E} with FDs = {CE→D, D → B, C → A}


a) Candidate Key:
• Start with CE:
CE → D → B (D→B), C → A ⇒ CE+ = {C, E, D, B, A} = all attributes
Candidate key = CE
b) Highest Normal Form:
• 1NF:
• 2NF: (no partial dependency since CE is the only candidate key)
• 3NF: D → B is a transitive dependency (D is not a super key, B not prime)
Relation is in 2NF but not 3NF
c) BCNF Decomposition:
Step 1: Decompose on D → B
• R1(D, B)
• R2(CE, D, C, A)
Step 2: In R2, C → A violates BCNF
• Decompose R2 into R3(C, A), R4(C, E, D)
Final BCNF relations:
• R1(D, B), R3(C, A), R4(C, E, D)
78. Justify the need of OLTP and OLAP and state their differences.
Need:
• OLTP (Online Transaction Processing): Manages real-time operations like insert, update,
delete in databases. Needed for day-to-day business operations.
• OLAP (Online Analytical Processing): Helps analyze large volumes of data for decision-
making. Useful in reporting, forecasting, and trend analysis.
Differences:
1. Purpose:
OLTP handles real-time transactions; OLAP performs analysis on historical data.
2. Data Volume:
OLTP uses small transactions; OLAP deals with large data sets.
3. Queries:
OLTP uses simple, fast queries; OLAP uses complex analytical queries.
4. Database Design:
OLTP is normalized; OLAP is often denormalized (star/snowflake schema).
5. Users:
OLTP is used by clerks, DBAs; OLAP by data analysts, managers.

79. Define DBMS.


A Database Management System (DBMS) is a software system that enables users to create, store,
manage, and manipulate data in a structured way. It provides an interface between the users and the
database, allowing efficient access and management of large amounts of data.
A DBMS ensures various key functions:
• Data Storage and Retrieval: Users can easily insert, update, delete, and fetch data using query
languages like SQL.
• Data Integrity and Security: It enforces rules to ensure data is accurate and only accessible to
authorized users.
• Concurrency Control: Allows multiple users to access the data simultaneously without conflicts.
• Backup and Recovery: Protects data from loss due to failures by providing recovery
mechanisms.
• Data Independence: Changes in data structure do not affect how data is accessed by users.
Examples of commonly used DBMS include MySQL, Oracle, PostgreSQL, and Microsoft SQL Serve

80. Compare the concepts of Data Abstraction and Data Independence in a database system.
1. Definition:
o Data Abstraction refers to hiding low-level details of how data is stored and maintained.
o Data Independence is the ability to change schema at one level without affecting others.
2. Levels Involved:
o Data Abstraction includes physical, logical, and view levels.
o Data Independence includes logical and physical independence.
3. Goal:
o Data Abstraction simplifies user interaction.
o Data Independence improves flexibility and maintenance.
4. User View:
o In abstraction, users see only necessary data.
o In independence, users are unaffected by internal changes.
5. Example:
o Abstraction: User sees "Student table", not file structures.
o Independence: Change file structure (physical) → no impact on user view.
81. Distinguish between Data Definition Language (DDL) and Data Manipulation Language
(DML) in a database system.
Data Definition Language (DDL) and Data Manipulation Language (DML) are two core subsets of SQL,
each serving different purposes in a relational database:
1. Purpose: DDL defines the structure/schema of a database (e.g., tables, indexes), while DML
deals with the data within those structures.
2. Operations: DDL includes commands like CREATE, ALTER, DROP.
DML includes SELECT, INSERT, UPDATE, DELETE.
3. Effect on Data: DDL changes the schema or structure; DML changes actual data.
4. Auto-commit:
DDL commands are auto-committed (permanent), DML changes can be rolled back.
5. Execution Time: DDL is executed less frequently (design time), DML is used frequently
(runtime).

82. Evaluate the significance of integrity constraints in a relational database


• Definition:
Integrity constraints are rules that ensure the correctness and validity of data in a relational
database.
• Types:
o Entity Integrity: Primary key must be unique and not null.
o Referential Integrity: Foreign keys must refer to valid primary keys.
o Domain Integrity: Ensures values fall within a valid range or format.
o User-defined constraints: Custom rules (e.g., age > 18).
• Significance:
o Prevents invalid data entry.
o Maintains consistency across relations.
o Supports enforcement of business rules.
o Essential for reliable query results.
Example: Preventing deletion of a parent row when child rows still reference it via foreign key.

83. Analyze the key characteristics of the Entity-Relationship (ER) Model and the Relational Data
Model
• Entity-Relationship Model:
o Conceptual model used for database design.
o Represents data using entities, attributes, and relationships.
o Helps visualize structure using ER diagrams.
o Good for understanding real-world objects.
• Relational Data Model:
o Logical model based on tables (relations).
o Data is stored as rows (tuples) and columns (attributes).
o Supported by relational algebra and SQL.
o Enforces integrity and normalization.
Both models help in structured and meaningful data representation but operate at different levels—
conceptual (ER) vs logical (Relational).

84. Compare Relational Algebra and SQL3 as relational query languages


1. Language Type: Relational Algebra is procedural (specifies how), SQL3 is declarative
(specifies what).
2. Usage: Relational Algebra is used for internal query optimization; SQL3 is used by end users
and applications.
3. Syntax Style: Algebra uses mathematical notation; SQL3 uses readable English-like syntax.
4. Expressiveness: SQL3 supports aggregation, recursion, triggers, and object-relational
features, which algebra lacks.
5. Output: Both return relations as output, but SQL3 can also return scalar values or nested
results.
85. Distinguish between Tuple Relational Calculus (TRC) and Domain Relational Calculus (DRC)
1. Focus:
TRC focuses on tuples (entire rows), while DRC focuses on domains (individual column
values).
2. Syntax:
TRC uses variables for tuples: {t | P(t)}
DRC uses variables for fields: {<x, y> | P(x, y)}
3. Clarity:
TRC is closer to how we think about rows; DRC is more abstract, field-oriented.
4. Complexity:
DRC can be more compact; TRC can be more intuitive for beginners.
5. Equivalence:
Both are logically equivalent in expressive power (can define same queries).

86. Evaluate the role of indices in database storage strategies


• Definition:
An index is a data structure (like B+ tree or hash) that improves the speed of data retrieval.
• Importance:
o Faster Querying: Speeds up SELECT and WHERE clauses significantly.
o Sorting Support: Helps with ORDER BY, GROUP BY.
o Join Performance: Reduces time in joins by indexing key columns.
o Reduced I/O: Avoids full table scans for common queries.
o Types: Primary index, secondary index, clustered, unclustered.
Trade-off: Indexes increase read speed but may slow down insert/update/delete and use extra space.

Q87. Analyze the concept of Concurrency Control in transaction processing


Concurrency control ensures correct and consistent execution of simultaneous transactions in a
multi-user database environment.
Purpose: To maintain data consistency, isolation, and serializability during concurrent transaction
execution.
Problems It Prevents:
• Lost Updates: Two transactions overwrite each other.
• Dirty Reads: Reading uncommitted changes.
• Unrepeatable Reads: Data changes between two reads in the same transaction.
Common Techniques:
• Two-Phase Locking (2PL): Locks data items in two phases (growing and shrinking).
• Timestamp Ordering: Orders transactions based on timestamps.
• Optimistic Control: Validates transactions before commit.
Significance:
• Enables safe parallelism.
• Preserves data integrity under high concurrency.

88. Distinguish between ACID properties and Serializability of scheduling in transaction


processing
• ACID properties ensure the reliability of individual transactions, while Serializability ensures
correctness of schedules involving multiple transactions.
• Atomicity (A): A transaction executes completely or not at all, while Serializability ensures the
result is equivalent to some serial order.
• Consistency (C) maintains database validity, while Serializability doesn’t guarantee that a
consistent state is maintained at all times.
• Isolation (I) ensures that transactions do not interfere, closely related to Serializability, which
formalizes this idea for multiple transactions.
• Durability (D) ensures committed changes persist after failures, while Serializability does not
handle failures, only logical correctness of schedules.
89. Compare the advantages and disadvantages of Locking and Timestamp-based schedulers
in concurrency control
Advantages:
• Locking-based schedulers allow transactions to wait for locks, ensuring controlled access,
while timestamp-based schedulers avoid waiting by aborting conflicting transactions
immediately.
• Locking-based schedulers prevent conflicts using explicit locks,
whereas timestamp-based schedulers use timestamps to order transactions and prevent
deadlocks.
• Locking works well in low-contention environments,
while timestamp scheduling performs better in read-heavy workloads.
Disadvantages:
• Locking-based schedulers may cause deadlocks requiring detection or prevention,
whereas timestamp-based schedulers avoid deadlocks but may cause frequent transaction
restarts.
• Locking can lead to waiting and blocking,
while timestamp-based schedulers never block but can cause starvation of long transactions.
• Locking requires complex deadlock management,
whereas timestamp schedulers add overhead for maintaining and comparing timestamps

Q90. Evaluate the importance of Armstrong’s axioms in relational database design


Armstrong’s axioms are a set of inference rules used to derive all possible functional dependencies
from a given set, playing a crucial role in relational schema design and normalization. The three
fundamental rules—reflexivity, augmentation, and transitivity—form a sound and complete basis for
reasoning about functional dependencies. These axioms help compute the closure of functional
dependencies, identify candidate keys, and ensure the correctness of schema decomposition. They
are especially vital for validating properties such as lossless join and dependency preservation. Thus,
Armstrong’s axioms provide a formal mathematical foundation to analyze, refine, and verify relational
database designs.

Q91. Analyze the concept of Lossless Design in relational database design


A lossless design ensures that when a relation is decomposed into two or more sub-relations, the
original relation can be accurately reconstructed by joining them without introducing spurious or
incorrect data. This property is essential for preserving the integrity and completeness of information
after decomposition during normalization. Lossless decomposition is achieved if the common
attribute(s) among the decomposed relations form a super key in at least one of them. It prevents
anomalies and inconsistencies that may arise during data retrieval or updates. Hence, ensuring a
lossless join is critical to maintaining accurate and reliable database systems.
Q92. Distinguish between 1NF, 2NF, and 3NF in relational database design
Normalization is a step-by-step process to reduce data redundancy and improve integrity in a relational
database. The first three normal forms—1NF, 2NF, and 3NF—each address specific types of
anomalies:
• First Normal Form (1NF):
o Ensures that each attribute contains only atomic (indivisible) values.
o Removes repeating groups or arrays.
o Example: A table with multiple phone numbers in one field violates 1NF.
• Second Normal Form (2NF):
o Achieved after 1NF.
o Removes partial dependencies, where a non-key attribute depends only on part of a
composite key.
o Applicable when the primary key is composite.
o Example: In a table with (student_id, course_id) as key, storing student_name (which
depends only on student_id) violates 2NF.
• Third Normal Form (3NF):
o Achieved after 2NF.
o Removes transitive dependencies, where a non-key attribute depends on another non-
key attribute.
o All non-key attributes must depend only on the primary key.
o Example: If department_name depends on department_id, which in turn depends on
employee_id, it violates 3NF.

Q93. Compare the Network Model and Relational Data Model in terms of data representation
Network Data Model:
• Represents data as records connected by set relationships, forming a complex graph.
• Access is navigational, requiring users to follow paths explicitly.
• Relationships are maintained using physical pointers between records.
• Complex to design and modify but efficient for many-to-many relationships.
• Used in legacy systems with high-performance requirements.
Relational Data Model:
• Represents data in tables (relations) consisting of rows and columns.
• Access is declarative using SQL—users specify what to retrieve, not how.
• Relationships are represented through foreign keys, not physical links.
• Easier to design, understand, and maintain.
• Preferred in modern systems for its simplicity and flexibility.

Q94. Evaluate the significance of query optimization in relational databases


Query optimization is a critical process in relational databases aimed at improving query performance
by selecting the most efficient execution plan from various alternatives. It reduces the time, memory,
and I/O cost required to retrieve data, especially in large datasets. The optimizer uses techniques such
as indexing, join order rearrangement, and predicate pushdown to minimize resource usage. Effective
query optimization ensures faster response times, better user experience, and system scalability. As
data volumes grow, the role of optimization becomes even more crucial in maintaining consistent
performance in enterprise-grade database systems.

You might also like