You are on page 1of 24

Ques. Describe the architecture of DBMS and the concept of data independence? Ans.

: The DBMS architecture designed so that the storage details of data in a database are hidden from users. The users accessing a database need to work with data rather than any concern for how the data is physically stored in the database. The DBMS architecture provides various views of data such are: Different user views of the database for users. It provides customized view to each user accessing the same data from the database. Each users view is independent and any change to a view does not affect other views. The internal structure of the database remains unaffected by the changes, which are made to the physical storage of data. The database administrator changes the structure of the database without changing the user views of a database.

The architecture of DBMS is also called American National Standards Institute / Standards planning and Requirements Committee (ANSI / SPARC) model. The (ANSI / SPARC) model or DBMS architecture is divided into three levels which are: Internal Level: This level specifies the way in which data is physically stored in a database. The internal or physical level of the database system architecture also provides a description of the relationship that exists between data. External Level: This level specifies the way in which data stored in a database is viewed by the users. Conceptual Level: This level specifies the level of interaction between the internal level and external level.




------------------- View4

Conceptual Level

Community view of the database

Internal level

Physical representation of the database

To understand the concept of levels in DBMS architecture, consider an employee database that contains the details of an employee such as employee number and department number. The internal level of architecture for the employee database can be represented as: Stored_emp Prefix BYTES=20 TYPE=BYTE (5), OFFSET=0

Database Management System

Emp Dept Pay




In the preceding example for the internal level, employees are represented by stored record types Stored_emp, which is twenty bytes long. The Stored_emp consists of four stored fields, which are prefix, emp, dept and pay. The prefix contains control information such as flags or pointers. The other data fields represents three properties of employee and also the record stored in Stored_emp are indexed by using an index. At the conceptual level, database contains information about an entity. For example, for an employee database, the conceptual level includes information about employee entity such as employee_number, dept_number and salary. The conceptual level of architecture for the employee database can be represented as: Employee Employee_number Department_number Salary CHARACTER CHARACTER NUMERIC (6) (4) (5)

At the external level, the view of the database consists of two fields, employee number and salary. The external view shows only the fields that a user needs to view. For example, for an employee database the external level of architecture consists of two fields, emp# and salary and can be represented as: DCL 1 empp, 2 emp# CHAR (6), 2 sal FIXED BIN (30);

The various data fields can have different names in the various views of a database. For example, the employee number at the internal level is represented by emp and at the conceptual level it is represented by employee_number. Data IndependenceDBMS architecture can be used to explain the concept of data independence, which is the ability to change the representation of data at one level of a database system without the compulsion of changing the data representation at the next higher level. Two types of data independence can be defined: Logical data independence: It is the ability to change the representation of data at the conceptual level without having to change the representation of data at the external level. For example, if you want to expand the database by adding a record type or data item, you will have to change the conceptual level . The changes in the conceptual level can be made accordingly and the external level that refers to the remaining data need not be changed.

Database Management System

Physical data independence: It is the ability to change the representation of data at the internal level without having to change the representation of data at the conceptual or external level. Changes to the internal level may be needed, if some physical files are to be recognized. For example, if you want to improve the performance of retrieval or update of a database you may need to create additional access structures. This may result in file reorganization. If the data stored in the database does not change you will not have to change the conceptual level.

Database Management System

Ques. Explain Hierarchical Data Model Ans.: Data models can be defined as a collection of various concepts, which are used to describe the structure of a database. Implementing a data model includes specifying data types, relationship among data types and the constraint on the data. In the hierarchical model, also called hierarchical schema, data is organized in the form of a tree structure. Hierarchical model supports the concept of data independence. Data independence is the ability to change the representation of data at one level of a database system without the compulsion of changing the data representation at the next higher level. Hierarchical model uses two types of data structures, records and parent-child relationship among data. Records can be defined as a set of field values, which are used to provide information about an entity. An entity is a collection of objects in a database, which can be described by using a set of attributes. Records that have a same type can be easily grouped together to form a record type and assigned a name. The structure of a record type can be defined by using a collection of named fields or data items. Each data item or field has a certain data type such as character, float or integer. Parent-Child Relationship (PCR) can be defined as a 1:N relationship between two different record types. The record type on the 1-side is called parent record type on the N-side is called child record type. Occurrence of PCR type also called instance, consist of one record of the parent record type and a number of records of child record type. Department Employee

Employee1 Employee2 Finance Employee3


Hierarchical schema consists of number of record types and PCR types. In hierarchical schema, record-types are represented by rectangular boxes and PCR types are represented by the lines, which are used to connect parent record type to child record type. Next fig shows a hierarchical schema, which has three record types and two PCR types. Department, Employee and project are the record types. Department Name
Database Management System



Each record type can have a set of data items or fields. For example, record type Department can have department name, department number and department code as the fields or data items. PCR type can be represented by listing pair in parenthesis. In above fig. there are two PCR types, which can be represented as (Department, Employee) and (Department, Project). Each occurrences of the (Department, Employee) PCR type relates one department record to the records of the many employees, who works in that department. The occurrences of (Department, Project) PCR type relates a department record to the records of projects controlled by that department. Next fig. represents the tree like structure of the hierarchical schema shown in above fig.




In tree like structure, a record type is represented by node of the tree and PCR type is represented by arc of the tree. The following are the properties of the hierarchical schema, which contains number of record types and PCR types. One record type, called the root of the hierarchical schema do not participate as a child record type in any PCR type. In hierarchical model, each record can have only one parent record but can have many children records. Every record type except the root participates as a child record type in only one PCR type. A record type can participate as a parent record type in number of PCR types. A record type, which does not participates as a parent record type in any PCR type is called leaf node in hierarchical schema. If record type participates as a parent node in more than one PCR type, then its child record types must be in a left to right ordered sequence.

The advantages of the hierarchical data model are: It is simple to construct and operate on in the hierarchical model. It involves hierarchically organized domains such as product info in manufacturing and employee information in organization.

The disadvantages of the hierarchical data model are: It requires navigational and procedural processing of data. It provides less scope of query optimization.

Database Management System

Ques. What is normalization? List the various normal forms involved in normalization process.

What is functional dependency?

Ans.: Normalization is the process of eliminating redundancy of data in database. A relation table in a database is said to be in a normal form if it satisfies certain constraints. Various normal forms involved in normalization 1. 2. 3. 4. 5. First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

The goals of normalization are Removing redundant data. Ensuring that only related data is stored in table.

Therefore normalization helps you to remove data redundancy and update inconsistency when data is inserted, deleted or modified in database. Functional dependency The functional dependency is the constraint between two sets of attributes from the database. Functional dependency is represented by X -> Y between two attributes, X and Y, in a table. The functional dependency X -> Y implies that Y is functionally dependent on X. Employee_id K067263 K067264 K067265 Employee_name John Chris Ken Employee_dept Sales Accounts Sales

In above table the various attributes of the EMPLOYEE are Employee_id, Employee_name, Employee_dept. You can state that Employee_id -> Employee_name The above representation that the Employee_name attribute is functionally dependent on the Employee_id implies that the name of the employee can be uniquely identified by the id of the employee. However you cannot uniquely identify Employee_id from Employee_name because more than one employee can have same name. However, each employee have different Employee_id in Employee_id column. Functional dependencies are a type of constraints based on keys such as primary key or foreign key. For a relation table R, a column Y is said to be functionally dependent on a column X of the same table if each value of column X is associated with only one value of column Y at a given time. All the columns in the relation table R should be functionally dependent on column X if the column X is a primary key.

Database Management System

If the column X and Y are functionally dependent, the functional dependency can be represented as R.x -> R.y For example consider the following functional dependency in the table. Employee_id -> Salary, the column Employee_id functionally determines the Salary column because the salary of each employee is unique and remains the same for an employee, each time the name of the employee appears in the table. A functional dependency represented by X -> Y, between two sets of attributes X and Y that are subsets of R, is called as trivial functional dependency if Y is a subset of X. For example, Employee_id -> Project is a trivial functional dependency. A functional dependency, represented by X -> Y, between two sets of attributes X and Y, which are subset of R, is called a non trivial functional dependency if at least one of the attributes of Y is not among the attributes of X. For example, Employee_id -> Salary is a non trivial functional dependency.

Database Management System

Ques. Explain set of properties that guarantee database transactions are processed reliably. Ans.: ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction. Characteristics . Atomicity

Atomicity requires that database modifications must follow an "all or nothing" rule. Each transaction is said to be atomic. If one part of the transaction fails, the entire transaction fails and the database state is left unchanged. It is critical that the database management system maintain the atomic nature of transactions in spite of any application, DBMS (Database Management System), operating system or hardware failure. An atomic transfer cannot be subdivided and must be processed in its entirety or not at all. Atomicity means that users do not have to worry about the effect of incomplete transactions. Transactions can fail for several kinds of reasons: .

Hardware failure: A disk drive fails, preventing some of the transaction's database
changes from taking effect. System failure: The user loses their connection to the application before providing all necessary information. Database failure: E.g., the database runs out of room to hold additional data. Application failure: The application attempts to post data that violates a rule that the database itself enforces such as attempting to insert a duplicate value in a column. Consistency

It ensures the truthfulness of the database. The consistency property ensures that any transaction the database performs will take it from one consistent state to another. The consistency property does not say how the DBMS should handle an inconsistency other than ensure the database is clean at the end of the transaction. If, for some reason, a transaction is executed that violates the databases consistency rules, the entire transaction could be rolled back to the pre-transactional state - or it would be equally valid for the DBMS to take some patch-up action to get the database in a consistent state. Thus, if the database schema says that a particular field is for holding integer numbers, the DBMS could decide to reject attempts to put fractional values there, or it could round the supplied values to the nearest whole number: both options maintain consistency. The consistency rule applies only to integrity rules that are within its scope. Thus, if a DBMS allows fields of a record to act as references to another record, then consistency implies the DBMS must enforce referential integrity: by the time any transaction ends, each and every reference in the database must be valid. If a transaction consisted of an attempt to delete a record referenced by another, each of the following mechanisms would maintain consistency:
Database Management System

abort the transaction, rolling back to the consistent, prior state; delete all records that reference the deleted record (this is known as cascade delete); or, nullify the relevant fields in all records that point to the deleted record.

These are examples of propagation constraints; some database systems allow the database designer to specify which option to choose when setting up the schema for a database. Application developers are responsible for ensuring application level consistency, over and above that offered by the DBMS. Thus, if a user withdraws funds from an account and the new balance is lower than the account's minimum balance threshold, as far as the DBMS is concerned, the database is in a consistent state even though this rule (unknown to the DBMS) has been violated. . Isolation

Isolation refers to the requirement that other operations cannot access data that has been modified during a transaction that has not yet completed. The question of isolation occurs in case of concurrent transactions (multiple transactions occurring at the same time). Each transaction must remain unaware of other concurrently executing transactions, except that one transaction may be forced to wait for the completion of another transaction that has modified data that the waiting transaction requires. If the isolation system does not exist, then the data could be put into an inconsistent state. This could happen, if one transaction is in the process of modifying data but has not yet completed, and then a second transaction reads and modifies that uncommitted data from the first transaction. If the first transaction fails and the second one succeeds, that violation of transactional isolation will cause data inconsistency. Due to performance and deadlocking concerns with multiple competing transactions, some modern databases allow dirty reads which is a way to bypass some of the restrictions of the isolation system. A dirty read means that a transaction is allowed to read, but not to modify the uncommitted data from another transaction. Another way to provide isolation for read transactions is via MVCC which gets around the blocking lock issues of reads blocking writes. The read is done on a prior version of data and not on the data that is being locked for modification thus providing the necessary isolation between transactions. . Durability

Durability is the ability of the DBMS to recover the committed transaction updates against any kind of system failure (hardware or software). Durability is the DBMS's guarantee that once the user has been notified of a transaction's success the transaction will not be lost, the transaction's data changes will survive system failure, and that all integrity constraints have been satisfied, so the DBMS won't need to reverse the transaction. Many DBMSs implement durability by writing transactions into a transaction log that can be reprocessed to recreate the system state right before any later failure. A transaction is deemed committed only after it is entered in the log. Durability does not imply a permanent state of the database. A subsequent transaction may modify data changed by a prior transaction without violating the durability principle.

Database Management System

Ques. What are the rules of RDBMS? Ans.: Following are the rules of RDBMS:

Rule 0: The system must qualify as relational as a database, and as a management system: For a system to qualify as a relational database management system (RDBMS), it system must use its relational facilities (exclusively) to manage the database. Rule 1: The information rule: All information in the database is to be represented in one and only one way, namely by values in column positions within rows of tables. Rule 2: The guaranteed access rule: All data must be accessible. This rule is essentially a restatement of the fundamental requirement for primary keys. It says that every individual scalar value in the database must be logically addressable by specifying the name of the containing table, the name of the containing column and the primary key value of the containing row. Rule 3: Systematic treatment of null values: The DBMS must allow each field to remain null (or empty). Specifically, it must support a representation of "missing information and inapplicable information" that is systematic, distinct from all regular values (for example, "distinct from zero or any other number", in the case of numeric values), and independent of data type. It is also implied that such representations must be manipulated by the DBMS in a systematic way. Rule 4: Active online catalog based on the relational model: The system must support an online, inline, relational catalog that is accessible to authorized users by means of their regular query language. That is, users must be able to access the database's structure (catalog) using the same query language that they use to access the database's data. Rule 5: The comprehensive data sublanguage rule: The system must support at least one relational language that Has a linear syntax Can be used both interactively and within application programs, Supports data definition operations (including view definitions), data manipulation operations (update as well as retrieval), security and integrity constraints, and transaction management operations (begin, commit, and rollback).

Rule 6: The view updating rule: All views that are theoretically updatable must be updatable by the system. Rule 7: High-level insert, update, and delete:
Database Management System

The system must support set-at-a-time insert, update, and delete operators. This means that data can be retrieved from a relational database in sets constructed of data from multiple rows and/or multiple tables. This rule states that insert, update, and delete operations should be supported for any retrievable set rather than just for a single row in a single table. Rule 8: Physical data independence: Changes to the physical level (how the data is stored, whether in arrays or linked lists etc.) must not require a change to an application based on the structure. Rule 9: Logical data independence: Changes to the logical level (tables, columns, rows, and so on) must not require a change to an application based on the structure. Logical data independence is more difficult to achieve than physical data independence. Rule 10: Integrity independence: Integrity constraints must be specified separately from application programs and stored in the catalog. It must be possible to change such constraints as and when appropriate without unnecessarily affecting existing applications. Rule 11: Distribution independence: The distribution of portions of the database to various locations should be invisible to users of the database. Existing applications should continue to operate successfully : when a distributed version of the DBMS is first introduced; and when existing distributed data are redistributed around the system.

Rule 12: The non subversion rule: If the system provides a low-level (record-at-a-time) interface, then that interface cannot be used to subvert the system, for example, bypassing a relational security or integrity constraint.

Database Management System

Ques. Explain selection and projection operations of relational algebra? Ans.: Selection - A generalized selection is a unary operation written as where is a propositional formula that consists of atoms as allowed in the normal selection and the logical operators (and), (or) and (negation). This selection selects all those tuples in R for which holds. Rules about selection operators play the most important role in query optimization. Selection is an operator that very effectively decreases the number of rows in its operand, so if we manage to move the selections in an expression tree towards the leaves, the internal relations (yielded by sub expressions) will likely shrink.

Basic selection properties Selection is idempotent (multiple applications of the same selection have no additional effect beyond the first one), and commutative (the order selections are applied in has no effect on the eventual result).

Breaking up selections with complex conditions A selection whose condition is a conjunction of simpler conditions is equivalent to a sequence of selections with those same individual conditions, and selection whose condition is a disjunction is equivalent to a union of selections. These identities can be used to merge selections so that fewer selections need to be evaluated, or to split them so that the component selections may be moved or optimized separately.

Selection and cross product Cross product is the costliest operator to evaluate. If the input relations have N and M rows, the result will contain NM rows. Therefore it is very important to do our best to decrease the size of both operands before applying the cross product operator. This can be effectively done, if the cross product is followed by a selection operator, e.g. A(R P). Considering the definition of join, this is the most likely case. If the cross product is not followed by a selection operator, we can try to push down a selection from higher levels of the expression tree using the other selection rules. In the above case we break up condition A into conditions B, C and D using the split rules about complex selection conditions, so that A = B C D and B only contains attributes from R, C
Database Management System

contains attributes only from P and D contains the part of A that contains attributes from both R and P. Note, that B, C or D are possibly empty. Then the following holds:

Selection and set operators Selection is distributive over the setminus, intersection, and union operators. The following three rules are used to push selection below set operations in the expression tree. Note, that in the setminus and the intersection operators it is possible to apply the selection operator to only one of the operands after the transformation. This can make sense in cases, where one of the operands is small, and the overhead of evaluating the selection operator outweighs the benefits of using a smaller relation as an operand.

Selection and projection Selection is associative with projection if and only if the fields referenced in the selection condition are a subset of the fields in the projection. Performing selection before projection may be useful if the operand is a cross product or join. In other cases, if the selection condition is relatively expensive to compute, moving selection outside the projection may reduce the number of tuples which must be tested (since projection may produce fewer tuples due to the elimination of duplicates resulting from elided fields).

Projection - A projection is a unary operation written as where a1,...,an is a set of attribute names. The result of such projection is defined as the set that is obtained when all tuples in R are restricted to the set {a1,...,an}.

Basic projection properties Projection is idempotent, so that a series of (valid) projections is equivalent to the outermost projection.

Projection and set operators

Database Management System

Projection is distributive over set union.

Projection does not distribute over intersection and set difference. Counterexamples are given by:


where b is assumed to be distinct from b'.

Database Management System

Ques. What are advantages of dbms? Ans.: Advantages of DBMS are as follows:

(1) DBMS provides "Potential for Enforcing Standards". In large organization, database approach allows a Database Administrator to define standards as well as gives freedom of enforcing standards among database users. In a centralized database environment, it is easy for a DBA to enforce standards than in those environments where each user has control over his own files. (2) DBMS Provides "Reduced Application Development Time". No doubt it takes more time for designing a database than making a single application but once a database is created then it becomes very easy to make new application by using facilities provided by DBMS. (3) DBMS provides flexibility. To the requirements we have to do changes in database. Using DBMS, we can change the structure of database and advantage is that it will not affect on the existing database. (4) When a single user modifies the database then all the other users of database can see this modification (update). (5) DBMS provides multiple user interfaces. (6) DBMS provides storage structure for Efficient Query Processing. (7) DBMS provide a mechanism for controlling redundancy. (8) DBMS provides backup and recovery. (9) DBMS enforce integrity constraints. (10) DBMS provides Persistent storage for program objects.

Database Management System

Ques. Explain network data model? Ans : The popularity of the network data model coincided with the popularity of the hierarchical data model. Some data were more naturally modeled with more than one parent per child. So, the network model permitted the modeling of many-to-many relationships in data. In 1971, the Conference on Data Systems Languages (CODASYL) formally defined the network model. The basic data modeling construct in the network model is the set construct. A set consists of an owner record type, a set name, and a member record type. A member record type can have that role in more than one set, hence the multiparent concept is supported. An owner record type can also be a member or owner in another set. The data model is a simple network, and link and intersection record types (called junction records by IDMS) may exist, as well as sets between them . Thus, the complete network of relationships is represented by several pair wise sets; in each set some (one) record type is owner (at the tail of the network arrow) and one or more record types are members (at the head of the relationship arrow). Usually, a set defines a 1:M relationship, although 1:1 is permitted. The CODASYL network model is based on mathematical set theory.

Database Management System

Ques. Explain SQL objects? Ans.: SQL Objects are as follows: Schemas Data Dictionary Journals and Journal Receivers Catalogs Tables, Rows, and Columns Aliases Views Indexes Constraints Triggers Stored Procedures Sequences

Schemas - A schema provides a logical grouping of SQL objects. A schema consists of a library, a journal, a journal receiver, a catalog, and optionally, a data dictionary. Tables, views, and system objects (such as programs) can be created, moved, or restored into any system library. All system files can be created or moved into an SQL schema if the SQL schema does not contain a data dictionary. If the SQL schema contains a data dictionary then: Source physical files or non source physical files with one member can be created, moved, or restored into an SQL schema. Logical files cannot be placed in an SQL schema because they cannot be described in the data dictionary. You can create and own many schemas. The term collection can be used synonymously with schema.

Data Dictionary A schema contains a data dictionary if it was created before Version 3 Release 1 or if the WITH DATA DICTIONARY clause was specified on the CREATE SCHEMA statements. A data dictionary is a set of tables containing object definitions. If SQL created the dictionary, then it is automatically maintained by the system. You can work with data dictionaries by using the interactive data definition utility (IDDU), which is part of the OS/400 program

Journals and Journal Receivers A journal and journal receiver are used to record changes to tables and views in the database. The journal and journal receiver are then used in processing SQL COMMIT, ROLLBACK, SAVEPOINT, and RELEASE SAVEPOINT statements. The journal and journal receiver can also be
Database Management System

used as an audit trail or for forward or backward recovery. For more information about journaling, see the Journaling topic.

Catalogs An SQL catalog consists of a set of tables and views which describe tables, views, indexes, packages, procedures, functions, files, sequences, triggers, and constraints. This information is contained in a set of cross-reference tables in libraries QSYS and QSYS2. In each SQL schema there is a set of views built over the catalog tables that contains information about the tables, views, indexes, packages, files, and constraints in the schema. A catalog is automatically created when you create a schema. You cannot drop or explicitly change the catalog.

Tables, Rows, and Columns A table is a two-dimensional arrangement of data consisting of rows and columns. The row is the horizontal part containing one or more columns. The column is the vertical part containing one or more rows of data of one data type. All data for a column must be of the same type. A table in SQL is a keyed or non keyed physical file. See the Data types topic in the SQL Reference book for a description of data types. A materialized query table is a table that is used to contain materialized data that is derived from one or more source tables specified by a select-statement.

Aliases An alias is an alternate name for a table or view. You can use an alias to refer to a table or view in those cases where an existing table or view can be referred to. Additionally, aliases can be used to join table members.

Views A view appears like a table to an application program; however, a view contains no data. It is created over one or more tables. A view can contain all the columns of given tables or some subset of them, and can contain all the rows of given tables or some subset of them. The columns can be arranged differently in a view than they are in the tables from which they are taken. A view in SQL is a special form of a non keyed logical file.

Indexes An SQL index is a subset of the data in the columns of a table that are logically arranged in either ascending or descending order. Each index contains a separate arrangement. These
Database Management System

arrangements are used for ordering (ORDER BY clause), grouping (GROUP BY clause), and joining. An SQL index is a keyed logical file. The index is used by the system for faster data retrieval. Creating an index is optional. You can create any number of indexes. You can create or drop an index at any time. The index is automatically maintained by the system. However, because the indexes are maintained by the system, a large number of indexes can adversely affect the performance of applications that change the table.

Constraints Constraints are rules enforced by the database manager. DB2 UDB for iSeries supports the following constraints: Unique constraints A unique constraint is the rule that the values of the key are valid only if they are unique. Unique constraints can be created using the CREATE TABLE and ALTER TABLE statements. Although CREATE INDEX can create a unique index that also guarantees uniqueness, such an index is not a constraint. Unique constraints are enforced during the execution of INSERT and UPDATE statements. A PRIMARY KEY constraint is a form of UNIQUE constraint. The difference is that a PRIMARY KEY cannot contain any nullable columns. Referential constraints A referential constraint is the rule that the values of the foreign key are valid only if: They appear as values of a parent key, or Some component of the foreign key is null. Referential constraints are enforced during the execution of INSERT, UPDATE, and DELETE statements. Check constraints A check constraint is a rule that limits the values allowed in a column or group of columns. Check constraints can be added using the CREATE TABLE and ALTER TABLE statements. Check constraints are enforced during the execution of INSERT and UPDATE statements. To satisfy the constraint, each row of data inserted or updated in the table must make the specified condition either TRUE or unknown (due to a null value).

Triggers A trigger is a set of actions that are run automatically whenever a specified event occurs to a specified base table. An event can be an insert, update, delete, or read operation. The trigger
Database Management System

can be run either before or after the event. DB2 UDB for series supports SQL insert, update, and delete triggers and external triggers.

Stored Procedures A stored procedure is a program that can be called using the SQL CALL statement. DB2 UDB for series supports external stored procedures and SQL procedures. External stored procedures can be any system program, service program, or REXX procedure. They cannot be System/36 programs or procedures. An SQL procedure is defined entirely in SQL and can contain SQL statements including SQL control statements.

Sequences A sequence is a data area object that provides a quick and easy way of generating unique numbers. You can use sequences to replace an IDENTITY column or user-generated numeric column. A sequence has similar uses as these alternatives. Que. Explain joins in SQL?

Ques. Explain joins in SQL? Ans.: SQL INNER JOIN Example The "Persons" table: P_Id 1 2 3 LastName Hansen Svendson Pettersen FirstName Ola Tove Kari Address Timoteivn 10 Borgvn 23 Storgt 20 City Sandnes Sandnes Stavanger

The "Orders" table: O_Id 1 2 3 OrderNo 77895 44678 22456 P_Id 3 3 1

Database Management System

4 5

24562 34764

1 15

Now we want to list all the persons with any orders. We use the following SELECT statement:

SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons INNER JOIN Orders ON Persons.P_Id=Orders.P_Id ORDER BY Persons.LastName
The result-set will look like this: LastName Hansen Hansen Pettersen Pettersen FirstName Ola Ola Kari Kari OrderNo 22456 24562 77895 44678

The INNER JOIN keyword return rows when there is at least one match in both tables. If there are rows in "Persons" that do not have matches in "Orders", those rows will NOT be listed.

SQL LEFT JOIN The LEFT JOIN keyword returns all rows from the left table (table_name1), even if there are no matches in the right table (table_name2). Syntax

SELECT column_name(s) FROM table_name1 LEFT JOIN table_name2 ON table_name1.column_name=table_name2.column_name

In some databases LEFT JOIN is called LEFT OUTER JOIN. Example
Database Management System

Now we want to list all the persons and their orders - if any, from the tables above. We use the following SELECT statement: SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons LEFT JOIN Orders ON Persons.P_Id=Orders.P_Id ORDER BY Persons.LastName The result-set will look like this: LastName Hansen Hansen Pettersen Pettersen Svendson FirstName Ola Ola Kari Kari Tove OrderNo 22456 24562 77895 44678

The LEFT JOIN keyword returns all the rows from the left table (Persons), even if there are no matches in the right table (Orders).

SQL RIGHT JOIN The RIGHT JOIN keyword Return all rows from the right table (table_name2), even if there are no matches in the left table (table_name1). Syntax SELECT column_name(s) FROM table_name1 RIGHT JOIN table_name2 ON table_name1.column_name=table_name2.column_name

Now we want to list all the orders with containing persons - if any, from the tables above. We use the following SELECT statement: SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo FROM Persons RIGHT JOIN Orders ON Persons.P_Id=Orders.P_Id
Database Management System

ORDER BY Persons.LastName The result-set will look like this: LastName Hansen Hansen Pettersen Pettersen FirstName Ola Ola Kari Kari OrderNo 22456 24562 77895 44678 34764

The RIGHT JOIN keyword returns all the rows from the right table (Orders), even if there are no matches in the left table (Persons).

Database Management System

Ques. How can you detect and prevent a deadlock? Ans.: Deadlock Detection

Deadlock detection involves periodically checking whether the system is in a state of deadlock. There are two basic methods of detection: timeouts and wait-for graphs.

Timeouts represent the simplest method of detection. If a transaction waits for a period longer
than some constant timeout period it will be aborted. This method has a low overhead, but may end up aborting transactions even in the absence of deadlock. Another approach is for the system to construct a wait-for graph. Each node in the graph represents an active transaction. A directed edge is drawn between two transactions when one transaction is waiting for a lock on an item held by the other transaction. Deadlock exists (and is detected) when there is a cycle in the graph. At this point the system engages in victim selection, where one of the transactions is chosen to be aborted. The challenge in this technique is deciding when and how often to check for deadlock in the graph. Deadlock Prevention For deadlock to occur four conditions need to be true (meaning you need to break one to prevent deadlock): Mutual Exclusion a resource cannot be held by more than one transaction at a time. This condition is true of database systems using two 2PL where an exclusive lock is required on updates. Systems using OCC dont hold locks, and so break this condition. Hold and Wait transactions already holding resources can request further resources. Conservative 2PL breaks this condition, since it requires all locks to be acquired from the outset. However this isnt always desirable as it limits concurrency. No pre-emption a resource cannot be forcibly removed from a transaction. Pre-emption is used in timestamp-based approaches. Two of the most commonly used schemes are wait-die and wound-wait. In wait-die (non-preemptive), if a transaction tries to lock an item which is already locked, it waits if the holder of the lock is a younger transaction (based on timestamp); otherwise it will abort. In wound-wait (pre-emptive), instead of waiting as before, the transaction aborts the other younger transaction. If it is the younger transaction it waits. Circular wait a number of transactions form a circular chain where each transaction is waiting for a resource that a later transaction (in the chain) holds. This can be prevented by imposing a total ordering on resources, requiring that each transaction requests locks on resources in an agreed order. This may not be possible in some forms of 2PL where locks are not taken out at a single point in time. Many of these approaches arent ideal because they result in transactions being aborted at the slightest chance of deadlock. In situations where deadlock will rarely occur (for example, when transactions are mostly short-lived and lightweight) detection is more practical.

Database Management System