1

B.Com (Computers) II Year DATABASE MANAGEMENT SYSTEM Unit- II 1. What is model? A. A representation of reality that retains only selected details. 2. What is an object set? A. Objects represent things that are important to users in the portion of reality we want to model. A set of things of the same kind are called as object sets. A particular instance of an object set is called object instance. 3. What is an Entity and Attribute? A. An entity is a thing or object in the real world that is distinguishable from the the other objects. An entity set has a set of properties called attributes. 4. Explain the different Mapping cardinalities. A. Mapping cardinalities express the number of entities to which another entity can be associated via a relationship set. For a binary relationship set R between entity sets A and B, the mapping cardinalities must be one of the following. (i). One to One: An entity in A is associated with at most one entity in B and an entity in B is associated with at most one entity in A. (ii). One to many: An entity in A is associated with any number of entities in B. An entity in B, however can be associated with at most one entity in A. (iii). Many to One: An entity in A is associated with at most one entity in B and an entity in B, however can be associated with any number of entities in A. (iv).Many to many: An entity in A is associated with any number of entities in B and an entity in B is associated with any number of entities in A. 5. What is Generalization and Specialization? A. Generalization: An object set that is a superset of another object set. Specialization: An object set that is a subset of another object set. 6. What is Aggregation? A. Aggregation: It is a relationship set viewed as an object set. 7. Explain Entity-Relationship diagram. A. E-R Diagram: An entity-relationship diagram (ERD) is a data modeling technique that creates a graphical representation of the entities, and the relationships between entities, within an information system. Any ER diagram has an equivalent relational table, and any relational table has an equivalent ER diagram. Entity: The entity is a person, object, place or event for which data is collected. It is equivalent to a database table. An entity can be defined by means of its properties, called attributes. For example, the CUSTOMER entity may have attributes for such things as name, address and telephone number. Relationship: The relationship is the interaction between the entities. E-R diagram components are: Rectangles representing entity sets. Ellipses representing attributes.

IIMC

Prasanth Kumar K

2
Diamonds representing relationship sets. Lines’ linking attributes to entity sets and entity sets to relationship sets.

Explain the different Implementation models. A. There are three types of Implementation models 1. The Hierarchical Data Model: The Hierarchical Data Model can be represented as follows:

A hierarchical database consists of the following: It contains nodes connected by branches. The top node is called the root. If multiple nodes appear at the top level, the nodes are called root segments. Each node (with the exception of the root) has exactly one parent. One parent may have many children. 2. The Network Data Model: The Network Data Model can be represented as follows:

Like the Hierarchical Data Model the Network Data Model also consists of nodes and branches, but a child may have multiple parents within the network structure. 3. The Relational Data Model:

IIMC

Prasanth Kumar K

3
The Relational Data Model has the relation at its heart, but then a whole series of rules governing keys, relationships, joins, functional dependencies, transitive dependencies, multi-valued dependencies, and modification anomalies. The Relation The Relation is the basic element in a relational data model.

A relation is subject to the following rules: Relation (file, table) is a two-dimensional table. Attribute (i.e. field or data item) is a column in the table. Each column in the table has a unique name within that table. Each column is homogeneous. Thus the entries in any column are all of the same type (e.g. age, name, employee-number, etc). Each column has a domain, the set of possible values that can appear in that column. A Tuple (i.e. record) is a row in the table. The order of the rows and columns is not important. Values of a row all relate to some thing or portion of a thing. Repeating groups (collections of logically related attributes that occur multiple times within one record occurrence) are not allowed. Duplicate rows are not allowed (candidate keys are designed to prevent this). Cells must be single-valued (but can be variable length). Single valued means the following: Cannot contain multiple values such as 'A1, B2, and C3’. Cannot contain combined values such as 'ABC-XYZ' where 'ABC' means one thing and 'XYZ' another. A relation may be expressed using the notation R (A, B, C...) where: R = the name of the relation. (A,B,C, ...) = the attributes within the relation. A = the attribute(s) which form the primary key. 9. Explain the different types of keys in Relational Data Model. A. Keys 1. A simple key contains a single attribute. 2. A composite key is a key that contains more than one attribute.

IIMC

Prasanth Kumar K

4
3. A candidate key is an attribute (or set of attributes) that uniquely identifies a row. A candidate key must possess the following properties: Unique identification - For every row the value of the key must uniquely identify that row. Non redundancy - No attribute in the key can be discarded without destroying the property of unique identification. 4. A primary key is the candidate key which is selected as the principal unique identifier. Every relation must contain a primary key. The primary key is usually the key selected to identify a row when the database is physically implemented. For example, empno is selected in EMP relation. 5. A superkey is any set of attributes that uniquely identifies a row. A superkey differs from a candidate key in that it does not require the non redundancy property. 6. A foreign key is an attribute (or set of attributes) that appears (usually) as a non key attribute in one relation and as a primary key attribute in another relation. I say usually because it is possible for a foreign key to also be the whole or part of a primary key: A many-to-many relationship can only be implemented by introducing an intersection or link table which then becomes the child in two one-to-many relationships. The intersection table therefore has a foreign key for each of its parents, and its primary key is a composite of both foreign keys. A one-to-one relationship requires that the child table has no more than one occurrence for each parent, which can only be enforced by letting the foreign key also serve as the primary key. 8. What is Determinant and Dependent? A. The terms determinant and dependent can be described as follows: The expression X Y means 'if I know the value of X, then I can obtain the value of Y' (in a table or somewhere). In the expression X dependent attribute. Y, X is the determinant and Y is the

The value X determines the value of Y. The value Y depends on the value of X. 9. What is Functional Dependency? A. A functional dependency can be described as follows: 1. An attribute is functionally dependent if its value is determined by another attribute. 2. That is, if we know the value of one (or several) data items, then we can find the value of another (or several). 3. Functional dependencies are expressed as X and Y is the functionally dependent attribute. 4. If A 6. If A 7. If A (B,C) then A B and B B and A C. C and B C. 5. If (A,B) Y, where X is the determinant

C, then it is not necessarily true that A

A, then A and B are in a 1-1 relationship.

B then for A there can only ever be one value for B.

IIMC

Prasanth Kumar K

5
10. What is Transitive Dependency? A. A transitive dependency can be described as follows: 1. An attribute is transitively dependent if its value is determined by another attribute which is not a key. 2. If X Y and X is not a key then this is a transitive dependency. B C but NOT A C. 3. A transitive dependency exists when A 12. What is Multi-Valued Dependency? A. A multi-valued dependency can be described as follows: 1. A table involves a multi-valued dependency if it may contain multiple values for an entity. 2. A multi-valued dependency may arise as a result of enforcing 1st normal form. 3. X Y, ie X multi-determines Y, when for each value of X we can have more than one value of Y. 4. If A B and A C then we have a single attribute A which multi-determines two other independent attributes, B and C. 5. If A (B,C) then we have an attribute A which multi-determines a set of associated attributes, B and C 13. What is Join Dependency? A. A join dependency can be described as follows: If a table can be decomposed into three or more smaller tables, it must be capable of being joined again on common keys to form the original table. 14. Explain different forms of Normalization. A. Normalization is the process of converting the table into standard form. It is of different types: 1st Normal Form A table is in first normal form if all the key attributes have been defined and it contains no repeating groups. Taking the ORDER entity as an example we could end up with a set of attributes like this: ORDER order_id 123 456 customer_id 456 789 product1 abc1 abc2 product2 def1 Abc3 product3 ghi1 Ghi4

This structure creates the following problems: Order 123 has no room for more than 3 products. Order 456 has wasted space for product2 and product3.

IIMC

Prasanth Kumar K

6
In order to create a table that is in first normal form we must extract the repeating groups and place them in a separate table, which I shall call ORDER_LINE. ORDER order_id 123 456 customer_id 456 789

I have removed 'product1', 'product2' and 'product3', so there are no repeating groups. ORDER_LINE order_id 123 123 123 456 product abc1 def1 ghi1 abc2

Each row contains one product for one order, so this allows an order to contain any number of products. The new relationships can be expressed as follows: 1 instance of an ORDER has 1 to many ORDER LINES 1 instance of a PRODUCT has 0 to many ORDER LINES

2nd Normal Form A table is in second normal form (2NF) if and only if it is in 1NF and every non key attribute is fully functionally dependent on the whole of the primary key (i.e. there are no partial dependencies). 1. Anomalies can occur when attributes are dependent on only part of a multi-attribute (composite) key. 2. A relation is in second normal form when all non-key attributes are dependent on the whole key. That is, no attribute is dependent on only a part of the key. 3. Any relation having a key with a single attribute is in second normal form. Take the following table structure as an example: order(order_id, cust, cust_address, cust_contact, order_date, order_total) Here we should realize that cust_address and cust_contact are functionally dependent on cust but not on order_date, therefore they are not dependent on the whole key. To make this table 2NF these attributes must be removed and placed somewhere else.

IIMC

Prasanth Kumar K

7
3rd Normal Form A table is in third normal form (3NF) if and only if it is in 2NF and every non key attribute is non transitively dependent on the primary key (i.e. there are no transitive dependencies). 1. Anomalies can occur when a relation contains one or more transitive dependencies. 2. A relation is in 3NF when it is in 2NF and has no transitive dependencies. 3. A relation is in 3NF when 'All non-key attributes are dependent on the key, the whole key and nothing but the key'. Take the following table structure as an example: order(order_id, cust, cust_address, cust_contact, order_date, order_total) Here we should realise that cust_address and cust_contact are functionally dependent on cust which is not a key. To make this table 3NF these attributes must be removed and placed somewhere else. You must also note the use of calculated or derived fields. Take the example where a table contains PRICE, QUANTITY and EXTENDED_PRICE where EXTENDED_PRICE is calculated as QUANTITY multiplied by PRICE. As one of these values can be calculated from the other two then it need not be held in the database table. Do not assume that it is safe to drop any one of the three fields as a difference in the number of decimal places between the various fields could lead to different results due to rounding errors. For example, take the following fields: AMOUNT - a monetary value in home currency, to 2 decimal places. EXCH_RATE - exchange rate, to 9 decimal places. CURRENCY_AMOUNT - amount expressed in foreign currency, calculated as AMOUNT multiplied by EXCH_RATE. If you were to drop EXCH_RATE could it be calculated back to its original 9 decimal places? Reaching 3NF is adequate for most practical needs, but there may be circumstances which would benefit from further normalization.

Boyce-Codd Normal Form A table is in Boyce-Codd normal form (BCNF) if and only if it is in 3NF and every determinant is a candidate key. 1. Anomalies can occur in relations in 3NF if there is a composite key in which part of that key has a determinant which is not a candidate key. 2. This can be expressed as R(A,B,C), C A where: The relation contains attributes A, B and C. A and B form a candidate key. C is the determinant for A (A is functionally dependent on C). C is not part of any key. 3. Anomalies can also occur where a relation contains several candidate keys where:

IIMC

Prasanth Kumar K

8
The keys contain more than one attribute (they are composite keys). An attribute is common to more than one key. Take the following table structure as an example: schedule(campus, course, class, time, room/bldg) Take the following sample data: campus East East West course English 101 English 101 English 101 class 1 2 3 time 8:00-9:00 10:00-11:00 8:00-9:00 room/bldg 212 AYE 305 RFK 102 PPR

Note that no two buildings on any of the university campuses have the same name, thus ROOM/BLDG CAMPUS. As the determinant is not a candidate key this table is NOT in Boyce-Codd normal form. This table should be decomposed into the following relations: R1(course, class, room/bldg, time) R2(room/bldg, campus) As another example take the following structure: enrol(student#, s_name, course#, c_name, date_enrolled) This table has the following candidate keys: (student#, course#) (student#, c_name) (s_name, course#) - this assumes that s_name is a unique identifier (s_name, c_name) - this assumes that c_name is a unique identifier The relation is in 3NF but not in BCNF because of the following dependencies: student# course# s_name c_name

4th Normal Form A table is in fourth normal form (4NF) if and only if it is in BCNF and contains no more than one multi-valued dependency. 1. Anomalies can occur in relations in BCNF if there is more than one multi-valued dependency. 2. If A B and A C but B and C are unrelated, ie A have more than one multi-valued dependency. (B,C) is false, then we

3. A relation is in 4NF when it is in BCNF and has no more than one multi-valued dependency.

IIMC

Prasanth Kumar K

9
Take the following table structure as an example: info(employee#, skills, hobbies) Take the following sample data: employee# 1 1 1 1 2 2 2 2 skills Programming Programming Analysis Analysis Analysis Analysis Management Management hobbies Golf Bowling Golf Bowling Golf Gardening Golf Gardening

This table is difficult to maintain since adding a new hobby requires multiple new rows corresponding to each skill. This problem is created by the pair of multi-valued dependencies EMPLOYEE# SKILLS and EMPLOYEE# HOBBIES. A much better alternative would be to decompose INFO into two relations: skills(employee#, skill) hobbies(employee#, hobby)

5th (Projection-Join) Normal Form A table is in fifth normal form (5NF) or Projection-Join Normal Form (PJNF) if it is in 4NF and it cannot have a lossless decomposition into any number of smaller tables. Another way of expressing this is: ... and each join dependency is a consequence of the candidate keys. Yet another way of expressing this is: ... and there are no pairwise cyclical dependencies in the primary key comprised of three or more attributes. Anomalies can occur in relations in 4NF if the primary key has three or more fields. 5NF is based on the concept of join dependence - if a relation cannot be decomposed any further then it is in 5NF. Pairwise cyclical dependency means that: You always need to know two values (pairwise). For any one you must know the other two (cyclical). Take the following table structure as an example: buying(buyer, vendor, item)

IIMC

Prasanth Kumar K

10
This is used to track buyers, what they buy, and from whom they buy. Take the following sample data: buyer Sally Mary Sally Mary Sally vendor Liz Claiborne Liz Claiborne Jordach Jordach Jordach item Blouses Blouses Jeans Jeans Sneakers

The question is, what do you do if Claiborne starts to sell Jeans? How many records must you create to record this fact? The problem is there are pairwise cyclical dependencies in the primary key. That is, in order to determine the item you must know the buyer and vendor, and to determine the vendor you must know the buyer and the item, and finally to know the buyer you must know the vendor and the item. The solution is to break this one table into three tables; Buyer-Vendor, Buyer-Item, and Vendor-Item.

6th (Domain-Key) Normal Form A table is in sixth normal form (6NF) or Domain-Key normal form (DKNF) if it is in 5NF and if all constraints and dependencies that should hold on the relation can be enforced simply by enforcing the domain constraints and the key constraints specified on the relation. Another way of expressing this is: ... if every constraint on the table is a logical consequence of the definition of keys and domains. 1. An domain constraint (better called an attribute constraint) is simply a constraint to the effect a given attribute A of R takes its values from some given domain D. 2. A key constraint is simply a constraint to the effect that a given set A, B, ..., C of R constitutes a key for R. This standard was proposed by Ron Fagin in 1981, but interestingly enough he made no note of multi-valued dependencies, join dependencies, or functional dependencies in his paper and did not demonstrate how to achieve DKNF. However, he did manage to demonstrate that DKNF is often impossible to achieve. 15. Explain the different operators in Relational Algebra? A. The eight relational algebra operators are 1. SELECT To retrieve specific tuples/rows from a relation.

IIMC

Prasanth Kumar K

11

Ord# 101 104

OrdDate 02-08-94 18-09-94

Cust# 002 002

2. PROJECT To retrieve specific attributes/columns from a relation.

Descr Power Supply 101-Keyboard Mouse MS-DOS 6.0 MS-Word 6.0

Price 4000 2000 800 5000 8000

3. PRODUCT To obtain all possible combination of tuples from two relations.

IIMC

Prasanth Kumar K

12

Ord# 101 101 101 101 101 102 102

OrdDate 02-08-94 02-08-94 02-08-94 02-08-94 02-08-94 11-08-94 11-08-94

O.Cust# 002 002 002 002 002 003 003

C.Cust# 001 002 003 004 005 001 002

CustName Shah Srinivasan Gupta Banerjee Apte Shah Srinivasan

4. UNION To retrieve tuples appearing in either or both the relations participating in the UNION.

Ord# 101 102 101

OrdDate 03-07-94 27-07-94 02-08-94

Cust# 001 003 002

IIMC

Prasanth Kumar K

13
102 103 104 105 11-08-94 21-08-94 28-08-94 30-08-94 003 003 002 005

5. INTERSECT- To retrieve tuples appearing in both the relations participating in the INTERSECT.

Eg: To retrieve Cust# of Customers who've placed orders in July and in August Cust# 003

6. DIFFERENCE To retrieve tuples appearing in the first relation participating in the DIFFERENCE but not the second.

IIMC

Prasanth Kumar K

14
Eg: To retrieve Cust# of Customers who've placed orders in July but not in August Cust# 001

7. JOIN To retrieve combinations of tuples in two relations based on a common field in both the relations.

Eg: ORD_AUG join CUSTOMERS (here, the common column is Cust#) Ord# 101 102 103 104 105 OrdDate 02-08-94 11-08-94 21-08-94 28-08-94 30-08-94 Cust# 002 003 003 002 005 CustNames Srinivasan Gupta Gupta Srinivasan Apte City Madras Delhi Delhi Madras Bombay

Note: The above join operation logically implies retrieval of details of all orders and the details of the corresponding customers who placed the orders. Such a join operation where only those rows having corresponding rows in the both the relations are retrieved is called the natural join or inner join. This is the most common join operation. Consider the example of EMPLOYEE and ACCOUNT relations. EMPLOYEE EMP # EmpName EmpCity Acc#

IIMC

Prasanth Kumar K

15
X101 X102 X103 X104 Shekhar Raj Sharma Vani Bombay Pune Nagpur Bhopal 120001 120002 Null 120003

ACCOUNT Acc# 120001 120002 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan. 1999 4. Mar. 1999 BalAmt 5000 1200 3000 500

A join can be formed between the two relations based on the common column Acc#. The result of the (inner) join is : Emp# X101 X102 X104 EmpName Shekhar Raj Vani EmpCity Bombay Pune Bhopal Acc# 120001 120002 120003 OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan 1999 BalAmt 5000 1200 3000

Note that, from each table, only those records which have corresponding records in the other table appear in the result set. This means that result of the inner join shows the details of those employees who hold an account along with the account details. The other type of join is the outer join which has three variations â€“ the left outer join, the right outer join and the full outer join. These three joins are explained as follows: The left outer join retrieves all rows from the left-side (of the join operator) table. If there are corresponding or related rows in the right-side table, the correspondence will be shown. Otherwise, columns of the right-side table will take null values.

EMPLOYEE left outer join ACCOUNT gives:

IIMC

Prasanth Kumar K

16
Emp# X101 X102 X103 X104 EmpName Shekhar Raj Sharma Vani EmpCity Bombay Pune Nagpur Bhopal Acc# 120001 120002 NULL 120003 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 BalAmt 5000 1200 NULL 3000

The right outer join retrieves all rows from the right-side (of the join operator) table. If there are corresponding or related rows in the left-side table, the correspondence will be shown. Otherwise, columns of the left-side table will take null values.

EMPLOYEE right outer join ACCOUNT gives: Emp# X101 X102 X104 NULL EmpName Shekhar Raj Vani NULL EmpCity Bombay Pune Bhopal NULL Acc# 120001 120002 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 1. Jan 1999 4. Mar. 1999 BalAmt 5000 1200 3000 500

(Assume that Acc# 120004 belongs to someone who is not an employee and hence the details of the Account holder are not available here) The full outer join retrieves all rows from both the tables. If there is a correspondence or relation between rows from the tables of either side, the correspondence will be shown. Otherwise, related columns will take null values.

EMPLOYEE full outer join ACCOUNT gives:

IIMC

Prasanth Kumar K

17
Emp# X101 X102 X103 X104 NULL EmpName Shekhar Raj Sharma Vani NULL EmpCity Bombay Pune Nagpur Bhopal NULL Acc# 120001 120002 NULL 120003 120004 OpenDate 30. Aug. 1998 29. Oct. 1998 NULL 1. Jan 1999 4. Mar. 1999 BalAmt 5000 1200 NULL 3000 500

The result of a natural joins operation between R1 and R2 a1 a2 a3 b1 b2 b3 c1 c2 c3

8. DIVIDE Consider the following three relations:

R1 divide by R2 per R3 gives: a Thus the result contains those values from R1 whose corresponding R2 values in R3 include all R2 values.

IIMC

Prasanth Kumar K