You are on page 1of 57

RDBMS-Day2

Continuation of ER modeling concepts Logical Database design Normalization

Recap of Day1 session

RDBMS handles data in the form of relations, tuples and fields Keys identify tuples uniquely ER modeling is a diagrammatic representation of the conceptual design of a database ER diagrams consist of Entity types, relationship types and attributes

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

Since day 2 is a continuation of day1 content, this recap is done here to maintain the continuity

Relationship participation

Employee

head of

department

partia

To

tal

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

All instances of the entity type Employee dont participate in the relationship, Head-of. Every employee doesnt head a department. So, employee entity type is said to partially participate in the relationship. But, every department would be headed by some employee. So, all instances of the entity type Department participate in this relationship. So, we say that it is total participation from the department side.

Attributes of a Relationship

Medicine

Number of days dosage

Doctor

Prescription

Patient

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

These attributes best describe the relationship rather than any individual entity

Weak entity

E#

id

name

Employee

has

dependant

The dependant entity is represented by a double lined rectangle and the identifying relationship by a double lined diamond

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The identifying relationship is the one which relates the weak entity with the strong entity on which it depends

Extended ER features (EER diagrams)


Supertypes & Subtypes
Sometimes different entity types are actually specializations of a more general entity type
Example: Rose, jasmine, lotus, lily are all flowers

Some attributes are common to all, others are specific to one entity type Represented by a generalization hierarchy subtypes may be disjoint or overlapping

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The attributes that are common belong to the supertype and those that are specific belong to the particular subtype

Supertypes & Subtypes


Colour Flower Subset symbol d U Rose Rose attributes
U

Jasmine Jasmine attributes

Lotus Lotus attributes

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

This depicts a disjoint subset: e.g a particular flower that is a jasmine can only belong to the entity set jasmine and it cant also belong to the set Rose

Overlapping Subtypes
Gender Artist DOB o U Dancer Dancer attributes
U

id Address

Singer Singer attributes

actor actor attributes

Copyright 2004, Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The diagram depicts a overlapping subtype relationship e.g a person whos a singer could also be a dancer.

Case study

A banking scenario
Banks have customers. Customers are identified by name, custid, phone number and address. Customers can have one or more accounts Accounts are identified by an account number, account type (savings, current) and a balance. Customers can avail loans. Loans are identified by loan id, loan type (car, home, personal) and an amount. Banks are identified by a name, code and the address of the main office. Banks have branches. Branches are identified by a branch number, branch name and an address. Accounts and loans are related to the banks branches. Create an ER diagram for a database to represent this application

Copyright 2004, 10 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

Note: The solution that follows is just a sample and not the only solution

10

Solution Step 1: Identify the entities


Bank Branch Customer Account Loan

Solution Step 2: Identify attributes of entities


Bank
Name

Branch
Branch# Branch name Address

Customer
Name Custid Phone Address

Code Address

Account
Account number Account type Balance

Loan
Loan id Loan type Amount

Copyright 2004, 11 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

11

Solution Step 3: Identify relationships between entities


Bank has Branch Branch maintains accounts Branch offers loans Account is held by customer Loan is availed by customer

Solution Step 4: Analyze cardinality of relationships


Bank has Branch : A bank has many branches->1:N Branch maintains accounts: One branch maintains many accounts-> 1:N Branch offers loans : One branch offers many loans -> 1:N Account is held by customer -> M:N Loan is availed by customer ->M:N

Copyright 2004, 12 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

one account may be held by many customers (joint accounts) and one customer may hold many accounts-> One loan may be availed by many customers (joint holders) and one customer may avail many loans (car, housing etc)

12

Solution Step 5: Identify weak entities if any


Branch: Depends on strong entity Bank

Solution Step 6: Identify participation types


Bank has Branch -> both total Branch maintains accounts-> Branch :partial Account: Total Branch offers loans -> Branch: partial Loan: Total Account is held by customer-> both Total Loan is availed by customer-> Loan : Total Customer: Partial

Copyright 2004, 13 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

13

Represented diagrammatically

Copyright 2004, 14 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

An ER diagram is typically read top to bottom and left to right The cardinality, the participation type, the attribute type etc need to be expressed to the extent to which it is required for the given problem.

14

Logical database design


Converting ER diagrams to relational schema

Logical database design Process of converting the conceptual model into an equivalent representation in the implementation model (relational/hierarchic/network etc.) We will focus on the relational model Relational database design Convert ER model into relational schema (a specification of the table definitions and their foreign key links) There are well defined rules for this conversion

15

Converting Strong entity types


Each entity type becomes a table Each single-valued attribute becomes a column Derived attributes are ignored Composite attributes are represented by components Multi-valued attributes are represented by a separate table The key attribute of the entiry type becomes the primary key of the table
back

Copyright 2004, 16 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

16

Entity example
Here address is a composite attribute Years of service is a derived attribute (can be calculated from date of joining and current date) Skill set is a multi-valued attribute

The relational Schema

Employee (E#, Name, Door_No, Street, City, Pincode, Date_Of_Joining)

Emp_Skillset( E#, Skillset)

Copyright 2004, 17 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

As per the rules: Derived attributes are ignored Composite attributes are represented by components Multi-valued attributes are represented by a separate table

17

Converting weak entity types


Weak entity types are converted into a table of their own, with the primary key of the strong entity acting as a foreign key in the table

This foreign key along with the key of the weak entity form the composite primary key of this table

The Relational Schema


Employee (E# ,.)

Dependant (Employee, Dependant_ID, Name, Address)


ER/CORP/CRS/DB07/003 Version No: 2.0

Copyright 2004, 18 Infosys Technologies Ltd

Here dependant is a weak entity. Dependant doesnt mean anything to the problem without the information on for which employee the person is a dependant.

18

Converting relationships
The way relationships are represented depends on the cardinality and the degree of the relationship The possible cardinalities are:
1:1, 1:M, N:M

The degrees are:


Unary Binary Ternary

Copyright 2004, 19 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

19

Unary 1:1
Consider employees who are also a couple

The primary key field itself will become foreign key in the same table

Employee( E#, Name,... Married_to)

Copyright 2004, 20 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

20

Unary 1:N
The primary key field itself will become foreign key in the same table Same as unary 1:1

Employee( E#, Name,,Manager)

Copyright 2004, 21 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

21

Unary M:N
Guarantor_of M Employee N

There will be two resulting tables. One to represent the entity and another to represent the M:N relationship as follows Employee( E#, Name,) Guaranty( Guarantor, beneficiary)

Copyright 2004, 22 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

22

Binary 1:1
Employee 1
partia l
ta l

head of

department

To

Case 1:Combination of participation types The primary key of the partial participant will become the foreign key of the total participant Employee( E#, Name,) Department (Dept#, Name,Head)

Copyright 2004, 23 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

23

Binary 1:1
Employee Sits_on CHAIR

Case 2: Uniform participation types The primary key of either of the participants can become a foreign key in the other Employee (E#,name) Chair( item#, model, location, used_by) (or) Employee ( E#, Name.Sits_on) Chair (item#,.)
ER/CORP/CRS/DB07/003 Version No: 2.0

Copyright 2004, 24 Infosys Technologies Ltd

24

Binary 1:N
1 Teacher teaches N Subject

The primary key of the relation on the 1 side of the relationship becomes a foreign key in the relation on the N side Teacher (ID, Name, Telephone, ...) Subject (Code, Name, ..., Teacher)

Copyright 2004, 25 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

25

Binary M:N
M Book N Borrowed by Employee

A new table is created to represent the relationship Contains two foreign keys - one from each of the participants in the relationship The primary key of the new table is the combination of the two foreign keys Book (Acc#,Title) Issue (Book#, Borrower#) Employee (E#,Name,)

Copyright 2004, 26 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

26

Ternary relationship
Represented by a new table The new table contains three foreign keys one from each of the participating Entities The primary key of the new table is the combination of all three foreign keys Prescription (Doctor#, Patient #, Medicine_Name)

Copyright 2004, 27 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

27

Normalization

28

A well structured table


Well-structured table - contains minimal redundancy and allows users to insert, modify, and delete the rows without errors or inconsistencies. The possible anomalies could be Insertion Anomalies Deletion Anomalies Modification Anomalies

An Example table:

Project#

Project_name

Member#

Member Name

Member Address

Copyright 2004, 29 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

Insertion Anomalies Experienced when we attempt to store a value for one field but cannot do so because the value of another field is unknown e.g., cannot add a new employee information until he /she is assigned a project Deletion Anomalies Experienced when value of a field is unexpectedly removed when value for another field is deleted e.g., cannot delete the project detail (project could be dropped) of the employee without deleting the only copy of the employee information Modification Anomalies Experienced when changes to multiple records of a table are needed to update a single value of a field e.g., for changing an employees address, it has to be changed in every row corresponding to the project he/she would be a member of

29

Normalization
The formal process that can be followed to achieve a good database design Also used to check that an existing design is of good quality The different stages of normalization are known as normal forms

Copyright 2004, 30 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

To understand this, we need to understand the concept of functional dependency

30

Functional dependency
An attribute Y of a relation schema R is functionally dependent on another attribute X of R if the value of the attribute X uniquely determines the value of the attribute Y x -> y We say here x determines y or y is functionally dependent on x XY does not imply that YX If the value of an attribute Marks is known then the value of an attribute Grade is determined since Marks Grade Types of functional dependencies:
Full dependency Partial dependency Transitive dependency

Copyright 2004, 31 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

31

Full dependencies
An attribute B of a relation R is fully functionally dependent on attribute A of R if it is functionally dependent on A & not functionally dependent on any proper subset of A. Report( S#,C#,Title,Lname,Room#,Marks) S#, C# Marks

Copyright 2004, 32 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

This implies that for a given pair of (S#,C#) values occurring in the relation Report there is exactly one value of Marks. ie Marks are dependent on S# & C# as a composite pair, but not on either individually

32

Partial dependencies
An attribute B of a relation R is partially dependent on attribute A of R if it is functionally dependent on any proper subset of A. Report( S#,C#,Title,Lname,Room#,Marks) C# C# Title LName

Copyright 2004, 33 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The attributes Title, LName are said to be partially dependent on the key (S#, C#) since they are dependent only on C# and not on S#.

33

Transitive dependencies
An attribute B of a relation R is transitively dependent on attribute A of R if it is functionally dependent on an attribute C Which in turn is functionally dependent on A or any proper subset of A. Report( S#,C#,Title,Lname,Room#,Marks) C# LName LName Room#

Copyright 2004, 34 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The attribute Room# is said to be transitively dependent on the key C# since it is dependent on LName which in turn is dependent on C#.

34

First normal form: 1NF


A relation schema is in 1NF if all of its attributes are:
single-valued restricted to assuming atomic values, functionally dependent on the primary key

1NF implies: Composite attributes are represented only by their component attributes Attributes cannot have multiple values Attributes cannot have complete tuples as values

Copyright 2004, 35 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

In relational database design it is not practically possible to have a table which is not in 1NF.

35

Prime Vs Non-Prime Attributes


An attribute of a relation R that belongs to any key of R is said to be a prime attribute and that which doesnt is a non-prime attribute E.g Report( S#,C#,Title,Lname,Room#,Marks)

S# is a prime attribute C# is a prime attribute Title is a non-prime attribute

Copyright 2004, 36 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

36

Second normal form: 2NF


A relation schema R is in 2NF if it is in 1NF and every non-prime attribute is fully functionally dependent on every key of R

Consider the relational schema: Empdetails( E#, Project#, Role, Number_Of_shares, Share_worth) In this, E#, Project# -> Role E# -> Number_Of_shares

Copyright 2004, 37 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

Number_of_shares depends only on E# irrespective of the project currently working on. i.e. partial dependency

37

A typical snapshot may look like

E# 102 119 198 102 127 102

Project# Abc Ppq Abc Hjk Edf Bnm

Role DV ML DV ML DV ML

Num_of_shares 100 150 100 100 200 100

Share_worth 5,00,000 7,50,000 5,00,000 5,00,000 10,00,000 5,00,000

Copyright 2004, 38 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

The share worth is fixed for number of shares and is unnecessarily repeated The stock detail is repeated for every project information of an employee. So if more shares are allotted to the employee, it needs to be updated on multiple rows If an employee is not yet allotted to any project, the remaining information cant be captured

38

After decomposing
Empdetails( E#, Project#, Role, Loan_amount, Loan_type, Interest_rate) Becomes Emp_Project (E#, P#, Role) Emp_Stock (E#, Num_of_Shares, Share_worth) This avoids the anomalies that were present in the original relation Still redundancy remains due to the transitive dependency E#->Num_of_shares Num_of_shares->Share_worth

Copyright 2004, 39 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

39

Third normal form:3 NF


A relation schema R is in 3NF if it is in 2NF and every non-prime attribute is non-transitively dependent on every key of R

Applying this, the relation Emp_Stock (E#, Num_of_Shares,Share_worth) Will be decomposed into Emp_Stock( E#, Num_of_Shares) StockWorth (Num_of_Shares,Share_worth)

Copyright 2004, 40 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

40

Boyce-Codd normal form:BCNF


A relation R is in BCNF if, for every non-trivial functional dependency A->B in it, it is true that A is a superkey of R In other words, every determinant is a candidate key

BCNF is a stronger form of 3NF 3NF states that every non-prime attribute must be non-transitively dependent on every key BCNF states that every attribute (prime or non-prime) must be nontransitively dependent on every key

Copyright 2004, 41 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

41

An example

Consider the relation: Courses (Dept#, Course#, Lecturer#, Num_Students) Each Department offers may courses Course# is unique within a Department only* Each Lecturer belongs to one Dept only Each Lecturer may handle several courses within the dept. A particular course offered by a department may be handled by a single lecturer.

Assumptions:

Copyright 2004, 42 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

* The same course id may refer to a different course offered by a different department

42

The functional dependencies


{Dept#,Course#}->Lecturer# {Dept#,Course#}-> Num-of_students {Lecturer#,Course#}->Num-of_students Lecturer# -> Course#

The candidate keys are: {Dept#,Course#} {Lecturer#,Course#}

Copyright 2004, 43 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

43

A sample table
Dept# D1 D1 D1 D2 D2 Course# C1 C2 C3 C5 C6 Lecturer# L1 L1 L2 L3 L4 Num_students 20 15 42 12 19

Copyright 2004, 44 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

44

Observations
In the table, the only non-prime attribute is Num-of_students. It depends on every key of the table non-transitively So, it is in 3NF But, the fact that Lecturer L1 belongs to department D1 is repeated redundancy Lecturer#->Dept#. In this, the attribute Dept# is only partially dependent on the key

The solution
Course_Offering(Lecturer#, Course#, Num-of-Students) Lecturer(Lecturer#, Dept#)
Copyright 2004, 45 Infosys Technologies Ltd ER/CORP/CRS/DB07/003 Version No: 2.0

Every determinant is a candidate key. So, BCNF

45

Denormalization
Denormalization can be described as a process for reducing the degree of normalization with the aim of improving query processing performance However, reducing the degree of normalization of a table may lead to inconsistencies and this option has to be dealt with after careful thinking The usefulness of Denormalization is as such debatable

Normalised tables

After Denormalization

Copyright 2004, 46 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

Denormalization may not always give an optimal solution

46

Exercise- Recruitment
The HR dept of an organization is planning for a big recruitment drive. They wish to organize the data required for the process, in a database. The data that needs to be captured is as follows: Interviewer Employee id Name Extension number class Previous employment Employer name Address Telephone Designation Reason for leaving Date joined Date left Last drawn salary. Qualifications Qualification(hsc, sslc, etc), year of passing, University/board

Applicant Enrollment id, Name, Address, Date of birth, Gender, Telephone no

Copyright 2004, 47 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

47

Functional dependencies
Enroll#, -> Name, Enroll ->Address, Enroll ->DOB, Enroll -> Gender, Enroll -> Phone, Enroll -> interviewer Interviewer -> Int_Name (transitive dependency) Interviewer -> Extension (transitive dependency)

Qualifications {Enroll#, qualification, year_of_passing } -> awarded_by {Enroll#, qualification, year_of_passing } -> class Assumptions: A person may acquire the same qualification several times from the same university (e.g M.A in english, M.A in history) Only one degree can be obtained in an year

Copyright 2004, 48 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

48

Functional dependencies
Employment Enroll#, Employername,date_joined ->designation Enroll#, Employername, date_joined -> reason_for_Leaving, Enroll#, Employername, date_joined -> date_left Enroll#, Employername, date_joined -> last_slary Employername -> address (partial dependency) Employername -> telephone (partial dependency)

Copyright 2004, 49 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

49

1NF
Applicant( Enroll#, Name, Address, DOB, Gender, Phone, interviewer, Int_Name, Extension)

Qualifications( Enroll#, qualification, year_of_passing, awarded by ,class)

Employment( Enroll#, Employername, date_joined, address, telephone, designation, reason_for_Leaving, date_left, last_slary)

Copyright 2004, 50 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

50

2NF
Applicant( Enroll#, Name, Address, DOB, Gender, Phone, interviewer, Int_Name, Extension) Qualifications( Enroll#, qualification, year_of_passing, awarded_by ,class) Employment( Enroll#, Employername, date_joined, designation, reason_for_Leaving, date_left, last_slary ) Employer( Employername, Address, Phone)

Removal of partial dependencies Copyright 2004,


Infosys Technologies Ltd

51

ER/CORP/CRS/DB07/003 Version No: 2.0

51

3NF
Applicant( Enroll#, Name, Address, DOB, Gender, Phone, interviewer)

Panel( interviewer, Name, Extn)

Removal of transitive dependencies

Copyright 2004, 52 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

52

Class work

53

Caterers association
A caterers association in a city wants to build a database of all the caterers who are members of the association. Every type of item has got an item code for e.g idly- itemcode bf1, dosa bf2, aloo paratha- bf 3, south indian meals ln1etc. Each caterer may have outlets spread over the city. It is not necessary that all caterers or all branches of a single caterer must provide all item types. Design the relational schema for the above requirement. Normalize the relations.

Copyright 2004, 54 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

54

The data
Unnormalized Data Items for Caterers Association Caterer name Membership id main location Branches 1..n item ID 1...n item Name 1...n location where available 1...n Caterer grade

Copyright 2004, 55 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

55

Summary
Entity types could participate in relationships fully or partially Extended ER features represent generalization-specialization hierarchy etc There are guidelines for converting entities, relationships , attributes into corresponding equivalent in relational model Functional dependencies could be full, partial, or transient If a table contains atomic values then it is in 1NF If a table is in 1NF and also free of partial dependencies, it is in 2NF If a table is in 2NF and is free of transitive dependencies then it is in 3NF If a table is in 3NF and every determinant is a candidate key, then it is in BCNF

Copyright 2004, 56 Infosys Technologies Ltd

ER/CORP/CRS/DB07/003 Version No: 2.0

56

Thank You!
Copyright 2004, 57 Infosys Technologies Ltd ER/CORP/CRS/DB07/003 Version No: 2.0

57

You might also like