Professional Documents
Culture Documents
RDBMS handles data in the form of relations, tuples and fields Keys identify tuples uniquely ER modeling is a diagrammatic representation of the conceptual design of a database ER diagrams consist of entity types, relationship types and attributes
Entity example
Here address is a composite attribute
Years of service is a derived attribute (can be calculated from date of joining and current date)
Skill set is a multi-valued attribute The relational Schema
Employee Table
EmpCode EmpName DateofJoining SkillSet PK
This foreign key along with the key of the weak entity form the composite primary key of this table The Relational Schema Employee (E# ,.)
Employee Table
EmpCode EmpName DateofJoining SkillSet PK
Dependent EmpCode
PK /FK PK
Dependent_ID
Name Address
Converting relationships
The way relationships are represented depends on the cardinality and the degree of the relationship The possible cardinalities are: 1:1, 1:M, N:M The degrees are:
Unary Binary
Ternary
Binary 1:1
Employee 1
head of
department
Case 1: Combination of participation types The primary key of the partial participant will become the foreign key of the total participant Employee( E#, Name,)
Binary 1 : 1
Employee Table
EmpCode EmpName DateofJoining SkillSet PK
PK
Location Head
FK
Binary 1:1
Employee Sits_on CHAIR
The primary key of either of the participants can become a foreign key in the other
Employee (E#,name)
Chair( item#, model, location, used_by) (or) Employee ( E#, Name.Sits_on) Chair (item#,.)
PK
SkillSet
Employee Table EmpCode PK
FK
EmpName
DateofJoining SkillSet Sits_On FK
Copyright 2004, # Infosys Technologies Ltd
Chair ItemNo
PK
Model
Location
ER/CORP/CRS/DB07/003 Version No: 2.0
Binary 1:N
1 Teacher teaches N Subject
The primary key of the relation on the 1 side of the relationship becomes a foreign key in the relation on the N side Teacher (ID, Name, Telephone, ...) Subject (Code, Name, ..., Teacher)
Binary 1 : N
Teacher
TeacherID Name Telephone Cabin PK
PK
Duration TeacherID
FK
Binary M:N
M Student
N
Enrolls Course
Contains two foreign keys - one from each of the participants in the relationship
The primary key of the new table is the combination of the two foreign keys Student (Sid#,Title) Course(C#,CName,)
Binary M : N
Student StudentID
CourseID
PK
PK / FK
StudentName
DOIssue Status
DOB Address
Copyright 2004, # Infosys Technologies Ltd
ER/CORP/CRS/DB07/003 Version No: 2.0
Unary 1:1
Consider employees who are also a couple
Unary 1 : 1
Spouse
FK
Unary 1:N
The primary key field itself will become foreign key in the same table
Unary 1 : N
Manager
FK
Unary M:N
Guarantor_of
M Employee
There will be two resulting tables. One to represent the entity and another to represent the M:N relationship as follows
Employee( E#, Name,)
Unary M : N
Employee Table
EmpCode EmpName DateofJoining SkillSet PK
PK/FK
PK /FK
Ternary relationship
Represented by a new table The new table contains three foreign keys - one from each of the participating Entities The primary key of the new table is the combination of all three foreign keys Prescription (Doctor#, Patient #, Medicine_Name)
NextVisit
Patient PatCode PatName PK Medicine MedName ExpDate
Copyright 2004, # Infosys Technologies Ltd
ER/CORP/CRS/DB07/003 Version No: 2.0
DOB Address
PK
CUSTOMER
(Customer#,Name,Telephone#,Address)
Example: Similarly a Bank table can be created with Bank Bankcode, Name and Address columns
BANK
(BankCode, Name, Address)
Weak Entity types are converted into a table of their own, with the primary key of the strong Entity acting as a foreign key in the table. This Foreign key along with the key of the Weak Entity form the composite primary key of this table. Example: As per this guideline, a Branch table can be created with the following structure.
BRANCH
(BankCode, Branch#, Name, Address)
LOAN_DETAILS
(Loan#, Customer#, DateofSanction, InterestRate, LoanAmount, Duration)
Normalization
What is Normalization?
Database designed based on the E-R model may have some amount of Inconsistency Uncertainty Redundancy
To eliminate these draw backs some refinement has to be done on the database. Refinement process is called Normalization Defined as a step-by-step process of decomposing a complex relation into a simple and stable data structure. The formal process that can be followed to achieve a good database design Also used to check that an existing design is of good quality The different stages of normalization are known as normal forms To accomplish normalization we need to understand the concept of Functional Dependencies.
Student_Details
101 102 101 103 Davis Daniel Davis Sandra 11/4/1986 11/6/1987 11/4/1986 10/2/1988 M4 M4 H6 C3
Course_Details
Applied Mathematics Applied Mathematics American History Bio Chemistry Basic Chemistry Basic Mathematics Basic Mathematics 7 7 4 11
Result_Details
11/11/2004 11/11/2004 11/22/2004 11/16/2004 82 62 79 65 A C B B
B3 P3 P3
8 13 13
77 68 89
B B A
B4 H6 M4
5 4 7
54 87 65
D A B
Functional dependency
In a given relation R, X and Y are attributes. Attribute Y is functionally dependent on attribute X if each value of X determines EXACTLY ONE value of Y, which is represented as X -> Y (X can be composite in nature). We say here x determines y or y is functionally dependent on x XY does not imply YX If the value of an attribute Marks is known then the value of an attribute Grade is determined since MarksGrade Types of functional dependencies: Full Functional dependency Partial Functional dependency Transitive dependency
Functional Dependencies
Consider the following Relation REPORT (STUDENT#,COURSE#, CourseName, IName, Room#, Marks, Grade) STUDENT# - Student Number COURSE# - Course Number CourseName - Course Name IName - Name of the Instructor who delivered the course Room# - Room number which is assigned to respective Instructor Marks - Scored in Course COURSE# by Student STUDENT# Grade - obtained by Student STUDENT# in Course COURSE#
Dependency diagram
Report( S#,C#,SName,CTitle,LName,Room#,Marks,Grade)
S# SName C# CTitle, C# LName LName Room# C# Room# S# C# Marks Marks Grade S# C# Grade
S#
C#
CTitle
LName Room#
Assumptions: Each course has only one lecturer and each lecturer has a room. Grade is determined from Marks.
Copyright 2004, # Infosys Technologies Ltd
ER/CORP/CRS/DB07/003 Version No: 2.0
Full dependencies
X and Y are attributes. X Functionally determines Y Note: Subset of X should not functionally determine Y
Partial dependencies
X and Y are attributes. Attribute Y is partially dependent on the attribute X only if it is dependent on a sub-set of attribute X.
Student#
Course#
CourseName
IName
Room#
Transitive dependencies
X Y and Z are three attributes. X -> Y Y-> Z => X -> Z
Course#
IName
Room#
Student_Course_Result Table
Student_Details
101 102 101 103 Davis Daniel Davis Sandra 11/4/1986 11/6/1987 11/4/1986 10/2/1988 M4 M4 H6 C3
Course_Details
Applied Mathematics Applied Mathematics American History Bio Chemistry Basic Chemistry Basic Mathematics Basic Mathematics 7 7 4 11
Results
11/11/2004 11/11/2004 11/22/2004 11/16/2004 82 62 79 65 A C B B
B3 P3 P3
8 13 13
77 68 89
B B A
103
Sandra
10/2/1988
B4
Zoology
11/27/2004
54
105
104
Susan
Evelyn
8/31/1985
2/22/1986
H6
M4
American History
Applied Mathematics Basic Mathematics
4
7
11/22/2004
11/11/2004
87
65
A
B
Table in 1NF
Student# Student Name Dateof Birth
Student_Course_Result Table
Cour s e # CourseName Pre Requisite Dura t i o n InDa y s DateOf Exam Marks Grade
101
Davis
04-Nov-1986
M4
Applied Mathematics Applied Mathematics American History Bio Chemistry Botany Nuclear Physics Nuclear Physics
Basic Mathematics
11-Nov-2004
82
M4 H6 C3 B3 P3 P3
Basic Mathematics
7 4
62 79 65 77 68 89
C B B B B A
Basic Chemistry
11 8
13 13
103
105
Sandra
Susan
02-Oct-1988
31-Aug-1985
B4
H6
Zoology
American History Applied Mathematics
5
4
27-Nov-2004
22-Nov-2004
54
87
D
A
104
Evelyn
22-Feb-1986
M4
Basic Mathematics
11-Nov-2004
65
An attribute of a relation R that belongs to any key of R is said to be a prime attribute and that which doesnt is a non-prime attribute
To make a table 2NF compliant, we have to remove all the partial dependencies
Student # Course #
Student Name
Is a PRIME Attribute
Date of Birth
CourseName Prerequisite Marks Grade DurationInDays DateOfExam
Is NON-PRIME Attribute
To make this table 2NF compliant, we have to remove all the partial dependencies.
Student #, Course# -> Marks, Grade Student# -> StudentName, DOB, Course# -> CourseName, Prerequiste, DurationInDays Course# -> Date of Exam
S#,C#
S#
StudentName
Partial Dependency
S#
DOB
C# C# C#
Partial Dependency
C#
DateOfExam
Partial Dependency
ER/CORP/CRS/DB07/003 Version No: 2.0
COURSE TABLE
Course Name Pre Requisite Duration InDays
04-Nov-1986 06-Nov-1987
M1 M4
Basic Mathematics Applied Mathematics American History Basic Chemistry Bio Chemistry Botany Basic Physics Nuclear Physics Zoology
ER/CORP/CRS/DB07/003 Version No: 2.0
11 M1 7 4 5 C1 11 8 8 P1 13 5
101 M4 102 M4 101 H6 103 C3 104 B3 102 P3 105 P3 103 B4 105 H6 104 M4
S#,C#
Marks
S#,C#
Grade
3NF Tables
Student# Course# Marks
101 M4 102 M4 101 H6 103 C3 104 B3 102 P3 105 P3 103 B4 105 H6 104 M4
Copyright 2004, # Infosys Technologies Ltd
82 62 79 65 77 68 89 54 87 65
ER/CORP/CRS/DB07/003 Version No: 2.0
UpperBound
LowerBound
Grade
100
95 A+
94
84 69 64 54 44
85 A
70 B 65 B55 C 45 D 0 E
ER/CORP/CRS/DB07/003 Version No: 2.0
101 Davis@myuni.edu 102 Daniel@myuni.edu 101 Davis@myuni.edu 103 Sandra@myuni.edu 104 Evelyn@myuni.edu 102 Daniel@myuni.edu 105 Susan@myuni.edu 103 Sandra@myuni.edu 105 Susan@myuni.edu 104 Evelyn@myuni.edu
BCNF
Candidate Keys for the relation are
S# C#
and
C#
EmailID
S#
C# C# C#
EmailID
BCNF
STUDENT TABLE Student# EmailID
101 Davis@myuni.edu
102 Daniel@myuni.edu
103 Sandra@myuni.edu
104 Evelyn@myuni.edu 105 Susan@myuni.edu
BCNF Tables
Student# Course# Marks 82 62 79 65 77 68 89 54 87 65
101 M4 102 M4 101 H6 103 C3 104 B3 102 P3 105 P3 103 B4 105 H6 104 M4
Merits of Normalization
Normalization is based on a mathematical foundation. Removes the redundancy to a greater extent. After 3NF, data redundancy is minimized to the extent of foreign keys. Removes the anomalies present in INSERTs, UPDATEs and DELETEs.
Demerits of Normalization
Data retrieval or SELECT operation performance will be severely affected. Normalization might not always represent real world scenarios.
Operation
Create separate rows or columns for every combination of multivalued columns
Output
Table in 1 NF
Table in 1 NF
Tables in 2NF
Tables in 2 NF
Tables in 3 NF
Tables in 3 NF
Tables in BCNF
Points to Remember:
Normal Form 1NF Test Relation should have atomic attributes. The domain of an attribute must include only atomic (simple, indivisible) values. For relations where primary key contains multiple attributes (composite primary key), nonkey attribute should not be functionally dependent on a part of the primary key. Relation should not have a nonkey attribute functionally determined by another non-key attribute (or by a set of non-key attributes). In other words there should be no transitive dependency of a non-key attribute on the primary key. Remedy (Normalization) Form new relations for each non-atomic attribute
2NF
Decompose and form a new relation for each partial key with its dependent attribute(s). Retain the relation with the original primary key and any attributes that are fully functionally dependent on it. Decompose and form a relation that includes the non-key attribute(s) that functionally determine(s) other non-key attribute(s).
3NF
Summary
Normalization is a refinement process. It helps in removing anomalies present in INSERTs/UPDATEs/DELETEs Normalization is also called Bottom-up approach, because this technique requires very minute details like every participating attribute and how it is dependant on the key attributes, is crucial. If you add new attributes after normalization, it may change the normal form itself. There are four normal forms that were defined being commonly used. 1NF makes sure that all the attributes are atomic in nature. 2NF removes the partial dependency.
Summary contd.
3NF removes the transitive dependency. BCNF removes dependency among key attributes. Too much of normalization adversely affects SELECT or RETRIEVAL operations. It is always better to normalize to 3NF for INSERT, UPDATE and DELETE intensive (On-line transaction) systems. It is always better to restrict to 2NF for SELECT intensive (Reporting) systems.
While normalizing, use common sense and dont use the normal forms as absolute measures.
Thank You!