You are on page 1of 85

UNIT - II

DATABASE DESIGN
SYLLABUS

� DATABASE DESIGN
Relational DBMS – ER model - Extended ER- Functional
Dependencies, Non-loss Decomposition, Anomaly - 1NF to
5NF
ER MODEL
� ER model stands for an Entity-Relationship model. It is a high-
level data model.
� This model is used to define the data elements and relationship
for a specified system.
� It develops a conceptual design for the database. It also
develops a very simple and easy to design view of data.
� It works around real-world entities and the associations among
them.
� At view level, the ER model is considered a good option for
designing databases.
COMPONENT OF ER DIAGRAM
NOTATION FOR ER DIAGRAMS
ENTITY
� An Entity may be an object with a physical existence – a
particular person, car, house, or employee – or it may be an
object with a conceptual existence – a company, a job, or a
university course.
� All these entities have some attributes or properties that give
them their identity.
� An entity set is a collection of similar types of entities. An
entity set may contain entities with attribute sharing similar
values.
WEAK ENTITY
� An entity that depends on another entity called a weak entity.
� The weak entity doesn't contain any key attribute of its own.
� The weak entity is represented by a double rectangle.
ATTRIBUTE
� The attribute is used to describe the property of an entity. Ellipse
is used to represent an attribute.
� For example, id, age, contact number, name, etc. can be
attributes of a student.
� All attributes have values. For example, a student entity may
have name, class, and age as attributes.
SIMPLE ATTRIBUTE
� Simple attributes are atomic values, which cannot be divided
further. For example, a student's phone number is an atomic
value of 10 digits.
� Example: The roll number of a student, the id number of an
employee.
COMPOSITE ATTRIBUTES
� An attribute composed of many other attribute is called as
composite attribute. For example, Address attribute of student
Entity type consists of Street, City, State, ad Country.
SINGLE-VALUED ATTRIBUTE
� Single-valued attribute is an attribute that can have only a
single value.
� For example, a person can have only one Social Security
number, and a manufactured part can have only one serial
number.
� Keep in mind that a single-valued attribute is not
necessarily a simple attribute.
MULTI-VALUED ATTRIBUTE
� Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number,
email_address, etc.
DERIVED ATTRIBUTE
� An attribute which can be derived from other attributes
of the entity type is known as derived attribute. e.g.; Age
(can be derived from DOB).
KEY ATTRIBUTE
� The attribute which uniquely identifies each entity in the
entity set is called key attribute.
� For example, Rollo will be unique for each student.
EXAMPLE
RELATIONSHIP
� A relationship type represents the association between
entity types.
� For example, ‘Enrolled in’ is a relationship type that exists
between entity type Student and Course.
� In ER diagram, relationship type is represented by a
diamond and connecting the entities with lines
RELATIONSHIP SET
� A set of relationships of similar type is called a relationship
set.
� Like entities, a relationship too can have attributes. These
attributes are called descriptive attributes.
DEGREE OF A RELATIONSHIP SET
� The number of different entity sets participating in a
relationship set is called as degree of a relationship set.
� Unary Relationship:
⚫ When there is only ONE entity set participating in a relation, the
relationship is called as unary relationship. For example, one person
is married to only one person.
DEGREE OF A RELATIONSHIP SET
CONT..
� Binary Relationship
⚫ When there are TWO entities set participating in a relation,
the relationship is called as binary relationship
DEGREE OF A RELATIONSHIP SET
CONT..
� n-ary Relationship:
⚫ When there are n entities set participating in a relation, the
relationship is called as n-ary relationship.
CARDINALITY
� The number of times an entity of an entity set participates
in a relationship set is known as cardinality.
� One-to-one: One entity from entity set A can be associated
with at most one entity of entity set B and vice versa.
CARDINALITY CONT….
� One-to-many: One entity from entity set A can be associated
with more than one entities of entity set B however an entity
from entity set B, can be associated with at most one entity.
CARDINALITY CONT….
� Many-to-one: More than one entities from entity set A
can be associated with at most one entity of entity set B,
however an entity from entity set B can be associated
with more than one entity from entity set A
CARDINALITY CONT….
� Many-to-many:More than one instance of an entity on
the left and more than one instance of an entity on the
right can be associated with the relationship.
UNARY OR RECURSIVE RELATIONSHIP
� When there is a relationship between two entities of the same
type, it is known as a recursive relationship. This means that
the relationship is between different instances of the same
entity type.
ENHANCED ENTITY RELATIONSHIP
MODEL (EER)
� EER is a high-level data model that incorporates the
extensions to the original ER model.
� It is a diagrammatic technique for displaying the
following concepts
⚫ Sub Class and Super Class
⚫ Specialization and Generalization
⚫ Union or Category
⚫ Aggregation
FEATURES OF EER MODEL
� EER creates a design more accurate to database schemas.
� It reflects the data properties and constraints more precisely.
� It includes all modeling concepts of the ER model.
� Diagrammatic technique helps for displaying the EER
schema.
� It includes the concept of specialization and generalization.
� It is used to represent a collection of objects that is union of
objects of different of different entity types.
A. SUB CLASS AND SUPER CLASS
� Sub class and Super class relationship leads the concept of
Inheritance.
� The relationship between sub class and super class is denoted
with symbol.
1. Super Class
⚫ Super class is an entity type that has a relationship with one or more
subtypes.
⚫ An entity cannot exist in database merely by being member of any
super class.
⚫ For example: Shape super class is having sub groups as Square,
Circle, Triangle.
2. Sub Class
⚫ Sub class is a group of entities with unique attributes.
⚫ Sub class inherits properties and attributes from its super class.
⚫ For example: Square, Circle, Triangle are the sub class of Shape super
class.
SUB CLASS AND SUPER CLASS CONT…
B. SPECIALIZATION AND GENERALIZATION
� Generalization
� Generalization is the process of generalizing the entities which
contain the properties of all the generalized entities.
� It is a bottom approach, in which two lower level entities
combine to form a higher level entity.
� Generalization is the reverse process of Specialization.
� It defines a general entity type from a set of specialized entity
type.
� It minimizes the difference between the entities by identifying
the common features.
GENERALIZATION
SPECIALIZATION
� Specialization is a process that defines a group entities which
is divided into sub groups based on their characteristic.
� It is a top down approach, in which one higher entity can be
broken down into two lower level entity.
� It maximizes the difference between the members of an entity
by identifying the unique characteristic or attributes of each
member.
� It defines one or more sub class for the super class and also
forms the superclass/subclass relationship.
SPECIALIZATION
C. CATEGORY OR UNION
� Category represents a single super class or sub class relationship
with more than one super class.
� It can be a total or partial participation.
� For example Car booking, Car owner can be a person, a bank (holds
a possession on a Car) or a company.
� Category (sub class) → Owner is a subset of the union of the three
super classes → Company, Bank, and Person.
� A Category member must exist in at least one of its super classes.
CATEGORY OR UNION CONT…
D. AGGREGATION
� Aggregation is a process that represent a relationship between
a whole object and its component parts.
� It abstracts a relationship between objects and viewing the
relationship as an object.
� It is a process when two entity is treated as a single entity.
FUNCTIONAL
DEPENDENCY
WHAT IS FUNCTIONAL DEPENDENCY?
� Functional Dependency is a relationship that exists between
multiple attributes of a relation.
� This concept is given by E. F. Codd.
� Functional dependency represents a formalism on the
infrastructure of relation.
� It is a type of constraint existing between various attributes of
a relation.
� It is used to define various normal forms.
� These dependencies are restrictions imposed on the data in
database.
FUNCTIONAL DEPENDENCY CONT…
�If R is a relation with A and B attributes, a functional
dependency between these two attributes is represented as {A
→ B}. It specifies that,

A It is a determinant set.
B It is a dependent attribute.
{A → B} A functionally determines B.
B is a functionally dependent on A.

• Each value of A is associated precisely with one B value. A functional


dependency is trivial if B is a subset of A.
• 'A' Functionality determines 'B' {A → B} (Left hand side attributes
determine the values of Right hand side attributes)
FUNCTIONAL DEPENDENCY CONT…
Employee table
EmpId EmpName
101 Ramesh R
102 Franklin J
103 Ramesh R

� In the above Employee table, EmpName (employee name) is


functionally dependent on EmpId (employee id) because the
EmpId is unique for individual names.
� The EmpId identifies the employee specifically, but EmpName
cannot distinguish the EmpId because more than one employee
could have the same name.
� The functional dependency between attributes eliminates the
repetition of information.
� It is related to a candidate key, which uniquely identifies a tuple
and determines the value of all other attributes in the relation.
ADVANTAGES OF FUNCTIONAL DEPENDENCY
�Functional Dependency avoids data
redundancy where same data should not be
repeated at multiple locations in same
database.
�It maintains the quality of data in database.
�It allows clearly defined meanings and
constraints of databases.
�It helps in identifying bad designs.
�It expresses the facts about the database
design.
ARMSTRONG'S AXIOMS
�Axioms Rules
⚫Armstrong's Axioms is a set of rules.
⚫It provides a simple technique for reasoning
about functional dependencies.
⚫It was developed by William W. Armstrong in
1974.
⚫It is used to infer all the functional
dependencies on a relational database.
VARIOUS AXIOMS RULES
Primary Rules
Rule 1 Reflexivity
    If A is a set of attributes and B is a subset of A, then A holds B.
{A→B}

Rule 2 Augmentation
If A hold B and C is a set of attributes, then AC holds BC. {AC → BC}
It means that attribute in dependencies does not change the basic
dependencies.

Rule 3 Transitivity
    If A holds B and B holds C, then A holds C.
    If {A → B} and {B → C}, then {A → C}
    A holds B {A → B} means that A functionally determines B.
VARIOUS AXIOMS RULES
Secondary Rules
Rule 1 Union
    If A holds B and A holds C, then A holds BC.
    If{A → B} and {A → C}, then {A → BC}
Rule 2 Decomposition
    If A holds BC and A holds B, then A holds C.
    If{A → BC} and {A → B}, then {A → C}
Rule 3 Pseudo Transitivity
    If A holds B and BC holds D, then AC holds D.
    If{A → B} and {BC → D}, then {AC → D}
Rule 4 Composition
If{X → Y} & {A → B} then XA → XB
DECOMPOSITION
� Decomposition is the process of breaking down in parts or
elements.
� It replaces a relation with a collection of smaller relations.
� It breaks the table into multiple tables in a database.
� It should always be lossless, because it confirms that the
information in the original relation can be accurately
reconstructed based on the decomposed relations.
� If there is no proper decomposition of the relation, then it
may lead to problems like loss of information.
PROPERTIES OF
DECOMPOSITION

1. Lossless Decomposition
2. Dependency Preservation
3. Lack of Data Redundancy
1. LOSSLESS DECOMPOSITION
� Decomposition must be lossless. It means that the information
should not get lost from the relation that is decomposed.
� It gives a guarantee that the join will result in the same relation
as it was decomposed.
� Example:
Let's take 'E' is the Relational Schema, With instance 'e'; is
decomposed into: E1, E2, E3, . . . .En; With instance : e1, e2,
e3, . . . . en, If e1 ⋈ e2 ⋈ e3 . . . . ⋈ en, then it is called
as 'Lossless Join Decomposition'.

� In the above example, it means that, if natural joins of all the


decomposition give the original relation, then it is said to be
lossless join decomposition.
EXAMPLE: EMPLOYEE_DEPARTMENT TABLE
Eid Ename Age City Salary Deptid DeptName
E001 ABC 29 Pune 20000 D001 Finance
E002 PQR 30 Pune 30000 D002 Production
E003 LMN 25 Mumbai 5000 D003 Sales
E004 XYZ 24 Mumbai 4000 D004 Marketing
E005 STU 32 Bangalore 25000 D005 Human Resource
Employee Table
Eid Ename Age City Salary
E001 ABC 29 Pune 20000
E002 PQR 30 Pune 30000
E003 LMN 25 Mumbai 5000
E004 XYZ 24 Mumbai 4000
E005 STU 32 Bangalore 25000

Department Table
Deptid Eid DeptName
D001 E001 Finance
D002 E002 Production
D003 E003 Sales
D004 E004 Marketing
D005 E005 Human Resource
EMPLOYEE ⋈ DEPARTMENT

Eid Ename Age City Salary Deptid DeptName


E001 ABC 29 Pune 20000 D001 Finance
E002 PQR 30 Pune 30000 D002 Production
E003 LMN 25 Mumbai 5000 D003 Sales
E004 XYZ 24 Mumbai 4000 D004 Marketing
E005 STU 32 Bangalore 25000 D005 Human Resource
2. DEPENDENCY PRESERVATION
� Dependency is an important constraint on the
database.
� Every dependency must be satisfied by at least one
decomposed table.
� If {A → B} holds, then two sets are functional
dependent. And, it becomes more useful for checking
the dependency easily if both sets in a same relation.
� This decomposition property can only be done by
maintaining the functional dependency.
� In this property, it allows to check the updates without
computing the natural join of the database structure.
3. LACK OF DATA REDUNDANCY
�Lack of Data Redundancy is also known as
a Repetition of Information.
�The proper decomposition should not suffer
from any data redundancy.
�The careless decomposition may cause a
problem with the data.
�The lack of data redundancy property may be
achieved by Normalization process.
CLOSURE OF ATTRIBUTE
� The closure of FDs defined by A, is the set of all
attributes that are “eventually” defined by A.
� Let, AB;
B  C,D;
BD  E;
Then, Closure(A)  A  B  C  D  E
� The closure of a set of attributes A is denoted by A+
which is the set of all attributes of R, then A is a super
key of R.

52
EXAMPLE
� R ( A , B , C , D , E , F , G ) 
� A → BC
� BC → DE
� D→F
� CF → G
� A+   = {A , B , C , D , E , F , G }
� D+  = {D, F}
� A is Super key

53
PRIME AND NON PRIME ATTRIBUTE
� Prime Attribute : The attribute present in any of the candidate
key.
� Non Prime Attribute : The attribute which is NOT present in
any of the candidate key.

EXAMPLE Prime
R(A, B, C, D, E, F) Attribute
F(R)={ CF, EA, ECD, AB} C, E
To Find Candidate keys => CE
a) Start with the attribute which is not present in the RHS Non Prime
=> CE Attribute
b) find closure of that attribute
=> (CE)+ = CEFADB A,B,D,F 54
NORMALIZATION
 A process that “improves” or “optimize” the database design by
reducing data redundancies.
⚫ Problems Caused by Redundancy
Wasting storage
Insert anomalies
Update anomalies
Deletion anomalies

� Decide whether a particular relation R is in “good” form.


� In the case that a relation R is not in “good” form, decompose it
into a set of relations {R1, R2, ..., Rn} such that
⚫ each relation is in good form
⚫ the decomposition is a lossless-join decomposition
55
PURPOSE OF NORMALISATION

� To avoid redundancy by storing each ‘fact’ within the


database only once.

� To put data into a form that conforms to relational principles


(e.g., single valued attributes, each relation represents one
entity) - no repeating groups.

� To put the data into a form that is more able to accurately


accommodate change.

� To avoid certain updating ‘anomalies’.

� To facilitate the enforcement of data constraints.


REDUNDANCY AND DATA ANOMALIES

Redundant data is where we have stored the same ‘information’ in more


than once i.e., the redundant data could be removed without the loss of
information.

staffNo job dept dname city Such ‘redundancy’ could


SL10 Salesman 10 Sales Madurai lead to the following
SA51 Manager 20 Accounts CBE ‘anomalies’
DS40 Clerk 20 Accounts CBE
OS45 Clerk 30 Operations CBE

Insert Anomaly: We can’t insert a dept without inserting a member of


staff that works in that department
Update Anomaly: We could change the name of the dept that SA51 works
in without simultaneously changing the dept that DS40 works in.
Deletion Anomaly: By removing employee SL10 we have removed all
information pertaining to the Sales dept.
REPEATING GROUPS
A repeating group is an attribute (or set of attributes) that can have
more than one value for a primary key value.

Example: We have the following relation that contains staff and department details and a list of telephone
contact numbers for each member of staff.

staffNo job dept dname city contact number


SL10 Salesman 10 Sales Madurai 018111777, 018111888, 079311122
SA51 Manager 20 Accounts CBE 017111777
DS40 Clerk 20 Accounts CBE
OS45 Clerk 30 Operations CBE 079311555

Repeating Groups are not allowed in a relational design, since all attributes
have to be ‘atomic’ - i.e., there can only be one value per cell in a table!
FULLY, PARTIAL FUNCTIONAL DEPENDENCIES

� Fully Functional Dependency: All non-key attributes are


fully functionally dependent on the primary key
� Partial Functional Dependency: In a relation, there
exists Partial Dependency, when a non prime attribute is
functionally dependent on a proper subset of Candidate
Key.
� For example : Let there be a relation
� R ( Course, Sid , Sname , fid, schedule ,  room , marks )
� Full Functional Dependencies :
⚫ {Course , Sid) -> Sname , {Course , Sid} -> Marks, etc.
� Partial Functional Dependencies : 59

⚫ Course -> Schedule ,  Course -> Room


EXAMPLE
� Student(Regno, name, DOB, CGPA, Address,
Gender,age, Mobileno)
� Functional Dependency
Regno->name
Regno->Dob
Dob->age Partial
Regno, name->CGPA Dependency
Name, Mobileno->Address
Regno, Mobileno->Mobileno
Full
Dependency 60
Levels of Normalization

Number of Tables
⚫ First Normal Form (1NF)

Redundancy

Complexity
⚫ Second Normal Form (2NF)
⚫ Third Normal Form (3NF)
⚫ Boyce-Codd Normal Form (BCNF)
⚫ Fourth Normal Form (4NF)
⚫ Fifth Normal Form (5NF)

61
FIRST NORMAL FORM (1NF)
� First Normal Form (1NF) is a simple form of
Normalization.
� It simplifies each attribute in a relation.
� In 1NF, there should not be any repeating group of
data.
� Each set of column must have a unique value.
� It contains atomic values because the table cannot
hold multiple values.

62
EXAMPLE
ROLL NAME COURSE
NO S
ROLL NAME COUR 101 Akil DBMS
NO SES
101 Akil OS
101 Akil DBMS,
OS, 101 Akil CN
CN, 101 Akil CD
CD
102 Abi CD, 102 Abi CD
OS 102 Abi OS
103 Bala CD,
103 Bala CD
DS,
DBMS, 103 Bala DS
CN
103 Bala DBMS
63
103 Bala CN
Not Atomic
SECOND NORMAL FORM (2NF)
�Relation R should be in First Normal Form
(1NF)
�All non prime attributes are fully Functional
Depended on any key of Relation (R)
�The main rule of 2NF is, 'No non-prime
attribute is dependent on the proper subset
of any candidate key of the table’.
� An attribute which is not part of candidate key is known as
non-prime attribute. 64
2 NF (EXAMPLE )

� Consider the relation R (A, B, C, D, E, F)


� FD is F(R) = { A  BCDEF, BC  ADEF,
B  F, D  E }
� Find non prime attributes
A A+ = ABCDEF
BC BC+ = BCADEF Non Prime Attributes
B B+ = BF D, E, F
D D+ = DE
65
2 NF (EXAMPLE )
� Check Fully Functional Dependency of non prime attributes

D A  BCDEF , BC  ADEF

E D  E, A  BCDEF, BC 
ADEF
F A  BCDEF , BC  ADEF , B
F
66
Partial functional
D is NOT Candidate key) dependency
2 NF (EXAMPLE )

� Relation R is not in 2 NF.


� Split the Relation R into Two Relations R1, R2

R(A, B, C, D, E, F)
F(R) = { A  BCDEF, BC  ADEF, B  F,
DE}

R1 (A, B, C, D, E)
R2 (B, F)
F(R1) = { A  BCDE, F(R2) = { B  F}
BC  ADE, 67

DE}
Roll Course Cours
No No e
Name
EXAMPLE: 1 cs8492 DBMS
Student Relation 2 cs8492 DBMS
FD: 1 cs8493 OS
{CourseNo,RollNo->Course Name
CourseNo->Course Name}
Clourse :
(Rollno, CourseNo)+ = {Rollno, CourseNo, Course Name}
Candidate key : Rollno, CourseNo
Prime Attribute : Rollno, CourseNo
Non Prime Attribute : Course Name
Here, Course name is partially dependent on CourseNo68
DECOMPOSE THE RELATION 

� Student(RollNo, CourseNo, CourseName)

� Student_course (RollNo, CourseNo)


� Course (CourseNo, CourseName)

69
THIRD NORMAL FORM (3 NF)
� Relation should be in 2NF.
� No non prime attribute should be transitively
dependent on candidate key
(OR)
� There should not be case that a non prime attribute is
determined by another non prime attribute.

EXAMPLE: R ( A, B, C, D)
Let Candidate key is BC 
Non Prime Attributes are A, D A D 70
3 NF (EXAMPLE )

R1 (A, B, C, D, E)
F(R1) = { A  BCDE, BC  ADE, D  E }

Candidate key : A, BC
Non Prime Attribute :
D, E
DE
R2 (B, F) Candidate key : B
F(R2) = { B  F} Non prime attribute : F
One Non prime attribute 71
3 NF (EXAMPLE )
� Split the Relation R1 into Two Relations R11, R12

R1(A, B, C, D, E, )
F(R) = { A  BCDE, BC  ADE, D  E }

R11 (A, B, C, D) R12 (D, E)


F(R11) = { A  BCD, F(R12) = { D E}
72
BC  AD}
3 NF (EXAMPLE )
Employee
EId Ename DOB City State Zip
001 ABC 10/05/1990 Pune Maharashtra 411038
002 XYZ 11/05/1988 Mumbai Maharashtra 400007

Employee_Table1

EId Ename DOB Zip


001 ABC 10/05/1990 411038
002 XYZ 11/05/1988 400007

Employee_Table2

City State Zip


Pune Maharashtra 411038
Mumbai Maharashtra 400007
BOYCE CODD NORMAL FORM (BCNF)
� Relation should be in 3NF
� A table is in Boyce-Codd normal form (BCNF) if
every determinant in the table is a candidate key.
(OR)
 No partial dependencies, No transitive dependencies

 If a table contains only one candidate key, the 3NF

and the BCNF are equivalent.


 Any 2-Attribute relation of the form R(A,B) is always

in BCNF.
74
BCNF (EXAMPLE)

R11 (A, B, C, D)
F(R11) = { A  BCD, BC  AD}
Candidate key: A, BC

R(A, B, C, D)
F(R) = { A  BCD, BC  AD,
DB}
Candidate key: A, BC
D
75
BCNF (EXAMPLE)
� Split the Relation R into Two Relations R1, R2

R1(A, B, C, D )
F(R) = { A  BCD, BC  AD, D  B }

R1 (A, C, D) R2 (D, B)
F(R11) = { A  CD, F(R2) = { D B}
C  AD} 76
BCNF (EXAMPLE)EMPLOYEE table:
EMP_ID EMP_COUNTRY EMP_DEPT DEPT_TYPE EMP_DEPT_NO

264 India Designing D394 283


264 India Testing D394 300
364 UK Stores D283 232
364 UK Developing D283 549

Functional dependencies are

1. EMP_ID  →  EMP_COUNTRY  
2. EMP_DEPT  →   {DEPT_TYPE, EMP_DEPT_NO}
 
Candidate key: {EMP-ID, EMP-DEPT}

The table is not in BCNF because neither EMP_DEPT nor EMP_ID alone are keys.
BCNF (EXAMPLE) EMP_COUNTRY table:

EMP_ID EMP_COUNTRY

Functional dependencies: 264 India


264 India
1. EMP_ID   →    EMP_COUNTRY  
2. EMP_DEPT   →   {DEPT_TYPE, EMP_DEPT_NO} 
  EMP_DEPT table:

EMP_DEPT DEPT_TYPE EMP_DEPT_NO

Designing D394 283

Testing D394 300


Candidate keys:
Stores D283 232
For the first table: EMP_ID Developing D283 549
For the second table: EMP_DEPT
For the third table: {EMP_ID, EMP_DEPT} EMP_DEPT_MAPPING table:

EMP_ID EMP_DEPT

D394 283

D394 300

D283 232

D283 549
FOURTH NORMAL FORM (4 NF)
� Relation should be in BCNF
� No multi valued dependency. ENAME : PNAME => 1:
ENAME PNAME MNAME
N
Smith X John ENAME : MNAME =>
Smith Y Raj 1:N
Smith X Raj ENAME PNAME
Smith Y John
ENAME MNAME
� Trivial Multi valued dependency
� XY ( YX )
� XY ( XY=R )
NOT satisfying these
conditions are Non 79
Trivial Multi Valued
Dependency
4 NF (EXAMPLE)
ENAME PNAME MNAME
Smith X John
Smith Y Raj
Smith X Raj
Smith Y John

ENMAE PNAME ENAME MNAME


Smith X Smith John
Smith Y Smith Raj

80
STUDENT
4 NF (EXAMPLE)
STU_ID COURSE HOBBY

21 Computer Dancing
1. STUDENT relation, a student with
STU_ID, 21 contains two 21 Math Singing
courses, Computer and Math and 34 Chemistry Dancing
two
hobbies, Dancing and Singing. 74 Biology Cricket
59 Physics Hockey
2. So there is a Multi-valued
dependency on STU_ID, which STUDENT_COURSE
leads to unnecessary repetition of
data STU_ID COURSE

21 Computer
21 Math
34 Chemistry
74 Biology
59 Physics

STUDENT_HOBBY
STU_ID HOBBY

21 Dancing

21 Singing

34 Dancing

74 Cricket

59 Hockey
FIFTH NORMAL FORM (5 NF)
� R is in 4NF
� If Join Dependency not exist  5NF
� Else Join Dependency exist
If only Trivial JD  5NF
Else (All Ri is Super key)  5NF

� JOIN DEPENDENCY
� A join dependency (JD) {R1, . . . Rn} is said to hold
over a relation R if R1, . . . ,Rn is a lossless-join
decomposition of R.
82
Additive join
5 NF EXAMPLE Lossy Decomposition
SUPP PART

S1 P1

S1 P2 SUPP PART PROJ


S2 P1 S1 P1 r1
SUPP PART PROJ S1 P1 r2
SUPP PROJ
S1 P1 r1 S1 P2 r2
S1 r1
S2 P1 r1
S1 P2 r2 S1 r2
S2 P1 r2
S2 P1 r1 S2 r1
S2 r2
S2 P1 r2

PART PROJ

P1 r1

P2 r2 83

p1 r2
FURTHER NORMAL FORMS
� Join dependencies generalize multi valued
dependencies
⚫ lead to project-join normal form (PJNF) (also called fifth
normal form)
� A class of even more general constraints, leads to a
normal form called domain-key normal form.
� Problem with these generalized constraints: are hard to
reason with, and no set of sound and complete set of
inference rules exists.
� Hence rarely used

84
Summary
Unnormalised
(UDF)
Remove repeating groups
(ATOMIC)
First normal form
(1NF)
Remove partial dependencies
Second normal form
(2NF)
Remove transitive dependencies
Third normal form
(3NF)
Remove remaining functional
dependency anomalies
Boyce-Codd normal
form (BCNF)
Remove multivalued dependencies
Fourth normal form
(4NF)
Remove remaining anomalies
Fifth normal form
(5NF)

You might also like