You are on page 1of 28

Normalization

Prepared by Harish Patnaik

School of Computer Engineering, KIIT Deemed to be University


Content

1. Functional dependency
2. Key
3. Entity set
4. Mapping Cardinalities
5. Extended E-R model
Functional Dependency
• It is a constraint between two sets of attributes in a
relation.
• If R is a relation, a FD denoted by X→Y between two
sets of attributes X and Y of R states that for any two
tupples t1 and t2 where t1[X]= t2[X] must have
t1[Y]= t2[Y]
• If X is a candidate key of R then X→Y for any subset
of attributes Y of R .
• If X→Y in R , it doesn’t infer Y→X.
Functional Dependency
• Fully Functional Dependency X→Y
If there is no Z where Z is a proper subset of X such
that Z→Y
• Partila Dependency
If K is a candidate key of a relation R and if X is a
proper subset of K and if X→Y then Y is said to be
partial dependent on K.
Functional Dependency
Name→ contactno, address
course→course_dept
Name,course→ grade
• Name,course is a candidate key
• Grade is fully functional dependent on the candidate key
• contactno is partial dependent on the candidate key

Name Course grade contactno address course_dept


Normalization
• It is the process of analyzing the given relation
schema based on their FDs and Primary keys to
achieve the desirable properties of
• minimizing redundancy
• minimizing the insertion, deletion and update
anomalies
• The normal form of a relation refers to the highest
normal form condition that it meets.
• The relation schema that do not meet the normal
forms test conditions are decomposed into smaller
relation schema that meet the test.
First normal form 1NF
• A relation schema is said to be 1NF if the
values in the domain of each attribute of the
relation are atomic.
• Only one value is associated with each
attribute.
• A database is in 1NF if every relation schema
is in 1NF.
Second normal form 2NF
• A relation schema is said to be in 2NF if it is in
1NF and if all non-prime attributes are fully
functionally dependent on the key of the
relation.
• A database is in 2NF if every relation schema
is in 2NF

rollno name age cotactno DOB


2NF
• 2NF doesn’t permit partial dependency.
• It doesn’t rule out the possibility that a non-prime
attribute may be functionaly dependent on
another non-prime attribute.
2NF
Q.For the given FDs of the relation R(A,B,C,D,E,F) ,if it is
in 1NF, check whether it is in 2NF or not.
F= {A → B,C}, {D → E}, {A,D → F}
A. {A}+={A,B,C}
{D}+={D,E}
{A,D}+={A,D,B,C,E,F}
So {A,D} is the candidate key.
But {A → B,C} and {D → E} are partial dependencies.
Hence not in 2NF.
Transitive Dependency

• Given a relation R, let X and Y are the subsets of R


and A is an attribute of R. If there exists the FDs as
{X→Y, Y→A} then A is said to be transitively
dependent on X.

rollno name age cotactno DOB

rollno→name,age,contactno
contactno → city
Here the transitive dependency is
rollno→contactno → city
Third normal form 3NF
• A relation schema is said to be in 3NF if it is in 2NF and
for all non-trivial FDs in F+ of the form X→Y, either X
contains a key or Y is a prime attribute.
• 3NF doesn’t permit partial or transitive dependency
• A database is in 3NF if every relation schema is in 3NF
empid ename desig deptno deptname

F= { empid→ename,desig,deptno,deptname
deptno → deptname }
There exist transitive dependency
empid→deptno → deptname
Third normal form 3NF
So to convert into 3NF, the above relation is
decomposed into
empid ename desig deptno

F1= { empid→ename,desig,deptno}

deptno deptname

F2={deptno → deptname }
3NF
rollno regnno course Grade

2350001 1234 CS2001 E

2350001 1234 IT2001 A

2350005 3456 IT2001 B

2350005 3456 CS2004 O

rollno,course→grade • rollno,regnno are repeated. If


regnno,course→grade any one need to be changed -
rollno→regnno • By typing mistake we may get
regnno→rollno inconsistency
Boyce Codd normal form BCNF
• A relation R with functional dependencies F is said to
be in BCNF if for every non-trivial FD in F+ of the form
X→Y, X is a super key of R.
• A database is in BCNF if every relation schema is in
BCNF
rollno name dept regnno

F= { rollno→name,dept
regnno→name,dept
rollno→regnno
regnno → rollno }
3NF to BCNF
time faculty subject roomno

F= { time,faculty→subject, roomno
subject→faculty }
Here {time,faculty} is the candidate key. So it is in 3NF but not
in BCNF.
So it is decomposed into two relations.

time faculty roomno faculty subject

F1= {time,faculty→ roomno} F2={subject→faculty }


But {time,faculty→ subject} is lost.
Multivalued Dependency
tuple ename pname dname

t1 James P Bob

t2 James Q Wills

t3 James P Wills

t4 James Q Bob

• An Employee may work on several projects.


• An Employee may have many dependents .
• Projects and Dependents are independent.
MVD
Defn :Given a relation R with subset of attributes X and Y. Then
the multivalued dependency X Y holds in R implies if two
tuples t1, t2 exist in R such that t1[X]= t2[X] then two tuples t3
& t4 should also exist in R with following properties -
ü t1[X]= t2[X]= t3[X]= t4[X]
ü t1[Y]= t3[Y] and t2[Y]= t4[Y]
ü t1[Z]= t4[Z] and t2[Z]= t3[Z] where Z= R - (XUY)
As per the definition when X Y holds in R , so does X Z.
An MVD like X Y is called Trivial, if
i) Y is subset of X
or ii) XUY = R
Fourth normal form 4NF
• A relation schema is said to be in 4NF with respect to a
set of dependencies F (which includes FDs and MVDs)
if it is in 3NF and if for every non-trivial multivalued
dependency of the form X Y in F+, X is super key or
MVD is trivial.
R1 ename pname R2 ename dname
ename pname ename dname

• Both of these subrelations are in 4NF because the


MVDs in R1 and R2 are trivial and no other MVDs nor
FDs hold in R1 and R2.
Lossless join & Dependency Preserving
decomposition
sname city pincode

Richard BBSR 751013

Martin CTC 761006

Ayush BBSR 751005

Prerana RKL 712010

sname → {city, pincode}


city → {pincode}
Lossless join decomposition
sname city city pincode
Richard BBSR BBSR 751013
Martin CTC CTC 761006
Ayush BBSR BBSR 751005
Prerana RKL RKL 712010

R1 R2 sname city pincode


Richard BBSR 751013
Richard BBSR 751005
Martin CTC 761006
Ayush BBSR 751013
Ayush BBSR 751005
Prerana RKL 712010
Lossless join & Dependency Preserving
decomposition
A B A C
R1 R2
1 2 1 3
2 2 2 3
3 2 3 3
4 3 4 4

F1 ={ A→ B} F2 = {A→C}

A B C
1 2 3
R1 R2
F= {A→ B 2 2 3
B→ C,
A→ C} 3 2 3
4 3 4
Lossless join & Dependency Preserving
decomposition
Defn :A decomposition of a relation R (with functional
dependencies F) into relations R i (2<=i<=n) is said to be
Lossless if for every relation R i satisfies the FDs in F, the
natural join of Ri gives the original relation R. Otherwise the
decomposition is called Lossy.

Defn :A decomposition of a relation R (with functional


dependencies F) into relations R i (2<=i<=n) with FDs F1,
F2, ...Fn is said to be Dependency Preserving if the clouser
of F’ where F’= F1 U F2 U ....Fn is identical to F+ i.e, F’+= F+
Join Dependency (JD)

A B C

CS white Navathe

CS black Korth

CS white Korth

MBA white Weston

MBA black Weston


Join Dependency
A B A C
R1 R2
CS white CS Navathe
CS black CS Korth
MBA white MBA Weston
MBA black

R1 ←╥ A,B (R) R2 ←╥ A,C (R)


A B C
CS white Navathe
R’ ←R1 R2 CS white Korth
CS black Navathe
CS black Korth
MBA white Weston
MBA black Weston
Join Dependency
B C
white Navathe
black Korth
white Korth
white Weston
black Weston

R3 ←╥ B,C (R) A B C

R ←R3 R’ CS white Navathe


CS black Korth
CS white Korth
MBA white Weston
MBA black Weston
Join Dependency

D ef n : G i ve n a re l at i o n R w i t h a s et o f p ro j e c t i o n s
{ R 1 , R 2 , . . . . . R n } . T h e re l a t i o n R s a t i s f i e s t h e j o i n
dependencies *[R1,R2, .....Rn] if and only if the join of
projections Ri, 1<=i<n , is equal to R.
R = ╥ R1(R) * ╥ R2(R)* .......* ╥ Rn(R)

Defn :A join dependencies *[R1,R2, .....Rn] of relation R is


said to be trivial, if one of the projections (Ri) is equal to R
itself.
Fifth normal form 5NF
• A relation R is said to be in 5NF with respect to a set of
dependencies F (which includes FDs ,MVDs & JD) if it is
in 4NF and if each join dependencies *[R1,R2, .....Rn]
satisfies one of the following conditions -
i) join dependency is trivial
or ii) each of the projections Ri is a super key of R

You might also like