You are on page 1of 20

min

matiom
UpdateSL
mirror
m

Chapter 9
NORMALIZATION
INTRODUCTION
The Nomalization Process was first
proposed by codd in 1972. Initialy
codd proposed three Normal Forms which he called first, second and third
Normal Form. The stronger definition of 3rd Normal Form was proposed by
Bovce and codd Normal Form which is known as BCNF. All these forms are
based on the functional dependencies among the attribute of the relation.
Later fourth and fifth Normal Form were proposed, based on the concepts
of Multi-valued Dependencies and Join Dependencies respectively.
Normalization is a design technique that is widely used in designing
relational model. In order to design a relational model system, we have to
decide logical structure of the database. Logical structure of the database
is designed so that basic operations on the database like Insert, Update,
Delete or Retrieve can be performed without any problems.

Definition
Normalization of data can be defined as a process during which redundant
their attributes into smaller
relation schemas are decomposed by breaking up
relation schemas that possess desirable properties.

bjectives of Normalization
are:
The objectives of the normalization process
relation schemas based
To create a formal framework for analyzing
and the functional dependencies among their
on their keys on

attributes.
based on a
T o obtain powerful relational retrieval algorithms
collection of primitive relational operators.
To free relations from undesirable insertion, update and deletion
MS ACC
To reduce the need for restructuring the relations as neu
are introduced.
anoali
ew data
To carry out series of tests on individual relation schema
the relational database can be normalized to some degree
test fails, the relation violating that test must be deco
relations that individually meet the normalization tests.d
Whe
The entire normalization process is based upon the analysis ofra
their schema, their primary keys and their functional dependencies
E.F. Codd proposed three normal forms known as irst, second.
normal form.
9.1 FUNDAMENTALS OF NORMALIZATION ***w.
Relational data base tables, whether they are derived from ER
or from some other design method, sometime suffer from some se mode
problems in terms of performance, integrity and maintainahit
For example, when the entire data base is defined as a single lity
table, it can result in a large amount of redundant data and lene
searches for just a small number of target rows. length
It can result in long and expensive updates, and
deletions-
particular can result in the elimination of useful data as an uwanta
side effect.
Consider a relation sales in which product, order no, cust name
address, sale person name are all stored in a single table.
SALES
Product Order Cust Cust Credit Date Sale Person
Name No. Name Address
Name
Vacuum cleaner 1458 Ekem Patiala 5.8.2007|Baljit
Computer 1492 Srithi Samana 8.8.2007Simar
Vaccum cleaner 1492 Hiten Bathinda 8
8.8.2008|Ekemijot
Computer 1499 Mayank Patiala 9
9.1.2008|Harsimran
Calculator 1503 Hiten Nabha 6 9.2.2008 Govind
I n this table we see that certain product are redundant which w
lead to Wasting the storage space.
I f we are having queries such as "Which customer order the vaccu
cleaner in month of August." Would require the lot of search in th
entire table. In addition update the address of Hiten.
I t is very difficult to update the table.
I f we want to delete the item purchase by Hiten. Then which o
No. is to be deleted 1492 or 1499. So it is very confusion to
such of problem.
Ina
type
T o handle these type of problem the concept of normalization cal-
ca
into the existence.
ln yhich the
This

single large tabl


would try to remove some
Normalization is
is aa ome eak
fro process of
these up
145
which is free from into small
some of by tables.
ai
Minimu ndancy. we problems.
which
datio of a Att
tion of anomalies will
like try to make the
an Attribute
a
ttribute
eletion of an . will not tne table
3. Attribute
n s e r t i o n

tribute
can take
pla
will not effect the
other effect
4.
the abov
above
very easily the Attributes.
without other Attribute
s To
achieve
10Ovides
provides the data properties
base of Attributes.
affecting the
a
relation other
ae other Attributes.
framework fordesigner with
nal the
normalizationAttribue
nalization process
keysrs and on the analyzing relation
functional dependencies
de amons process
of test
ies of tests that lation schemes based
9.
2.
A
Ahat the relation can be carried based
ong their
their
attributes.
on
so that
aSut
data aned out
fails, the relation base can be an individual relation schemes
test fails,
ations thatnormalized schenm
3. Normal

that
elations that
individuallyviolation
meet
to
test must be any degree.
decomposed
ormal forms, when consideredthe normalization tests. posed into
in
e. When
wn
notguarantee a in isolation
good data base
from other factor, do
DrOceeding further about design.
portant concepts which will benormalization
used in form we will discuss
ider is normalization form.
SOIe
Consi the relation
Student, Roll No. is Prime
Prime Attribute : Attribute.
Hke
The Prime
a Primary Key. With the
Attribute is the attribute which act
help of
information of all other non primary primary key we
key attributes. can obtain the
STUDENT
Name Roll No. Address Number
P.K.
STUDENT is a relation with the attributes Name, Roll No., Address
the Roll No.,
and Number. Roll No. is the Primary key. By knowing
informations of the students.
we can get all other the one
schema which has more man
2. Candidate Key : A relation
candidate key.
minimal" key, each is called attribute if
attribute is called non prime
, Non Prime Attribute : An
attribute.
it is not primary key in a table and that key
the additional key is
Super Key : If we
add
type of key attribute
key then such
a s a primary
y als0 act
called Super Key.
12 FUNCTIONAL DEPENDENCIES DBMS.
role in
important two
very between any
plays that exist
Functional depende relationship
is called
determined.
is and other
Fund dependence
determinant
called the
fields one field is
PMS VWI
MMS WIUA
F o r each value of determinant there is associated one a
TH MSMS ASN
ne and omly
value of determined.
If X is determinant and Y is determined then we say that
determined Y and is graphically represented as X
x functun
Y
Y is functionally determined by X

Y
10 2
15 3
20
25 5
30 6
Each value of X there is associated and only one value of vv
Example The following
: table illustrate that X does not
funchs.
dependent on Y nction
5
10 2
5 3
5 4
10 20
Because for X 5 there is more than one value of
=
Y
And X 10 there is more than value of Y. n this
=

may be same but the value of X should not be


table the value
dependency can also be defined as redundant. Functio
An attribute relational data base model is said to be
a
dependent on another attribute in the functiona
table if it can take
value for a given value of the attribute upon which it is only o
dependent. functiona
Example: Consider the data base student
Name, Sex, Address having attribute Roll Na
STUDENT [Roll No., Name, Sex,
Address]
STUDENT
Roll No. Name Sex
901
Address
Ram M
902 Sham
Khanna
M Amritsar
903 Sita F Morinda
910 David M Bathinda
911 Ram M Patiala
Here Roll No. is
determinant.
[Name, Sex, Address] is
determined.
NOHIMAL

f W now the Roll


ot repeat. Roll No. we can
should n o
repeat. Repeatation
Sex and
and Add.
ex
Repeatation
Address is fiuntOof
Address get all
of all
this the inforn
147
attributeinformation.
Name,
pependence Diagramn functionally attsnlormation. This attribute
attribute
will lead to confusi
MOT XDlicitly we can
dependent on Roll NO.
innplied

hema a se of
explain the functional
ed eet set of functional
functional dependencies F. Supposedependence F on Ris
R is logica
Adependencies
10
B R =(A, B, we
C, G, H,are) give
siven a relatior

A C
CG H NAME
CG I
B H ROLL NO. SEX
The functional dependency
ically
A H ADDRESS
implied. That is we can show
onal dependencies
functional

holds on a that, whenever our set of


that ti relation, A
A Ver our given set
H ol
relation.Uppose and ta are must also hold
old on the
tuple such that
t lA] = talA]
Since we are given that A Bit flows from the
dependency that definition of functional
tB] = taB
Then, since we are
given that B > H, it follows
functional dependency that
from the definitions
t [H] =tplH]
Therefore we have proved that whenever t/ and t2 are tuples such that
t A = t2lA] must be that ti[H] = t2[H]. That is the exactly definition of
Functional Dependency.

9.3 MULTIVALUED DEPENDENCY


****

A Multivalued Dependency (MVD) occurs when two or more independent


attribute occur within the same
multi valued facts about the same
a relation R having X, Y,
Z as attributes Y and
table. It m e a n s that if in
about X, which is represented as
Z are multivalued valued facts
Y
X
and X Z
on each other and
independent
Then MVD exists only if Y and Z are
on X.
are functionally dependent does and does not exist
in
where MVD
example shows MVDS i.e. x
The following conditions of
all the
satisfy
a table. In R1, the
first rows

Z.
y and X
>>
1 DBMS WITH MS
ACC
8

R1 X 1
1
1 2
2
2 2
1
2 2
3 3 3
The fifth and sixth rows of R1 (when the X value is 2) satisfy the
interchange conditions in preceding definition. In both row of value f
2 so the MVD condition does not hold. The seventh row {8, 3, 3) satisk.
definition trivially.
In table R2
Ra
2

2
2
2
The Y value of fifth and sixth are different and inter changing the-
and 2 values for Y results in row (2, 2, 2) that does not appear in the tahle
Thus in Ro there is no MVD between X and Y or Y and Z even though th
irst four row satisfy MVD conditions.
In Table R3
R3 X
1 1 1
2

2 2 2
The first three rows do not satisfy the cretrion for an MVD, sino
changing MVD Y from 1 to 2 in second row results in a row that appear n
table. Similarly changing Z from 1 to 2 in the third row results in a not
appearing row. Therefore, R3 does not have any MVD between X and 1
o

between X and Z.

9.4 TRANSITIVE DEPENDENCIES


Assume that A, B and C are the set of attributes of a relationt
Further assume that following functional dependencies are saus
simultaneously: A B, B A, B>C, A C and C>A
Observe that C B is neither prohibited nor required.
is
I f all these conditions are true, we will say that attribute
transitively dependent on attribute A.
AALIZATION
he
, I tshould

noulaor clear that


having a these functional
äitiose functional
ely transitlve dependency
nyo f
149
figure 9.1 dependencl
dependent es dependencies
not attrit deter
of are ermine the
The
shown below summarizes
on satlsfied ther
t h e s e
conditions, In
arroWs are equivalentthisto diagram
attributeA. en attribute

that wve use the the


for symbol A
be explicitlyinencyNotinnot0
functie
otional dependency denotinag
otice th
C may not
dicated but it B
ue to the
Transitivity axiom.holds true
The
T h e

requirements that B
function
depends A) and >A
C
(B not
O
not fur
onally depends A) are >A (C Figure 9.1
A and B are nonprime attributes.necessary to ensure that
attributes
FORMS OF NORMALIZATION
girst
5.1 First
Normalization Form (1 NF)
tion
relatio is said
to be in the
l
aion has at most a single value. In NF
at most if and
other words
yonly ifil every entry of the
elation every entry of the
relation is said to be 1NF if and only if all
the
the atomic value or single value. underlying domain contains
Consider a Relation Salary

Salary
Name Basic Pay DA HRA Total
Ram 8000 5000 200 13200
8000 3000 11000
Ram, Gita 6000 2000,2000 200 8200
In this relation some of cels are empty i.e. there is no value in it. And
i some cells there is more than values exists. So it is not in 1NF.
To be in the INF this relation must look like

Salary
Name Basic Pay DA HRA Total

Ram 5000 200 13200


8000
13000
Sham 8000 3000 2000
8200
Ram 6000 2000 200
8200
Gita 2000 C00

6000
Anomalies in the 1NF of
which will lead to variety
nere is redundancy in 1INF relation
data anomalies.
dificult.
Insertion is very
(i) want to delete the
not be done. If we
re
(ti Deletion can
two records will
be deleted. So it is very difficult t of Ram
to update the record of Ram. Then
ete aveg
(iv) If we want
is very difficult to
decide.
ich
which record -
updated it
Normal Form (2 NF)
9.5.2 Second
be in 2NF if it is in
A relation is said to
PROJ# HOURS P HANDLER P NAME
NO. LOCATION
fd
fd2
fd2
cONVERTED INTO 2 NF
NO PROJ # HOUR NO PHANDLER
() (0)
PROJ# PNAME LOCATION
()
(9 1 NF
(i) Every non key attribute is fully dependent on
Primary Key.
Now every non key attribute is Functional
has
Dependent on Primary k
overcome the problem of updation still we have a
problem in normal
Problems in 2NF
( Insertion. We can not insert the fact that
particular project handler.
a
particular projec
(i Deletion. If we delete the tuple which contain
the No. it will de=
not only the information for the
concerned project but also os
information of handler in Table I.
(ii Updation. To change the value is very difficult. To
problem we move to the nextnormal from i.e. 3NF.
overcom
9.5.3 Third Normal Form (3 NF)
A relation is said to be in third normal form
( It is in 2NF. (3NF) if and onuy
(ii Every non key attribute is non on
primary key. transitively dependel
NOFMALIZATO ON
151
X
X1 Amabala Amabla 30
X2 Patiala Patiala 40
X3 Malerkotla Bathinda 35
X4 Sangrur Sangrur 45
X5 Bathinda
ion X and Y are having the Problem of Transtivity.
t h t h e relation
XCfCty City Status
relation we have removed the trans
Isitivity.
Thus b y s u c h
u l e to
RRule
ttransform a relation into Third Normal Form
of table
9 . 5 . 3 . 1
Third Norm mal
a form applies that every non-prime a t t r i b u t e not be
Third on primary key, or we can say that, there
should
mustbedependent
d
another non-prim
at a non-prime attribute is determined by be removed
the case t h a
transitive functional dependency should
te So this
altribute
be in Second Normal form. For exampie
and also the table must
table with following fields.
the tabl
ler a
consi
A* A
Convert to
B B C
C
Figure 9.2.
Student_Detail Table
Street city Zip State
S t u d e n t n a m e DOB and state
Studentid but street, city
is Primary key, called
table Student_id other fields is
In this
The dependency between
zip and m o v e the street
depends upon Zip. Hence to apply
3NF, we need to
transitive dependency. as primary key.
table, with Zip DOB Zip
state to n e w Student_name
city and StudentDetail
Table : Student_id
New state
Zip Street city
Address Table : transtive dependency is,
The advantage
of removing
is reduced.
Amount of data duplication
Data integrity achieved.
3NF Relations caused either by
9.5.3.2 Data Anomalies in data anomalies
of
to get rid of the
a
dependencies
helped us or by
T h e 3NF
on the Primary Keya t t r i b u t e .
dependencies
another nonprime
ransitive
attribute o n
nonprime
Relations in 3NF are still susceptible to data anomalie
es partic
when the relations have two overlapping candidate ke.
nonprime attribute functionally determines a primeeys or
ME.ANC
The following exanmple will illustrate this.
Example: Consider the Manufacturer relation shoWn .
attrlbuteWhen
each manufacturer has a
items (identified by their
unique ID and name. Man
unique itemn
elow wh
numbers) in the amounte,ind
Pro.
Manufacturers may produce more than one item an
manufacturers may produce the same items.
Manufacturer ( Id No, Name, Item_No, Quantity)
ffer
Manufacturer
ld No Name tem No
M101 Electronics USA H3772 QuantiN
M101 Electronics USA J08732 1000
M101 Electronics USA Y23490 700
M322 Electronics-R-Us 200
H3772
900
This
Manufacturer relation has two candidate keys:
and (Name, Item_No) that (ID, Item
is in 3NF because
overlap on the attribute Item_No. The rel
there is only one nonprime
impossible that this attribute
attribute and therefore
can determine another nonprime attrih
The relation Manufacturer is
for susceptible to update anomalies. Con-
example the case in which one of the manufacturers
changes its no
If the value of this attribute is not changed in all of the
correspond-
tuples there is the possibility of having an inconsistent
database.
9.5.4 BCNF
BCNF is better than 3NF. A relation is in BCNF
must be in 3NF I
vice versa is not true that a relation is in 3NF
not necessarily be in
BC-
BCNF state that
Arelation Ris in BCNF if and only if every determinant is
Here determinant is a a candidate ke
simple attribute or composite attribute on which sor-
other attribute is fully
functionally dependent.
For example Qty is FFD on (Sno, Pno)
(Sno, Pno) Qty
(Sno, Pno) is composite determinant
Sno S name
Here Sno is simple attribute determinant.
In order to show the difference between 3NF and BCNF We
consider the overlapping of the
Two candidate
candidate key.
key overlap
each and have attribute
if they involve two or more ttrib
in common. au
xample in
f o r

relation Invoice
Iaroice
Tnvoice Name 153
101
101
Patiala Steel temNo. Cty
Patiala Steel 3275 Oty
101 Patiala Steel 3371 100
102 Cotton Mi 20
7312
Here N a n is
Name unique for each 1274 500
f the relation is invoice 600
D of no
F Do
navoice
l n v o i c e

No., Item No.)


Name, ltem No.) Qty
tem No.) Qty
Name
NaThis rela
(Name)
lation has Item No.
two
. 1 overlapping
tof whichcandidate key (Invoicecandidate key because
two composite
w No., thereItemn- are
item-No is Item-No)
common attribute and (Name,
ame, Item
this is case in
of candidate both
keys of the
both the overlapping key.
candidate
Here relations are in 3NF
because every non
mOn-transitively fully functional
dependent on
key attribute
the primary
In the above relation, there is key.
and it is FFD and non only one non key attribute i.e. Qty
transitively
ame. Item-No are non-key dependent on the primary key
attributes because they can participate
into the primary key.
But invoice relation is not in BCNF because this relation has four
determinant
(Inoice-No., Item-No.)
(Name, Item-No.)
Item-No.
Name
Item-
Out of these four determinant two determinate (Invoice-No.,
are unique but Item-No and Name
and (Name, Item-No.)
No.) keys. In order to made this relation
determinants are not candidate
relation into
in BCNF we decompose this
and
ID-Name (Invoice No., Name)
D-Qty (Invoice No., Item-No., 9ty) both are unique.
d e t e r m i n a n t s (Invoice
No., Name and Now
-Name has two Item-No.) and
is also unique.
D0h
Bty (Invoice No.,
thi s has one
onne determinant
a relation is in BCNF.
and BCNF also be
Similarities between 3NF after application
of 3NF can
and
The relations which
C O U R S E _ S T U D E N T
achieved
he are
relation
also not
in BCNF.
Achieved by BCNF.
UDENT
For
example,
are not in
3NF are
ENT SYSTEMM_CHARGE which
COURSE_STUDENT AMS
(Course_Code,Rollno, Name, System_Used, Hourly_Rate.Toht.
Here, (Course_Code, Rollno) Total_Hours Total Hors
Rollno- Name | System_Used | Hourly_Rate
Here. Rollno is a determinant but not candidate kev
COURSE_STUDENT is not in BCNNF.
ey SO Tela.
In relation STUDENT_SYSTEM_ CHARGE
(Rollno,Name,System_Used, Hourly_Rate)
Rollno Name | System_Used | Hourly_Rate
System_Used Hourly_Rate
Here, System_Used is also a determinant but it is not
relation STUDENT_SYSTEM_CHARGE is not in BCNF.
Now, Consider CoURSE, HOUR_ASSIGNED, STUDENT
CHARGE relations discussed earlier which are in 3NF.
COURSE (Course Code,Course Name,Teacher_Name)
SYSTE
HOUR ASSIGNED ( Course_Code,Rollno, Total_Hours)
STUDENT_SYSTEM (Rollno,Name,System_Used)
CHARGE(System_Used,Hourly_Rate)
These relations are also in BCNF,because
Course_Code Course_Name | Teacher_Name ([n relation COm
(CourseCode,Rollno) Total_Hours (In relation HOUR ASSt
Rollno Name | System_Used (In relation STUDEN GNE
System_Used Hourly_Rate NT_SYSTE
(In relation CHARGS
Here, each determinant is unique in its corresponding relation
In conclusion, we can say that in these relations which have only sh
sind
candidate key can be normalized both with 3NF and BCNF without any prob
9.5.4.2 Differences in 3NF and BCNF
In order to show the difference between 3NF and BCNF, relationshat
overlapping of candidate keys are considered in detail.
Overlapping of Candidate keys
Two candidate keys overlap if they involve two or more attributes ai
and have an attribute in common.
For example, in Manufacturer relation:
Manufacturer ( Id no,Name, Item_No, Quantity)
Manufacturer
Id No Name Item No Quantity
M101 Electronics USA H3772 1000
M101 Electronics USA J08732 700
M101 Electronics USA Y23490 200
M322 Electronics-R-Us 900
H3772
MALIZA
Name is considered ni
Here,
of a b o relation is
FDo fabove dque for ea
lque
ItemNo) ch ld no. 155
d no,, Item_No)
Name. Quantity
ldNo Quantity
Name
Name
elation has
id No
n
has two
overlapping
Osite candidate kevs
(Id
is candidate
Itendidate
he
of
I t e m _ N o
terlapping of
overlapPping

common tat
C
of candidate
ribub0,
didate keys. Doth the c keys,because there are
ame, Item_No) out of
Possible
FD diagram of this case is:
eys, so this is a
Qantity
Ttem_No
Qantity tem_
Name
ld_No ld No
Name
Figure 9.3.
Here
ra hoth the relations are in 3NF, because every non-key attribute is
-transitively fully functional dependent on the primary key.
fo ahove relation, there is only one non-key attribute i.e.
t is FFD and non transitively dependent on the primary kev.
Quantity and
Name, Id_No are not non-key attributes because they can participate
as shown in FD diagram.
into the primary key
But, Manufacturer relation is not in BCNF because this relation has
four determinants
(Id_no, Item_No)
(Name,Item_No)
Item_No
Name
Example
For consider a relation
example,
SSP (Sno, Sname, Pno, Qty)
for each Sno.
Nere, Sname is considered unique
FD of above relation is
Sno,Pno) 9ty
Sname,Pno) 9ty
Sno Sname
Sname Sno
156
DBMS WITH MS
overlapping candidate keys,beca
This relation has two
candidate keys (Sno, Pno) and (Sname, Pno) ose of whi the
two composite
the candidate keys, so this is
in both
is c o m m o n attribute ue cos
overlapping of candidate keys.
Possible FD diagram of
this case is:

Qantity Pno
Qantity Pno

Sno Sname
Sname Sno

Figure9.4
in 3NF, because every non-ke
Here, both the relations are on the primary keyb
attri
non-transitively fully functional dependent
attribute i.e, Oh
In above relation, there is only one non-key.
FFD and non transitively dependent on the primary key. and
attributes because they can
an particin
Sname, Sno are not non-key
into the primary key as shown in FD diagram.
But, SSP relation is not in BCNF because this relation ho
1as i
determinants:
(Sno, Pno)
(Sname, Pno)
(Sno)
(Sname)
Out of these four determinants two determinants (Sno, Pno) and (Sta
Pno) are unique but Sno and Sname determinants are not candidate k
In order to make this relation in BCNF we non-loss decompose
relation in two projections SN (Sno, Sname) and SP (Sno, Pno, Qty.
SN relation has two determinants Sno, Sname and both are uniqu
SP has one determinant (Sno, Pno) and is also unique.
These two relations (SN, SP) removes all anomalies of SSP relation
5NF is of little practical use to the database designer, but it h
interestfrom a theoretical point of and a discussion of it is inci
view
here to complete the picture of the further normal forms.
In of
all the further normal forms discussed so far, no loss decomps
was achieved by the decomposing of a single table into two separate
No loss decomposition is possible because of the availability of he
operator as part of the relational model. In considering 5NF, coIs
must be given to tables where this non-loss on
achieved by decomposition into three or more decomposition ca
separate Laa
decomposition is not always possible as is shown the by following
NORMALIZATION
157
Consider the table
AGENT COMPANY_PRODUCT (Agent, Company, Product_Name)
This table lists agents, the companies they work for and the procauct
those the
sell for
they sell
companies. The agents do not
necessaruyomple
An example of
of
roducts suppliedbe:by
products
the companies
they do business with.
this table might
Agent Product_Name
Company
Savy Nut
ABC
Savy ABC Screw
Savy CDE Bolt
Vicky ABC Bolt
The table is necessary in order to show all the information required.
sells ABC's Nuts and Screws, but not ABC's Bolts. Vicky
Savy, for example, The tabie
ior CDE and does not sell ABC's Nuts or Screws.
is not an agent
it contains no multi-valued dependency. It does, however,
is in 4NHF because fact that Savy 1s
an element of redundancy in that it records the
contain this redundaney
1or ABC twice. But there is no way of eliminating
an agent into its
information. Suppose that the table is decomposed
without losing
two projections,
P1 and P2.
P1

Agent Company
ABC
Savy
Savy CDE
ABC
Vicky
P2
Agent Product Name
Nut
Savy
Screw
Savy
Savy Bolt
Bolt
Vicky which
been eliminated, but
the information about
has
The redundancy
and which of these products they
supply
companies make which products over
natural join of these projections
to which agentshas been lost. The
the 'agent' columns is: Product_Name

Agent Company
Nut
ABC
Savy ABC Screw
Savy ABC
Bolt
Savy CDE
Nut
Savy CDE Screw
Savy CDE Bolt
Savy ABC Bolt
Vicky
158
The table from this join
OBMS WITH MG A
resulting Is
spurious, since the ast
of the table contains incorrect information. Now suppose thate
table were to be decomposed into three tables, the two projection
two eo
projectio
steriskeq,
P2 which have already shown, and the final, possible projectionP
P3 P3.
Company Product Name
ABC Nut
ABC Screw
ABC Bolt
CDE Bolt
If a
join is taken of all three projections, 1irst or Pl and P2
spurious) result shown above, and then of this result with P3 w de

Company' and Product name' column, the following table is obtain


Agent iained:
Company Product Name
Savy ABC Nut
Savy ABC
Screw
Savy ABC
Bolt
Savy CDE Bolt
Vicky ABC Bolt
This still contains a spurious row. The order in which the joins
performed makes no difference to the final result. It is not
simply Dossl
of decompose the 'AGENT_COMPANY PRODUCT
without losing information. Thus, it has to be
table,
as she populated
accepted that it is not possh
to eliminate all redundancies using normalization
cannot be assumed that all techniques, because
decompositions will be non-loss.
But now consider the different case where, if an
agent is an agent fora
company and that company makes a product, then he always sells
that
product for the company. Under these circumstances, the
agent _company_product' table as shown below:
Agent Company Product Name
Savy ABC Nut
Vicky ABC Bolt
Vicky ABC Nut
Savy CDE Bolt
Savy ABC Bolt
The
assumption being that ABC makes both Nuts and Bolts
ana uree
CDE makes Bolts
only. This table can be decomposed into
projections without loss of information as demonstrated below: its
ATION

P1
Agent
CESS Savy
ompany
159

row Savy ABC


sinal
and Vicky CDE
ABC
P2
Agent
Savy Product Name
Savy Nut
Vicky Bolt
Vicky Bolt
Nut
P3
Company
ABC Product_Name
ABC
Nut
Bolt
CDE
Bolt
ndancy has been removed, if the
natural join of Pl and P2
All
the result is: isS
Agent Company Product Name
Savy ABC Nut
Savy ABC Bolt
CDE Nut
Savy
Savy CDE Bolt
ABC Bolt
Vicky
ABC Nut
Vicky
spurious Now, if this result is joiined with P3 over
row as asterisked.
The
'company and product_name the following table is obtained
column
Company Product Name
Agent Nut
ABC
Savy Bolt
ABC
Savy CDE
Bolt
Savy ABC
Bolt
Vicky ABC
Nut
Vicky table and a
no losSs

recomposition
of the original the order in
1s a correct achieved. Again,
s three projections
was
result. The original
O n t o the affect the
final
h are performed does not non-loss
decomposable
was
ons because it
v1olated 5NF simply
stS ththree projections.
in
g , 6A C O

STUDY OF DA data

nalization: Process
o r g a n i z i n g

of efficiently
N o r m

RELATIONS
(attributes group
g r o u p e d together)

urate representation and sts.


constraints.

Goal:
of data,
data, relationships

Eliminate
Eliminate
redundant data
Ensure ata in aa Data Base.
Data Base.

Guidelines for dependencies


ensuring make
ake sense.
sense.
INF
2NF, 3 BCNF. that
inat DBs
DBs are normalized '! normal orms:
are

Normalizauon:Series
ies or violates the of
satisfies or tests on a relation to determine whetner it
Atate: meet pracucal business requirements of a normal form.

Normalization: A requirements.
technique for producing a
sirable properties, given set of relations with
the data
Redundant Relation requirements of an
enterprise.
Staff Relation
Staff No. S Name S Address Position Salary Branch No. |
101 Ram Ludhiana Clerk 30000 5000
102 Sham Patiala Manager 50000 5001
101 Ram Ludhiana Clerk 30000 5000

Branch Relation
Branch No. Branch Address Telephone No.
Ludhiana 367546
5000 234569
Patiala
5001
456780
5002 Rajpura

taff Branch Relation Branch Branch Telephone


Position Salary Address No.
S S No.
5000 Ludhiana 367546
Name Address
30000 234569
Ludhiana Clerk Patiala
Ram 50000
5001
456780

Patiala Manager 5002 Rajpura


Sham 10000
noinura Peon
162 DBMS WITH MS
PROCESs OF NORMALIZATION
Un Normalized form (UNF): A table that contains one or .
ing groups. more Tep
Repeating group: An attribute or group of attributes with
that occurs with multiple values for a single occurrence of the a
key attributes of that table. Some of cell are having no val
es
Staff Relation
Staff No. S Name S Address Position Salary
101 Ram Ludhiana Clerk Branch
102 Sham Manager 50000 5000
103 Peon 10000
5001
Rajpura,
Patiala 5002
Relation is not in 1 NF because some of the cell is having
ing no valu
and some are having double values
To convert UNF into 1 NF the relation will be

Position Salary
Staff No. S Name S
Address Branch N
101 Ram Ludhiana Clerk 30000
5000
102 Sham Patiala Manager 50000 5001
101 Ram Ludhiana 30000Clerk 5000
First normal form (1NF): A relation in which the intersection of
row and column contains one and only one value.

UNF 1NF:
Remove repeating groups:
Entering appropriate data in the empty columns of rows.
Placing repeating data along with a copy of the original key attrib
in a separate relation.

primary key for each of the new relations.


Identifying a

Second normal form (2NF):


A relation t is in 1NF and

Every non-primary key attribute is fully functionally dependent


the primary key.
Note: Applies to relations with composite keys (primary key compos
two or more attributes). A relation with a single attribute prila
key is in at least 2NF.
Full functional dependency: If A and B are attributes of a relauo
*

fully functionally dependent on A if B is functionally dependent on


not any proper subset of A.
can
cau
A B is partially dependent if there is some attribute that
removed from A and the dependency still holds.
ALALIZomer_Re
AT77
TIO
Custome
R r
R e
e ntal Relation
163
Cust
n e r Property
c
Rent
Name Address Rent
Start Rent
stomer No.
John 6 Lawrence
1 July Finish Order O.
P G 4

Street 31 Aug.
350
No Name
NO Kay 1993 C040
1995 Tina
Glasgow
John 5 Novar Driver, 1
Sept. 1 Sept.
Murphy
PG16
Kay Glasgow 1995 450
1996 CO93
CRTO Aline 6 Lawrence 1 Sept. Tony
PG4 Stewart| Street, Glasgow 1992 10 June Shaw
350 CO40
CRS6 2 Manor Road,
1993 Tina
PG36
Aline 10Oct. 1 Dec. Murphy
StewartGlasgow 1993 375 Co93
CRS6 1994 Tony
Aline 5 Novar Drive, 1 Jan Shaw
PG16 10 Aug.
Stewart | Glasgow 1995 1995
450 CO93
CRS6 Tony
Dependencies in Shaw
nction.
Customer_Rental Relation
Customer_N
T No. Property_No. - RentStart, RentFinish
Customer_No - CName

No
P r o p e r t y _ N o
PAddress, Rent, Owner_No. Oname
Owner_No. - OName
INF 2NF:

Remove partial dependencies:

r furnctionally dependent attributes are removed from the relation


by placing them
by placing in a new relation along with a copy of their
determinant.

You might also like