You are on page 1of 27

Data Normalization

Normalization

Database normalization is a technique for designing relational
database tables to minimize duplication of information and to
safeguard the database against certain types of logical or
structural problems, namely data anomalies

Objectives



Data normalization aims to derive at records which avoid
Repetition of Data
Update anomalies
Insert Anomalies
Delete Anomalies

Stud_Id Stud_name Address Module_id Module Instructor Dept
S1001 Smith Tom Main Street 1 DBT Williams 10
S1001 Smith Tom Main Street 2 DCN Wilson 10
S1001 Smith Tom Main Street 3 SE Sam 20
S1060 Jones Mary
Mountain
Shadow 1 DBT Williams 10
Student
The Process of Normalization
Usually three steps (in industry) giving rise to
First Normal Form (1NF)
Second Normal Form (2NF)
Third Normal Form (3NF)
In academia
Boyce -Codd Normal Form (BCNF)
Fourth Normal Form (4NF)
Steps in Data Normalization
UNORMALISED ENTITY
step1 ... Eliminate repeating groups
1st NORMAL FORM
step2 ... Eliminate partial dependencies
2nd NORMAL FORM
step3 ... Eliminate Transitive dependencies
3rd NORMAL FORM
step4 ... remove multi-dependencies
4th NORMAL FORM
step4 ..every determinate a key
BOYCE-CODD NORMAL FORM
Attributes - Repeating Groups
When a group of attributes has multiple values then we say there is a
repeating group of attributes in the entity
COMPANY NAME ADDRESS
BRANCH
NAME
BRANCH
ADDRESS
A123 ABC Ltd
100 High St ABC1 Manchester
ABC2 London
ABC3
Glasgow
(BRANCH_NAME, BRANCH_ADDRESS) is a repeating
group
Functional Dependency

Consider a relation R that has two attributes A and B. The attribute B
of the relation is functionally dependent on the attribute A if and only
if for each value of A no more than one value of B is associated.
In other words, the value of attribute A uniquely determines the
value of B

Stud_id -> stud_name
Stud_id -> Date of Birth
Module_id -> Module name
Marks -> grade

Full Functional Dependency
Let A and B be distinct collections of attributes from a relation R
B is then fully functionally dependent on A if B is not functionally
dependent on any subset of A.
(Stud_id, Module_id) -> marks

First Normal Form
A relation is in 1NF if and only if every attribute is single valued for
each tuple or row.

A relation is in 1NF if and only if there are no repeating groups of
Attribute values.


ORDER NUMBER
SUPPLIER NUMBER
ORDER DATE
DELIVERY DATE
500028
09/05/88
25/07/88
PART NO. PART-DESC QTY-ORD PRICE
O463 Hook 150 15.00
1492 Bolt 1000 10.00
3164 Spanner 10 5.00
TOTAL 30.00
1023
PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATE
DELIVERY-DATE, (PART#, PART-DESCRIPTION,
QUANTITY-ORDERED, PRICE), TOTAL-PRICE)
UN-NORMALISED ENTITY TYPE
Example of First Normal Form
Example in 1NF
PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATE
DELIVERY-DATE, TOTAL-PRICE)
PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,
QUANTITY-ORDERED, PRICE)
ENTITY TYPES IN 1NF
ORDER NUMBER
SUPPLIER NUMBER
ORDER DATE
DELIVERY DATE
500028
09/05/88
25/07/88
PART NO. PART-DESC QTY-ORD PRICE
O463 Hook 150 15. 00
1492 Bolt 1000 10. 00
3164 Spanner 10 5.00
TOTAL 30. 00
1023
Example
STUDENT NUMBER
STUDENT NAME
STUDENT ADDRESS
COURSE NO COURSE
TUTOR NAME TUTOR NO
S0843215
P. Smith
1, South Downs
Hale
PM951 Computing T. Long
037428
S212 Biology S. Short
096524
REGISTRATION FORM
STUDENT (Student#, student-name, student-address)

ENROLMENT (Student#, Course#, course-title,
tutor-name,tutor-staff#
1st Normal Form
Process results in separation of different objects
BUT anomalies may still exist
PURCHASE-ITEM-1( ORDER#, PART#, PART-
DESCRIPTION,QUANTITY-ORDERED, PRICE)
PART-DESCRIPTION appears on every PURCHASE-ITEM
occurrence.
This may result in anomalies when updating or deleting records
The problem in the example is that PART-DESCRIPTION is
functionally dependent only on PART# (part of the identifier)
Second Normal Form


A relation is in 2NF if and only if it is in 1NF and all the
non-key attributes are fully functionally dependent on the
key.

Any entity type in 1NF is transformed to 2NF
Identify functional dependencies
Re-write entity types so that each non-identifying attribute is
functionally dependent on the whole of the identifier
Example
PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATE
DELIVERY-DATE, TOTAL-PRICE)
PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,
QUANTITY-ORDERED, PRICE)
ENTITY TYPES IN 1NF
ORDER NUMBER
SUPPLIER NUMBER
ORDER DATE
DELIVERY DATE
500028
09/05/88
25/07/88
PART NO. PART-DESC QTY-ORD PRICE
O463 Hook 150 15. 00
1492 Bolt 1000 10. 00
3164 Spanner 10 5.00
TOTAL 30. 00
1023
Functional Dependencies
PURCHASE-ORDER (ORDER#, SUPPLIER#, ORDER-DATE
DELIVERY-DATE, TOTAL-PRICE)
PURCHASE-ITEM-1 ( ORDER#, PART#, PART-DESCRIPTION,
QUANTITY-ORDERED, PRICE)

ORDER#
PART#
PART-
DESCRIPTION
QUANTITY-
ORDERED
PRICE
In 2nd Normal Form
Decompose PURCHASE-ITEM into two entity types
PURCHASE-ITEM (Order#, Part#, Quantity-Ordered, Price)
PART (Part#, Part-Description)
Original entity types are decomposed into three entity types in
2nd normal form
PURCHASE-ORDER (Order#,Supplier#, Order-Date, Delivery-Date,
Total-Price)
PURCHASE-ITEM (Order#, Part#,Quantity-Ordered, Price)
PART (Part#, Part-Description)
Example in 2NF
STUDENT NUMBER
STUDENT NAME
STUDENT ADDRESS
COURSE NO COURSE
TUTOR NAME TUTOR NO
S0843215
P. Smith
1, South Downs
Hale
PM951 Computing T. Long
037428
S212 Biology S. Short
096524
REGISTRATION FORM
STUDENT (Student#,Student-Name, Student-Address)
ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)
COURSE (Course#, Course-Title)
ENTITY TYPES IN 2NF
Third normal Form
A relation is in 3NF if and only if it is in 2NF and no
non-key attribute is transitively dependent on the key.


Any enity type in 2NF is transformed in 3NF
Determine functional dependencies between non identifying
attributes
Decompose enity into new entities
Example
STUDENT NUMBER
STUDENT NAME
STUDENT ADDRESS
COURSE NO COURSE
TUTOR NAME TUTOR NO
S0843215
P. Smith
1, South Downs
Hale
PM951 Computing T. Long
037428
S212 Biology S. Short
096524
REGISTRATION FORM
STUDENT (Student#,Student-Name, Student-Adderss)
ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)
COURSE (Course#,, Course-Title)
ENTITY TYPES IN 2NF
Functional Dependencies
STUDENT (Student#,Student-Name, Student-Adderss)
ENROLMENT ( Student#, Course#, Tutor-Name, Tutor-Staff#)
COURSE (Course#,, Course-Title)
Student#
Course#
Tutor-staff#
Tutor-name
Example in 3NF
STUDENT (Student#,Student-Name, Student-Adderss)
ENROLMENT ( Student#, Course#, Tutor-Staff#)
COURSE (Course#,, Course-Title)
TUTOR (Tutor-Staff#, Tutor-Name)
STUDENT NUMBER
STUDENT NAME
STUDENT ADDRESS
COURSE NO COURSE
TUTOR NAME TUTOR NO
S0843215
P. Smit h
1, Sout h Downs
Hale
PM951 Computing T. Long
037428
S212 Biology S. Short
096524
REGISTRATION FORM
ENTITY TYPES IN 3NF
Example of Normal Forms
Let Work be a relation scheme that stores information about projects
in a large business organization.

Work
( PROJNAME,PROJMGR,STARTDATE,EMPID, HOURS,EMPNAME,
BUDGET,SALARY, EMPDEPT,EMPMGR, RATING)
Assumptions:
1. Each project has a unique name, but names of employees are
not unique.
2. Each project has one manager, whose name is stored in
PROJMGR.
3. Many employees may be assigned to work on each project, and an
employee may be assigned to more than one project. HOURS tells the
number of hours per week that a particular employee is assigned to
work on a particular project.

4. BUDGET stores the amount budgeted for a project, and a
STARTDATE gives the starting date for a project.

5. SALARY gives the annual salary of an employee
6. EMPMGR gives the name of the employees manager, who is not the
same as the project manager.

7. EMPDEPT gives the the employees department. Department names
are unique. The employees manager is the manager of the employees
department.

8. RATING gives the employees rating for a particular project. The
project manager assigns the rating at the end of the employees work
on that project.
Functional Dependencies
PROJNAME


EMPID

PROJNAME, EMPID
PROJMGR, BUDGET, STARTDATE

EMPNAME, SALARY, EMPMGR,
EMPDEPT

HOURS, RATING
Analysis of the sample FDs
1NF: Since PROGNAME and EMPID is the composite key, each cell
contain single value so WORK is in 1NF.
The partial dependencies are
PROJNAME PROJMGR, BUDGET, STARTDATE
EMPID EMPNAME, SALARY, EMPMGR, EMPDEPT

Transform the relation work into an equivalent collection of 2NF
relations
The relations schemes are:
PROJ ( PROJNAME, PROJMGR, BUDGET, STARTDATE)
EMP ( EMPID, EMPNAME, SALARY, EMPMGR, EMPDEPT)
WORK1 ( PROJANAME, EMPID, HOURS, RATING)
The above relations schemes are in 2NF.


3
rd
Normal Form
PROJ(PROJNAME,PROJMGR, BUDGET, STARTDATE)

EMP1(EMPID, EMPNAME, SALARY, EMPDEPT)

DEPT(EMPDEPT, EMPMGR)

WORK1(PROJNAME, EMPID, HOURS, RATING)

You might also like