You are on page 1of 32

Normalization

Part 1
What is Data Normalization?
•The process of decomposing relations
with anomalies to produce smaller,
well-structured relations
Importance of Normalization
•To reduce anomalies
▫ Database that is free of update, insertion
and deletion anomalies
Anomalies
Types of Anomaly
• An anomaly is an inconsistent, incomplete, or
contradictory state of the database
• An undesirable side effect
• Types of anomalies
▫ Insertion anomaly
▫ Deletion anomaly
▫ Update anomaly
Insertion Anomaly
• Inserting a partial record (prime attribute is unknown at the time of insertion)
▫ Problem with composite PKs
• user is unable to insert a new record of data when it should be possible to do so because not all
other information is available.
▫ A new course has been approved CUIT220 called Oracle Programming is it possible to enter this course ?
▫ New department no course yet?

CourseCode CourseName DeptCode


CUIT201 Database ICT
CUIT201 Database Mechatronics
CUIT211 Web Design ICT
CUIT220 Oracle Programming ?
Deletion Anomaly
• When a record is deleted, other important information
that is tied to it is also deleted.
▫ If the department of ICT is deleted from the database. We lose
information about course CUIT211 that is tied to ICT.

CourseCode CourseName DeptCode


CUIT201 Database ICT
CUIT201 Database Mechatronics
CUIT211 Web Design ICT
Update Anomaly
• A record is updated, but other appearances of the same items are
not updated because of multiple occurrences
▫ Suppose the course name for CUIT201 is changed from Database to
Database Management. How many times do you have to make this
change in the COURSE table in its current form if suppose they were
100 instances of that course?
CourseCode CourseName DeptCode
CUIT201 Database ICT
CUIT201 Database Mechatronics
CUIT211 Web Design ICT
Resolving Anomalies
• So, a table (relation) is a stable (‘good’) table only if
it is free from any of these anomalies at any point
in time.
• You have to ensure that each and every table in a
database is always free from these modification
anomalies.
▫ How do you ensure that?
• ‘Normalization’ theory helps.
Exercise: describe the update anomalies in the table

Plant Manager Machine Supplier-Name Supplier-City

Plant-A Rungata Lath Maromba Harare


hardware
Plant-A Rungata Boiler ABC industry Harare

Plant-B Moyo Cutter Mweras Masvingo


Machinery
Plant-B Moyo Boiler Mweras Masvingo
Machinery
Plant-B Moyo CNC Maromba Harare
hardware
Normalization
First Normal Form (1NF)
What is Data Normalization?
• The process of decomposing relations with
anomalies to produce smaller, well-
structured relations
Normal Forms
First Normal Form(1NF)
Second Normal Form(2NF)
Third Normal Form(3NF)
Boyce-Codd Normal Form(BCNF)
Fourth Normal Form(4NF)
Fifth Normal Form(5NF) *

Domain/Key Normal Form (DK/NF)


FIRST NORMAL FORM
• A table is in 1NF if there are no repeating groups in the table.
▫ A repeating group derives its name from the fact that a group of multiple entries of
the same type can exist for any single key attribute occurrence
 arrays
 A form
• A table is in 1NF if all non-key fields are functionally dependent on the primary
key (PK).
▫ That is, for each given value of PK, we always get only one value of the non-key
field(s).
▫ The values in an atomic domain are indivisible units.
▫ No arrays
▫ Transform form to relation
To Eliminate the Repeating Groups
• Present the data in a tabular format, where each
cell has a single value and there are no repeating
groups.
• Eliminate the nulls by making sure that each
repeating group attribute contains an appropriate
data value
• Identify the primary key
Relation is not in 1NF Relation in INF

COURSE-CONTENT(Course, Content)

1. Determine PK for the COURSE-CONTENT table


2. FDs
Content Course
Exercise: Normalise the table to 1NF
Plant Manager Machine Supplier-Name Supplier-City

Plant-A Rungata Lath Maromba Harare


hardware
Boiler
ABC industry Bulawayo
Plant-B Moyo Cutter Mweras Masvingo
Machinery

Boiler Meiky Industries Mutare

Maromba Harare
CNC hardware
Plant Manager Machine Supplier-Name Supplier-City

Plant-A Rungata Lath Maromba hardware Harare

Plant-A Rungata Boiler ABC industry Bulawayo

Plant-B Moyo Cutter Mweras Machinery Masvingo

Plant-B Moyo Boiler Meiky Industries Mutare

Plant-B Moyo CNC Maromba hardware Harare

FDs
• Plant Manager
• Supplier-Name SupplierCity
Plant(Plant, Manager, Machine, Supplier-Name, Supplier-City)
COURSE-REGISTRATION:
(Student Name,
Registration Number,
Programme, Level, {
CourseNumber,
CourseName })
Student_Course Table (1NF)
LastName FirstName RegNumber Programme Level Course_Num Course_Name

Mara John C1105 BSIT 2.1 C201 Database


Systems
Mara John C1105 BSIT 2.1 C202 Numerical
Methods
Mara John C1105 BSIT 2.1 C203 PC Design
Mara John C1105 BSIT 2.1 C113 Visual
Programming

FDs
• RegNumberLastName, FirstName, Programme, Level
• Course_NumCourse_Name
Update anomalies of the Student-Course table
▫ Insertion anomaly
 New student not yet registered any course
 New course no students assigned to it
▫ Deletion anomaly
 Only one student registered for a certain course and
decides to drop that course?
▫ Update anomaly
 Multiple occurrences of LastName
Transform the Order Form to 1NF
Normalization
2nd Normal Form
Second Normal Form
• A table is in 2NF if it is in 1NF and has no partial dependencies.
▫ Prime/key attribute − An attribute, which is a part of the prime-
key, is known as a prime attribute.
▫ Non-prime/key attribute − An attribute, which is not a part of the
prime-key, is said to be a non-prime attribute.
• Every non-key attribute should be fully functionally dependent on
the primary key.
Resolving Partial Dependency
• Make new relations to Eliminate Partial
Dependencies
▫ For each component of the primary key that acts as a
determinant in a partial dependency, create a new table
with a copy of that component as the primary key.
• Decompositions must be loss less
▫ One of the relations must contain a super key
Decompose R into 2NF
R(A,B,C,D); F={AC, ABD}
• R is not in 2NF because C is not fully
dependent on the primary key (partial
dependency)
• AC : R1(A, C) and
• ABD : R2(A,B, D)
 SK in R2
Normalise COURSE-CONTENT to 2NF
FDs F ={Content Course}
COURSE-CONTENT(Course, Content)
• The key is single attribute key
• Non key attribute Course is fully dependent
on PK Content
• The relation is already in 2NF
Plant Manager Machine Supplier-Name Supplier-City

Plant-A Rungata Lath Maromba hardware Harare

Plant-A Rungata Boiler ABC industry Bulawayo

Plant-B Moyo Cutter Mweras Machinery Masvingo

Plant-B Moyo Boiler Meiky Industries Mutare

Plant-B Moyo CNC Maromba hardware Harare

Plant(Plant, Manager, Machine, Supplier-Name, Supplier-City)


FDs
• Plant Manager
• Supplier-Name SupplierCity
• Non-key attributes are partially dependent on PK
Plant(Plant, Manager, Machine, Supplier-Name, Supplier-City)
FDs
• Plant Manager
• Supplier-Name SupplierCity
• Plant,Machine,Supplier-Name Plant,Machine,Supplier-Name (Trivial)
FDs for Student_Course table
• RegNumber  LastName, FirstName, Programme, Level
• Course_num  Course_Name
• Non key attributes are partially dependent on PK
LastName FirstName RegNumber Programme Level Course_Num Course_Name
Mara John C1105 BSIT 2.1 C201 Database
Systems
Mara John C1105 BSIT 2.1 C202 Numerical
Methods
Mara John C1105 BSIT 2.1 C203 PC Design
Mara John C1105 BSIT 2.1 C113 Visual
Programming
Student_Course Table to (2NF)
Student(RegNumber, LastName, FirstName, Programme, Level)
Course(Course_Num, Course_Name)
Student_Course(RegNumber, Course_Num)
Student
LastName FirstName RegNumber Programme Level Course
Mara John C1105 BSIT 2.1 Course_Num Course_Name
C201 Database
Student_Course Systems
RegNumber Course_Num C202 Numerical
C1105 C201 Methods
C1105 C202 C203 PC Design
C1105 C203 C113 Visual
Programming
C1105 C113
Normalise ORDER table to 2NF
ORDER
OrderNum CustNum CustName CustAddr City Country Date ProductN Des QTY UnitPrice
ess o
1234 9876 Billy 456 High Hong China 14/04/98 A123 Pencil 100 $3.00
Tower St Kong
1234 9876 Billy 456 High Hong China 14/04/98 B234 Eraser 200 $1.50
Tower St Kong
1234 9876 Billy 456 High Hong China 14/04/98 C345 Sharpener 5 $8.00
Tower St Kong

FDs
• OrderNumCustNum, CustName, CustAddress, City, Country, Date
• CustNum CustName, CustAddress, City, Country
• ProductNo Des, UnitPrice
• OrderNum, ProductNoQTY

You might also like