You are on page 1of 45

ITP 250

Normalization
Lecture 4
Modeling Review
☞Consider relationships in pairs. Determine
relationship type and cardinality.
☞Collapse into one larger view. Remove many to many
relationships.
☞Assign primary and foreign keys. Determine
relationship strength.
☞Fill in remaining attributes.
Modeling Scenario
white board discussion on the Movie Busines
ate Database to track Movies and people invo

Business Scenario:
• Movies are released every year and have a specific genre
• Participants are everyone who can work on a Movie in a major
Role.
• Participants may work on many Movies
• Participants may have more than one Roles on the same or
different on Movies.
What are the entities?
What are the relationship between entities?
Normalization
Anomalies
UPDATE ANOMALY

How do we update the acquisition price of the Drill Press?


DELETION ANOMALY

What other data is removed when we delete a row of student data?


INSERTION ANOMALY

To add Tennis (Fee – 100), what other data must we enter?


Normalization the process of evaluating
and correcting table structures to reduce
redundancies and better separate subjects
WELL FORMED TABLES

EACH TABLE REPRESENTS A SINGLE SUBJECT


COURSE table only contains data about courses

NO DATA IS REDUNDANTLY STORED


Allows updates to occur in only one place

ALL NON-KEY ATTRIBUTES DEPENDENT ON THE PK


Yields simplest tables and data that is uniquely identifiable by the PK
Normalization Stages

☞ 1st normal form (1NF)


☞ 2nd normal form (2NF)
☞ 3rd normal form (3NF)
☞ Boyce-Codd normal form (BCNF)
☞ 4th normal form (4NF)
☞ In this course we will go up to 3NF
Normalization Steps
• Step 0: Select the data source and convert into UNF (un-normalized data/table)
• Step 1: Transform data in UNF into 1NF
• Step 2: Transform data in 1NF into 2NF
• Step 3: Transform data in 2NF into 3NF

It is helpful to know that occasionally, the data may still be subject to anomalies in 3NF. In this
case, we may have to perform further transformations:
• Transform 3NF to BCNF. Transform BCNF to 4NF. Transform 4NF to 5NF

Normalization Process
• Process involves applying a series of tests on a relation to determine whether it satisfies or
violates the requirements of a given normal form.
• When a test fails, the relation is decomposed into simpler relations that individually meet the
normalization tests.
• All these normal forms are based on the functional dependencies among the attributes of a
relation.
Example of Normalization Process
Source Data

Project Management Report


Step 0: Convert Source Data to an un-normalized (UNF) table
FIRST NORMAL FORM

☞ Identify primary keys


☞ Identify all dependencies
☞ If you can find a PK then you are in 1NF
Step 1: Transform the un-normalized data (UNF) into first normal form (1NF)
- Already is in 1NF
- Primark key (PROJ_NUM, EMP_NUM)
IDENTIFY ALL DEPENDENCIES

Block

☞ (Proj_Num, Emp_Num)  (Proj_Name, Emp_Name, Job_Class, Chg_Hour,Hours )


☞ Proj_Num  Proj_Name
Dependency
☞ Emp_Num  (Emp_Name, Job_Class, Chg_Hour)
☞ Job_Class  Chg_Hour
☞ (Proj_Num, Emp_Num)  Hours

☞ Project (Proj_Num, Proj_Name)


Schema ☞ Employee (Emp_Num, Emp_Name, Job_Class, Chg_Hour)
Relational
☞ Job (Job_Class, Chg_Hour)
☞ Assignment (Proj_Num, Emp_Num, Hours)
SECOND NORMAL FORM

☞ It is in 1NF
☞ It has no partial dependencies

☞ Start at 1NF
☞ Make new tables to eliminate partial dependencies
☞ Assign corresponding dependent attributes
Step 2: Transform data in first normal form (1NF) into second normal form (2NF)
• - Remove partial dependencies
Step 2: Transform data in first normal form (1NF) into second normal form (2NF)
• - Remove partial dependencies

• Take each non-key attribute in turn and ask the question: is this attribute dependent on one part of the key?
• If yes, remove the attribute to a new table with a copy of the part of the key it is dependent upon. The key it
is dependent upon becomes the key in the new table. Underline the key in this new table.
• If no, check against other part of the key and repeat above process
• If still no, ie: not dependent on either part of the key, keep attribute in current table.
THIRD NORMAL FORM

☞ It is in 2NF
☞ It has no transitive dependencies

☞ Start at 2NF
☞ Make new tables to eliminate transitive dependencies
☞ Assign corresponding dependent attributes
TRANSITIVE DEPENDENCY

☞ Attribute is dependent on a non-key attribute


Step 3: Transform data in second normal form (2NF) into third normal form (3NF)
• - Remove transitive dependencies
Step 3: Transform data in second normal form (2NF) into third normal form (3NF)
• - Remove transitive dependencies

• If a (non-key) attribute is dependent on another non-key attribute:


• Move the dependent attribute, together with a copy of the non-key attribute upon which it is dependent, to a new table.
• Make the non-key attribute, upon which it is dependent, the key in the new table. Underline the key in this new table.
• Leave the non-key attribute, upon which it is dependent, in the original table and mark it a foreign key (*).
Step 3: Transform data in second normal form (2NF) into third normal form (3NF)
• - Remove transitive dependencies

• If a (non-key) attribute is dependent on another non-key attribute:


• Move the dependent attribute, together with a copy of the non-key attribute upon which it is dependent, to a new table.
• Make the non-key attribute, upon which it is dependent, the key in the new table. Underline the key in this new table.
• Leave the non-key attribute, upon which it is dependent, in the original table and mark it a foreign key (*).
I swear to construct my tables so that all
non-key columns are dependent on the key
(1NF), the whole key (2NF), and nothing but
the key (3NF), so help me Codd.
1NF: has PK

2NF: no partial dependencies, only PKs 3NF: no other dependencies, only PKs
Denormalization the process of
purposefully producing a table of lower
normal form for better performance
1NF – Identify PK

A B C D E F G
1NF – Identify Dependencies
1NF – Relational Schema Notation

☞ TABLE1(A, B, C, D, E, F, G)
2NF – Make new tables to Eliminate
Partial Dependencies
☞ TABLE1(A, B, C, E, F)
☞ TABLE2(B, D)
☞ TABLE3(A, G)
3NF - Make new tables to Eliminate
Transient Dependencies
☞ TABLE1(A, B, C, E)
☞ TABLE2(B, D)
☞ TABLE3(A, G)
☞ TABLE4(E, F)
TRY THIS IN CLASS

☞ Normalize the following table into 3NF


Normalize the following table into 1NF

Item Acquisition Repair


Type Repair Date Repair Cost
Number Cost Number

RepairItem(RepairNumber, ItemNumber, Type, AcquisitionCost, RepairDate, RepairCost


Normalize the following table into 2NF
☞ Start at 1NF
☞ Make new tables to eliminate partial
dependencies
☞ Assign corresponding dependent
attributes
1NF => RepairItem(RepairNumber, ItemNumber,, Type, AcquisitionCost, RepairDate, RepairCost

Item Acquisition Repair


Type Repair Date Repair Cost
Number Cost Number

2NF => RepairItem(RepairNumber, ItemNumber, RepairDate, RepairCost


Item(ItemNumber, Type, AcquisitionCost
Normalize the following table into 3NF
☞ Start at 2NF
☞ Make new tables to eliminate transitive
dependencies
☞ Assign corresponding dependent attributes

2NF => RepairItem(RepairNumber, ItemNumber, RepairDate, RepairCost


Item(ItemNumber, Type, AcquisitionCost

Item Acquisition Repair


Type Repair Date Repair Cost
Number Cost Number

No transient dependencies
2NF => 3NF
RepairItem(RepairNumber, ItemNumber, RepairDate, RepairCost
Item(ItemNumber, Type, AcquisitionCost

You might also like