You are on page 1of 54

Database Lecture Notes Normalization 2 – How to Normalize

Dr. Meg Murray mcmurray@kennesaw.edu

Normalization Why?
• All relations are not equal • Tables not normalized experience issues known as modification problems
– Insertion problems
• Difficulties inserting data into a relation

– Modification problems
• Difficulties modifying data into a relation

– Deletion problems
• Difficulties deleting data from a relation

Deletion Anomaly

• If you delete any row, you delete information about both the machine and the repair

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall

Modification Anomalies
• The EQUIPMENT_REPAIR table before and after an incorrect update operation on AcquisitionCost for Type = Drill Press:

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall

Normalization
• Normalization is a process of analyzing a relation to ensure that it is well formed • More specifically, if a relation is normalized (well formed), rows can be inserted, deleted, or modified without creating update anomalies

KROENKE and AUER - DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall

KROENKE and AUER .DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Normalization Review: Solving Modification Problems • Most modification problems are solved by breaking an existing table into two or more tables through a process known as normalization • So the question….

How Many Tables? Should we store these two tables as they are. or should we combine them into one table in our new database? .

Normal Forms • Relations are categorized as a normal form based on which modification anomalies or other problems that they are subject to: KROENKE and AUER .DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Normal Forms • 1NF – A table that qualifies as a relation is in 1NF • 2NF – A relation is in 2NF if all of its nonkey attributes are dependent on all of the primary key [focus is on composite primary keys] • 3NF – A relation is in 3NF if it is in 2NF and has no determinants except the primary key • Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key KROENKE and AUER .

so help me Codd.” .BCNF • Boyce-Codd Normal Form (BCNF) – A relation is in BCNF if every determinant is a candidate key “I swear to construct my tables so that all nonkey columns are dependent on the key. the whole key and nothing but the key.

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Normalization Review: Definition Review • Determinant – The attribute that can be used to find the value of another attribute in the relation – The right-hand side of a functional dependency StudentID  (StudentName. DormName. DormRoom) KROENKE and AUER .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Normalization Review: Definition Review II • Candidate key – The value of a candidate key can be used to find the value of every other attribute in the table – A simple candidate key consists of only one attribute – A composite candidate key consists of more than one attribute KROENKE and AUER .

Phone) •What is the primary key? •What are the candidate keys? •What are the non-keyed attributes? KROENKE and AUER . ContactName.The CUSTOMER Table CUSTOMER (CustomerNumber. ZIP. CustomerName. State. City.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall . StreetAddress.

the whole key and nothing but the key? .Simple Examples • Remember the question: – Is every determinant a candidate key or – are all nonkey columns dependent on the key.

DormCost) •What are the determinants? •Does StudentID determine Student Name? •Does Student ID determine Dorm Name? •Does Student ID determine Dorm cost? •Probably not – more likely Dorm Name does •If so.Normalization Example (StudentID) (StudentName. the whole key and nothing but the key? KROENKE and AUER . Dorm Name is a determinate of Dorm cost •Is StudentID a candidate key? •Is Dorm Name a candidate key? Is every determinant a candidate key Are all nonkey columns dependent on the key. DormName.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall (StudentName. DormName) (DormCost) . resulting in the relations: (StudentID) (DormName) KROENKE and AUER . if… (StudentName.Normalization Example (StudentID) However. DormName. DormCost) (DormCost) (DormName) Then DormCost should be placed into its own relation.

AttorneyID MeetingDate.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall . Duration) ClientID ClientName MeetingDate Duration However. KROENKE and AUER .Cl ientID) ATTORNEY (ClientName.Normalization Example (AttorneyID. if… (ClientID) (ClientName) Then ….

resulting in the relations: (AttorneyID.Then ClientName should be placed into its own relation.C lientID) (ClientID) SCHEDULE AttorneyID ClientID MeetingDate Duration (MeetingDate. Duration) (ClientName) CLIENT ClientID ClientName .

Walking through the forms .

• Create separate tables for each group of related data – Give each table a primary key (unique identifier) • Putting a Table into 1NF makes it a Relation – Do you remember the rules of a relation? .1st Normal Form [1NF] • Eliminate Repeating Groups – Eliminate duplicative columns from the same table.

Is this in 1NF? .

Characteristics of 1NF • Characteristics – Table Format – No repeating groups – Primary key (PK) identified .

Eliminate repeating groups. .Steps to 1NF 1. where each cell has a single value and there are no repeating groups. – Present data in a tabular format.

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Repeating Groups • Do you see the repeating group in this table? KROENKE and AUER .

. but would not uniquely identify all of the remaining row attributes. – The combination of AIRCRAFT_NUMBER and PILOT_NUMBER is a PK candidate that will uniquely identify all row attributes.Steps to 1NF 2. Identify the Primary Key (PK) – At a first glance the AIRCRAFT_NUMBER seems a good candidate for a PK.

Table in 1NF Table with primary key identified [attributes listed vertically] .

aircraft name is only dependent on a part of the composite AIRCRAFT_NUMBER. PILOT_NUMBER --> AIRCRAFT_NAME. MISSION_CLASS. • AIRCRAFT_NUMBER --> AIRCRAFT_NAME – Partial dependency . . PILOT_NUMBER key. PILOT_NAME. COST_HOUR – Primary Key (PK) dependency.. The PK is also a composite key. FLYING_HOUR.Identify Dependencies • AIRCRAFT_NUMBER..

FLYING_HOUR.....Identify Dependencies • PILOT_NUMBER --> PILOT_NAME – Partial dependency . COST_HOUR – Partial dependencies • MISSION_CLASS --> COST_HOUR – Transitive dependency . • PILOT_NUMBER --> PILOT_NAME. PILOT_NUMBER key.. pilot name is only dependent on a part of the composite AIRCRAFT_NUMBER. COST_HOUR non-prime/non-key attribute is dependent on non-prime/non-key MISSION_CLASS attribute .

Helpful to Create Dependency Diagram .

a non-key field must provide a fact about the whole key .not just one part of the key .2nd Normal Form [2NF] • Characteristics – For Tables with composite keys – 1NF – No partial dependencies • In other words.

remove it to a separate table.com/od/specificproducts/a/2nf. – Often map to components [themes.htm .about. entities…] • Create relationships between these new tables and their predecessors through the use of foreign keys. http://databases.2nd Normal Form [2NF] • If an attribute depends on only part of a composite key.

Look for Partial Dependencies • Ask questions such as: • Are both Aircraft_Number and Pilot_Number needed to determine Pilot_Name? • Are both Aircraft_Number and Pilot_Number needed to determine Mission_Class? •… .

Look for Partial Dependencies • Move partial dependencies to their own tables AIRCRAFT (Aircraft_Number) PILOT (Pilot_Number) FLYING HOURS (Aircraft_Number. Pilot_Number) .

Steps to 2NF Notice how moving partial dependencies separates into key components AIRCRAFT PILOT FLYING HOURS .

PILOT_NAME.Steps to 2NF • Assign dependent attributes to each key component – AIRCRAFT ( AIRCRAFT_NUMBER. MISSION_CLASS. FLYING_HOUR ) . COST_HOUR ) – FLIGHT ( AIRCRAFT_NUMBER. PILOT_NUMBER. AIRCRAFT_NAME ) – PILOT ( PILOT_NUMBER.

Draw New Dependency Diagram .

If attributes do not contribute to a description of the key. remove them to a separate table.3rd Normal Form [3NF] • Eliminate Columns Not Dependent On Key . – Third normal form is violated when a non-key field is a fact about another non-key field [transitive dependency] .

3rd Normal Form • Characteristics – 2NF – No transitive dependencies 3-38 .

3rd Normal Form 3-39 .

Back to the ERD 3-40 .

Is the Air Pilot Example in BCNF? • Are all determinants candidate keys? • Often when in 3NF. also in BCNF .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Steps to BCNF KROENKE and AUER .

Another Example .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Putting a Relation into BCNF: EQUIPMENT_REPAIR KROENKE and AUER .

RepairDate.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall . Type.Identify Functional Dependencies EQUIPMENT_REPAIR (ItemNumber. Type. RepairDate. RepairAmount) FD: ItemNumber  (Type. AcquisitionCost. RepairAmount) Is there a determinate key that is not a candidate key? KROENKE and AUER . AcquisitionCost. RepairNumber. AcquisitionCost) RepairNumber  (ItemNumber.

RepairDate. RepairAmount) KROENKE and AUER . RepairNumber.Put into Tables • ItemNumber is not a candidate key so – Move it and its attributes to a new table • ITEM(ItemNumber.Type.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall . Acquisition) – Leave a foreign key in the original table • REPAIR (ItemNumber. Acquisition) – The determinate becomes the primary key • ITEM(ItemNumber.Type.

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Putting a Relation into BCNF: New Relations KROENKE and AUER .

DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .Putting a Relation into BCNF: SKU_DATA KROENKE and AUER .

Buyer must exist in SKU_DATA. Buyer) Buyer  Department SKU_DATA (SKU. SKU_Description. Department. SKU_Description.Putting a Relation into BCNF: SKU_DATA SKU_DATA (SKU. Department) Where BUYER. Buyer) SKU  (SKU_Description. Buyer) BUYER (Buyer.Buyer KROENKE and AUER . Department.DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall . Department. Buyer) SKU_Description  (SKU.

Putting a Relation into BCNF: New Relations KROENKE and AUER .DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .

Multivalued Dependencies • A multivalued dependency occurs when a determinant determines a particular set of values: Employee  Degree Employee  Sibling PartKit  Part • The determinant of a multivalued dependency can never be a primary key KROENKE and AUER .DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .

Multivalued Dependencies KROENKE and AUER .DATABASE CONCEPTS (3rd Edition) © 2008 Pearson Prentice Hall .

so: – Always put multivalued dependencies into their own relation – This is known as Fourth Normal Form (4NF) .Eliminating Anomolies from Multivalued Dependencies • Multivalued dependencies are not a problem if they are in a separate relation.

org.za/willy/archive/2008/04/10/ta king-a-step-back-database-normalisation-1nf2nf-3nf-bcnf-and-4nf-part-1.References • Example for AirPilot: – http://dotnet.bkent.net/Doc/simple5.htm .aspx • Good reference – http://www.