You are on page 1of 29

Database Normalization

Database Normalization
 a technique for designing relational
database tables to minimize duplication of
information and, in so doing, to safeguard
the database against certain types of
logical or structural problems, namely
data anomalies.
Database Normalization
 Example, when multiple instances of a
given piece of information occur in a table,
the possibility exists that these instances
will not be kept consistent when the data
within the table is updated, leading to a
loss of data integrity.
Database Normalization
 A table that is sufficiently normalized is
less vulnerable to problems of this kind,
because its structure reflects the basic
assumptions for when multiple instances
of the same information should be
represented by a single instance only.
Why do we need to normalize?
 To avoid anomalies
Why do we need to normalize?
 Higher degrees of normalization typically involve
more tables and create the need for a larger
number of joins, which can reduce performance.
 Accordingly, more highly normalized tables are
typically used in database applications involving
many isolated transactions (e.g. an Automated
teller machine), while less normalized tables tend
to be used in database applications that do not
need to map complex relationships between data
entities and data attributes (e.g. a reporting
application, or a full-text search application).
Normal Forms
 1NF,2NF,3NF,BCNF,4NF,5NF
 Database theory describes a table's
degree of normalization in terms of normal
forms of successively higher degrees of
strictness.
 A table in third normal form (3NF), for
example, is consequently in second
normal form (2NF) as well; but the
reverse is not always the case.
Anomalies: Problems addressed by
Normalization
 If the proper normal forms aren't followed,
various undesirable side effects can occur
in a database
 These side effects are commonly referred
to as anomalies.
 Insertion Anomalies
 Deletion Anomalies
 Update Anomalies
Insertion Anomaly
 This occurs when you can’t add row into a
table

Until the new faculty member is assigned to teach at least


one course, his details cannot be recorded.
Deletion Anomaly
 This occurs when you can’t delete row into
a table, when the row you want to delete
contains a important piece of information,
or when the row you delete is the last one
in the table that contains this piece of
information.
Deletion Anomaly

All information about Dr. Giddens is lost when he


temporarily ceases to be assigned to any courses.
Update Anomaly
 These occur when there is unnecessary
redundancy in the data.

Employee 519 is shown as having different addresses on


different records.
First Normal Form (1NF)
 Any table that has only one value per cell,
or row/column intersection, is in 1NF.
 Columns contain only scalar values, not arrays
 There can be only one value per column-row
position (field) in a table.
Violation of 1NF (not normalized)
First Name Last Name Course Code Title

John Dela Cruz CS1, CS22, Intro to Computer


IS101, Concepts, Computer
Math3a Graphics, Mgmt Info
System, College
Algebra

Peter Domingo CS2, CS3, Programming 2, Data


Math3a, Structures, College
Math6, Algebra, Statistics,
IS102 Quality Assurance
Kaye Abad CS1,Math3a, Intro to Computer
Engl2 Concepts, College
Algebra, Comm. Arts
2
A table in the 1NF
FirstName LastName CourseCode Ttile
John Dela Cruz CS1 Intro to Computer Concepts
John Dela Cruz CS22 Computer Graphics
John Dela Cruz IS101 Mgmt Info System
John Dela Cruz Math3a College Algebra
Peter Domingo CS2 Programming 2
Peter Domingo CS3 Data Structures
Peter Domingo Math3a College Algebra
Peter Domingo Math6 Statistics
Peter Domingo IS102 Quality Assurance
Kaye Abad CS1 Intro to Computer Concepts
Kaye Abad Math3a College Algebra
Kaye Abad Engl2 Comm Arts 2
Second Normal Form (2NF)
 It is in the 1NF
 Every column that is not a part of the
primary key is functionally dependent on
the entire primary key.
Functional Dependency
 Attribute B is functionally dependent on
attribute A if, for each value of attribute A,
there is exactly one value of attribute B.
 Example, Employee Address is
functionally dependent on Employee ID,
because a particular Employee ID value
corresponds to one and only one Employee
Address value.
Employee Address  Employee ID
Second Normal Form (2NF)
 Test whether a table is in the 2NF, we
ask:
 What is the key to this relation? If the key is
concatenated or composite (more than one
column or attribute), we further ask:
 Are there any non-key columns that depends
on only part of the key
Violation of the 2NF
PatientID RelativeID Relationship Patient_Telephone
2490974 GGP001 Father 123-1342
2490974 GGP002 Guardian 123-1342
2490974 GGP003 Mother 123-1342
0803484 PDE001 Brother 789-3421
0803484 PDE002 Uncle 789-3421
9857092 AGE001 Sister 421-5986
Patient_Telephone is not functionally dependent on the entire
primay key (PatientID+RelativeID). The Patient_Telephone
column is fully dependent on PatientID alone.
Normalized – 2NF
PatientID RelativeID Relationship PatientID Patient_Telephone
2490974 GGP001 Father 2490974 123-1342
2490974 GGP002 Guardian 0803484 789-3421
2490974 GGP003 Mother 9857092 421-5986
0803484 PDE001 Brother
0803484 PDE002 Uncle
9857092 AGE001 Sister
Third Normal Form (3NF)
 It is in the 2NF
 All of its columns that are not part of the
primary key are mutually dependent
 The 3NF is concerned with the removal of
transitive dependencies in tables.
 There exist transitive dependency when a
column is dependent on a column not the
primary key.
Transitive Dependency
 A transitive dependency is an indirect
functional dependency, one in which X→Z
only by virtue of X→Y and Y→Z.
 Example:
 Column1 is directly dependent on the primary
key column; another column, say Column2, is
indirectly dependent on the primary key
column because of its dependency on
Column1.
 The dependency of Column2 to the primary
key is by virtue of its dependency on Column1.
Violation of 3NF
EmployeeID First_Name Last_Name Department Dept_Address
8934 Anna Burgin Research 2 Ferman St.
3049 Clarence Dillon Research 2 Ferman St.
4589 Elliot Freeman Research 2 Ferman St.
7623 Gail Healy MIS 3 Gauss Lane
6103 Keith Jordan MIS 3 Gauss Lane
4503 Michael Landon HR 6 Wiles Place
This table violates the 3NF because Dept_Address depends on
Department, which is not part of the primary key.
Normalized – 3NF
EmployeeID First_Name Last_Name Department
8934 Anna Burgin Research
3049 Clarence Dillon Research
4589 Elliot Freeman Research
7623 Gail Healy MIS
6103 Keith Jordan MIS
4503 Michael Landon HR

Department Dept_Address
Research 2 Ferman St.
MIS 3 Gauss Lane
HR 6 Wiles Place
Boyce-Codd Normal Form (BCNF)
 An extension of 3NF for the special case
where:
 There are at least 2 candidate keys in the table
 All the candidate keys are composite keys
 There is an overlapping column in the
candidate keys.
 A table to be in BCNF, it must be
 3NF
 All of its columns in all its candidate keys must
be functionally independent.
Violation of the BCNF
EmployeeID Extension Month Overtime_Hours
8934 9089 January 2
8934 9089 February 1.5
8934 9089 March 2
7623 8607 January 0
7623 8607 February 3
7623 8607 March 3
4503 3869 January 2
4503 3869 February 8
4503 3869 March 0
Fourth Normal Form (4NF)
 It is in the 3NF
 There is only one multi-valued
dependency per table.
 A multi-valued dependency is a constraint
according to which the presence of certain
rows in a table implies the presence of certain
other rows
 Multi-valued dependency occurs when
there is many-to-many relationship
between two columns in a table.
Violation of the 4NF
AuthorID BookTitle Editor
Smith023 Monkeys are from Asteriod Lisa
Smith023 How to Exercise Catherine
Smith023 How to Exercise John
Williams153 Monkeys are from Asteriod John
Williams153 AP Programming Elaine
Johnson823 Dogs are from Hale-Bopp Lisa
Johnson823 Monkeys are from Asteriod Silvia

There is no functional dependency between the


BookTitle and Editor columns.
End of
Database Normalization
Any question?

You might also like