Professional Documents
Culture Documents
Database Normalization
Proposed by Codd (1972)
Introduced 3 normal forms, the first, second and third
normal form
A stronger definition of 3NF - called Boyce-Codd
normal form (CDNF) was proposed later
Later, 4NF and 5NF were proposed
The minimum, and most common, goal is to achieve 3NF.
Database Normalization
Normalization Is the process of analyzing the given
relational schema based on its functional dependencies
and keys to achieve the desirable properties of:
Minimizing redundancy
Minimizing the insertion, deletion, and updating
anomalies
Minimize data storage
Unsatisfactory relation schema that do not meet a
given normal form test are decomposed into smaller
relational schemas that meet the test and hence
possess the desired properties.
Key Concepts in normalization are Functional Dependency
and keys
Example
Sales
(Order#, Date, CustID, Name, Address, City,
State, Zip, {Product#, ProductDesc, Price,
QuantityOrdered}, Subtotal, Tax, S&H, Total)
20 Research Hundredfold
30 Marketing Leeds
Deptno Location
Deptno Dname 10 Leeds
10 Bradfprd
10 IT
10 Kent
20 Research
20 Hundredfold
30 Marketing
30 Leeds
Second Normal Form (2NF)
Each attribute must be functionally dependent on
the primary key.
• If the primary key is a single attribute, then the relation is in 2NF
• The test for 2NF involves testing for FDs whose left-hand-side
attribute are part of the primary key
• Disallow partial dependency, where non-keys attributes depend on
part of a composite primary key
• In short, remove partial dependencies
Relation X2
EmpNo EName Salary Address
Third Normal Form (3NF)
Remove transitive dependencies.
Transitive dependency
A non-prime attribute is dependent on another, non-prime
attribute or attributes
Attribute is the result of a calculation
Examples:
Area code attribute based on City attribute of a customer
Total price attribute of order entry based on quantity attribute
and unit price attribute (calculated value)
Solution:
• Any transitive dependencies are moved into a smaller table.
Transitive Dependence
Give a relation R, EmpNo EName Salary Address
Assume the following FD hold:
Ename Address
Note : Both Ename and Address attributes are non-key attributes in R, and since
Address depends on a non-Prime attribute Name, which depends on the primary
key(EmpNo), a transitive dependency exists
EmpNo Ename, Ename Addresst, EmpNo Address
R1 R2
EmpNo EName Salary Ename Address
Figure 5.7
The Decomposition of a Table Structure to Meet
BCNF Requirements
Figure 5.8
Sample Data for a BCNF Conversion
Table 5.2
Decomposition into BCNF
Figure 5.9