# Normalization

Presented by: Faizan Ahmed PGDM 2011-13

Database Normalization

Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability. In the relational model, methods exist for quantifying how efficient a database is. These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.

History

Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated: “There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition non simple domains are replaced by ‘domains whose elements are atomic (non decomposable) values.’”

Normal Form

Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).

First Normal Form (1 NF)

All values in the columns are atomic. This is, they contain no repeating values. There are no repeating groups: two columns do not store similar information in the same table. Basically: 1 NF is to eliminate duplicate columns

1st Normal Form Example
Un-normalized Students table: Student# AdvID AdvName AdvRoom Class1 Class2 123 123A James 555 102-8 104-9 124 123B Smith 467 209-0 102-8 Normalized Students table: Student# AdvID AdvName AdvRoom Class# 123 123 124 124 123A 123A 123B 123B James James Smith Smith 555 555 467 467 102-8 104-9 209-0 102-8

Second Normal Form (2 NF)

A relation is in 2 NF if it is in 1 NF and every non-key attribute is fully functionally dependant on the primary key

2nd Normal Form Example
Students table Student# AdvID 123 123A 124 123B Registration table Student# 123 123 124 124 Class# 102-8 104-9 209-0 102-8

Third Normal Form (3 NF)

A relation is in 3 NF if it is 2 NF and no transitive dependencies exist. Transitive dependency is a functional dependency between non-key attribute Basically: 3 NF is to eliminate column not depend upon the primary key.

Transitive Dependency
transitive dependency

Cust_ID

Name

Salesperson

Region

Cust_ID

Name

Salesperson

Region

Salesperson

Region

Cust_ID

Name

Salesperson

Boyce-Codd Normal Form (BCNF)

A relation is in BCNF if it is in 3 NF and every determinant is a candidate key; in other words, each determinant can be used as a primary key. Determinant: an attribute on which some other attribute is fully functionally dependent

Ex: A --> B (A is called the determinant)

BCNF Example

● ● ●

Given: R (A, B, C , D) A --> B, C, D B --> A, C, D C --> A , B, D D --> A, B, C

BCNF Example (Cont...)

Determinants:

A, B, C, and D A, B, C, and D

Candidate keys:

Since all the determinants are candidate keys, this is BCNF.

Thank You !!