You are on page 1of 6

Normalization

Normalization is a process of removing redundancies. It comprises of set of rules called Normal


Forms. Normalization is applied to a set of data in a database. Tables can be related or linked to
each other through the use of key or index identifiers. A key or index identifier identifies a row of
data in a table much like an index is used in a book. An index in a book is used to locate an item
of interest without having to read the whole title.

There are five levels or layers of Normalization called 1st, 2nd, 3rd, BCNF, 4th, and 5th Normal
Forms. Each Normal Form is a refinement of the previous Normal Form. 4th and 5th Normal
Forms are rarely applied.

First Normal Form:


Rules for 1NF:

• Eliminate duplicate columns from the table.


• Create separate tables for each group of related data and identify each row with a unique
column (the primary key).

Example of 1NF:

Consider the example of Sales order information which contains the Order and sales details of a
customer.
Initial table contains the details of SalesOrder.
SalesOrder table contains the following columns.

Order Customer City State ZIP Order Shippin Item Item Item Item Item Item
Number(P Phone Date g date quantit desc quantity price desc price
K) y 1 1 1 2 2

1 98596585 Hyder AP 50000 1998- 1998- 5 6 15 25


24 abad 82 06-13 06-20

By applying 1NF the table becomes as:

Sales Order
Order Customer City State ZIP Order Shipping
Number(PK) Phone Date Date
1 9859658524 Hyderabad AP 500008 1998- 1998-06-
2 06-13 20
Sales Order item

Order Order item Description Quantity price


Number(FK) Number(PK)
1 878 Jeans 2 2000

Now the table is in 1NF.

2nd Normal Form

The second Normal form can be defined as the level where tables are in 1NF, and where for every
table, all its non-primary key columns are dependent on the whole primary key. A column should
not be partially dependent on the primary key.
When a column depends on part of the primary key it is called partial dependency.

To bring a table to the 2NF, any of the following rules should be met.
1. The primary key column should not be a composite key. That is, it should be made up of
only one column.
2. All the non-primary key columns are dependent only on the primary key.
3. There should be no non-primary key dependent columns in the table.

In the example the customer information does not depend on the primary key that is order
Number. So make it as a separate table.

Sales Order
Order Customer Order Date Shipping
Number(PK) Number(FK) Date
1 256 1998-06-13 1998-06-20

Sales Order item


Order Order item Description Quantity price
Number(FK) Number(PK)
1 878 Jeans 2 2000

Customer
Customer Customer City State ZIP
Number(PK) Phone
256 985965852 Hyderabad AP 5000082
4

This is 2NF

3rd Normal Form


The 3NF is defined as the level where the non-primary key columns in the table are directly
dependent on the primary key and are not dependent on any other non-primary key. If we remove
the transitive dependencies from a table in 2NF, then it is said to be in 3NF.
Transitive dependency can be defined as a situation where a non-primary key column of a table is
dependent on another non-key column.
Steps to convert a table to 3NF:
1. Identify the non-primary key columns that dependent on other non-primary key columns.
2. Create another table with the removed columns and include the non-primary key column
that they are dependent on making it the primary key.
3. Create a foreign key in the base table linking it to the primary key in the new table.
To be in 3NF there should not be any transitive dependency in the table.

In this Example city, state does not depend on customer number it depends on ZIP code.

Customer Customer City State ZIP


Number(PK) Phone
256 985965852 Hyderabad AP 5000082
4

To be this in 3NF should separate the city, state and ZIP from the above table.

Customer ZIP(FK) Customer


Number(PK) Phone ZIP(PK) City State
256 5000082 9859658524 5000082 Hyderabad AP

Boyce-Codd Normal Form (BCNF):

• When a relation has more than one candidate key, anomalies may result even though the
relation is in 3NF.
• 3NF does not deal satisfactorily with the case of a relation with overlapping candidate
keys
• i.e. composite candidate keys with at least one attribute in common.
• BCNF is based on the concept of a determinant.
• A determinant is any attribute (simple or composite) on which some other attribute is
fully functionally dependent.
• A relation is in BCNF is, and only if, every determinant is a candidate key.

For each Person / Shop Type combination, the table tells us which shop of this type is
geographically nearest to the person's home. We assume for simplicity that a single shop cannot
be of more than one type.

The candidate keys of the table are:

• {Person, Shop Type}


• {Person, Nearest Shop}
Because all three attributes are prime attributes (i.e. belong to candidate keys), the table is in
3NF. The table is not in BCNF, however, as the Shop Type attribute is functionally dependent on
a non-superkey: Nearest Shop.

The violation of BCNF means that the table is subject to anomalies. For example, Eagle Eye
might have its Shop Type changed to "Optometrist" on its "Fuller" record while retaining the
Shop Type "Optician" on its "Davidson" record. This would imply contradictory answers to the
question: "What is Eagle Eye's Shop Type?" Holding each shop's Shop Type only once would
seem preferable, as doing so would prevent such anomalies from occurring:

Person Shop Type Nearest Shop


James Optician Chermas
Peter Book Shop Sweet shop
Robert Bakery Book Shop

Person Shop Type Nearest Shop


James Optician Chermas Shop Shop Type
Peter Book Shop Sweet shop Optician Optician
Robert Bakery Book Shop Book Shop Book Shop
4th Bakery Bakery
Normal Form

Although BCNF removes anomalies due to functional dependencies, another type of


dependency called a multi-valued dependency (MVD) can also cause data
redundancy.
Possible existence of multi-valued dependencies in a relation is due to 1NF and
can result in data redundancy.

Multi-valued Dependency (MVD)


Dependency between attributes (for example, A, B, and C) in a relation, such that for each value
of A there is a set of values for B and a set of values for C. However, the set of values for B and
C are independent of each other.

MVD between attributes A, B, and C in a relation using the following notation:


A −>> B
A −>> C
A multi-valued dependency can be further defined as being trivial or nontrivial.
A MVD A −>> B in relation R is defined as being trivial if (a) B is a subset of A or
(b) A È B = R.
A MVD is defined as being nontrivial if neither (a) nor (b) are satisfied.
A trivial MVD does not specify a constraint on a relation, while a nontrivial MVD does specify a
constraint.

Defined as a relation that is in Boyce-Codd Normal Form and contains no nontrivial


Multi-valued dependencies.

ID(PK) Name Salary Skills Certifications


1 Sony 20000 Management MBA
2 James 18000 Java SCJP

This is converted into 4NF


ID(PK) Name Salary
1 Sony 20000
2 James 18000

ID(FK) Skills ID(FK) Certifications


1 Management 1 MBA
2 Java 2 SCJP

5th Normal Form

A table is said to be in the 5NF if and only if it is in 4NF and every join dependency in it is
implied by the candidate keys.
Sometimes it is impossible to break the table into 2 tables, that is when we can use the rules of
5NF to normalize.
Generally a table in 4NF is always in 5NF, but sometimes real world constraint will cause the
Relation to be not comply with 5th NF.

5th Normal Form divides related columns into separate tables based on those relationships. In
the example product, manager, and employee are all related to each other. Thus, three separate
entities can be created to explicitly define those inter-relationships. The result is information that
can be reconstructed from smaller parts. An additional purpose of 5th Normal Form is to remove
redundancy or duplication not covered by the application of 1st to 4th Normal Forms of
Normalization.

Defined as a relation that has no join dependency.


© Pearson

Product Manager Employee


CCUI PM Sony
WSI PDM James
Product Manager Manager Employee
CCUI PM PM Sony
WSI PDM PDM James

Product Employee
CCUI Sony
WSI James

You might also like