You are on page 1of 5

NORMALIZATION

Normalization is a technique for producing a set of relations with desirable


properties, given the data requirements of an enterprise.

The process of normalization is a formal method that identifies relations based


on their primary or candidate / foreign keys and the functional dependencies
among their attributes.

The Process of Normalization

•Normalization is often executed as a series of steps. Each step corresponds to a


specific normal form that has known properties.
•As normalization proceeds, the relations become progressively more restricted in
format, and also less vulnerable to update anomalies.
•For the relational data model, it is important to recognize that it is only first
normal form (1NF) that is critical in creating relations. All the subsequent normal
forms are optional.
Basically 6 normal forms are available. they are: 1NF,2NF,3NF,BCNF,4NF
and 5 NF. Each normal form contains a set of rules that are to be satisfied by the
tables. Almost all real time database designs ends with the 3NF.

Unnormalized form (UNF)


A table that contains one or more repeating groups

ORDERS

ORDNO ORDDATE CNO CNAME CADD ITEMS


100 1/1/99 10 ANIL Hyderabad I1,Item1,25,100
;
I2,Item2,20,120
101 1/2/99 20 JAGAN Secunderabad I3,Item3,20,80;
I2,Item2,10,60
102 1/2/99 10 ANIL Hyderabad i3,Item3,25,100

First Normal Form(1NF):

1. All the tables must contain a single value at the intersection of row and column
2. The table must not contain repeated groups of data

Example:
In the above orders table the customer information is repeating so we
have to separate tat information into another table. Also the Items column
contains a set of vales not a single value. We have to provide different columns
for each value of the group. To achieve the rules of 1NF, the orders table will be
divided into two tables

1. CUSTOMERS (CNO*,CNAME,CADD)
2. ORDERS(SNO,ORDNO,ORDDATE,CNO,INO,IDESC,QTY,PRICE)
(*SNO,ORDNO)
Customers:

CNO CNAME CADD


10 ANIL Hyderabad
11 JAGAN Secunderabad

ORDERS:

SNO ORDNO ORDDATE CNO INO IDESC QUNATITY PRICE

1 100 1/1/99 10 I1 Item1 25 100


2 100 1/1/99 10 I2 Item2 20 120
1 101 1/2/99 20 I3 Item3 20 80
2 101 1/2/99 20 I2 Item2 10 60
1 102 1/2/99 10 I3 Item3 25 100

Second Normal Form(2NF):

1. All the tables must be in 1NF


2. All the non-key columns(other than primary key) must depend on primary key
column

Example: In the above example, the ORDDATE and CNO columns depends only
on ORDNO, which alone is not a primary key. So, we have to remove this
information into another table as follows:
1. CUSTOMERS (CNO*, CNAME, CADD)
2. ORDERS (SNO, ORDNO, INO, IDESC, QTY, PRICE) (*SNO,ORDNO)
3. ORDER DETAILS(ORDNO*,ORDATE,CNO)

Customers:
CNO CNAME CADD
10 ANIL Hyderabad
11 JAGAN Secunderabad

Orders:
SNO ORDNO INO IDESC QUNATITY PRICE

1 100 I1 Item1 25 100


2 100 I2 Item2 20 120
1 101 I3 Item3 20 80
2 101 I2 Item2 10 60
1 102 I3 Item3 25 100

Order Details:
ORDNO ORDDATE CNO
100 1/1/99 10
101 1/2/99 20
102 1/2/99 10

Third Normal Form(3NF):

1. All the tables must be in 2NF


2. All the non-key columns must depend ONLY on primary column. They must not
depend on other non-key columns (i.e., there should be no TRANSITIVE
dependency)

Example:
In the above orders table Idesc column depends on primary key as well as
INO column which is not a primary key. To satisfy 3NF, this information can be
separated into another table.
1. CUSTOMERS (CNO*, CNAME, CADD)
2. ORDERS (SNO, ORDNO, INO, QTY, PRICE) (*SNO,ORDNO)
3. ORDER DETAILS(ORDNO*,ORDATE,CNO)
4. ITEMS (INO*,IDESC)

Customers:
CNO CNAME CADD
10 ANIL Hyderabad
11 JAGAN Secunderabad

Orders:
SNO ORDNO INO QUNATITY PRICE

1 100 I1 25 100
2 100 I2 20 120
1 101 I3 20 80
2 101 I2 10 60
1 102 I3 25 100

Order Details:
ORDNO ORDDATE CNO
100 1/1/99 10
101 1/2/99 20
102 1/2/99 10

Items:
INO IDESC
I1 Item1
I2 Item2
I3 Item3

Therefore, at the end of 3NF we can achieve a fully normalized database.


Boyce-Codd normal form (BCNF)

A relation is in BCNF, if and only if, every determinant is a candidate key.


The difference between 3NF and BCNF is that for a functionaldependency A & B,
3NF allows this dependency in a relation if B is a primary-key attribute and A is
not a candidate key, whereas BCNF insists that for this dependency to remain in
a relation, A must be a candidate key.

Fourth normal form (4NF)

A relation that is in Boyce-Codd normal form and contains no nontrivial multi-


valued dependencies.
Multi-valued dependency (MVD)
It represents a dependency between attributes (for example, A, B and C)
in a relation, such that for each value of A there is a set of values for B and a set
of value for C. However, the set of values for B and C are independent of each
other.
A multi-valued dependency can be further defined as being trivial or nontrivial. A
MVD A à> B in relation R is defined as being trivial if
• B is a subset of A
or
•AUB=R
A MVD is defined as being nontrivial if neither of the above two
conditions is satisfied.
Fifth normal form (5NF)
A relation that has no join dependency.

Lossless-join dependency
A property of decomposition, which ensures that no spurious tuples are generated
when relations are reunited through a natural join operation.
Join dependency Describes a type of dependency. For example, for a relation R
with subsets of the attributes of R denoted as A, B, …, Z, a relation R satisfies a
join dependency if, and only if, every legal value of R is equal to the join of its
projections on A, B, …, Z.

You might also like