You are on page 1of 9

DATABASE & DATABASE DESIGN

What is Normalization?

Normalization is the process of efficiently organizing data


in a database. There are two goals of the normalization
process: eliminating redundant data and ensuring that the
database design does not suffer from any Update, delete
and insert anomalies. Both of these are worthy goals as
they reduce the amount of space a database consumes
and ensure that data is logically stored

Example of Anomalies:

Consider the following database design:


Student
Enroll No Name Section Mailing Club
Address Membership
06BS1256 Shipra A CC-89,Xmas Finance
Street,
Gurgaon
06BS1909 Krishna K A- IT
4555,Christ
Rd,Chennai
06BS1256 Shipra A CC-89,Xmas Marketing
Street,
Gurgaon
06BS1909 Krishna K A- HR
4555,Christ
Rd,Chennai
Gokul J ABC,Saket,N Finance
06BS1890 Delhi

In the table student the primary key is Enroll No and


Club Membership. It is seen from the table that a student
will be opting for a number of club memberships and as is
evident from the table that in case of a student opting for
no of club memberships there is repetition of students
details like name,section ,mailing address(see row 1 and
3).It results not only in redundancy of data but can also
result in data inconsistency since any changes which have
to be made have to be made at multiple places.If Shipras
section changes from Section A to Section C changes will
will have to be made in both rows 1 and 3.If row 3 is not
updated it will result in data inconsistency and hence
result in Update Anomaly.

Insert Anomaly:
Consider a case of a new student joining IBS gurgaon in
Section D but he doesn’t have any club memberships
.Since Club Memebership is a primary key and Primary
key cannot be left blank or have NULL Value we cannot
insert the details of the new student till he becomes a
member of atleast one club.This is refereed to as Insert
Anomaly.

Delete Anomaly:
Consider a case where Gokul (Row 5 of the table) is no
longer a member of Finance Club. We will have to delete
Gokuls record since Club Membership cannot be NULL or
blank since it’s a primary key.If we delete Gokuls record
we loose all information about gokuls section ,mailing
address etc.This is a Delete Anomaly.

To prevent instances like these the database community


has developed a series of guidelines for ensuring that
databases are normalized. These are referred to as normal
forms and are numbered from one (the lowest form of
normalization, referred to as first normal form or 1NF)
through five (fifth normal form or 5NF). In practical
applications, you'll often see 1NF, 2NF, and 3NF along
with the occasional 4NF. Fifth normal form is very rarely
seen.

It's important to point out that they are guidelines and


guidelines only. Occasionally, it becomes necessary to
stray from them to meet practical business requirements.
However, when variations take place, it's extremely
important to evaluate any possible ramifications they
could have on the system and account for possible
inconsistencies.

First Normal Form (1NF)

First normal form (1NF) sets the very basic rules for an
organized database:
• Remove all multivalued attributes. No comma
separated values are allowed in a single field of the
database.

For example
Customer
Customer First Telephone
Surname
ID Name Number

555-861-
123 Robert Ingram
2025

555-403-
1659,
456 Jane Wright
555-776-
4100

555-808-
789 Maria Fernandez
9633

This table is not in 1 NF since Telephone Number contains


Multiple Attributes in one cell/field.

Remedies:

Customer First Tel. No. Tel. No. Tel.


Surname
ID Name 1 2 No. 3

555-861-
123 Robert Ingram
2025

555-403- 555-776-
456 Jane Wright
1659 4100
555-808-
789 Maria Fernandez
9633

• Have different Number of Columns for the


repeating field Telephone No.But this remedy also
comes with its problems since the max no of
telephone numbers for a customer would be difficult
to ascertain .Like in this table we see that most of
the fields for Tel No 2 and Tel No 3 are blank.This
contributes to redundancy .
• Have Separate rows for different Telephone
nos.This also contributes to redundancy since for
every row except for Telephone No all other
attributes will get repeated.

• Best remedy would be to divide this table into


two .The problem of both redundancy and
Multivalued Attributes gets solved.

Customer Telephone
Customer Number

Customer First Customer Telephone


Surname
ID Name ID Number
123 Robert Ingram 555-861-
123
2025
456 Jane Wright
555-403-
456
1659
789 Maria Fernandez

555-776-
456
4100

555-808-
789
9633

SECOND NORMAL FORM (2NF)

Second normal form (2NF) further addresses the concept


of removing duplicative data:

• Applicable to tables which have a composite


primary key.
• 2 NF states that all Non Key Attributes should
be fully functionally dependent on the key
attributes.

2 NF states that all Non key Attributes (Section, Mailing


Address, Name) in table student should be fully
functionally dependent on the Key attributes(Enroll No,
Club Membership).That means no Non Key attribute
should get its value only from a part of the primary key.
In the table student all three Non key Attributes are
dependent on Enroll No and not Club Membership. For the
table to be in 2NF them all must derive their values from
the combination of both Enroll No and Club Membership.

To get this table in 2NF we divide the table further by


taking out the non key attribute which has a partial
functional dependency on part of primary key.For example
Name(Non key attribute) and Enroll No (part of the
Primary key Attribute) will be palced in another table.

Third Normal Form (3NF)

Third normal form (3NF) goes one large step further:

• Meet all the requirements of the second normal


form.
• Remove columns that are not dependent upon
the primary key directly .There should be no
transitive dependency.

For egs:

Employee

Employee ID Dept Id Dept Name


In This table Employee Id is the Primary key
.Employee ID determines the Dept ID (the dept for
which the employee is working) and Dept ID
determines the Dept Name. Thus in this table there
is transitive dependency because Employee ID
determines Dept ID and Dept ID determines Dept
Name .3NF states that all non key attributes should
be directly dependent on the primary key.

To get this table in 3 NF we remove Dept Name and


Dept ID and put it another table.

You might also like