You are on page 1of 19

Database normalization is the process of removing redundant data from your tables to improve storage efficiency, data integrity,

and scalability. Normalization generally involves splitting existing tables into multiple ones with specific relationships with the table. Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table). Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.

Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated: There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition nonsimple domains are replaced by domains whose elements are atomic (nondecomposable) values.

Simplifying Allows

data maintenance.

retrieval of data at optimal speed.

Reduces

storage space by removing the redundant data. of database is reduced.

Restructuring

Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form). The normal forms (abbrev. NF) of relational database theory provide criteria for determining a table's degree of vulnerability to logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is to inconsistencies and anomalies.

Title
Database System Concepts

Author1
Abraham Silberschatz

Author 2

ISBN

Subject
MySQL, Computers

Pages
1168

Publisher
McGraw-Hill McGraw-

Henry F. 0072958863 Korth

Operating System Concepts

Abraham Silberschatz

Henry F. 0471694665 Korth

Computers

944

McGraw-Hill McGraw-

This This

table is not very efficient with storage. design does not protect data integrity. table does not scale well.

Third, this

A table in which all attributes are atomic in nature and no repeated group is present is said to be in First NF. In our Table 1, we have two violations of First Normal Form: First, we have more than one author field, Second, our subject field contains more than one piece of information. With more than one value in a single field, it would be very difficult to search for all books on a given subject.

Table
Title

2
Author Abraham Silberschatz Henry F. Korth ISBN 0072958863 0072958863 Subject MySQL Computers Pages 1168 1168 Publisher McGraw-Hill McGrawMcGraw-Hill McGraw-

Database System Concepts Database System Concepts perating System Concepts perating System Concepts

Henry F. Korth Abraham Silberschatz

0 7169 665 0 7169 665

Computers Computers

9 9

McGrawMcGraw-Hill McGrawMcGraw-Hill

We now have two rows for a single book. Additionally, we would be violating the Second Normal Form A better solution to our problem would be to separate the data into separate tables- an Author table and a Subject table to store our information, removing that information from the Book table:

Subject Table
Subject_ID 1 2 Subject MySQL Computers

Author Table
Author_ID 1 2 Last Name First Name

Silberschatz Abraham Korth Henry

Book Table
ISBN
0072958863

Title
Database System Concepts Operating System Concepts

Pages
1168

Publisher
McGraw-Hill McGraw-

0471694665

944

McGraw-Hill McGraw-

Each table has a primary key, used for joining tables together when querying the data. A primary key value must be unique with in the table (no two books can have the same ISBN number), and a primary key is also an index, which speeds up data retrieval based on the primary key. Now to define relationships between the tables

Book_Author Table Book_Subject Table


ISBN 0072958863 0072958863 0471694665 0471694665 Author_ID 1 2 1 2 ISBN 0072958863 0072958863 0471694665 Subject_ID 1 2 2

table that is in first normal form as well as which dont have any partial dependencies on primary fields is said to be in second NF. non-key fields should be individually completely dependent on primary key.

All

As the First Normal Form deals with redundancy of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in vertical columns. The Book Table will be used for the 2NF example

Publisher Table
Publisher_ID 1 Publisher Name McGraw-Hill McGraw-

Book Table
ISBN 0072958863 Title Database System Concepts Operating System Concepts Pages 1168 Publisher_ID 1

0471694665

944

Third

normal form (3NF) requires that there are no functional dependencies of non-key attributes on something other than a candidate key. E.g. :- Qty, Price, Total table is in 3NF if all of the non-primary key attributes are mutually independent should not be transitive dependencies . A B, B C, A C. Here A is a primary Key.

There

If

After Normalizing a table, Query becomes too complicated then , we need to omit 2-3 steps of normalization. This process is known as denormalization.

Ex-1 :Name, Qty, Price, Total, Phone, city1, city2. :MEMBER abc DATABASE Oracle/ Access COM-CODE Tcs COMPANY Tata Consultancy Services Rishabh

Ex-2

MID 1

def

MySQL/ MS SQLSERVER

Rb

You might also like