You are on page 1of 6

Normalization

Normalization = transformation of complex data stores into smaller, simpler, stable data
structures.

Purpose of Normalization:

(1) to make each attribute have equal importance (which allows indexing on any attribute), and

(2) to remove redundancy in the data (which facilitates maintenance of consistency.


Three Steps of Normalization


Step 1: Flatten the relation, i.e., get rid of composite attributes and multi-valued attributes -->
1NF

Step 2: Remove partial key dependencies --> 2NF

Step 3: Remove nonkey dependencies --> 3NF

Step 1: Flatten the relation --> 1NF


For example, the user view might be a table called SALES-REPORT, of the form:

(Salesperson-No, Salesperson-Name, Sales-Area, Customer-record*)

where Customer-record* is a multivalued composite attribute:

(Customer-No, Customer-Name, Warehouse-No, Warehouse-location, Sales)*

A couple of rows of the table might look like:


Sp-No Sp-Name Area C-No C-Name W-No Ware-loc Sales
--------------------------------------------------------
3462 Waters West (18765 D Services 4 Fargo 13,540)
(18830 M. Levy 3 Bisma 10,600)
1234 Johnson East (45455 Compuhou 7 James 9,444)
(54512 Building 7 James 7,545)
(12132 Hydraler 9 Kitik 15,444)


To convert to first-normal form is easy, simply combine each row in the non-simple column with
the items in the other columns to make a row in the 1NF table:


Sp-No Sp-Name Area C-No C-Name W-No Ware-loc Sales
3462 Waters West 18765 D Services 4 Fargo 13,540
3462 Waters West 18830 M. Levy 3 Bisma 10,600
1234 Johnson East 45455 Compuhou 7 James 9,444
1234 Johnson East 54512 Building 7 James 7,545
1234 Johnson East 12132 Hydraler 9 Kitik 15,444


Now each attribute has equal importance and may be indexed for quick access.


Step 2: Remove partial key dependencies --> 2NF


What is the key of the previous table?

--------------- ---------------

Since it is not a single attribute, it is called a compound key.

Do you see any attributes that depend on only part of the key?

Functional dependencies can be written in the form:

attribute1 --> attribute2

"attribute 1 determines attribute 2"

-------------------- --> ----------------------
-------------------- --> ----------------------
-------------------- --> ----------------------
-------------------- --> ----------------------
-------------------- --> ----------------------
-------------------- --> ----------------------

The key of the previous table:

Salesperson-No, Customer-No

Salesperson-No --> Salesperson-Name

Salesperson-No --> Sales-Area

So split the SALES-REPORT table into two tables

SALESPERSON (Salesperson-No, Salesperson-name, Sales-area), and

SALESPERSON-CUSTOMER (Salesperson-No, Customer-No, Customer-name,
Warehouse-No, Warehouse-loc, Sales-amount)

SALESPERSON

Salesperson-No Salesperson-name West Sales-area

3462 Waters West
1234 Johnson East


Is this table in 2NF...how do you know?

SALESPERSON-CUSTOMER table

------------------------------------------------




What is the key?
Is this table in 2NF?

Salesperson-No --> --------------
Customer-No --> ------------------
Customer-No --> ------------------
Customer-No --> ------------------
Customer-No --> ------------------
Salesperson-No, Customer-No --> --------------


So split the Salesperson-Customer table into two:

SALES
(Salesperson-No, Customer-No, Sales-amount), and
Sp-No C-No C-Name W-No Warehouse-loc Sales

3462 18765 Delta Services 4 Fargo 13,540
3462 18830 M. Levy Sons 3 Bismarck 10,600
1234 45455 Compuhouse 7 Jamestown 9,444
1234 54512 Building Sup. 7 Jamestown 7,545
1234 12132 Hydraler Co. 9 Kitikack 15,444


CUSTOMER-WAREHOUSE
(Customer-No, Customer-name, Warehouse-No, Warehouse)

SALES










Is this table in 2NF?


Step 3: Remove non-key dependencies --> 3NF

CUSTOMER-WAREHOUSE table
----------------------------------


Is this table in 2NF?

What is the key?
What are some of the dependencies?

Customer-No --> --------------
Customer-No --> --------------
Customer-No --> --------------
Customer-Name --> ------------
Warehouse-No --> ------------------
Warehouse-loc --> ------------------

Salesperson-No Customer-No Sales-amount

3462 18765 13540
3462 18380 10600
3462 19242 9999
1234 45455 9444
1234 54512 7545
1234 12132 15444
C-No C-Name W-No
Warehouse-loc.

18765 Delta Services 4 Fargo
18830 M. Levy Sons 3 Bismarck
45455 Compuhouse 7 Jamestown
54512 Building Sup. 7 Jamestown
12132 Hydraler Co. 9 Kitikack
98765 Seward System 9 Kitikack
So split the CUSTOMER-WAREHOUSE table into two:

CUSTOMER (Customer-No, Customer-Name, Warehouse-No), and

WAREHOUSE (Warehouse-No, Warehouse-location)

CUSTOMER table




Is this table in 3NF?

WAREHOUSE table


Is this table in 3NF?

W-No Warehouse-loc
4 Fargo
3 Bismarck
7 Jamestown
9 Kitikack


So, the intial unnormalized relation SALES-REPORT produced four 3NF relations:

SALES-REPORT

(Salesperson-No, Salesperson-Name, Sales-Area,
(Customer-No, Customer-Name, Warehouse-No, Warehouse-location, Sales)*)

SALESPERSON (Salesperson-No, Salesperson-name, Sales-area)

SALES (Salesperson-number, Customer-number, Sales-amount)

CUSTOMER (Customer-number, Customer-name, Warehouse-number)

and WAREHOUSE (Warehouse-No, Warehouse-location)

C-No C-Name W-No
18765 Delta Services 4
18830 M. Levy Sons 3
45455 Compuhouse 7
54512 Building Sup. 7 7
12132 Hydraler Co. 9
98765 Seward System 9
Foreign key = a nonkey attribute in one relation which is a key attribute in another relation
What is the foreign key in the CUSTOMER relation?Underline it with a dotted line.