You are on page 1of 7

IM211:FUNDAMENTALS OF DATABASE SYSTEMS

Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Lesson title: Database Normalization Materials:


Lesson Objectives: Student Activity Sheet,
1. At the end of this module, I should be able to:
2. Explain normalization and its role in the database design References:
process https://www.guru99.com/relational-
Identify and describe each of the normal forms: 1NF, 2NF, 3NF, data-model-dbms.html
BCNF, and 4NF
3. Apply normalization rules to evaluate and correct table
structures

MAIN LESSON

What Is Database Normalization?

Database normalization or SQL normalization helps us to group related data in one single table. Any attributive data
or indirectly related data are put in different tables and these tables are connected with a logical relationship
between parent and child tables.

In 1970, Edgar F. Codd came up with the concept of normalization. He shared a paper named “A Relational
Model of Data for Large Shared Banks” in which he proposed “First Normal Form (1NF)”.

 Advantages Of DBMS Normalization:

Database Normalization provides the following basic advantages:

1. Normalization increases data consistency as it avoids the duplicity of data by storing the data in one
place only.
2. Normalization helps in grouping like or related data under the same schema, thereby resulting in the
better grouping of data.
3. Normalization improves searching faster as indexes can be created faster. Hence, the normalized
database or table is used for OLTP (Online Transaction Processing).

 Disadvantages Of Database Normalization


IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

DBMS Normalization has the following disadvantages:

1. We cannot find the associated data for, say a product or employee in one place and we have to join
more than one table. This causes a delay in retrieving the data.
2. Thus, Normalization is not a good option in OLAP transactions (Online Analytical Processing).

Normalization Rule

Normalization rules are divided into the following normal forms:

1. First Normal Form


2. Second Normal Form
3. Third Normal Form
4. BCNF
5. Fourth Normal Form

First Normal Form (1NF)

For a table to be in the First Normal Form, it should follow the following 4 rules:
1. It should only have single(atomic) valued attributes/columns.
2. Values stored in a column should be of the same domain
3. All the columns in a table should have unique names.
4. And the order in which data is stored, does not matter.
By definition, an entity that does not have any repeating columns or data groups can be termed as the First Normal
Form. In the First Normal Form, every column is unique.

Following is how our Employees and Department table would have looked if in first normal form (1NF):

Here, all the columns of both Employees and Department tables have been clubbed into one and there is
no need of connecting columns, like deptNum, as all data is available in one place.

But a table like this with all the required columns in it, would not only be difficult to manage but also difficult
to perform operations on and also inefficient from the storage point of view.
IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Second Normal Form (2NF)

For a table to be in the Second Normal Form,

1. It should be in the First Normal form.


2. And, it should not have Partial Dependency.

By definition, an entity that is 1NF and one of its attributes is defined as the primary key and the remaining attributes
are dependent on the primary key. Following is an example of how the employees and department table would
look like:

Here, we can observe that we have split


the table in 1NF form into three different tables.
the Employees table is an entity about all the
employees of a company and its attributes
describe the properties of each employee. The
primary key for this table is empNum.

Similarly, the Departments table is an


entity about all the departments in a company
and its attributes describe the properties of each
department. The primary key for this table is the
deptNum.

In the third table, we have combined the


primary keys of both the tables. The primary
keys of the Employees and Departments tables
are referred to as Foreign keys in this third table.

If the user wants an output similar to the


one, we had in 1NF, then the user has to join all
the three tables, using the primary keys.
IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Third Normal Form (3NF)

A table is said to be in the Third Normal Form when,

1. It is in the Second Normal form.


2. And, it doesn't have Transitive Dependency.

By definition, a table is considered in third normal if the table/entity is already in the second normal form and
the columns of the table/entity are non-transitively dependent on the primary key.
Let’s understand non-transitive dependency, with the help of the following example.

Say a table named, Customer has the below columns:


CustomerID – Primary Key identifying a unique customer
CustomerZIP – ZIP Code of the locality customer resides in
CustomerCity – City the customer resides in

In the above case, the CustomerCity column is dependent on the CustomerZIP column and the
CustomerZIP column is dependent on CustomerID.
The above scenario is called transitive dependency of the CustomerCity column on the CustomerID i.e. the
primary key. After understanding transitive dependency, now let’s discuss the problem with this dependency.

There could be a possible scenario where an unwanted update is made to the table for updating the
CustomerZIP to a zipcode of a different city without updating the CustomerCity, thereby leaving the database in
an inconsistent state.
In order to fix this issue, we need to remove the transitive dependency that could be done by creating another
table, say, CustZIP table that holds two columns i.e. CustomerZIP (as Primary Key) and CustomerCity.

The CustomerZIP column in the Customer table is a foreign key to the CustomerZIP in the CustZIP table.
This relationship ensures that there is no anomaly in the updates wherein a CustomerZIP is updated without
making changes to the CustomerCity.

Fourth Normal Form (4NF)

A table is said to be in the Fourth Normal Form when,


1. It is in the Boyce-Codd Normal Form.
2. And, it doesn't have Multi-Valued Dependency.

By definition, the table is considered Boyce-Codd Normal Form, if it’s already in the Third Normal Form and for
every functional dependency between A and B, A should be a super key.

This definition sounds a bit complicated. Let’s try to break it to understand it better.

 Functional Dependency: The attributes or columns of a table are said to be functionally dependent


when an attribute or column of a table uniquely identifies another attribute(s) or column(s) of the
same table.

For Example, the empNum or Employee Number column uniquely identifies the other columns like
Employee Name, Employee Salary, etc. in the Employee table.

 Super Key: A single key or group of multiple keys that could uniquely identify a single row in a table
can be termed as Super Key. In general terms, we know such keys as Composite Keys.
IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Let’s consider the following scenario to understand when there is a problem with Third Normal Form and how does
Boyce-Codd Normal Form comes to rescue.

In the above example, employees with empNum 1001 and 1007 work in two different departments. Each
department has a department head. There can be multiple department heads for each department. Like for the
Accounts department, Raymond and Samara are the two heads of departments.

In this case, empNum and deptName are super keys, which implies that deptName is a prime attribute.
Based on these two columns, we can identify every single row uniquely.

Also, the deptName depends on deptHead, which implies that deptHead is a non-prime attribute. This
criterion disqualifies the table from being part of BCNF.

To solve this we will break the table into three different tables as mentioned below:

Now that you

differentiate the aspects of communication ?. Let’s try a short activity to know how much you understand the our
short introduction to our lesson.
IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Activity 1: Skill-building Activities.

There you go! I’m expecting that you learn something today, I am
excited to hear your understanding with our lesson for today, Answer
the following question:

On this activity, you are going to create the normalization for the sample database table records name as
StudentgradeReport . Listed below is the content of the table.

Student_Grade_Report (StudentNo, StudentName, Major, CourseNo, CourseName,


InstructorNo, InstructorName, InstructorLocation, Grade)
IM211:FUNDAMENTALS OF DATABASE SYSTEMS
Module #6

Name: _______________________________________________________________
Section: ____________ Date: ________________

Activity 2: Check for Understanding (5 mins)

On this activity, create the database scheme for activity no 3. The Database scheme will based on
the final answer or normal form only.

Activity 3: Answer the following questions below.

1. What is Normalization in a Database?

2. What are the different types of Normalization?

3. What is the Purpose of Normalization?

4. What is Denormalization?

You might also like