You are on page 1of 30

Chapter 4

Functional Dependency and Normalization

1
Functional Dependency
 Describes the relationship between attributes in a
relation
 If A and B are attributes of relation R, B is functionally
dependent on A (denoted A->B),
 if value of B can be determined given A
Example
R = Person (ID, Name, Address, Birth_date)
• ID→ (Name, Address, Birth_date)
• A person’s Name, address, and birth date are
functionally dependent on that person's ID
2
Con’t
• Partial Dependency occurs when an attribute in a
table depends on only part of the primary key and
not on the whole key .
• Full Dependency
• If an attribute which is not a member of the
primary key is not dependent on some part of the
primary key but the whole key (if we have
composite primary key) then
• that attribute is fully functionally dependent on the
primary key.
3
Con’t
Transitive Dependency
• In mathematics and logic, a transitive relationship is a
relationship of the following form:
 "If A implies B, and if also B implies C,
then A implies C."
Example:
• If Mr X is a Human, and if every Human is an Animal, then
Mr X must be an Animal
• When a non-prime attribute depends on other non-prime
attribute(s) rather than depending upon the primary key

4
Normalization

 is a formal process for deciding which attributes should be


grouped together in a relation
 is the process of decomposing relations with anomalies to
produce smaller, well- structured relations
• If a relation is not normalized we can encounter the
following problems
• Information Redundancy
• Anomalies:
 Insertion anomalies
 Deletion anomalies
 Update anomalies
5
Insertion anomalies

• An "insertion anomaly" is a failure to place information


about a new database entry into all the places in the
database
• where information about that new entry needs to be
stored.
• In a properly normalized database, information about
a new entry needs to be inserted into only one place in
the database; in an inadequately normalized database,
information about a new entry may need than one
place and, human fallibility being what it is, some of
the needed additional insertions may be missed.

6
Deletion anomalies
• A "deletion anomaly" is a failure to remove information
about an existing database entry when it is time to
remove that entry.
• In a properly normalized database, information about an
old, to-be-gotten-rid-of entry needs to be deleted from
only one place in the database;
• in an inadequately normalized database, information
about that old entry may need to be deleted from more
than one place, and, human fallibility being what it is,
• some of the needed additional deletions may be missed.

7
update anomalies

• Update of a database involves changing some value


of the attribute of a table.
• In a properly normalized database table, whatever
information is modified by the user, the change will
be effected and used accordingly.
• The purpose of normalization is to reduce the
chances for anomalies to occur in a database.

8
Unnormalized Form (UNF)

Example of unnormalized table

9
Problems caused by Unnormalized forms

• Redundant storage
• Potential inconsistency (eg. Spelling errors)
• Anomalies
 Insertion anomaly
 To insert details of a new department that do not
have any students yet, we must enter nulls into the
attributes for studid.
 However, as studid is the primary key, null is not
allowed. We have to wait until students are registered

10
Con’t
 Deletion anomaly:
 if we delete the record associated with studid=234,
 the details about the database course are also lost.
 Update anomaly:
 If we want to change head of the Department of
Information Science to Amare
 then we must update the department head of
Information Science in every record; otherwise, the
database will become inconsistent
11
Advantages of Normalization
 Reduces data redundancies

• Helps eliminate data anomalies


• Produces controlled redundancies to link
tables

12
Normal Forms
• Normalization is often done as a series of steps
i.e UNF, 1st NF, 2nd NF, 3rd NF, BCNF, ...
• Each normal form involves a set of rules that can be
tested against each table in the system
• If a table violates the rules of some normal form,
then
• we decompose the table in to tables that individually
meet the requirements of normalization
• Each higher form of normalization is based on the
form prior to it
13
First Normal Form(1NF)
 Requires that all column values in a table are atomic
 (e.g., a number is an atomic value, while a list or a set
is not).
We have tow ways of achiving this:
1. Putting each repeating group into a separate table and
connecting them with a primary key-foreign key
relationship
2. Moving this repeating groups to a new row by
repeating the common attributes.
 If so then Find the key with which you can find all data
14
Con’t
• Definition: a table (relation) is in 1NF
• If There are no duplicated rows in the table.
Unique identifier
 Each cell is single-valued
 Entries in a column (attribute, field) are of the
same kind.

15
Unnormalized Form (UNF)

 This is not in 1NF because subject is not an atomic attribute

16
Conversion to 1NF

 Make sure that each column has a single/atomic value

 In 1NF, data redundancy increases, as there will be many columns with same
data in multiple rows but each row as a whole will be unique

17
2nd Normal Form (2NF)
 For a relation to be in the 2NF, it must satisfy
two conditions:
 The relation should be in the 1NF
 There should be no Partial Dependency
 Remove part-key dependencies.
 Make all data dependent on the whole key.

18
Partial Dependency
To remove partial dependency
 Write each primary key component on
separate line
 Write dependent attributes after each primary
key
 Each component is new table

19
Conversion to 2NF
Example
R = (stud_id, stud_name, course_id, course_name, grade)
• Primary Key – stud_id, course_id
Functional Dependencies
{stud_id, course_id → grade, stud_id → stud_name,
course_id → course_name }
 For the relation to be in 2NF, we have to remove the
partial dependency
 (course_id → course_name & stud_id → stud_name)
by decomposing the relation into more relations
20
Con’t
The resulting tables in 2NF are the following:
 R1 = Course(course_id, course_name)
 R2 = Student(stud_id, stud_name)
 R3 = Student_Course(studid, course_id, grade)

21
Third Normal Form(3NF)
A relation schema R is in third normal form (3NF) if
 it is in 2NF and
 There are no transitive dependencies between a
primary key and non-primary key attributes
 Make all data dependent on nothing but the key.
 If transitive dependencies exist on the primary key
remove them by placing them in a new relation
along with a copy of their dominant

22
Conversion to 3NF
Consider the following table
 R= Employee(empid, empname, empsal, depid, deptname, depbudget)
Primary Key : empid
 Functional dependencies:
Empid → depid
 Depid → depname and Depid → depbudget.
 Here depname & depbudget are transitively dependent on Empid.
Therefore, the above table is not in 3NF
To normalize it, we can use the functional dependencies:
• Depid → depname, Depid → depbudjet And Empid → depid
The resulting tables in 3NF are the following:
 R1= Employee(empid, empname, empsal, depid)
 R2= Department(depid ,deptname, depbudget)

23
Conversion to 3NF

Sid FName Lname department year Dormitory

125/12 Abebe kebede Civil 2 402

136/13 Mekuria Alemu CS 1 401

141/12 Sumeya Mohamed IT 2 402

138/11 Orion Girma CS 3 403

111/13 Monica Geller Civil 3 403

200/11 Chandler Being IT 2 402

210/12 Rachel Green IS 2 402


24
Con’t
• Let’s take
 SID, Year and Dormitory and see the
dependencies.
 SID → Year AND Year → Dormitery
 Then transitively
 SID → Dormitery

25
Con’t
Student
Sid FName Lname Department Year

125/12 Abebe Kebede Civil 2


136/13 Mekuria Alemu CS 1
141/12 Sumeya Mohamed IT 2
138/11 Orion Grima CS 3
111/13 Monica Geller Civil 3
200/11 Chandler Being IT 2
210/12 Rachel Green IS 2

26
Con’t
Dorm
Year Dormitery

1 401
2 402
3 403

27
Other Levels of Normalization
Boyce-Codd Normal Form (BCNF):
 Isolate Independent Multiple Relationships
 No table may contain two or more 1:n or N:M
relationships that are not directly related.
 The correct solution, to cause the model to be in 4th
normal form,
 is to ensure that all M:M relationships are resolved
independently if they are indeed independent,
 Def: A table is in BCNF if it is in 3NF and if every
determinant is a candidate key.
28
cont
Forth Normal form (4NF) :
• Isolate Semantically Related Multiple
Relationships
• There may be practical constrains on
information that justify separating logically
related many-to-many relationships.
• Def: A table is in 4NF if it is in BCNF and if it
has no multi-valued dependencies.

29
Con’t
Fifth Normal Form (5NF):
• Def: A table is in 5NF, also called "Projection-
Join Normal Form" (PJNF), if it is in 4NF and
• if every join dependency in the table is a
consequence of the candidate keys of the
table.

30

You might also like