You are on page 1of 82

Topics to be covered

• Functional Dependency
• Definition and types of FD
• Closure of FD set
• Closure of attribute set
• Irreducible set of FD
• Normalization and normal forms
• 1NF
• 2NF
• 3NF
• BCNF
• 4NF
• 5NF
What is Functional Dependency?
▪ Let R be a relation schema having n attributes A1, A2, A3,…, An.
Student
RollNo Name SPI BL
101 Raj 8 0
102 Meet 7 1

▪ Let attributes X and Y are two subsets of attributes of relation R.


▪ If the values of the X component of a tuple uniquely (or
functionally) determine the values of the Y component, then
there is a functional dependency from X to Y.
▪ This is denoted by X → Y (i.e RollNo → Name, SPI, BL).
▪ It is referred as: Y is functionally dependent on the X or X
functionally determines Y.
Diagrammatic representation
X→Y {X1, X2} → Y X → {Y1, Y2}

X Y X1 X2 Y X Y1 Y2

▪ Example
▪ Consider the relation Account(account_no, balance, branch).
▪ account_no can determine balance and branch.
▪ So, there is a functional dependency from account_no to balance
and branch.
▪ This can be denoted by account_no → {balance, branch}.

account_no balance branch


Types of Functional Dependencies
1. Full Functional Dependency
• In a relation, the attribute B is fully functional dependent on A if B is
functionally dependent on A, but not on any proper subset of A.
• Eg. {Roll_No, Semester, Department_Name} → SPI
• We need all three {Roll_No, Semester, Department_Name} to find SPI.
2. Partial Functional Dependency
• In a relation, the attribute B is partial functional dependent on A if B is
functionally dependent on A as well as on any proper subset of A.
• If there is some attribute that can be removed from A and the still
dependency holds then it is partial functional dependancy.
• Eg. {Enrollment_No, Department_Name} → SPI
• Enrollment_No is sufficient to find SPI, Department_Name is not required
to find SPI.
Types of Functional Dependencies
3. Transitive Functional Dependency
• In a relation, if attribute(s) A → B and B → C, then C is transitively
depends on A via B.
• Eg. Staff_No → Branch_No ; Branch_No → Branch_Address
then Staff_No → Branch_Address
4. Trivial Functional Dependency
• X → Y is trivial FD if Y is a subset of X
• Eg. {Roll_No, Department_Name} → Roll_No
5. Nontrivial Functional Dependency
• X → Y is nontrivial FD if Y is not a subset of X
• Eg. {Roll_No, Department_Name} → Student_Name
Armstrong's axioms (inference rules)
▪ Armstrong's axioms are a set of rules used to infer (derive) all the
functional dependencies on a relational database.
Armstrong's axioms (inference rules)
1. Reflexivity 5. Self-determination
– If B is a subset of A – A→A
then A → B
6. Decomposition
2. Augmentation – If A → BC
– If A → B
then A → B and A → C
then AC → BC
3. Transitivity 7. Union
– If A → B and B → C – If A → B and A → C
then A → C then A → BC
4. Pseudo Transitivity 8. Composition
– If A → B and BD → C – If A → B and C → D
then AD → C then AC → BD
What is closure of a set of FDs?
▪ Given a set F set of functional dependencies, there are certain
other functional dependencies that are logically implied by F.
▪ E.g.: F = {A → B and B → C}, then we can infer that A → C
▪ The set of functional dependencies (FDs) that is logically implied
by F is called the closure of F.
▪ It is denoted by F+.
Example of closure of a set of FDs
▪ Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
▪ The functional dependency A → H is logical implied.

We have
A→B
Transitivity rule A→H
B→H
Example of closure of a set of FDs
▪ Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
▪ The functional dependency CG → HI is logical implied.

We have
CG → H
Union rule CG → HI
CG → I
Example of closure of a set of FDs
▪ Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
▪ The functional dependency AG → I is logical implied.

We have
A→C
Pseudo-transitivity rule AG → I
CG → I
Example of closure of a set of FDs
▪ Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)
▪ The functional dependency AG → I is logical implied.

We have
A→C Augmentation rule AG → CG

AG → CG
Transitivity rule AG → I
CG → I
Example of closure of a set of FDs
▪ Suppose we are given a relation schema R(A,B,C,G,H,I) and the set
of functional dependencies are:
• F = (A → B, A → C, CG → H, CG → I, B → H)

Several members of F+ are


F+ = (A → H, CG → HI, AG → I)
Exercise

• Compute the closure of the following set F of functional


dependencies for relational schema R = (A,B,C,D,E,F):
F = (A → B, A → C, CD → E, CD → F, B → E)

A → BC A→B&A→C Union Rule


CD → EF CD → E & CD → F Union Rule
A→E A→B&B→E Transitivity Rule
AD → E A → C & CD → E Pseudo-transitivity Rule
AD → F A → C & CD → F Pseudo-transitivity Rule

F+ = {A → BC, CD → EF, A → E, AD → E, AD → F}
Exercise

• Compute the closure of the following set F of functional


dependencies for relational schema R = (A,B,C,D,E):
F = (AB → C, D → AC, D → E)

D → A, D →C D → AC Decomposition Rule
D → ACE D → AC & D → E Union Rule

F+ = {D → A, D → C, D → ACE}
What is a closure of attribute sets?
▪ Given a set of attributes α, the closure of α under F is the set of
attributes that are functionally determined by α under F.
▪ It is denoted by α+.
Algorithm to find closure of attribute sets
▪ Algorithm to compute α+, the closure of α under F
▪ Steps
1. result = α
2. while (changes to result) do
– for each β → γ in F do
» begin
• if β ⊆ result then result = result U γ
• else result = result
» end
Example of closure of attribute sets
▪ Consider the following relation schema
▪ R = (A, B, C, G, H, I).
▪ For this relation, a set of functional dependencies F can be given
as F = {A → B, A → C, CG → H, CG → I, B → H}
▪ Find out the closure of (AG)+.
Example of closure of attribute sets (Solution)
▪ F = {A → B, A → C, CG → H, CG → I, B → H}.
Find out (AG)+
▪ Step-1 :
• result = {AG}
▪ Step-2 : for each β → γ in F do
• if β ⊆ result then result = result U γ
• else result = result
result = ABG A→B A ⊆ AG
result = ABCG A→C A ⊆ ABG
result = ABCGH CG → H CG ⊆ ABCG
result = ABCGHI CG → I CG ⊆ ABCGH
result = ABCGHI B→H B ⊆ ABCGHI
Exercise

• Given functional dependencies (FDs) for relational schema R =


(A,B,C,D,E):
F = {A → BC, CD → E, B → D, E → A}
1. Find Closure for A
2. Find Closure for CD
3. Find Closure for B
4. Find Closure for BC
5. Find Closure for E

A+ = ABCDE
CD+ = ABCDE
B+ = BD
BC+ = ABCDE
E+ = ABCDE
What is extraneous attributes?
▪ Let us consider a relation R with schema R = (A, B, C) and set of
functional dependencies F = { AB → C, A → C }.
▪ In AB → C, B is extraneous attribute. The reason is, there is
another FD A → C, which means when A alone can determine C,
the use of B is unnecessary (extra).
▪ An attribute of a functional dependency is said to be extraneous if
we can remove it without changing the closure of the set of
functional dependencies.
What is canonical cover?
▪ A canonical cover of F is a minimal set of functional dependencies
equivalent to F, having no redundant dependencies or redundant
parts of dependencies.
▪ It is denoted by Fc
▪ A canonical cover for F is a set of dependencies Fc such that
1. F logically implies all dependencies in Fc and
2. Fc logically implies all dependencies in F and
3. No functional dependency in Fc contains an extraneous attribute and
4. Each left side of functional dependency in Fc is unique.

F = {A → B, A → C}
Decomposition Rule Union Rule
FC = {A → BC}
Algorithm to find canonical cover
▪ repeat
Use the union rule to replace any dependencies in F
α1 → β1 and α1 → β2 with α1 → β1 β2
Find a functional dependency α → β with an
extraneous attribute either in α or in β
/* Note: test for extraneous attributes done using Fc, not F*/
If an extraneous attribute is found, delete it from α → β
until F does not change
▪ Note: Union rule may become applicable after some extraneous
attributes have been deleted, so it has to be re-applied
Example to find canonical cover
▪ Consider the relation schema R = (A, B, C) with FDs
F = {A → BC, B → C, A → B, AB → C} find canonical cover.
▪ Combine A → BC and A → B into A → BC (Union Rule)
• Set is now {A → BC, B → C, AB → C}
▪ A is extraneous in AB → C
• Check if the result of deleting A from AB → C is implied by the other
dependencies
• Yes: in fact, B → C is already present
• Set is now {A → BC, B → C}
▪ C is extraneous in A → BC
• Check if A → C is logically implied by A → B and the other dependencies
• Yes: using transitivity on A → B and B → C.
▪ The canonical cover is: A → B, B → C
What is decomposition?
▪ Decomposition is the process of breaking down given relation
into two or more relations.
▪ Relation R is replaced by two or more relations in such a way that:
1. Each new relation contains a subset of the attributes of R
2. Together, they all include all tuples and attributes of R
▪ Types of decomposition
1. Lossy decomposition
2. Lossless decomposition (non-loss decomposition)
Lossy decomposition
Ano Balance Bname
A01 5000 Rajkot
A02 5000 Surat

Ano Balance Balance Bname


A01 5000 Not Same 5000 Rajkot
A02 5000 5000 Surat

Ano Balance Bname


A01 5000 Rajkot
A01 5000 Surat
Extra records
A02 5000 Rajkot
A02 5000 Surat
Lossy decomposition
▪ The decomposition of relation R into R1 and R2 is lossy when the
join of R1 and R2 does not yield the same relation as in R.
▪ This is also referred as lossy-join decomposition.
▪ The disadvantage of such kind of decomposition is that some
information is lost during retrieval of original relation.
▪ From practical point of view, decomposition should not be lossy
decomposition.
Lossless decomposition
Ano Balance Bname
A01 5000 Rajkot
A02 5000 Surat

Ano Balance Ano Bname


A01 5000 Same A01 Rajkot
A02 5000 A02 Surat

Ano Balance Bname


A01 5000 Rajkot Same as original table
A02 5000 Surat No extra records
Lossless decomposition
▪ The decomposition of relation R into R1 and R2 is lossless when
the join of R1 and R2 produces the same relation as in R.
▪ This is also referred as a non-additive (non-loss) decomposition.
▪ All decompositions must be lossless.
What is an anomaly in database design?
▪ Anomalies are problems that can occur in poorly planned, un-
normalized database where all the data are stored in one table.
▪ There are three types of anomalies that can arise in the database
because of redundancy are
1. Insert anomaly
2. Delete anomaly
3. Update / Modification anomaly
Insert anomaly
▪ Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to insert new 1 Raj Rajkot 1 CE 1
department detail (IT) 2 Meet Surat 1 CE 1

▪ Suppose a new department (IT) has been started by the organization


but initially there is no employee appointed for that department.
▪ We want to insert that department detail in emp_dept table.
▪ But the tuple for this department cannot be inserted into this table as
the E# will have NULL value, which is not allowed because E# is primary
key.
▪ This kind of problem in the relation where some tuple cannot be
inserted is known as insert anomaly.
What is Insert anomaly?
▪ An insert anomaly occurs when certain attributes cannot be
inserted into the database without the presence of another
attribute.
Delete anomaly
▪ Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to delete Meet 1 Raj Rajkot 1 CE 1
employee's detail 2 Meet Surat 1 IT 2

▪ Now consider there is only one employee in some department (IT) and
that employee leaves the organization.
▪ So we need to delete tuple of that employee (Meet).
▪ But in addition to that information about the department also deleted.
▪ This kind of problem in the relation where deletion of some tuples can
lead to loss of some other data not intended to be removed is known as
delete anomaly.
What is Delete anomaly?
▪ A delete anomaly exists when certain attributes are lost because
of the deletion of another attribute.
Update anomaly
▪ Consider a relation
• emp_dept (E#, Ename, Address, D#, Dname, Dmgr#) E# as a primary key
E# Ename Address D# Dname Dmgr#
Want to update CE 1 Raj Rajkot 1 CE M1
department’s manager 2 Meet Surat 2 IT M2
3 Jay Rajkot 2 C.E M2

▪ Suppose the manager of a (CE) department has changed, this requires


that the Dmgr# in all the tuples corresponding to that department
must be changed to reflect the new status.
▪ If we fail to update all the tuples of given department, then two
different records of employee working in the same department might
show different Dmgr# lead to inconsistency in the database.
What is Update anomaly?
▪ An update anomaly exists when one or more records (instance) of
duplicated data is updated, but not all.
Anomaly (Summary)
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E Patel
E5 Suresh Surat D3 Electrical Joshi
null null null D4 Chemical null

Delete Anomaly Insert Anomaly


If we delete Employee having ID “E2” then Do not allow to insert new
Civil department will also delete because Department “Chemical” until an
there is only one record of Civil dept. employee is assign to it.

Update Anomaly
An update anomaly exists when one or more records of duplicated data is updated, but
not all.
How to deal with insert anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi
null null null D4 Chemical null
Do not allow to insert new department “Chemical” until an employee is assign to it.
EmpID EmpName Address DeptID DeptID DeptName DeptMngr
E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1 D4 Chemical null
E5 Suresh Surat D3
How to deal with delete anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi

If we delete Employee having ID “E2” then Civil department will also delete because
there is only one record of Civil dept.

EmpID EmpName Address DeptID DeptID DeptName DeptMngr


E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1
E5 Suresh Surat D3
How to deal with update anomaly
EmpID EmpName Address DeptID DeptName DeptMngr
E1 Raj Rajkot D1 Computer
C.E. Patel
E2 Samir Rajkot D2 Civil Shah
E3 Meet Baroda D1 Computer Patel
E4 Deepak Surat D1 C.E
Computer Patel
E5 Suresh Surat D3 Electrical Joshi
Changing the name of department D1 from “Computer” to “IT” may update one or
more records, but not all.

EmpID EmpName Address DeptID DeptID DeptName DeptMngr


E1 Raj Rajkot D1 D1 Computer Patel
E2 Samir Rajkot D2 D2 Civil Shah
E3 Meet Baroda D1 D3 Electrical Joshi
E4 Deepak Surat D1
E5 Suresh Surat D3
How anomalies in database design can be solved?
▪ Such type of anomalies in database design can be solved by using
normalization.
What is normalization?
▪ Normalization is the process of removing redundant data from
tables to improve data integrity, scalability and storage
efficiency.
1. data integrity (completeness, accuracy and consistency of data)
2. scalability (ability of a system to continue to function well in a
growing amount of work)
3. storage efficiency (ability to store and manage data that consumes
the least amount of space)
What we do in normalization?
▪ Normalization generally involves splitting an existing table into
multiple (more than one) tables, which can be re-joined or linked
each time a query is issued (executed).
How many normal forms are there?
▪ Normal forms:
1. 1NF (First normal form)
2. 2NF (Second normal form)
3. 3NF (Third normal form)
4. BCNF (Boyce–Codd normal form)
5. 4NF (Forth normal form)
6. 5NF (Fifth normal form)

As we move from 1NF to 5NF number of tables and


complexity increases but redundancy decreases.
1NF (First Normal Form)
▪ Conditions for 1NF
Each cells of a table should contain a single value.

▪ A relation R is in first normal form (1NF) if and only if it does not


contain any composite attribute or multi-valued attributes or
their combinations.
OR
▪ A relation R is in first normal form (1NF) if and only if all
underlying domains contain atomic values only.
1NF (First Normal Form) [Composite attribute]
Customer
CustomerID Name Address
C01 Raj Jamnagar Road, Rajkot
C02 Meet Nehru Road, Jamnagar

▪ In above relation address is composite attribute which is further


divided into sub-attributes as “Road” and “City”.
▪ So above relation is not in 1NF.
1NF (First Normal Form) [Composite attribute]
Customer
CustomerID Name Address
C01 Raj Jamnagar Road, Rajkot
C02 Meet Nehru Road, Jamnagar

▪ Problem: It is difficult to retrieve the list of customers living in


‘Jamnagar’ from above table.
▪ The reason is that address attribute is composite attribute which
contains road name as well as city name in single cell.
▪ It is possible that city name word is also there in road name.
▪ In our example, ‘Jamnagar’ word occurs in both records, in first
record it is a part of road name and in second one it is the name
of city.
1NF (First Normal Form) [Composite attribute]
Customer
CustomerID Name Address
C01 Raj Jamnagar Road, Rajkot
C02 Meet Nehru Road, Jamnagar

▪ Solution: Divide composite attributes into number of sub-


attributes and insert value in proper sub-attribute.

Customer
CustomerID Name Road City
C01 Raj Jamnagar Road Rajkot
C02 Meet Nehru Road Jamnagar
1NF (First Normal Form) [Multivalued attribute]
Student
RollNo Name FailedinSubjects
101 Raj DS, DBMS
102 Meet DBMS, DS
103 Jeet DS, DBMS, DE
104 Harsh DBMS, DE, DS
105 Nayan DE, DBMS, DS

▪ In above relation FailedinSubjects is a multi-valued attribute


which can store more than one values.
▪ So above relation is not in 1NF.
1NF (First Normal Form) [Multivalued attribute]
Student
RollNo Name FailedinSubjects
101 Raj DS, DBMS
102 Meet DBMS, DS
103 Jeet DS, DBMS, DE
104 Harsh DBMS, DE, DS
105 Nayan DE, DBMS, DS

▪ Problem: It is difficult to retrieve the list of students failed in


‘DBMS’ as well as ‘DS’ but not in other subjects from above table.
▪ The reason is that FailedinSubjects attribute is multi-valued
attribute so it contains more than one value.
1NF (First Normal Form) [Multivalued attribute]
Student Student Result
RollNo Name FailedinSubjects RollNo Name RID RollNo Subject
101 Raj DS, DBMS 101 Raj 1 101 DS
102 Meet DBMS, DS 102 Meet 2 101 DBMS
103 Jeet DS, DBMS, DE 103 Jeet 3 102 DBMS
104 Harsh DBMS, DE, DS 104 Harsh 4 102 DS
105 Nayan DE, DBMS, DS 105 Nayan 5 103 DS
… … …

▪ Solution: Split the table into two tables in such as way that
• the first table contains all attributes except multi-valued attribute with
same primary key and
• other table contains multi-valued attribute and place a primary key in it.
• insert the primary key of first table in the second table as a foreign key.
2NF (Second Normal Form)
▪ Conditions for 2NF
It is in 1NF and each table should contain a single primary key.

▪ A relation R is in second normal form (2NF)


• if and only if it is in 1NF and
• every non-primary key attribute is fully dependent on the primary key
OR
▪ A relation R is in second normal form (2NF)
• if and only if it is in 1NF and
• no any non-primary key attribute is partially dependent on the primary
key
2NF (Second Normal Form)
CustomerID AccountNO AccesssDate Balance BranchName
C01 A01 01-01-2017 50000 Rajkot
C02 A01 01-03-2017 50000 Rajkot
C01 A02 01-05-2017 25000 Surat
C03 A02 01-07-2017 25000 Surat

CustomerID AccountNO AccesssDate Balance BranchName

▪ FD1: {CustomerID, AccountNO} → {AccesssDate, Balance, BranchName}


▪ FD2: AccountNO → {Balance, BranchName}
▪ Balance and BranchName are partial dependent on primary key.
So above relation is not in 2NF.
2NF (Second Normal Form)
CustomerID AccountNO AccesssDate Balance BranchName
C01 A01 01-01-2017 50000 Rajkot
C02 A01 01-03-2017 50000 Rajkot
C01 A02 01-05-2017 25000 Surat
C03 A02 01-07-2017 25000 Surat

▪ Problem: For example, in case of a joint account multiple


customers have common (one) accounts.
▪ If an account ‘A01’ is operated jointly by two customers says ‘C01’
and ‘C02’ then data values for attributes Balance and
BranchName will be duplicated in two different tuples of
customers ‘C01’ and ‘C02’.
2NF (Second Normal Form)
Solution: Decompose relation in such a CstID ActNO AccessDate Balance Branch
way that resultant relations do not have
any partial FD. C01 A01 01-01-2017 50000 Rajkot
C02 A01 01-03-2017 50000 Rajkot
C01 A02 01-05-2017 25000 Surat
C03 A02 01-07-2017 25000 Surat

Remove partial dependent attributes


ActNO Balance Branch
from the relation that violets 2NF.
Place them in separate relation along Table 1 A01 50000 Rajkot
with the prime attribute on which they A02 25000 Surat
are fully dependent. CstID ActNO AccessDate
The primary key of new relation will be C01 A01 01-01-2017
the attribute on which it is fully Table 2 C02 A01 01-03-2017
dependent.
C01 A02 01-05-2017
Keep other attributes same as in that
C03 A02 01-07-2017
table with the same primary key.
3NF (Third Normal Form)
▪ Conditions for 3NF

It is in 2NF and there is no transitive dependency.

(Transitive dependency???) A → B & B → C then A → C.

▪ A relation R is in third normal form (3NF)


• if and only if it is in 2NF and
• every non-key attribute is non-transitively dependent on the primary key
OR
▪ A relation R is in third normal form (3NF)
• if and only if it is in 2NF and
• no any non-key attribute is transitively dependent on the primary key
3NF (Third Normal Form)
AccountNO Balance BranchName BranchAddress
A01 50000 Rajkot Kalawad Road
A02 40000 Rajkot Kalawad Road
A03 35000 Rajkot Kalawad Road
A04 25000 Rajkot Kalawad Road

AccountNO Balance BranchName BranchAddress

▪ FD1: AccountNO → {Balance, BranchName, BranchAddress} and


▪ FD2: BranchName → BranchAddress
▪ So AccountNO → BranchAddress (Using Transitivity rule)
▪ Problem: In this relation, branch address will be stored repeatedly for
each account of the same branch which occupies more space.
3NF (Third Normal Form)
Solution: Decompose relation in such a
ANO Balance BName BAddress
way that resultant relations do not have
any transitive FD. A01 50000 Rajkot Kalawad Road
A02 40000 Rajkot Kalawad Road
A03 35000 Rajkot Kalawad Road
Remove transitive dependent attributes A04 25000 Rajkot Kalawad Road
from the relation that violets 3NF.
Place them in a new relation along with BName BAddress
the non-prime attributes due to which
Table 1 Rajkot Kalawad Road
transitive dependency occurred.
The primary key of the new relation will
be non-prime attributes due to which ANO Balance BName
transitive dependency occurred. A01 50000 Rajkot
Table 2
Keep other attributes same as in the A02 40000 Rajkot
table with same primary key and add A03 35000 Rajkot
prime attributes of other relation into it
as a foreign key. A04 25000 Rajkot
BCNF (Boyce-Codd Normal Form)
▪ Conditions for BCNF

BCNF is based on the concept of a determinant.

It is in 3NF and every determinant should be primary key.

Primary Key

Determinant Dependent

AccountNO → {Balance, Branch}

AccountNO → {Balance, Branch}


BCNF (Boyce-Codd Normal Form)
▪ A relation R is in Boyce-Codd normal form (BCNF)
• if and only if it is in 3NF and
• for every functional dependency X → Y, X should be the primary key of the
table.
OR
▪ A relation R is in Boyce-Codd normal form (BCNF)
• if and only if it is in 3NF and
• every prime key attribute is non-transitively dependent on the primary
key
OR
▪ A relation R is in Boyce-Codd normal form (BCNF)
• if and only if it is in 3NF and
• no any prime key attribute is transitively dependent on the primary key
BCNF (Boyce-Codd Normal Form)
▪ FD1: Student, Language→ Faculty Student Language Faculty
Mita JAVA Patel
▪ FD2: Faculty → Language
Nita VB Shah
▪ So {Student, Language} → Language Sita JAVA Jadeja
(Using Transitivity rule) Gita VB Dave
Rita VB Shah
Here, one faculty teaches only one subject, but a
subject may be taught by more than one faculty. Nita JAVA Patel
Mita VB Dave
In FD2, determinant is Faculty which is not a Rita JAVA Jadeja
primary key. So table is not in BCNF.

▪ Problem: In this relation one student has more than one project
with different guide (faculty) then records will be stored
repeatedly for each student, language and guide (faculty)
combination which occupies more space.
BCNF (Boyce-Codd Normal Form)
Solution: Decompose relation in such a
way that resultant relations do not have Student Language Faculty
any transitive FD.

Remove transitive dependent prime


attribute from relation that violets Table 1 Table 2
BCNF.
Faculty Language Student Faculty
Place them in separate new relation along Patel JAVA Mita Patel
with the non-prime attribute due to
Shah VB Nita Shah
which transitive dependency occurred.
Jadeja JAVA Sita Jadeja
The primary key of new relation will be
Dave VB Gita Dave
this non-prime attribute due to which
transitive dependency occurred. Rita Shah

Keep other attributes same as in that Nita Patel


table with same primary key and add a Mita Dave
prime attribute of other relation into it Rita Jadeja
as a foreign key.
Multivalued dependency (MVD)
▪ For a dependency X → Y, if for a single value of X, multiple values
of Y exists, then the table may have multi-valued dependency.
X Y

studentID subject activity Multivalued dependency


101 DS Cricket (MVD) is denoted by →→
101 DBMS Cricket Multivalued dependency
101 DS Football (MVD) is represented as
101 DBMS Football X →→ Y
4NF (Fourth Normal Form)
▪ A relation R is in fourth normal form (4NF)
• if and only if it is in BCNF and
• has no multivalued dependencies
4NF (Fourth Normal Form)
studentID subject activity
101 DS Cricket
101 DBMS Cricket
101 DS Football
101 DBMS Football

Decompose

studentID subject studentID activity


101 DS 101 Cricket
101 DBMS 101 Football
Functional dependency & Multivalued dependency
▪ A table can have both functional dependency as well as multi-
valued dependency to gather.
• studentID → address
• studentID →→ subject
• studentID →→ activity

studentID address subject activity


101 C. G. Road DS Cricket
101 C. G. Road DBMS Cricket
101 C. G. Road DS Football
101 C. G. Road DBMS Football
Functional dependency & Multivalued dependency

studentID address subject activity


101 C. G. Road DS Cricket
101 C. G. Road DBMS Cricket
101 C. G. Road DS Football
101 C. G. Road DBMS Football

Decompose

studentID subject studentID address studentID activity


101 DS 101 C. G. Road 101 Cricket
101 DBMS 101 Football
5NF (Fifth Normal Form)
▪ A relation R is in fifth normal form (5NF)
• if and only if it is in 4NF and
• it cannot have a lossless decomposition in to any number of smaller tables
(relations).
5NF (Fifth Normal Form)
Result
ResultID RollNo StudentName SubjectName Result
1 101 Raj DBMS Pass
2 101 Raj DS Pass
3 101 Raj DE Pass
4 102 Meet DBMS Pass
5 102 Meet DS Fail
6 102 Meet DE Pass
7 103 Suresh DBMS Fail
8 103 Suresh DS Pass
9 103 Suresh DE Fail

Above relation is further decompose into sub-relations.


So above relation is not in 5NF.
5NF (Fifth Normal Form)
Student Result
RollNo StudentName ResultID RollNo SubjectID Result
101 Raj 1 101 1 Pass
102 Meet 2 101 2 Pass
103 Suresh 3 101 3 Pass
4 102 1 Pass
Subject 5 102 2 Fail
SubjectID SubjectName 6 102 3 Pass
1 DBMS 7 103 1 Fail
2 DS 8 103 2 Pass
3 DE 9 103 3 Fail

None of the above relations can be further decomposed


into sub-relations. So above database is in 5NF.
How to find key?
▪ Conditions to find key
1. The attribute is a part of key, if it does not occur on any side of FD
2. The attribute is a part of key, if it occurs on the left-hand side of an FD,
but never occurs on the right-hand side
3. The attribute is not a part of key, if it occurs on the right-hand side of an
FD, but never occurs on the left-hand side
4. The attribute may be a part of key or not, if it occurs on the both side of
an FD
Example to find (candidate) key
▪ Let a relation R with attributes ABCD with FDs C → A, B → C. Find
keys for relation R.
1. attribute not occur on any side of FDs (D)
2. attribute occurs on only left-hand side of an FDs (B)
3. attribute occurs on only right-hand side of an FDs (A)
4. attribute occurs on both the sides of an FDs (C)
▪ The core is BD.
▪ B determines C and C determines A, So using transitivity rule B
determine A also.
▪ So BD is a key.
Exercise to find (candidate) key
▪ Let a relation R with attributes ABCD with FDs C → D, C → A and
B → C. Find keys for relation R.
• The core is B. B determines C which determines A and D, so B is a key.
Therefore B is the only key.
▪ Let a relation R with attributes ABCD with FDs B → C, D → A. Find
keys for relation R.
• The core is BD. B determines C and D determines A, so BD is a key.
Therefore BD is the only key.
▪ Let a relation R with attributes ABCD with FDs A → B, BC → D and
A → C. Find keys for relation R.
• The core is A. A determines B and C which determine D, so A is a key.
Therefore A is the only key.
Find (candidate) key & check for normal forms
▪ Suppose you are given a relation R with four attributes ABCD. For
each of the following sets of FDs, do the following:
1. Identify the candidate key(s) for R.
2. Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

B → C, D → A

Candidate Key is BD

Relation R is in 1NF but not 2NF. In above FDs there is


partial dependency
(C is depend on only B but Key is BD)
(A is depend on only D but Key is BD)
Find (candidate) key & check for normal forms
▪ Suppose you are given a relation R with four attributes ABCD. For
each of the following sets of FDs, do the following:
1. Identify the candidate key(s) for R.
2. Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

C → D, C → A, B → C

Candidate Key is B

Relation R is in 2NF but not 3NF. In above FDs there is


transitive dependency
(B → C & C → D so B → D using transitivity rule) or
(B → C & C → A so B → A using transitivity rule)
Find (candidate) key & check for normal forms
▪ Suppose you are given a relation R with four attributes ABCD. For
each of the following sets of FDs, do the following:
1. Identify the candidate key(s) for R.
2. Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

A → B, BC → D, A → C

Candidate Key is A

Relation R is in 2NF but not 3NF. In above FDs there is


transitive dependency
(A → B & A → C so A → BC using union rule) and
(A → BC & BC → D so A → D using transitivity rule)
Find (candidate) key & check for normal forms
▪ Suppose you are given a relation R with four attributes ABCD. For
each of the following sets of FDs, do the following:
1. Identify the candidate key(s) for R.
2. Identify the best normal form that R satisfies (1NF, 2NF, 3NF or BCNF).

ABC → D, D → A

Candidate Key are ABC, BCD

Relation R is in 3NF but not BCNF. In above FDs both


FDs have prime attribute in dependent (right) side
D and A
How to normalize database?
▪ A software contract and consultancy firm maintains details of all
the various projects in which its employees are currently involved.
These details comprise: Employee Number, Employee Name, Date
of Birth, Department Code, Department Name, Project Code,
Project Description, Project Supervisor
Assume the following:
I. Each employee number is unique.
II. Each department has a single department code.
III. Each project has a single code and supervisor.
IV. Each employee may work on one or more projects.
V. Employee names need not necessarily be unique.
VI. Project Code, Project Description and Project Supervisor are repeating
fields.
VII. Normalize this data to Third Normal Form.
How to normalize database? [UNF]
▪ A software contract and consultancy firm maintains details of all
the various projects in which its employees are currently involved.
These details comprise: Employee Number, Employee Name, Date
of Birth, Department Code, Department Name, Project Code,
Project Description, Project Supervisor

UNF
Employee Employee Date of Department Department Project Project Project
Number Name Birth Code Name Code Description Supervisor
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE 3 IOT Patel
1 Raj 1-1-85 1 CE 2 PHP Shah
How to normalize database? [1NF]
UNF
Employee Employee Date of Department Department Project Project Project
Number Name Birth Code Name Code Description Supervisor
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE 1 IOT Patel
1 Raj 1-1-85 1 CE 2 PHP Shah

1NF

Employee Employee Date of Department Department Employee Project Project Project


Number Name Birth Code Name Number Code Description Supervisor
1 Raj 1-1-85 1 CE 1 1 IOT Patel
2 Meet 4-4-86 2 EC 2 2 PHP Shah
3 Suresh 2-2-85 1 CE 3 1 IOT Patel
1 2 PHP Shah
How to normalize database? [2NF]
2NF

Employee Employee Date of Department Department Project Project Project


Number Name Birth Code Name Code Description Supervisor
1 Raj 1-1-85 1 CE 1 IOT Patel
2 Meet 4-4-86 2 EC 2 PHP Shah
3 Suresh 2-2-85 1 CE

Employee Project
Number Code
1 1
2 2
3 1
1 2
How to normalize database? [3NF]
3NF

Employee Employee Date of Department Project Project Project


Number Name Birth Code Code Description Supervisor
1 Raj 1-1-85 1 1 IOT Patel
2 Meet 4-4-86 2 2 PHP Shah
3 Suresh 2-2-85 1

Employee Project
Number Code
Department Department
Code Name 1 1
1 CE 2 2
2 EC 3 1
1 2

You might also like