You are on page 1of 94

Closure

Step-01: Add the attributes contained in the attribute set for


which closure is being calculated to the result set.

Step-02:
Recursively add the attributes to the result set which can be
functionally determined from the attributes already contained
in the result set.
Closure
Example-

Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-

A → BC

BC → DE

D→F

CF → G
Closure
Closure of attribute A-

A+ = { A }

= { A , B , C } ( Using A → BC )

= { A , B , C , D , E } ( Using BC → DE )

= { A , B , C , D , E , F } ( Using D → F )

= { A , B , C , D , E , F , G } ( Using CF → G )

Thus,

A+ = { A , B , C , D , E , F , G }
Closure
Closure of attribute D-

D+ = { D }

= { D , F } ( Using D → F )

We can not determine any other attribute using attributes D and F contained in the
result set.

Thus,

D+ = { D , F }
Closure
Closure of attribute set {B, C}-

{ B , C }+= { B , C }

= { B , C , D , E } ( Using BC → DE )

= { B , C , D , E , F } ( Using D → F )

= { B , C , D , E , F , G } ( Using CF → G )

Thus,

{ B , C }+ = { B , C , D , E , F , G }
Functional Dependency
Functional Dependency

The functional dependency is a relationship that exists


between two attributes. It typically exists between the primary
key and non-key attribute within a table.

X → Y
The left side of FD is known as a determinant, the right side of
the production is known as a dependent.
Functional Dependency
Assume we have an employee table with attributes: Emp_Id,
Emp_Name, Emp_Address.

Here, Emp_Id attribute can uniquely identify the Emp_Name


attribute of employee table because if we know the Emp_Id,
we can tell that employee name associated with it.
Functional dependency can be written as:
Emp_Id → Emp_Name
We can say that Emp_Name is functionally dependent on
Emp_Id.
Functional Dependency
• Functional Dependency (FD) determines the relation of one
attribute to another attribute in a database management
system (DBMS) system.
• Functional dependency helps you to maintain the quality of
data in the database.
• A functional dependency is denoted by an arrow →.
• The functional dependency of X on Y is represented by X → Y.
Functional Dependency

• If we know the value of Employee number, we can obtain


Employee Name, city, salary, etc.
• By this, we can say that the city, Employee Name, and salary
are functionally depended on Employee number.
Functional Dependency Rules

• Three most important rules for Functional Dependency:


Types of Functional Dependency
1. Multivalued dependency
2. Trivial functional dependency
3. Non-trivial functional dependency
4. Transitive dependency
Types of Functional Dependency
1. Multivalued dependency
• Multivalued dependency occurs in the situation where there
are multiple independent multivalued attributes in a single
table.
• A multivalued dependency is a complete constraint between
two sets of attributes in a relation.
• It requires that certain tuples be present in a relation.
Types of Functional Dependency
1. Multivalued dependency

maf_year and color are independent of each other but dependent on


car_model. these two columns are said to be multivalue dependent on
car_model.
Types of Functional Dependency
1. Multivalued dependency

car_model -> maf_year


car_model-> colour
But CEO is not a subset of Company, and hence it's non-trivial functional
dependency.
Types of Functional Dependency
1. Trivial functional dependency

A → B has trivial functional dependency if B is a subset of A.


The following dependencies are also trivial like: A → A, B → B
Types of Functional Dependency
Example:
Consider a table with two columns Employee_Id and
Employee_Name.
{Employee_id, Employee_Name} → Employee_Id is a trivial
functional dependency as
Employee_Id is a subset of {Employee_Id, Employee_Name}.

Also, Employee_Id → Employee_Id and Employee_Name →


Employee_Name are trivial dependencies too.
Types of Functional Dependency
2. Non-trivial functional dependency
A → B has a non-trivial functional dependency if B is not a
subset of A.
When A intersection B is NULL, then A → B is called as
complete non-trivial.
Example:
ID → Name,
Name → DOB
Fully Functional Dependency
XY
Y is Fully Functional Dependent on X if Y is Functional Dependent
on X but not functional dependent of any proper subset of X
Functional Dependency
XY Eid Enam Dept
e
Y is Fully Functional Dependent on X 101 A Mkt
102 B Fin
if Y is Functional Dependent on X
103 B HR
but not functional dependent of any proper
subset of X.
Eid,ename  dept
Subsets:

Eid dept
Enamedept
Normalization
What is it and what is need of normalization:
Process of organizing data in DB.
1.Minimize redundancy in relation
2.Divide big table into small tables
Need
To remove anomolies
Normalization
• If a database design is not perfect, it may contain anomalies,
which are like a bad dream for any database administrator.
Managing a database with anomalies is next to impossible.
Problems Without Normalization
If a table is not properly normalized and have data redundancy
then it will not only consume extra memory space but will also
make it difficult to handle and update the database, without
facing data loss.

Insertion, Updation and Deletion Anomalies are very frequent if


database is not normalized.

To understand these anomalies let us take an example of a


Student table.
Problems Without Normalization
Data Redundancy:
Duplicate data in the table
Disadvantage:
1.Insertion, deletion and
update Anomalies
Sid Name Age Branch Branch HOD
_code _name
1 A 18 101 CSE XYZ
2 B 19 101 CSE XYZ
3 C 18 101 CSE XYZ
4 D 21 102 ECE PQR
5 E 20 102 ECE PQR
6 F 19 103 ME KLM
Problems Without Normalization
Sid Name Age Branch Branch HOD
INSERTION ANOMALY: _code _name
1 A 18 101 CSE XYZ
When certain data
2 B 19 101 CSE XYZ
attributes cant be inserted 3 C 18 101 CSE XYZ

into the database without 4 D 21 102 ECE PQR


5 E 20 102 ECE PQR
the presence of other data.
6 F 19 103 ME KLM

In this table, a new branch civil cant be added until you have the sid
of the student who took admission in that branch civil. You cant
enter data without sid because entire branch data is kept in a single
table.
Anomalies
2. Update anomalies − If data items are scattered and are not
linked to each other properly, then it could lead to strange
situations.

For example, when we try to update one data item having its
copies scattered over several tables, a few instances get
updated properly while a few others are left with old values.
Such instances leave the database in an inconsistent state.
Anomalies
3. Deletion anomalies − We tried to delete a record that is
unwanted but it leads to deletion of the data that we wanted.
Example: we wanted to delete details of student but did not
want to delete branch information. Now because branch info.
Is not stored separately so it will also be deleted.
Anomalies

Insertion anomaly:

If a tuple is inserted in referencing relation and referencing attribute value is not


present in referenced attribute, it will not allow inserting in referencing relation.

For Example, If we try to insert a record in STUDENT_COURSE with


STUD_NO =7, it will not allow.
Anomalies

Deletion and Updation anomaly: If a tuple is deleted or updated from referenced


relation and referenced attribute value is used by referencing attribute in
referencing relation, it will not allow deleting the tuple from referenced relation.

For Example, If we try to delete a record from STUDENT with STUD_NO =1.
Normalization
• Normalization is the process of organizing the data in the
database.
• Normalization is used to minimize the redundancy from a
relation or set of relations. It is also used to eliminate the
undesirable characteristics like Insertion, Update and Deletion
Anomalies.
• Normalization divides the larger table into the smaller table
and links them using relationship.
• The normal form is used to reduce redundancy from the
database table.
Normalization Types
Normalization Types
First Normal Form (1NF)

•A relation will be 1NF if it contains an atomic value.


•It states that an attribute of a table cannot hold multiple values. It must
hold only single-valued attribute.
•First normal form disallows the multi-valued attribute, composite
attribute, and their combinations.
Normalization Types: First Normal Form (1NF)
Normalization Types: First Normal Form (1NF)
Normalization Types: First Normal Form (1NF)
Second Normal Form (2NF)
•In the 2NF, relational must be in 1NF.
•In the second normal form, all non-key attributes are
fully functional dependent on the primary key
•There should not be partial dependency
Examples: check whether this Relation is in 2NF or not?
R(ABCD) and FD:{ABD , B C}
Sol:
ABCD: after reducing: (AB)+={ABCD} So AB is a CK.
Prime att: {AB} and Non prime:{CD}
Concept is: As AB is a CK this means {AB} pair cant be
empty, but individual values can be null.
From BC : partial dependency
A B
If B is null you cant find C 1 -
- 2
- -
3 4
Examples: check whether this Relation is in 2NF or
not?
R(ABCD) and FD:{ABD , B C}
Example2: check whether this Relation is in 2NF or
not?

scoreid studen subjec Marks Teache


tid tid r

1 10 1 82 A
2 10 2 87 B
3 11 1 86 C
4 11 2 85 D
5 11 4 80 E
Example2: check whether this Relation is in 2NF or
not? scor stud subj Mar Teac
eid entid ectid ks her
1.Display teacher of student with sid=10
1 10 1 82 A
2.Display teacher for subjectid=1 2 10 2 87 B
3 11 1 86 C
(studentid,subjectid)  teacher
4 11 2 85 D
To know the teacher, subject id is enough. 5 11 4 80 E

subjectid  teacher {partial dependency}


Score table: scoreid studentid subjectid marks

Subject:
subjectid subjectname teacher
Final tables:
Student: studentid name regno branch address

Score table
scoreid studentid subjectid marks

subject
subjectid subjectname teacher
3NF: No transitive dependency
When a non prime attribute in a table depends upon other non
prime attribute.
Consider score table with additional 2 attributes:
scoreid studentid subjectid marks examname Total marks

In the score table PK is composite (studentid, subjectid)


CSE student will have graphics exam but Civil student will
not.
It means marks are dependent on exam
3NF: No transitive dependency

scoreid studentid subjectid marks examname Total marks

examname totalmarks {both these are non prime att.}

scoreid studentid subjectid marks

examid examname totalmarks


BCNF: 3.5NF
Table must be in 3NF and for any dependency A B
A should be a super key.

AB
A: non prime and B: prime :{not allowed in BCNF}
BCNF: 3.5NF
studentid subject professor
101 Java P.Java
101 C P.C
102 Java P.Java
103 C++ P.C++

One student can enroll


104in more subjects
java and one teacher is assigned
P.Java

2 subjects
BCNF: 3.5NF
Multiple prof. teach same studenti subject professo
subject d r
101 Java P.Java
(subjectid,subject) professor
101 C P.C
professor  subject
102 Java P.Java
Nonprime  Prime att. 103 C++ P.C++
104 java P.Java

studentid professor

professorid professor subject


4NF
It must be in BCNF and there must not be any
multivalued dependency.
A -->> B
4NF
A table should have at least 3 columns to have multi-
valued dependency.
Reason:

Distribute data in multiple rows will solve the problem,


no need to decompose the table.
4NF
A table should have at least 3 columns to have multi-
valued dependency.
Reason:
For a table with 3 values,
A B is a multivalued dependency then B and C should
be independent of each other.
4NF
There are 3 conditions for multi-valued dependency.
1.AB for a single value of A, more than one value of B exist
2.Table has atleast 3 columns
3.For a table with A, B, C attributes, B and C should be
independent of each other.
If all there are true, then table has multi valued
dependency.
4NF
Studentid Course Hobby
1 Science Cricket
1 Maths Hockey
2 C++ Cricket
3 Php Hockey

Studentid Course Hobby


Course_opted(sid,course)
1 Science Cricket
1 Science Hockey
1 Maths Cricket
1 Maths Hockey
Hobbies(sid,hobbies)
5NF: PJNF
Project Join Normal Form
There must be not join dependency.
If a relation has join dependency then it is divided into small
relations such that if we join back it given original relation.
5NF: PJNF
Supplier Product Customer

ACM EXCEL FORD


ROBO GEAR GM
ROBO SWITCH FORD
VAT TOOL BOX TATA
5NF: PJNF
1.One supplier can supply given part to one or more customers

Ternary relationship
5NF:
2. Customer buying from supplier and supplier can have one or
more products.
5NF:
3. Product used by customer can be supplied by one or more
suppliers
5NF
Binary Relationships can be:
Supplier-customer
Customer-product
Product-supplier
5NF
5NF: final tables
Supplier product customer

Supplier Product Supplier Customer Customer Product


To convert the given table into 2NF, we decompose it into two
tables:
To convert the given table into 2NF, we decompose it into two
tables:
Third Normal Form (3NF)
•A relation will be in 3NF if it is in 2NF and not contain any transitive partial
dependency.
•3NF is used to reduce the data duplication. It is also used to achieve the data
integrity.
•If there is no transitive dependency for non-prime attributes, then the relation
must be in third normal form.
•A relation is in third normal form if it holds atleast one of the following
conditions for every non-trivial function dependency X → Y.
1.X is a super key.
2.Y is a prime attribute, i.e., each element of Y is part of some candidate key.
Super key in the table above:

{EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on


Candidate key: {EMP_ID}

Non-prime attributes: In the given table, all attributes except EMP_ID are non-
prime.

Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP


dependent on EMP_ID. The non-prime attributes (EMP_STATE, EMP_CITY)
transitively dependent on super key(EMP_ID). It violates the rule of third normal
form.

That's why we need to move the EMP_CITY and EMP_STATE to the new
<EMPLOYEE_ZIP> table, with EMP_ZIP as a Primary key.
Boyce Codd normal form (BCNF)

•BCNF is the advance version of 3NF. It is stricter than 3NF.

•A table is in BCNF if every functional dependency X → Y, X is the


super key of the table.

•For BCNF, the table should be in 3NF, and for every FD, LHS is
super key.
Candidate key: {EMP-ID, EMP-DEPT}
Fourth normal form (4NF)

•A relation will be in 4NF if it is in Boyce Codd normal form and has


no multi-valued dependency.

•For a dependency A → B, if for a single value of A, multiple values


of B exists, then the relation will be a multi-valued dependency.
Fourth normal form (4NF)

•A relation will be in 4NF if it is in Boyce Codd normal form and has


no multi-valued dependency.

•For a dependency A → B, if for a single value of A, multiple values


of B exists, then the relation will be a multi-valued dependency.
The given STUDENT table is in 3NF, but the COURSE and HOBBY
are two independent entity. Hence, there is no relationship between
COURSE and HOBBY.

In the STUDENT relation, a student with STU_ID, 21 contains two


courses, Computer and Math and two hobbies, Dancing and Singing.
So there is a Multi-valued dependency on STU_ID, which leads to
unnecessary repetition of data.

So to make the above table into 4NF, we can decompose it into two
tables:
Normalization

1. First Normal Form –


If a relation contain composite or multi-valued attribute, it
violates first normal form or a relation is in first normal form if it
does not contain any composite or multi-valued attribute.

A relation is in first normal form if every attribute in that relation


is singled valued attribute.
2. Second Normal Form –
•To be in second normal form, a relation must be in first normal
form and relation must not contain any partial dependency.
• A relation is in 2NF if it has No Partial Dependency, i.e., no
non-prime attribute (attributes which are not part of any candidate
key) is dependent on any proper subset of any candidate key of
the table.
•Partial Dependency – If the proper subset of candidate key
determines non-prime attribute, it is called partial dependency.
2NF
2NF
Here,
COURSE_FEE cannot alone decide the value of COURSE_NO or STUD_NO;

COURSE_FEE together with STUD_NO cannot decide the value of


COURSE_NO;

COURSE_FEE together with COURSE_NO cannot decide the value of


STUD_NO;
Hence,

COURSE_FEE would be a non-prime attribute, as it does not belong to the one


only candidate key {STUD_NO, COURSE_NO} ;
2NF
But, COURSE_NO -> COURSE_FEE , i.e., COURSE_FEE is dependent on
COURSE_NO, which is a proper subset of the candidate key.

Non-prime attribute COURSE_FEE is dependent on a proper subset of the


candidate key, which is a partial dependency and so this relation is not in 2NF.

To convert the above relation to 2NF,


we need to split the table into two tables such as :
Table 1: STUD_NO, COURSE_NO
Table 2: COURSE_NO, COURSE_FEE
2NF
Practice: MCQ
1. Attributes are
(i)properties of relationship
(ii)attributed to entities
(iii)properties of members of an entity set
(a) i
(b) i and ii
(c) i and iii
(d) iii
Practice: MCQ
By relation cardinality we mean:
(a) number of items in a relationship
(b) number of relationships in which an entity can appear
(c) number of items in an entity
(d) number of entity sets which may be related to a given entity
Practice: MCQ
By relation cardinality we mean:
(a) number of items in a relationship
(b) number of relationships in which an entity can appear
(c) number of items in an entity
(d) number of entity sets which may be related to a given entity
Practice: MCQ
An attribute y may be functionally dependent on
(i)a composite attribute x,y
(ii)a single attribute x
(iii)no attribute
(a) i and ii
(b) i and iii
(c) ii and iii
(d) iii
Practice: MCQ
The following functional dependencies are given:
AB → CD, AF → D, DE → F, C → G, F → E, G
→A
Which one of the following options is false?
(a) {CF}+ = {ACDEFG}
(b) {BG}+ = {ABCDG}
(c) {AF}+ = {ACDEFG}
(d) {AB}+ = {ABCDFG}
Practice: MCQ
Display name of cities (all entries) whose temperature is in the
range 21 and 39
A. SELECT * FROM weather WHERE temperature NOT IN (21
to 39);
B. SELECT * FROM weather WHERE temperature NOT IN (21
and 39);
C. SELECT * FROM weather WHERE temperature NOT
BETWEEN 21 to 39;
D. SELECT * FROM weather WHERE temperature BETWEEN
21 AND 39;

You might also like