Professional Documents
Culture Documents
• That is, we need not perform joins of the smaller relations to check
whether a constraint on the original relation is violated.
PROPERTIES OF DECOMPOSITION
1. Lossless Decomposition
2. Dependency Preservation
------- In the above example, it means that, if natural joins of all the decomposition give the
original relation, then it is said to be lossless join decomposition.
2. Dependency Preservation
• Dependency is an important constraint on the database.
• Every dependency must be satisfied by at least one decomposed table.
• If {A → B} holds, then two sets are functional dependent. And, it becomes
more useful for checking the dependency easily if both sets in a same
relation.
• This decomposition property can only be done by maintaining the
functional dependency.
• In this property, it allows to check the updates without computing the
natural join of the database structure.
• Let R(A,B,C,D) is a relation holding the FD’s :
• {AB, BC, CD,DB }
• R is now decomposed into R1(AB), R2(BC), R3(BD)
• When a relation is decomposed, FD’s also gets divided between
decomposed tables and when done with the union of all FD’s they
should lead to the original set of FD’s.
• This means the dependencies are preserved
• Let’s check for this relation:
R1(AB) R2(BC) R3(BD) Let R(A,B,C,D) is a relation holding the FD’s :
{AB, BC, CD,DB }
AB BC BD R is now decomposed into R1(AB), R2(BC), R3(BD)
B A CB DB
ISBN Title
ISBN,no author
Pid publisher
Armstrong rules/Axioms
1. Reflexivity: if Y is a subset of X, then X Y.
EX: ISBN, no ISBN
Cust_name, loan no cust_name
2. Augmentation: if X Y then XZ YZ for any Z
EX: ISBN title
ISBN, no title, no
3. Transitivity: it X Y and Y Z then X Z.
EX: ISBN publisher
publisher pid
ISBN pid
4. Self – determination: X X.
EX: author author
5. Decomposition: If X YZ, then XY, XZ , the axiom of
decomposition indicates that determinant of any FD , it can uniquely
determine individual attribute.
EX: Pincode state, city
Pincode state
Pincodecity
6. Union: If X Y and X Z, then X YZ, , if they two FD’s are with
same determinant then it is possible to form a new FD.
EX: ISBN publisher
ISBN pid
ISBN publisher, pid
Superkeys and Candidate Keys
• A set of attributes that determine the entire tuple is a superkey
• {sid, name} is a superkey for the student table.
• Also {sid, name, supervisor_id} etc.
• A minimal set of attributes that determines the entire tuple is a candidate key
• {sid, name} is not a candidate key because I can remove the name.
• sid is a candidate key
• If there are multiple candidate keys, the DB designer designates one as the primary key.
Reasoning about Functional Dependencies
It is sometimes possible to infer new functional dependencies from a set of given functional dependencies
• independently from any particular instance of the relation schema or of any additional knowledge
Example:
From
{sid} {first_name} and
{sid} {last_name}
We can infer
{sid} {first_name, last_name}
Reasoning About FDs
• The reasons about FDs are in two ways:
• 1. CLOSURE OF A SET OF FDs
• 2. ATTRIBUTE CLOSURE
• The set of all FDs implied by a given set F of FDs is called the closure of F
and is denoted as F+.
• Example:
• Contracts(cid,sid,jid,did,pid,qty,value), and:
• C is the key: C CSJDPQV
• Project purchases each part using single contract: JP C
• Dept purchases at most one part from a supplier: SD P
• JP C, C CSJDPQV imply JP CSJDPQV
• SD P implies SDJ JP
• SDJ JP, JP CSJDPQV imply SDJ CSJDPQV
2. ATTRIBUTE CLOSURE (AC)
• Properties of a Key:
• A key X, which belongs in relation R has the following
properties:
• 1. X ⊆ R
• 2. X → R (complete key)
• 3. There is no X′ ⊂ X such that X′ →R (minimal key)
Example for producing a FD based on AA
• Given a relation R with attributes W, U, V, X, Y, Z and functional
dependencies: W UV, U Y, VX YZ. Prove that it holds: WXZ.
• Solution:
• 1. (with decomposition) From W UV, W U we take W V.
• 2. (with augmentation) WX VX.
• 3. (with transitivity) Using WXVX and VXYZ we get WX YZ.
• 4. (with decomposition) WX Y and WX Z.
Algorithm to find the Closure of a Set of attributes X
• 1. Let X=A.
• 2. Among the functional Dependencies of F+, we
• search for dependencies C D, where C ⊆ X. If
• we found such a dependency, then we add D in X.
• 3. We repeat Step 2 till we cannot add additional
• attributes in X.
Example 1
• Let R= (V, Y, Z, W) and FD = {V Z, VZ W, W Y, VY W}
• Find the closure of attribute V.
• V+ = {VZWY}
• Solution:
• Step 1: X=V.
• Step 2: X=VZ because of V Z.
• Step 3: X=VZW because of VZ W.
• Step 3: X=VZWY because of W Y.
• Step 3: No more repeats can be made.
Example 2
• Let R = ( V, Y, Z, W) and F+ = {V Υ, W Y, V W}
• Find the closure of attribute V.
• V+ = {V,Y,W}
• Solution:
• Step 1: X=V.
• Step 2: X=VY because of V Y.
• Step 3: X=VYW because of V W.
• Step 3: no more repeats can be made.
Example 3
• Finding all the candidate keys in a relation R(ABCD)
• FD = {AB, BC, CD}
• Candidate key is a key that determines every other attribute in the table
• A+ = {ABCD} (i.e., A is a candidate key)
• B+ = {BCD} (i.e., B is not a candidate key)
• C+ = {CD} (i.e., C is not a candidate key)
• D+ = {D} (i.e., D is not a candidate key)
• CK={A}
• Prime Attribute : Is an attribute that helps in formation of Candidate key.
• So the Prime Attribute for this relation are : { A }
• Non-Prime Attributes are :{B,C,D}
• AB+= {ABCD}
• AB here can be a CK, but it cannot be a CK because CK is always minimal
• A is a CK already, if added anything to it , it becomes the super key but not the CK.
Example 4
• Let R be a relation having R(ABCD)
• FD = {AB, BC, CD, DA}
• A+ = {ABCD}
• B+ = {BCDA}
• C+ = {CDAB}
• D+ = {DABC}
• So the CK ={A,B,C,D}
• Prime Attributes are : {A,B,C,D}
• Non Prime Attributes are : {}
Example 5
Consider the relation scheme R(A,B,C,D) with functional dependencies
{A}{C} and {B}{D}.
{A}+ = {A,C}
{B}+ = {B,D}
{C}+={C}
{D}+={D}
{A,B}+ = {A,B,C,D}
Example 6
• Consider the relation scheme R(A,B,C,D,E) with functional dependencies {AB , BCD, EC,
DA}
• RHS determining attributes = {BDCA}
• First Check right side attributes & the one which is not in the RHS , it has to be at left hand side.
• E is definitely used in preparing CK
• E+ = EC
• With every attribute E should be attached
• AE+ = {ABECD}
• BE+ = {BECDA}
• CE+ = {CE}
• DE+= { DEABC}
• Candidate keys : {AE, BE, DE}
• Prime Attributes : {A,B,D,E}
• Non Prime Attributes : {c}
Normal Forms
Normalization of Relations
Normalization: The process of decomposing unsatisfactory "bad" relations
by breaking up their attributes into smaller relations
•Normalization is the process of organizing the data in the database.
•Normalization is used to minimize the redundancy from a relation or set of
relations. It is also used to eliminate the undesirable characteristics like Insertion,
Update and Deletion Anomalies.
•Normalization divides the larger table into the smaller table and links them using
relationship.
•The normal form is used to reduce redundancy from the database table.
Normal form: Condition using keys and FDs of a relation to certify whether
a relation schema is in a particular normal form
Relationship Between
Normal Forms
Types of Normal Form
• First Normal Form (1NF) - included in the definition of a relation.
Department
DNAME DNUMBER DMGRSSN DLOCATIONS
3.
There are three main techniques to achieve first normal form for such a
relation:
2. Expand the key so that there will be a separate tuple in the original
DEPARTMENT relation for each location of a DEPARTMENT. In this
case, the primary key becomes the combination {DNUMBER,
DLOCATION}.
for example, if
it is known that at most three locations can exist for a department-
replace the DLOCATIONS attribute by three atomic attributes:
DLOCATIONl, DLOCATION2, and DLOCATION3.
First Normal Form: Best solution
EMP_PROJ1
ENO ENAME
EMP_PROJ2
ENO PNUMBER HOURS
• First Normal Form : Problem
ROLL NO COURSE
ROLL NO NAME
1 C
1 SAI
1 C
2 HARSH
2 JAVA
3 OMKAR 3 C
3 DBMS
First Normal Form: Problem
Person
ENO CAR_LIC PHONE
• Here the student Jeet is used twice in the table and subject PHY is repeated Another
method is
• to divide the relation into 2
Take the following table.
Is it 1NF?
No. There are repeating groups (subject,
subjectcost, grade)
Is it 2NF?
Before learning 2NF, it is important to know
what is partial dependency
Partial Dependency
• Partial Dependency – when an non-key attribute is
determined by a part, but not the whole, of a
COMPOSITE primary key.
Partial
CUSTOMER Dependency
1
SUBJECTS TABLE (key = Subject)
1
SUBJECTS TABLE (key = Subject)
1
A subject can be listed MANY
times in the results table (for
different students)
8
1
SUBJECTS TABLE (key = Subject)
1
A student can be listed MANY
times in the results table (for
different subjects)
8
1
SUBJECTS TABLE (key = Subject)
SubjectCost is only
dependent on the
8
primary key,
Subject
RESULTS TABLE (key = StudentID+Subject)
A 2NF check
STUDENT TABLE (key = StudentID)
1
SUBJECTS TABLE (key = Subject)
1
8
2NF!
1
8
But is it 3NF?
RESULTS TABLE (key = StudentID+Subject)
Second Normal Form(2NF): Examples
EMP_PROJ
ENO PNUMBER HOURS ENAME PNAME PLOCATION
FD1
FD2
FD3
The test for 2NF involves testing for functional dependencies whose left-
hand side attributes are part of the primary key.
If the primary key contains a single attribute, the test need not be applied.
The functional dependencies FD2 and FD3 make ENAME, PNAME, and
PLOCATION partially dependent on the primary key {ENO, PNUMBER,
PLOCATION}.
Hence, decompose EMP_PROJ into the three relation schemas
EMP_PROJ1, EMP_PROJ 2, and EMP_PROJ 3
EMP_PROJ1 EMP_PROJ2
ENO PNUMBER HOURS ENO ENAME
FD1 FD2
EMP_PROJ3
PNUMBER PNAME HOURS PLOCATION
FD3
Example
• Consider the following relation, not in 2NF
FD : StoreID Location
Key : CustomerID,StoreID
Here , the key is custID,StoreID.
CustomerID Store ID Location Location is dependent on store ID
1 1 Delhi
but not on customerID. So, there
1 3 Mumbai
2 1 Delhi exists a partial dependency here
3 2 Banglore which violates the rule of 2NF .
4 3 mumbai
So, this has to be decomposed now.
Key : CustomerID,StoreID
Key : StoreID CustomerID StoreID
Solutions:
StoreID Location 1 1
1 Delhi 1 3
2 Banglore 2 1
3 Mumbai 3 2
4 3
Here the LOCATION is completely dependent on the key
Example:
• For example AB is a candidate key of any relation R
• Now, Either A or B are the proper subsets of AB and AB is the subset
of AB
• If a part of the candidate key , like, either A or B is determining C
which is a non prime attribute, then we say partial dependency exists.
• This is not supposed to be in 2NF
Example : Find out whether this relation is in 2NF
Let R(A,B,C,D,E,F)
FD { CF, EA,ECD,AB}
Now let’s find out the candidate key in a relation here. For that there is procedure we have already
discussed earlier by using closure method.
First of all, R.H.S elements should be checked. F,A,D,B.
The elements which are not there at R.H.S are E,C.
Now we can say in L.H.S while finding the closures there should be E,C.
SO, whatever be the candidate key, EC is definitely present in that candidate key.
Then why not start with the EC CLOSURE itself .
EC+ = {ECFADB}
Hence all the attributes of a table are determined by EC .
So, we can say EC is the candidate key for this table.
CK = {EC}
Prime Attributes : {E,C}
Non Prime Attributes : {A,B,D,F}
Now, check any non-prime attribute is being determined by a part of candidate key.
YES!! CF, EA are causing partial dependency.
Before learning 3NF you have to learn about
transitive dependency
Transitive Dependency
• Transitive Dependency – A functional dependency is said to be transitive if it
is indirectly formed by two functional dependencies.
• The advantage of removing transitive dependency is -
• Amount of data duplication is reduced.
• Data integrity achieved.
Transitive
Dependency
EMPLOYEE
1 Punjab Mohali
5 Bihar Patna
• Solution :
RC Table
RS Table
What? Is it in
3NF?
8
1
HouseName is
SUBJECTS TABLE (key = Subject)
dependent on both
1
StudentID +
HouseColour
8
1
OR HouseColour is
SUBJECTS TABLE (key = Subject)
dependent on both
1
StudentID +
HouseName
8
1
But either way,
SUBJECTS TABLE (key = Subject)
non-key fields are
dependent on MORE 1
1
And 3NF says that
SUBJECTS TABLE (key = Subject)
non-key fields must
depend on nothing 1
1
WHAT DO SUBJECTS TABLE (key = Subject)
WE DO? 1
8
8
1
1
8
1
1
8
8
1
SUBJECTS TABLE (key = Subject)
RESULTS TABLE (key = StudentID+Subject)
StudentTable 1 GradesTable
¥
StudentID* StudentID* SubjectTable
¥ 1
StudentName Subject* Subject*
Address Grade SubjectCost
HouseName ¥
1 HouseTable
HouseName*
HouseColour
The Reveal
Before…
After… 1
8
1
1
8
8
SUBJECTS TABLE (key = Subject)
EMP_PROJ
EMP_PROJ1
EMP_PROJ2
-Our new column exam_name depends on both student and subject. For example, a mechanical
engineering student will have Workshop exam but a computer science student won't. And for some
subjects you have Prctical exams and for some you don't. So we can say that exam_name is dependent on
both student_id and subject_id.
-And what about our second new column total_marks? Does it depend on our Score table's primary key?
-Well, the column total_marks depends on exam_name as with exam type the total score changes. For
example, practicals are of less marks while theory exams are of more marks.
-But, exam_name is just another column in the score table. It is not a primary key or even a part of the
primary key, and total_marks depends on it.
-This is Transitive Dependency. When a non-prime attribute depends on other non-prime attributes
rather than depending upon the prime attributes or primary key.
Third Normal Form(3NF): Example
Consider a relation R with simple attributes {A, B, C, D, E}.
R contain two candidate keys {A} and {B,C}
{A} is made primary key.
The following FDs hold over R
A BCDE
A B C D E
D E
Is R in 3 NF, if no decompose.
Third Normal Form(3NF): Example to check whether a relation is in 3NF ?
The only difference between the definitions of BCNF and 3NF is that
condition (b) of 3NF, which allows A to be prime, is absent from BCNF.
3NF Table Not in BCNF
AB C
AB D
CB
Figure 4.7
Decomposition of Table
Structure to Meet BCNF
BCNF Conversion Results
Example
ROLL NO NAME VOTER_ID AGE
• A 3NF table which does not have multiple overlapping candidate keys is said to be in BCNF.
103 C#
104 java
STUDENT
Formally, if the MVD X Y holds over R and Z = R−XY , the following must be
true for every legal instance r of R:
Replication: If X Y, then X Y.
• If all these conditions hold true, then we can say that the table
contains multi valued dependency
• Also, multi valued dependency can exists for more than 1 column
Example
• A Student here has opted for two subjects and has 2 hobbies.
a) Y X or XY = R, or
b) X is a superkey.
C T
C B
Does Multi Valued Dependency exists on the above relation?
YES!!
• Each course has several teachers and each course has several
textbooks. The text book is used for a given course is
independent of the teacher.
• If you want to add a new textbook to Physics101 course we
need to add two new rows to (course & teacher) . This is Multi
Valued Dependency
ME ME201 SUSHANT
EC EC301 IRA
Decomposed in such a way that the tables are
in 4NF now:
DEPT SUBJECT STUDENT
DEPT SUBJECT DEPT STUDENT
CSE CS101 SHREYA
The 5NF (Fifth Normal Form) is also known as project-join normal form. A relation is in
Fifth Normal Form (5NF), if it is in 4NF, and won’t have lossless decomposition into
smaller tables.
In order to have the above example in 5NF decompose
again
DEPT SUBJECT STUDENT DEPT SUBJECT STUDENT
IT IT501 YUG
ME ME201 SUSHANT
EC EC301 IRA