Professional Documents
Culture Documents
Relational Model
Relational Algebra
• The relational model has two formal languages: Relational Algebra and Relational Calculus.
• Other than defining the structure of the database and its constraints, any data model has the
set of operations to manipulate the database and are called as Relational Algebra.
n schemas
GUIDELINE 1: Informally, each tuple in a relation should represent one entity or relationship instance. (Applies to individual
relations and their attributes).
• Attributes of different entities (EMPLOYEEs, DEPARTMENTs, PROJECTs) should not be mixed in the same relation
Bottom Line: Design a schema that can be explained easily relation by relation. The semantics of attributes should be easy
to interpret.
Redundant Information in Tuples and Update Anomalies
• Wastes storage
Insertion anomalies
Deletion anomalies
Modification anomalies
What are the Anomalies in DBMS?
In Database Management System (DBMS), anomaly means the inconsistency occurred in the relational table during the
operations performed on the relational table.
• For example, if there is a lot of redundant data present in our database then DBMS anomalies can occur.
• If a table is constructed in a very poor manner, then there is a chance of database anomaly. Due to database
anomalies, the integrity of the database suffers.
The other reason for the database anomalies is that all the data is stored in a single table.
So, to remove the anomalies of the database, normalization is the process that is done where the splitting of the table
and joining of the table (different types of join) occurs.
Worker_id Worker_name Worker_dept Worker_address
When we update some rows in the table, and if it leads to the inconsistency of the table then this anomaly occurs.
In the above table, if we want to update the address of Ramesh then we will have to update all the rows where Ramesh is
present. If during the update we miss any single row, then there will be two addresses of Ramesh, which will lead to
inconsistent and wrong databases.
Insertion Anomaly
If there is a new row inserted in the table and it creates inconsistency in the table, then it is called the insertion anomaly.
For example, if in the above table, we create a new row of a worker, and if it is not allocated to any department then we
cannot insert it in the table so, it will create an insertion anomaly.
Deletion Anomaly
If we delete some rows from the table and if any other information or data which is required is also deleted from the
database, this is called the deletion anomaly in the database.
For example, in the above table, if we want to delete the department number ECT669 then the details of Rajesh will also
be deleted since Rajesh's details are dependent on the row of ECT669. So, there will be deletion anomalies in the table.
Stu_id Stu_name Stu_branch Stu_club
2018nk01 Shivani Computer literature
science
2018nk01 Shivani Computer dancing
science
2018nk02 Ayush Electronics Videography
2018nk03 Mansi Electrical dancing
2018nk03 Mansi Electrical singing
2018nk04 Gopal Mechanical Photography
• If Shivani changes her branch from Computer Science to Electronics, then we will have to update all the rows.
• If we add a new row for student Ankit who is not a part of any club
• If we remove the photography club from the college
GUIDELINE 2:
Design a schema that does not suffer from insertion, deletion and update anomalies.
If there are any anomalies present, then note them so that applications can be made to take them into account.
GUIDELINE 3:
Relations should be designed such that their tuples will have as few NULL values as possible
Attributes that are NULL frequently could be placed in separate relations (with the primary key)
Reasons for nulls:
Attribute not applicable or invalid
Attribute value unknown (may exist)
A value known to exist, but unavailable
Bad designs for a relational database may result in erroneous results for certain JOIN operations
The "lossless join" property is used to guarantee meaningful results for join operations
GUIDELINE 4:
Note that:
• Property (a) is extremely important and cannot be sacrificed.
• Property (b) is less stringent and may be sacrificed.
Functional Dependency
In any relation, a functional dependency α → β holds if- Two tuples having same value of attribute α also have same value
for attribute β.
α-Determinant
β- Dependent
Mathematically,
α⊆R
β⊆R
Then, for a functional dependency to exist from α to β,
Examples-
•Thus, if there exists at least one attribute in the RHS of a functional dependency that is not a part of LHS, then it is
called as a non-trivial functional dependency.
Examples-
a b c d e
Question: Which of the following options are correct:
A 2 3 4 5
a) A-BC 2 A 3 4 5
b) DE-C A 2 3 6 5
c) C-DE A 2 3 6 6
d) BC-A
Inference Rules-
Reflexivity-
If B is a subset of A, then A → B always holds.
Transitivity-
If A → B and B → C, then A → C always holds.
Augmentation-
If A → B, then AC → BC always holds.
Decomposition-
If A → BC, then A → B and A → C always holds.
Composition-
If A → B and C → D, then AC → BD always holds.
Additive-
If A → B and A → C, then A → BC always holds.
Rules for Functional Dependency-
Rule-01:
A functional dependency X → Y will always hold if all the values of X are unique (different) irrespective of the values of Y.
Example-
The following functional dependencies will always hold since all the values of attribute ‘A’ are unique-
• A→B
• A → BC
• A → CD
• A → BCD
• A → DE
• A → BCDE
In general, we can say following functional dependency will always hold- A → Any combination of attributes A, B, C, D,
E
Rule-02:
A functional dependency X → Y will always hold if all the values of Y are same irrespective of the values of X.
Example-
Consider the following table-
The following functional dependencies will always hold since all the values of attribute ‘C’ are same-
• A→C
• AB → C
• ABDE → C
• DE → C
• AE → C
In general, we can say following functional dependency will always hold true- Any combination of attributes A, B, C, D, E → C
Rule-03:
For a functional dependency X → Y to hold, if two tuples in the table agree on the value of attribute X, then they must
also agree on the value of attribute Y.
Rule-04:
For a functional dependency X → Y, violation will occur only when for two or more same values of X, the corresponding
Y values are different.
Step-01:
Add the attributes contained in the attribute set for which closure is being calculated to the result set.
Step-02:
Recursively add the attributes to the result set which can be functionally determined from the attributes already contained in
the result set.
The set of all attributes that can be determined using given set of attributes is called attribute closure.
In this example A-B and B- C so using transitivity axiom we can include B and C in the closure set and A-A is a trivial
functional dependency. Thus,
B+ {B,C}
EXAMPLE 2
Consider a relation R ( A , B , C , D , E , F , G ) with the functional dependencies-
A → BC
BC → DE
D→F
CF → G
Now, let us find the closure of some attributes and attribute sets-
Closure of attribute A-
A+ = { A }
= {A, B , C } ( Using A → BC )
= {A, B , C , D , E } ( Using BC → DE )
= {A, B , C , D , E , F } ( Using D → F )
= {A, B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }
Closure of attribute D-
D+ = { D }
= { D , F } ( Using D → F )
We can not determine any other attribute using attributes D and F contained in the result set.
Thus,
D+ = { D , F }
{ B , C } += { B , C }
={B,C,D,E} ( Using BC → DE )
={B,C,D,E,F} ( Using D → F )
={B,C,D,E,F,G} ( Using CF → G )
Thus,
{ B , C }+ = { B , C , D , E , F , G }
Finding the Keys Using Closure-
Super Key-
•If the closure result of an attribute set contains all the attributes of the relation, then that attribute set is called as a super key
of that relation.
•Thus, we can say-
“The closure of a super key is the entire relation schema.”
Example-
Candidate Key-
•If there exists no subset of an attribute set whose closure contains all the attributes of the relation, then that attribute set is
called as a candidate key of that relation.
Example-
Problem-
Option-(A):
{ CF }+ = { C , F }
={C,F,G} ( Using C → G )
={C,E,F,G} ( Using F → E )
= {A, C , E , E , F } ( Using G → A )
= { A , C , D , E , F , G } ( Using AF → D )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Option-(B):
{ BG }+ = { B , G }
= {A, B , G } ( Using G → A )
= {A, B , C , D , G } ( Using AB → CD )
Since, our obtained result set is same as the given result set, so, it means it is correctly given.
Option-(C):
{ AF }+ = { A , F }
= {A, D , F } ( Using AF → D )
= {A, D , E , F } ( Using F → E )
Since, our obtained result set is different from the given result set, so,it means it is not correctly given.
Option-(D):
{ AB }+ = { A , B }
= {A, B , C , D } ( Using AB → CD )
= {A, B , C , D , G } ( Using C → G )
Thus,
Option (C) and Option (D) are correct.
Candidate Key-
OR
Step-01:
A→B
C→D
D→E
Here, the attributes which are not present on RHS of any functional dependency are A, C and F.
Step-02:
If all essential attributes together can determine all remaining non-essential attributes, then-
•The combination of essential attributes is the candidate key.
•It is the only possible candidate key.
Case-02:
If all essential attributes together can not determine all remaining non-essential attributes, then-
•The set of essential attributes and some non-essential attributes will be the candidate key(s).
•In this case, multiple candidate keys are possible.
•To find the candidate keys, we check different combinations of essential and non-essential attributes.
PRACTICE PROBLEM BASED ON FINDING CANDIDATE KEYS-
Problem-01:
Also, determine the total number of candidate keys and super keys.
Solution-
We will find candidate keys of the given relation in the following steps-
Step-01:
Step-02:
Now,
•We will check if the essential attributes together can determine all remaining non-essential attributes.
•To check, we find the closure of CE.
So, we have-
{ CE }+
={C,E}
={C,E,F} ( Using C → F )
= {A, C , E , F} ( Using E → A )
= {A, C , D , E , F } ( Using EC → D )
= {A, B , C , D , E , F } ( Using A → B )
We conclude that CE can determine all the attributes of the given relation.
So, CE is the only possible candidate key of the relation.
Thus, Option (B) is correct.
Total Number of Candidate Keys-
The generalized formula will be for the table if only one candidate key is available and K is the number of attributes,
then total super keys = 2(K-1).
Consider the relation scheme R(E, F, G, H, I, J, K, L, M, N) and the set of functional dependencies-
{ E, F } → { G }
{F}→{I,J}
{ E, H } → { K, L }
{K}→{M}
{L}→{N}
Also, determine the total number of candidate keys and super keys.
Closure Of Functional Dependency :
The Closure Of Functional Dependency means the complete set of all possible attributes that can be functionally derived
from given functional dependency using the inference rules known as Armstrong’s Rules.
If “F” is a functional dependency, then closure of functional dependency can be denoted using “{F}+”.
There are three steps to calculate closure of functional dependency. These are:
Step-1 : Add the attributes which are present on Left Hand Side in the original functional dependency.
Step-2 : Now, add the attributes present on the Right-Hand Side of the functional dependency.
Step-3 : With the help of attributes present on Right Hand Side, check the other attributes that can be derived from the other
given functional dependencies.
Repeat this process until all the possible attributes which can be derived are added in the closure.
In a relation R(A,B,C) the following FDs exist:
F:{A-B, B-C}
Q- Find the number of FDs in F+ for a relation with two attributes R(A,B).
Take possible subsets of A and B then determine X-Y where x can be subset of {A,B} and Y is a subset of {A,B}.
Thus, if R has cardinality as n i.e., having n attributes then possible subsets are 2 n
Practice Problem:
The process of breaking up or dividing a single relation into two or more sub relations is called as decomposition of a
relation.
Properties of Decomposition-
The following two properties must be followed when decomposing a given relation-
1. Lossless decomposition-
None of the functional dependencies that holds on the original relation are lost.
The sub relations still hold or satisfy the functional dependencies of the original relation.
Types of Decomposition-
This decomposition is called lossless join decomposition when the join of the sub relations results in the same relation
R that was decomposed.
R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R
Condition-01:
Union of both the sub relations must contain all the attributes that are present in the original relation R.
Thus, R1 ∪ R2 = R
Condition-02:
Intersection of both the sub relations must not be null. In other words, there must be some common attribute which is
present in both the sub relations.
Thus,
R1 ∩ R2 ≠ ∅
Condition-03:
Intersection of both the sub relations must be a super key of either R1 or R2 or both.
PRACTICE PROBLEM BASED ON DETERMINING WHETHER DECOMPOSITION IS LOSSLESS OR LOSSY-
Problem-01:
Consider a relation schema R ( A , B , C , D ) with the functional dependencies A → B and C → D. Determine whether the
decomposition of R into R1 ( A , B ) and R2 ( C , D ) is lossless or lossy.
Solution-
Condition-01:
According to condition-01, union of both the sub relations must contain all the attributes of relation R.
So, we have-
R1 ( A , B ) ∪ R 2 ( C , D )
= R (A, B , C , D )
Clearly, union of the sub relations contain all the attributes of relation R.
Thus, condition-01 satisfies.
Condition-02:
According to condition-02, intersection of both the sub relations must not be null.
So, we have-
R1 ( A , B ) ∩ R 2 ( C , D )
=Φ
Clearly, intersection of the sub relations is null.
So, condition-02 fails.
Problem-02:
A given relation is called in First Normal Form (1NF) if each cell of the table contains only an atomic value.
OR
A given relation is called in First Normal Form (1NF) if the attribute of every tuple is either single valued or a null value.
Example-
Relation is in 1NF
NOTE-
A given relation is called in Second Normal Form (2NF) if and only if-
1.Relation already exists in 1NF.
2.No partial dependency exists in the relation.
Partial Dependency
A partial dependency is a dependency where few attributes of the candidate key determines non-prime attribute(s).
OR
A partial dependency is a dependency where a portion of the candidate key or incomplete candidate key determines non-prime
attribute(s).
In other words,
A → B is called a partial dependency if and only if-
1.A is a subset of some candidate key
2.B is a non-prime attribute.
If any one condition fails, then it will not be a partial dependency.
NOTE-
•To avoid partial dependency, incomplete candidate key must not determine any non-prime attribute.
•However, incomplete candidate key can determine prime attributes.
Example-
From here,
•Prime attributes = { V , W , X , Y }
•Non-prime attributes = { Z }
A given relation is called in Third Normal Form (3NF) if and only if-
Transitive Dependency
NOTE-
A is a super key
B is a prime attribute