You are on page 1of 25

1.

Define and explain about redundancy and the problems that it can cause
Or
Discuss various problems that arise with redundant storage of Information?

Anomalies or problems facing without normalization (problems due to redundancy):


Anomalies refers to the problems occurred after poorly planned and unnormalized databases
where all the data is stored in one table which is sometimes called a flat file database. Let us
consider such type of schema

Here all the data is stored in a single table which causes redundancy of data or say anomalies
as SID and Sname are repeated once for same CID . Let us discuss anomalies one by one. Due
to redundancy of data we may get the following problems, those are
1.insertion anomalies : It may not be possible to store some information unless some other
information is stored as well.
2.redundant storage: some information is stored repeatedly
3.update anomalies: If one copy of redundant data is updated, then inconsistency is created
unless all redundant copies of data are updated.

4.deletion anomalies: It may not be possible to delete some information without losing some
other information as well. Problem in updation / updation anomaly – If there is updation in
the fee from 5000 to 7000, then we have to update FEE column in all the rows, else data will
become inconsistent.

Insertion Anomaly and Deletion Anomaly- These anomalies exist only due to redundancy,
otherwise they do not exist.
Insertion Anomalies: New course is introduced C4, But no student is there who is having C4
Subject
Deletion Anomaly: Deletion of S3 student cause the deletion of course. Because of deletion
of some data forced to delete some other useful data
2. What is the need for normalization of schemas? Are they any demerits of normalization?

Purpose of Normalization:

 Minimize the redundancy in data.

 Remove insert, update, and delete anomalies during the database activities.

 Reduce the need to organize the data when it is modified or enhanced.

 Normalization reduces a complex user view to a set of small and sub groups of fields or

relations. This process helps to design a logical data model known as conceptual data

model.

Advantages of Normalization:

1. Greater overall database organization will be gained.

2. The amount of unnecessary redundant data reduced.

3. Data integrity is easily maintained within the database.

4. The database & application design processes are much for flexible.
5. Security is easier to maintain or manage.

Disadvantages of Normalization:

1. The disadvantage of normalization is that it produces a lot of tables with a relatively small
number of columns. These columns then have to be joined using their primary/foreign key
relationship.

2. This has two disadvantages.

Performance: all the joins required to merge data slow processing & place additional stress on
your hardware.

Complex queries: developers have to code complex queries in order to merge data from different
tables.

3. Describe the desirable properties of Schema decomposition?

Properties of Decomposition

Following are the properties of Decomposition,


1. Lossless Decomposition
2. Dependency Preservation
3. Lack of Data Redundancy

1. Lossless Decomposition
 Decomposition must be lossless. It means that the information should not get lost from the
relation that is decomposed.
 It gives a guarantee that the join will result in the same relation as it was decomposed.
2. Dependency Preservation
 Dependency is an important constraint on the database.
 Every dependency must be satisfied by at least one decomposed table.
 If {A → B} holds, then two sets are functional dependent. And, it becomes more useful for
checking the dependency easily if both sets in a same relation.
 This decomposition property can only be done by maintaining the functional dependency.
 In this property, it allows to check the updates without computing the natural join of the
database structure.
3. Lack of Data Redundancy
 Lack of Data Redundancy is also known as a Repetition of Information.
 The proper decomposition should not suffer from any data redundancy.
 The careless decomposition may cause a problem with the data.
 The lack of data redundancy property may be achieved by Normalization process.
4. Explain 1NF, 2NF, and 3NF with suitable examples?

First Normal Form (1NF): A relation is said to in the 1NF if it is already in un-normalized
form and it satisfies the following conditions or rules or qualifications are:
1. Each attribute name must be unique.
2. Each attribute value must be single or atomic i.e., Single Valued Attributes.
3. Each row / record must be unique.
4. There is no repeating group’s.
Example: How do we bring an un-normalized table into first normal form? Consider the
following relation:

Solution: This table is not in first normal form because the [Color] column can contain
multiple values. For example, the first row includes values "red" and "green." To bring this
table to first normal form, we split the table into two tables and now we have the resulting
tables:

Second Normal Form (2NF): A relation is said to be in 2NF, if it is already in 1st NF and it
has no Partial Dependency i.e., no non-prime attribute is dependent on the only a part of the
candidate key.
(OR)
A relation is in second normal form if it satisfies the following conditions:
• It is in first normal form
• All non-key attributes are fully functional dependent on the primary key.
Note: Partial Functional Dependency: If a non-prime attribute of the relation is getting
derived by only a part of the candidate key, then such dependency is known as Partial
Dependency
Example: Consider the following relation

➔This table has a composite primary key [Customer ID, Store ID]. The non-key attribute is
[Purchase Location]. In this case, [Purchase Location] only depends on [Store ID], which is
only part of the primary key. Therefore, this table does not satisfy second normal form.
➔ To bring this table to second normal form, we break the table into two tables, and now we
have the following

Third Normal Form (3NF): A database is in third normal form if it satisfies the following
conditions:
• It is in 2NF.
• There is no transitive functional dependency
By transitive functional dependency, we mean we have the following relationships in
the table: A is functionally dependent on B, and B is functionally dependent on C. In
this case, C is transitively dependent on A via B. and A non-key attribute is
depending on a non-key attribute.
Example: Consider the following relation.

➔ In the table able, [Book ID] determines [Genre ID], and [Genre ID] determines [Genre
Type].
Therefore, [Book ID] determines [Genre Type] via [Genre ID] and we have transitive
functional dependency, and this structure does not satisfy third normal form.
➔ To bring this table to third normal form, we split the table into two as follows:
5. Give a set of FDs for the relation schema R(A, B, C, D) with primary key AB under which R is in
1NF but not in 2NF.
Sol: R=ABCD
Key=AB
(a) Atomic values are allowed in 1NF and partial dependency is not allowed in 2NF.
The following FDs are allowed.
BC, AC, BD, AD
(show the FDs which is having partial dependency)
(b) According to question partial dependencies are not allowed and transitivity
dependency is allowed. The following FDs are allowed.
CD, DC

6. Explain how to preserve functional dependencies during decomposition.


Or
When is a decomposition said to be dependency-preserving? Explain with example.

Dependency Preserving Decomposition in DBMS


 Decomposition of a relation in relational model is done to convert it into appropriate
normal form
 A relation R is decomposed into two or more only if the decomposition is both
lossless join and dependency preserving.

Dependency Preserving Decomposition


 If we decompose a relation R into relations R1 and R2, all dependencies of R must be
part of either R1 or R2 or must be derivable from combination of functional
dependencies(FD) of R1 and R2
 Suppose a relation R(A,B,C,D) with FD set {A->BC} is decomposed into R1(ABC) and
R2(AD) which is dependency preserving because FD A->BC is a part of R1(ABC).
Theory

Consider a schema R(A,B,C,D) and functional dependencies A->B and C->D which is decomposed into
R1(AB) and R2(CD)

This decomposition is dependency preserving decompostion because

A->B can be ensured in R1(AB)

C->D can be ensured in R2(CD)

Example

Let a relation R(A,B,C,D) and set a FDs F = { A -> B , A -> C , C -> D} are given.

A relation R is decomposed into –

R1 = (A, B, C) with FDs F1 = {A -> B, A -> C}, and

R2 = (C, D) with FDs F2 = {C -> D}.

F' = F1 ∪ F2 = {A -> B, A -> C, C -> D}

so, F' = F.

And so, F'+ = F+.

Thus, the decomposition is dependency preserving decomposition.

7. What is meant by the closure of functional dependencies? Illustrate with an example.

Closure Of Functional Dependency : Introduction

The Closure Of Functional Dependency means the complete set of all possible attributes that can be
functionally derived from given functional dependency using the inference rules known as
Armstrong’s Rules.

If “F” is a functional dependency then closure of functional dependency can be denoted using “{F}+”.

There are three steps to calculate closure of functional dependency. These are:

Step-1 : Add the attributes which are present on Left Hand Side in the original functional
dependency.
Step-2 : Now, add the attributes present on the Right Hand Side of the functional dependency.

Step-3 : With the help of attributes present on Right Hand Side, check the other attributes that can
be derived from the other given functional dependencies. Repeat this process until all the possible
attributes which can be derived are added in the closure.

Example 1: Find candidate keys for the relation R(ABCD) having following FD’s ABCD, CA, DA.
Sol: From the given FD’s, the attribute B is key attribute because it is not in RHS of functional
dependency.

B+ = B (not a candidate key, find the combinations of B)

AB+ = ABCD (∵ AB CD)

BC+ = BCAD (∵ C A, AB CD)

BD+ = BDA (∵ D A )

CD+ = CDA (∵ D A )

AC+ = AC

AD+ =AD

From the above attributes AB and BC determines all attributes. AB, BC are candidate keys.

8. When is the decomposition of a relational schema R into two relational schemas X and Y said
to be lossless-join decomposition? Why is this property so important? Give a necessary and
sufficient condition to test whether a decomposition is lossless-join

Decomposition of a Relation-

The process of breaking up or dividing a single relation into two or more sub relations is called as
decomposition of a relation.

Properties of Decomposition-

The following two properties must be followed when decomposing a given relation-

1. Lossless decomposition-

Lossless decomposition ensures-

 No information is lost from the original relation during decomposition.


 When the sub relations are joined back, the same relation is obtained that was decomposed.
Every decomposition must always be lossless.

2. Dependency Preservation-

Dependency preservation ensures-


 None of the functional dependencies that holds on the original relation are lost.
 The sub relations still hold or satisfy the functional dependencies of the original relation.

Types of Decomposition-

Decomposition of a relation can be completed in the following two ways-

1. Lossless Join Decomposition-

 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.


 This decomposition is called lossless join decomposition when the join of the sub relations
results in the same relation R that was decomposed.
 For lossless join decomposition, we always have-

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn = R 

where ⋈ is a natural join operator

Example-
 

Consider the following relation R( A , B , C )- 

A B C

1 2 1

2 5 3

3 3 3

R( A , B , C )
Consider this relation is decomposed into two sub relations R1( A , B ) and R2( B , C )-

 The two sub relations are-

A B

1 2

2 5

3 3

R1( A , B )

B C

2 1

5 3

3 3

R2( B , C )

Now, let us check whether this decomposition is lossless or not.

For lossless decomposition, we must have-


R1 ⋈ R2 = R 

Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 , we get-

A B C

1 2 1

2 5 3

3 3 3

This relation is same as the original relation R.

Thus, we conclude that the above decomposition is lossless join decomposition.

 NOTE-

 Lossless join decomposition is also known as non-additive join decomposition.


 This is because the resultant relation after joining the sub relations is same as the decomposed
relation.
 No extraneous tuples appear after joining of the sub-relations.
 

2. Lossy Join Decomposition-


 
 Consider there is a relation R which is decomposed into sub relations R1 , R2 , …. , Rn.
 This decomposition is called lossy join decomposition when the join of the sub relations does
not result in the same relation R that was decomposed.
 The natural join of the sub relations is always found to have some extraneous tuples.
 For lossy join decomposition, we always have-
 

R1 ⋈ R2 ⋈ R3 ……. ⋈ Rn ⊃ R 


where ⋈ is a natural join operator

Example-
 

Consider the following relation R( A , B , C )- 

A B C

1 2 1

2 5 3

3 3 3

R( A , B , C )

Consider this relation is decomposed into two sub relations as R1( A , C ) and R2( B , C )-
 

The two sub relations are-

R1( A , B )
A C
 

1 1

2 3

3 3

B C

2 1
R2( B , C )

5 3

3 3

Now, let us check whether this decomposition is lossy or not.

For lossy decomposition, we must have-

R1 ⋈ R2 ⊃ R

 
Now, if we perform the natural join ( ⋈ ) of the sub relations R1 and R2 we get-

A B C

1 2 1

2 5 3

2 3 3

3 5 3

3 3 3

This relation is not same as the original relation R and contains some extraneous tuples.

Clearly, R1 ⋈ R2 ⊃ R.

Thus, we conclude that the above decomposition is lossy join decomposition.

NOTE-
 

 Lossy join decomposition is also known as careless decomposition.


 This is because extraneous tuples get introduced in the natural join of the sub-relations.
 Extraneous tuples make the identification of the original tuples difficult.
Determining Whether Decomposition Is Lossless Or
Lossy-
 

Consider a relation R is decomposed into two sub relations R 1 and R2.

Then,

 If all the following conditions satisfy, then the decomposition is lossless.


 If any of these conditions fail, then the decomposition is lossy.

Condition-01:

Union of both the sub relations must contain all the attributes that are present in the original
relation R.

Thus,

R1 ∪ R2 = R

Condition-02:
 Intersection of both the sub relations must not be null.
 In other words, there must be some common attribute which is present in both the sub
relations.
Thus,

R1 ∩ R2 ≠ ∅

Condition-03:
 

Intersection of both the sub relations must be a super key of either R 1 or R2 or both.

Thus,

R1 ∩ R2 = Super key of R1 or R2


Problem-01:
 

Consider a relation schema R ( A , B , C , D ) with the functional dependencies A → B and C →
D. Determine whether the decomposition of R into R1 ( A , B ) and R2 ( C , D ) is lossless or
lossy.

Solution-
 

To determine whether the decomposition is lossless or lossy,

 We will check all the conditions one by one.


 If any of the conditions fail, then the decomposition is lossy otherwise lossless.
 

Condition-01:
 

According to condition-01, union of both the sub relations must contain all the attributes of
relation R.

So, we have-

 R1 ( A , B ) ∪ R2 ( C , D )

=R(A,B,C,D)

Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

Condition-02:
 

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

R1 ( A , B ) ∩ R2 ( C , D )

Clearly, intersection of the sub relations is null.

So, condition-02 fails.


Thus, we conclude that the decomposition is lossy.

 
Problem-02:
 

Consider a relation schema R ( A , B , C , D ) with the following functional dependencies-


A → B

B→C

C→D

D→B

Determine whether the decomposition of R into R 1 ( A , B ) , R2 ( B , C ) and R3 ( B , D ) is


lossless or lossy.

Consider the original relation R was decomposed into the given sub relations as shown-

  

Decomposition of R(A, B, C, D) into R'(A, B, C) and R3(B,


D)-
 

To determine whether the decomposition is lossless or lossy,

 We will check all the conditions one by one.


 If any of the conditions fail, then the decomposition is lossy otherwise lossless.
 
Condition-01:
 

According to condition-01, union of both the sub relations must contain all the attributes of
relation R.

So, we have-

 R‘ ( A , B , C ) ∪ R3 ( B , D )

=R(A,B,C,D)

Clearly, union of the sub relations contain all the attributes of relation R.

Thus, condition-01 satisfies.

Condition-02:
 

According to condition-02, intersection of both the sub relations must not be null.

So, we have-

 R‘ ( A , B , C ) ∩ R3 ( B , D )

=B

Clearly, intersection of the sub relations is not null.

Thus, condition-02 satisfies.

Condition-03:
 

According to condition-03, intersection of both the sub relations must be the super key of one of
the two sub relations or both.

So, we have-

 R‘ ( A , B , C ) ∩ R3 ( B , D )


=B

Now, the closure of attribute B is-

B+ = { B , C , D }

Now, we see-

 Attribute ‘B’ can not determine attribute ‘A’ of sub relation R’.
 Thus, it is not a super key of the sub relation R’.
 Attribute ‘B’ can determine all the attributes of sub relation R 3.
 Thus, it is a super key of the sub relation R 3.
 

Clearly, intersection of the sub relations is a super key of one of the sub relations.

So, condition-03 satisfies.

Thus, we conclude that the decomposition is lossless.

 
9. Define multivalued dependencies and join dependencies. Discuss the use of such
dependencies in database design.

Multivalued Dependency in DBMS


Multivalued dependency would occur whenever two separate attributes in a given table
happen to be independent of each other. And yet, both of these depend on another third
attribute. The multivalued dependency contains at least two of the attributes dependent on
the third attribute. This is the reason why it always consists of at least three of the attributes.

Why Do We Use Multivalued Dependency in DBMS?


We always use multivalued conditions when we encounter these two different ways:

When we want to test the relations or decide if these happen to be lawful under some
arrangement of practical as well as multivalued dependencies.
When we want to determine what limitations are there on the arrangement of the lawful
relations. Thus, we will concern ourselves with just the relations that fulfil a given
arrangement of practical as well as multivalued dependencies.

What is Join Dependency in DBMS


Whenever we can recreate a table by simply joining various tables where each of these
tables consists of a subset of the table’s attribute, then this table is known as a Join
Dependency. Thus, it is like a generalization of MVD. We can relate the JD to 5NF. Herein, a
relation can be in 5NF only when it’s already in the 4NF. Remember that it cannot be further
decomposed.

Characteristics of Join Dependency in DBMS


 The join decomposition is like a further generalization of the Multivalued
dependencies.
 In case the join of X1 and X2 over C is equal to relation X, then one can say that there
exists a join dependency (JD).
 Where X1 and X2 are the decompositions X1(A, B, C) and X2(C, D) of a given relation
X (A, B, C, D).
 Alternatively, X1 and X2 are lossless forms of decomposition of X.
 A JD ⋈ {X1, X2,…, Xn} holds over a relation X if X1, X2,….., Xn is a lossless-join type of
decomposition.
 The *(A, B, C, D), (C, D) happen to be a Join Dependency of X if the join of the join’s
attribute happens to be equal to the relation X.
 Here, we use the *(X1, X2, X3) to indicate that relation X1, X2, X3 and so on are a
Join Decomposition of X.
10. Define functional dependencies. How are primary keys related to FDs? Explain with example.

Functional Dependencies are fundamental to the process of Normalization i.e., Functional

Dependency plays key role in differentiating good database design from bad database designs.

A functional dependency is a “type of constraint that is a generalization of the notation of

the key”.

Functional Dependency describes the relationship between attributes (columns) in a table.

Functional dependency is represented by an arrow sign (→).

In other words, a dependency FD: “X → Y” means that the values of Y are determined by the

values of X. Two tuples sharing the same values of X will necessarily have the same values of

Y. An attribute on left hand side is known as “Determinant”. Here X is a Determinant.

Prime and non-prime attributes

Attributes which are parts of any candidate key of relation are called as prime attribute, others

are non-prime attributes.

Candidate Key: Candidate Key is minimal set of attributes of a relation which can be used to
identify a tuple

uniquely.

Consider student table: student(sno, sname,sphone,age)


we can take sno as candidate key. we can have more than 1 candidate key in a table.

types of candidate keys:

1. simple(having only one attribute)

2. composite(having multiple attributes as candidate key)

Super Key: Super Key is set of attributes of a relation which can be used to identify a tuple
uniquely.

 Adding zero or more attributes to candidate key generates super key.


 A candidate key is a super key but vice versa is not true.

Consider student table: student(sno, sname,sphone,age)

we can take sno, (sno, sname) as super key

11. Consider the schema R=(A,B,C,D,E) and the functional dependencies: A→BC, CD→E, B→D,
E→A. Give a lossless join decomposition into BCNF of the schema R.
Soultion:
result : = {R};
F
+ = {A → ABCDE, B → D, BC → ABCDE, CD → ABCDE, E → ABCDE, …}.
R is not in BCNF.
B → D is a non-trivial f.d. that holds on R, B ∩ D = ∅, and B → ABCDE is not in
F
+
. Therefore,
result := (result – R) ∪ (R – D) ∪ (B, D), i.e. (A, B, C, E) ∪ (B, D).
(A, B, C, E) and (B,D) are in BCNF. So this is a decomposition of R into BCNF.

12. Consider the relational schema R(A, B, C), which has the FD: B → C. If A is a candidate key for
R, is it possible for R to be in BCNF? If so, under what conditions? If not, explain why not.

Soultion:

If A is the candidate key for the given relation then the non trivial dependency that will be
implied is :

A --> BC as we know candidate key is a key that uniquely identifies a tuples hence determines
all attributes value

So prime attribute : A

Non prime attribute : {B,C}

And given we have other dependency : B --> C


So since B is not a superkey (not even a candidate key , as it does not determine all
attributes), the given relation is not in BCNF

since for BCNF we require that every determinant (left hand side of FD) must be a superkey
which is not the case here.
A superkey is nothing but superset of candidate key (or) we can say candidate key is the
minimal superkey.

13. Consider the instance of a relation shown below:


List all the functional dependencies that this relational instance satisfies

Solution:

Let us find the F.D’s


a) X->Y is FD because for each x1 we have corresponding y1 value same as for x2 also,
so this is in FD
b) Y->Z is not FD because not two tuples have either corresponding same or unique
values
c) Z->X is not FD because not two tuples have either corresponding same or unique
values
d) Z->Y is FD because for each z1 and z2 we have corresponding y1 value same and
unique also, so this is in FD
e) XY->Z is not FD because not two tuples have either corresponding same or unique
values
f) YZ->X is not FD because not two tuples have either corresponding same or unique
values
g) XZ->Y is FD because for each x1z1,x2z1,x2z3 we have corresponding y1 value same
nd unique also , so this is in FD
So conclusion is, The following functional dependencies hold over Relation R X->Y,Z->Y
and XZ->Y
14. Suppose that we have the following four tuples in a relation S with three attributes ABC:
(1,2,3), (4,2,3), (5,3,3), (5,3,4). Which functional and multivalued dependencies hold over
relation S?
Solution:
Please try same as above question

You might also like