Design Theory For Relational Databases

Chapter 3: Design Theory For Relational Databases
3.1 Functional Dependencies

3.1.1 Definition Of Functional Dependency
3.1.2 Keys Of Relations
3.1.3 Superkeys
3.1.4 Exercises For Section 3.1
3.2 Rules About Functional Dependencies

3.2.1 Reasoning About Functional Dependencies
3.2.2 The Splitting/Combining Rule
3.2.3 Trivial Functional Dependencies
3.2.4 Computing The Closure Of Attributes
3.2.5 Why The Closure Algorithm Works
3.2.6 The Transitive Rule
3.2.7 Closing Sets Of Functional Dependencies
3.2.8 Projecting Functional Dependencies
3.3 Design Of Relational Database Schemas

3.3.1 Anomalies
3.3.2 Decomposing Relations
3.3.3 Boyce-Codd Normal Form
3.3.4 Decomposing Into BCNF
3.4 Decomposition: The Good, The Bad, The Ugly

3.4.1 Recovering Information From A Decomposition
3.4.2 The Chase Test For Lossless Join
3.4.3 Why The Chase Works
3.4.4 Dependency Preservation
3.4.5 Excercises For Section 3.4
3.5 Third Normal Form

3.5.1 Definition Of Third Normal Form
3.5.2 The Synthesis Algorithm of 3NF Schemas
3.5.3 Why the 3NF Synthesis Algorithm Works
3.6 MultiValued Dependencies

3.6.1 Attribute Independence And Its Consequent Redundancy
3.6.2 Definition Of Multivalued Dependencies
3.6.3 Reasoning About Multivalued Dependencies
3.6.4 Fourth Normal Form
3.6.5 Decomposition Into Fourth Normal Forms
3.6.6 Relationships Among Normal Forms
3.7 An Algorithm To Discover MultiValued Dependencies

3.7.1 The Closure And The Chase
3.7.2 Extending The Chase To MVDs
3.7.3 Why The Chase Works For MVDs
3.7.4 Projecting MVDs
3.8 Summary Of Chapter 3
3.9 References For Chapter 3

Notes
Chapter 3: Design Theory For Relational Databases
?? == given an application - how do we go about designing a database for this

application?
In Chapter 4, we see high level notations for describing the structure of data, and
ways in which these high level designs can be converted to relations.
Alternatively we look at the requirements (for the application) and design the
relations directly without going through an intermediate stage.
Whichever approach is used, it is common to have room for improvement on the
initial schema.
fortunately there is a well developed theory for relational databases -

"dependencies", their implications on what makes a good relational schema,
this theory impies something called "normalization" == we decompose one relation
into two to resolve ambiguities. Next "multi valued dependencies",which
intuitively, represents a condition where one or more attributes are independent of
the other attributes. Dependencies also lead to normal forms, and decomposition of
relations to eliminate redundancies.
3.1 Functional Dependencies
A design theory that lets us

a) examine a given design carefully and
b) improve it based on a few simple principles.
The theory begins with stating the constraints that apply to the relation.
The most common constraint is 'the functional dependency' , which generalises the
idea of a key for a relation, .
Then we see how the theory gives us tools to improve our relations by
"decomposition of relations" - the replacement of one relation by two or more,
whose set of attributes == the set of attributes of the original undecomposed
relation
3.1.1 Definition Of Functional Dependency
A Functional Dependency (FD) on a relation R is a statement of the form "If two

tuples of R agree on attributes A1, A2, ... A_n i.e both tuples have the same
values for the same attributes, then they must also agree on all attributes B1,
B2, .... B_m. " (note: A1 .. .A_n do *not* have to appear consecutively, so also
B1,B2, ..., B_m.)
If every instance of relation R is such that, an FD is true, we say "R satisfies

FD".
When we say that R satisfies FD, we are saying that we are asserting a constraint
on R, (not just some instances of R).
Notation: A1 ... A_N functionally determine B1, ... B_m == A1, ... A_n -> B1, ...,
B_m
this is equivalent to
A1, .... , A_n -> B1

A1, .... , A_n -> B2
....
A1, .....2 A,n -> B_m
Consider the relation instance,

title, year, length, genre, studioName, starName
StarWars, 1997, 124, scifi, Fox, Carrie Fisher
StarWars, 1997, 124, scifi, Fox, Mark Hamill
StarWars, 1997, 124, scifi, Fox, Harrison Ford
Gone With The Wind, 1939, 231, drama, MGM, Vivian Leigh
Wayne's World, 1992, 95, comedy, Paramount, Dany Carvey
Wayne's World, 1992, 95, comedy, Paramount, Mike Meyers
For this relation the FD
title, year -> length, genre, studioName
holds.
the FD
title, year -> starName does not hold (counter example: the first two tuples
have the same title and year name but starName is different).
the FD holds for *all* instances of the relation, not a particular instance.
3.1.2 Keys Of Relations
We say that attributes {A1, A2, ... , An} are a *key* for a relation if
(1) these attributes functionally determine all other attributes of the
relation. i.e. there are not two tuples in any R instance s.t they agree on
attributes A1 thru An
(2) no proper *subset* of {A1,....,An} functionally determines all other
attributes of R. Iow a key must be minimal.
Consider
title, year, length, genre, studioName, starName

StarWars, 1997, 124, scifi, Fox, Carrie Fisher
StarWars, 1997, 124, scifi, Fox, Mark Hamill
StarWars, 1997, 124, scifi, Fox, Harrison Ford
Gone With The Wind, 1939, 231, drama, MGM, Vivian Leigh
Wayne's World, 1992, 95, comedy, Paramount, Dany Carvey
Wayne's World, 1992, 95, comedy, Paramount, Mike Meyers
{title, year, starName} is a key.

{year,starName} is not a key because we could have one star acting in two movies in
the same year (_ even though for the above *instance* this is true, iow, what is a
key comes from the *domain* and via domain knowledge, not reverse engineering an
*instance*)
{title, starName} is not a key because there could be two versions of a movie
(Spiderman!) made in two different years, with a star in common.
What is "Functional" about Functional Dependencies?

A1,A2, ... , A_n -> B is called a functional dependency because we can concieve of
a function F that takes A1, .. An as arguments and producing a unique value or no
value at all for B. This is not a mathematical function though (_ but a
characteristic of the data?)
3.1.3 Superkeys
A set of attributes that contains a key, iow, the superset of attributes that form
a key, is a superkey.
Thus every key is a superkey. However, some superkeys are not (minimal and so not)
keys. A superkey determines all the other attributes of the relation, but it need
not satisfy the 2nd condition -minimality.
Terminology: Other books use 'key' to mean what we call 'superkey' - i.e a *not
necessarily * minimal collection of attributes that determine the rest of the
attributes of the relation
3.2 Rules About Functional Dependencies
In this section, how to reason about FDs. i.e suppose we are told of a set of FDs
that a relation satisfies, we can then deduce what other FDs that relation
satisfies. This ability to deduce other 'satisfied FDs' is essential.
3.2.1 Reasoning About Functional Dependencies
A motivating example: if we are told that the relation R (A, B, C) satisfies the
FDs A -> B, B->C
we can deduce that R satisfies the FDs A -> C .
"Proof":
Let (a1, b1, c1) and (a1, b2, c2) be two tuples of R.
Since the FD A -> B holds, and since the two tuples have the value a1 for the
attribute A, b1 must = b2.
And since FD B -> C holds, c1 must equal c2.
So we have proved that A -> C holds.
a *set of* FD T follow * a set of * FD S

if all instances that satisfy S also satisfy T (_ Note that T may have
additional relation instances that do not satisfy S). S could be 'smaller' or
'equal'.
Equivalence of sets of FD S and T.

S and T are equivalent if the relation instances satisfying S are exactly those
satisfying T.
Iow S follows T and T follows S.
In this section, some rules about FDs. In general these allow us to replace one set
of FDs with another, or add to one set of FDs, more FDs that follow from the
original. e.g by the transitivity proved earlier, if given A -> B, and B -> C we
can add A -> C.
We also give an algorithm to see if one FD follows from another.
3.2.2 The Splitting/Combining Rule

1. We can replace the FD
A1, A2, ... A-n -> B1 B2 ... B-m
with
A1, A2, ... A-n -> B1

A1, A2, ... A-n -> B2
...
A1, A22 ... A-n -> B_m
2. We can replace
A1, A2, ... A-n -> B1

A1, A2, ... A-n -> B2
...
A1, A22 ... A-n -> B_m
with a single FD
A1, A2, ... A-n -> B1 B2 ... B-m
3.2.3 Trivial Functional Dependencies
A functional dependency is said to be trivial if it holds for all instances of the

relation, irrespective of what other constraints are assumed.
If the constraints are FDs, it is easy to tell whether an FD is trivial.
Trivial FDs are FD's such that, in,

A1, A2, .... , A_n -> B1, B2, .... ,B_m
{B1, .., Bm} is a subset of {A1, A2, ... A_n}
e.g
title, year -> title
title -> title
We can assume any trivial FD, without having to justify it in terms of what other
FDs are asserted on the relation.
An intermediate situation is when *some* of the attributes on the left of the FD

also appear on the rhs.
These can be removed from the rhs to simplify the FD.
3.2.4 Computing The Closure Of Attributes
3.2.5 Why The Closure Algorithm Works

3.2.6 The Transitive Rule
3.2.7 Closing Sets Of Functional Dependencies
3.2.8 Projecting Functional Dependencies
3.3 Design Of Relational Database Schemas

3.3.1 Anomalies
3.3.2 Decomposing Relations
3.3.3 Boyce-Codd Normal Form
3.3.4 Decomposing Into BCNF
3.4 Decomposition: The Good, The Bad, The Ugly

3.4.1 Recovering Information From A Decomposition
3.4.2 The Chase Test For Lossless Join
3.4.3 Why The Chase Works
3.4.4 Dependency Preservation
3.4.5 Excercises For Section 3.4
3.5 Third Normal Form

3.5.1 Definition Of Third Normal Form
3.5.2 The Synthesis Algorithm of 3NF Schemas
3.5.3 Why the 3NF Synthesis Algorithm Works
3.6 MultiValued Dependencies

3.6.1 Attribute Independence And Its Consequent Redundancy
3.6.2 Definition Of Multivalued Dependencies
3.6.3 Reasoning About Multivalued Dependencies
3.6.4 Fourth Normal Form
3.6.5 Decomposition Into Fourth Normal Forms
3.6.6 Relationships Among Normal Forms
3.7 An Algorithm To Discover MultiValued Dependencies

3.7.1 The Closure And The Chase
3.7.2 Extending The Chase To MVDs
3.7.3 Why The Chase Works For MVDs
3.7.4 Projecting MVDs
3.8 Summary Of Chapter 3
3.9 References For Chapter 3

Design Theory For Relational Databases

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Design Theory For Relational Databases

Uploaded by

Copyright:

Available Formats

Chapter 3: Design Theory For Relational Databases

3.1 Functional Dependencies

3.2 Rules About Functional Dependencies

3.3 Design Of Relational Database Schemas

3.4 Decomposition: The Good, The Bad, The Ugly

3.5 Third Normal Form

3.6 MultiValued Dependencies

3.7 An Algorithm To Discover MultiValued Dependencies

3.8 Summary Of Chapter 3

3.9 References For Chapter 3

Chapter 3: Design Theory For Relational Databases

?? == given an application - how do we go about designing a database for this

fortunately there is a well developed theory for relational databases -

3.1 Functional Dependencies

A design theory that lets us

3.1.1 Definition Of Functional Dependency

A Functional Dependency (FD) on a relation R is a statement of the form "If two

If every instance of relation R is such that, an FD is true, we say "R satisfies

A1, .... , A_n -> B1

A1, .....2 A,n -> B_m

Consider the relation instance,

For this relation the FD

title, year -> length, genre, studioName

3.1.2 Keys Of Relations

title, year, length, genre, studioName, starName

{title, year, starName} is a key.

What is "Functional" about Functional Dependencies?

3.1.4 Exercises For Section 3.1

3.2 Rules About Functional Dependencies

3.2.1 Reasoning About Functional Dependencies

So we have proved that A -> C holds.

a *set of* FD T follow * a set of * FD S

Equivalence of sets of FD S and T.

We also give an algorithm to see if one FD follows from another.

3.2.2 The Splitting/Combining Rule

A1, A2, ... A-n -> B1

A1, A2, ... A-n -> B1

A1, A2, ... A-n -> B1 B2 ... B-m

3.2.3 Trivial Functional Dependencies

A functional dependency is said to be trivial if it holds for all instances of the

Trivial FDs are FD's such that, in,

{B1, .., Bm} is a subset of {A1, A2, ... A_n}

An intermediate situation is when *some* of the attributes on the left of the FD

3.2.4 Computing The Closure Of Attributes

3.2.5 Why The Closure Algorithm Works

3.3 Design Of Relational Database Schemas

3.4 Decomposition: The Good, The Bad, The Ugly

3.5 Third Normal Form

3.6 MultiValued Dependencies

3.7 An Algorithm To Discover MultiValued Dependencies

3.8 Summary Of Chapter 3

3.9 References For Chapter 3

You might also like

a set of FD T follow * a set of * FD S

An intermediate situation is when some of the attributes on the left of the FD