40 views

Uploaded by upcursor

save

You are on page 1of 86

Normal Forms

Chapter 19

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 1
**

The Evils of Redundancy

Redundancy is at the root of several problems

associated with relational schemas:

redundant storage, insert/delete/update anomalies

Integrity constraints, in particular functional

dependencies, can be used to identify schemas with

such problems and to suggest refinements.

Main refinement technique: decomposition (replacing

ABCD with, say, AB and BCD, or ACD and ABD).

Decomposition should be used judiciously:

Is there reason to decompose a relation?

What problems (if any) does the decomposition cause?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 2

Functional Dependencies (FDs)

** A functional dependency X→Y holds over relation R
**

if, for every allowable instance r of R:

t1 r, t2 r, (t1) = (t2) implies (t1) = (t2)

i.e., given two tuples in r, if the X values agree, then the Y

values must also agree. (X and Y are sets of attributes.)

An FD is a statement about all allowable relations.

Must be identified based on semantics of application.

Given some allowable instance r1 of R, we can check if it

violates some FD f, but we cannot tell if f holds over R!

K is a candidate key for R means that K→R

However, K→R does not require K to be minimal!

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 3

Example: Constraints on Entity Set

** Consider relation obtained from Hourly_Emps:
**

Hourly_Emps (ssn, name, lot, rating, hrly_wages, hrs_worked)

Notation: We will denote this relation schema by

listing the attributes: SNLRWH

This is really the set of attributes {S,N,L,R,W,H}.

Sometimes, we will refer to all attributes of a relation by

using the relation name. (e.g., Hourly_Emps for SNLRWH)

Some FDs on Hourly_Emps:

ssn is the key: S→SNLRWH

** rating determines hrly_wages: R→W
**

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 4

Wages

Example (Contd.)

Problems due to R→W : Hourly_Emps2

Update anomaly: Can

we change W in just

the 1st tuple of SNLRWH?

Insertion anomaly: What if

we want to insert an

employee and don’t know

the hourly wage for his

rating?

Deletion anomaly: If we

delete all employees with

rating 5, we lose the

information about the

wage for rating 5!

Will 2 smaller tables be better?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 5

Reasoning About FDs

Given some FDs, we can usually infer additional FDs:

ssn→did, did→lot implies ssn→lot

An FD f is implied by a set of FDs F if f holds

whenever all FDs in F hold.

= closure of F is the set of all FDs that are implied by F.

Armstrong’s Axioms (X, Y, Z are sets of attributes):

Reflexivity: If X⊆Y, then Y→X

Augmentation: If X→Y, then XZ→YZ for any Z

Transitivity: If X→Y and Y→Z, then X→Z

These are sound and complete inference rules for FDs!

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 6

Reasoning About FDs (Contd.)

Couple of additional rules (that follow from AA):

Union: If X→Y and X→Z, then X→YZ

Decomposition: If X→YZ, then X→Y and X→Z

Example: Contracts(cid,sid,jid,did,pid,qty,value), and:

C is the key: C→CSJDPQV

Project purchases each part using single contract: JP→C

Dept purchases at most one part from a supplier: SD→P

JP→C, C→CSJDPQV imply JP→CSJDPQV

SD→P implies SDJ→JP

SDJ→JP, JP→CSJDPQV imply SDJ→CSJDPQV

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 7

Reasoning About FDs (Contd.)

Computing the closure of a set of FDs can be

expensive. (Size of closure is exponential in # attrs!)

Typically, we just want to check if a given FD X→Y is

in the closure of a set of FDs F. An efficient check:

Compute attribute closure of X (denoted ) wrt F:

• Set of all attributes A such that X→A is in

• There is a linear time algorithm to compute this.

Check if Y is in

Does F = {A→B, B→C, C D→E } imply A→E?

i.e, is A→E in the closure ? Equivalently, is E in ?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 8

Normalization

**“There are two rules in life:
**

Rule #1: Don’t sweat the small stuff.

Rule #2: Everything is small stuff.”

(Finn Taylor)

**Life is as complicated as we make it—normalization
**

can be simplified.☻

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 9
**

Normal Forms

Returning to the issue of schema refinement, the first

question to ask is whether any refinement is needed!

If a relation is in a certain normal form (BCNF, 3NF

etc.), it is known that certain kinds of problems are

avoided/minimized. This can be used to help us

decide whether decomposing the relation will help.

Role of FDs in detecting redundancy:

Consider a relation R with 3 attributes, ABC.

• No FDs hold: There is no redundancy here.

• Given A→B: Several tuples could have the same A

value, and if so, they’ll all have the same B value!

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 10

Normalization

What is normalization?

In general, normalization removes

duplication and minimizes redundant chunks

of data.

The result is better organization and more

effective use of physical space, among other

factors.

Normalization is not always the best solution!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 11
**

1NF: First Normal Form

** Eliminate repeating fields. Furthermore, all
**

fields must contain a single value.

Define primary keys. All records must be

identified uniquely with a primary key. A

primary key is unique and thus no duplicate

values are allowed.

All fields other than the primary key must

depend on the primary key, either directly or

indirectly.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 12

1NF: First Normal Form

Table in 0th

Normal Form!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 13
**

1NF: First Normal Form

Table in 0th

Normal Form!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 14
**

1NF: First Normal Form

** apply 1NF: remove
**

repeating fields by

creating a new table

where the original

and new table are

linked in a one-to-

many relationship.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 15
**

1NF: First Normal Form

** apply 1NF: assign
**

primary keys

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 16
**

2NF: Second Normal Form

** The table must be in 1NF.
**

All non-key values must be fully functionally

dependent on the primary key. In other words,

non-key fields not completely and individually

dependent on the primary key are not allowed.

Partial dependencies must be removed. A

partial dependency is a special type of

functional dependency that exists when a field

is fully dependent on a part of a composite

primary key.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 17

2NF: Second Normal Form

** Full functional dependence
**

given: X→Y

Y depends on X and X alone

therefore

in: XZ→Y

Y is not in full functional dependence

Y is partially dependent on the composite

key XZ

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 18
**

2NF: Second Normal Form

Partial Dependency

Case 1: Attribute A

Key Attributes X

A not in Key

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 19
**

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 20

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 21

Table in 1NF!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 22
**

2NF: Second Normal Form

** 2NF performs a seemingly similar function to
**

that of 1NF, but creates a table where

repeating values rather than repeating fields

are removed to a new table.

Typically, 2NF creates one-to-many

relationships between static and dynamic,

removing static data from transactional tables

into new tables.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 23
**

separate static

data from

dynamic data

is this in 2NF

already?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 24
**

connect the

tables with the

proper

relationship

(one-to-one or

one-to-many)

is this in 2NF

already?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 25
**

all tables must

have a primary

key (1NF

requirement!)

is this in 2NF

already?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 26
**

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 27

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 28

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 29

2NF: Second Normal Form

Is this in 2NF??

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 30
**

2NF: Second Normal Form

Is this in 2NF??

✕

✔

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 31

3NF: Third Normal Form

** The table must be in 2NF.
**

Eliminate transitive dependencies.

A transitive dependency is where a field is

indirectly determined by the primary key

because that field is functionally dependent

on a second field, where that second field is

dependent on the primary key.

In basic terms, every field in a table that is

not a key field must be directly dependent

on the primary key.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 32

3NF: Third Normal Form

Transitive Dependency

Case 1:

Key Attributes X Attribute A

A not in Key

Case 2: Attributes X

Key Attribute A

A is in Key

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 33
**

many-to-many relationships!

how to search for single task

assigned to a single employee?

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 34

3NF Transformation!

decomposition of entities

involved in a many-to-

many relationship

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 35
**

3NF Transformation!

amalgamating duplication

into a new table

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 36
**

3NF Transformation!

transitive dependency

removed

a transitive dependency

exists because it is

assumed that:

1. each employee is

assigned to a CAUTION: too many

particular department tables will result to slower

2. each department within queries having to join too

a company is exclusively many tables

based in one specific city

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 37
**

3NF Transformation!

remove calculated fields

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 38
**

Third Normal Form (3NF)

Reln R with FDs F is in 3NF if, for all X→A in

A∈X (called a trivial FD), or

X contains a key for R, or

A is part of some key for R.

Minimality of a key is crucial in third condition above!

If R is in 3NF, some redundancy is possible. It is a

compromise, used when BCNF not achievable.

Lossless-join, dependency-preserving decomposition of R into a

collection of 3NF relations always possible.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 39
**

3NF: Third Normal Form

Is this in 2NF??

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 40
**

3NF: Third Normal Form

Is this in 2NF??

✔

all the nonkey attributes (B and C) are fully dependent on the

primary key (A)

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 41
**

3NF: Third Normal Form

Is this in 3NF??

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 42
**

3NF: Third Normal Form

Is this in 3NF??

✕

C, which is a nonkey attribute, is also functionally dependent on

B, which is also a nonkey attribute. Therefore, the relation R is

not in 3NF.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 43
**

3NF: Third Normal Form

Is this in 3NF??

✕

✔

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 44
**

BCNF: Boyce-Codd Normal Form

** The table must be in 3NF.
**

A table can have only one candidate key.

A candidate key has potential for being a

table’s primary key.

A table is not allowed more than one

primary key because referential integrity

requires it as such.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 45
**

BCNF: Boyce-Codd Normal Form

** BCNF is an odd one because it is a little like a
**

special case of 3NF.

BCNF requires that every determinant in a

table is a candidate key.

If there is only one candidate key, 3NF and

BCNF are the same.

Essentially, BCNF prohibits a table from having

two possible primary keys.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 46
**

BCNF

Transformation!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 47
**

Boyce-Codd Normal Form (BCNF)

** Reln R with FDs F is in BCNF if, for all X→A in
**

A∈X (called a trivial FD), or

X contains a key for R.

In other words, R is in BCNF if the only non-trivial

FDs that hold over R are key constraints.

No dependency in R that can be predicted using FDs alone.

If we are shown two tuples that agree upon

the X value, we cannot infer the A value in

one tuple from the A value in the other.

If example relation is in BCNF, the 2 tuples

must be identical (since X is a key).

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 48

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 49
**

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

✕

The R relation is not in 2NF because, like before, C is in a partial

dependence with the primary key.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 50
**

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

✕

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 51
**

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

✕

✕

We can’t because we lose an FD, namely D → C. Therefore, we

need to find another decomposition.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 52
**

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

✕

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 53
**

BCNF: Boyce-Codd Normal Form

Is this in 2NF??

✕

✔

With this decomposition, no FDs are lost. And the resulting

relations are in 2NF.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 54
**

BCNF: Boyce-Codd Normal Form

Is this in 3NF??

✕

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 55
**

BCNF: Boyce-Codd Normal Form

Is this in 3NF??

✕

✔

The resulting relations are not only in 2NF, but they are also in

3NF.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 56
**

BCNF: Boyce-Codd Normal Form

Is this in BCNF??

?

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 57
**

BCNF: Boyce-Codd Normal Form

Is this in BCNF??

✕

B and D are determinants and are not candidate keys. Therefore,

the relation R2 is not in BCNF, while the relation R1 is.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 58
**

BCNF: Boyce-Codd Normal Form

Is this in BCNF??

✕ ✔

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 59

BCNF: Boyce-Codd Normal Form

Normalized Form!

✔

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 60

Recap!

** 1st Normal Form (1NF)—Eliminate repeating groups
**

such that all records in all tables can be identified

uniquely by a primary key in each table.

2nd Normal Form (2NF)—All non-key values must be

fully functionally dependent on the primary key. No

partial dependencies are allowed.

3rd Normal Form (3NF)—Eliminate transitive

dependencies.

Boyce-Codd Normal Form (BCNF)—Every

determinant in a table is a candidate key.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 61

Decomposition of a Relation Scheme

Suppose that relation R contains attributes A1 ... An.

A decomposition of R consists of replacing R by two or

more relations such that:

Each new relation scheme contains a subset of the attributes

of R (and no attributes that do not appear in R), and

Every attribute of R appears as an attribute of one of the

new relations.

Intuitively, decomposing R means we will store

instances of the relation schemes produced by the

decomposition, instead of instances of R.

E.g., Can decompose SNLRWH into SNLRH and RW.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 62

Example Decomposition

** Decompositions should be used only when needed.
**

SNLRWH has FDs S→SNLRWH and R→W

Second FD causes violation of 3NF; W values repeatedly

associated with R values.

Easiest way to fix this is to create a relation RW to store

these associations, and to remove W from the main schema:

• i.e., we decompose SNLRWH into SNLRH and RW

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 63
**

Problems with Decompositions

** There are three potential problems to consider:
**

Some queries become more expensive.

• e.g., How much did sailor Joe earn? (salary = W*H)

Given instances of the decomposed relations, we may not

be able to reconstruct the corresponding instance of the

original relation!

• Fortunately, not in the SNLRWH example.

Checking some dependencies may require joining the

instances of the decomposed relations.

• Fortunately, not in the SNLRWH example.

Tradeoff: Must consider these issues vs. redundancy.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 64

Problems with Decompositions

Illustration of a Lossy Decomposition

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 65
**

Lossless Join Decompositions

** Decomposition of R into X and Y is lossless-join w.r.t. a
**

set of FDs F if, for every instance r that satisfies F:

(r) (r) = r

It is always true that r (r) (r)

In general, the other direction does not hold! If it does, the

decomposition is lossless-join.

Definition extended to decomposition into 3 or more

relations in a straightforward way.

It is essential that all decompositions used to deal with

redundancy be lossless! (Avoids Problem (2).)

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 66

More on Lossless Join

** The decomposition of R into X and Y is
**

lossless-join wrt F if and only if the closure of

F contains:

X∩Y → X, or

X∩Y → Y

in other words, the attributes common to X and Y

must contain a key for either X or Y

In particular, the decomposition of R into

R - Y and XY is lossless-join if X → Y holds

over R.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 67

Dependency Preserving Decomposition

** Consider CSJDPQV, C is key, JP→C and SD→P.
**

BCNF decomposition: CSJDQV and SDP

Problem: Checking JP→C requires a join!

This is NOT a dependency-preserving decomposition!!

Dependency preserving decomposition (Intuitive):

If R is decomposed into X, Y and Z, and we enforce the FDs

that hold on X, on Y and on Z, then all FDs that were given

to hold on R must also hold. (Avoids Problem (3).)

A dependency preserving decomposition allows us to

enforce all FDs by examining a single relation instance on

each insertion and update.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 68

Dependency Preserving Decomposition

** Projection of set of FDs F: If R is decomposed into X
**

and Y, and let F be a set of FDs over R. The projection

of F on X (denoted FX ) is the set of FDs in the closure

of F+ (not just F) that involve only attributes of X.

U→V is in FX iff U, V are in X.

** Decomposition of R into X and Y is dependency
**

preserving if (FX ∪ FY ) + = F +

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 69
**

Dependency Preserving Decompositions

(Contd.)

Important to consider F +, not F, in this definition:

Consider relation R with attributes ABC

decomposed into AB and BC

FDs of R (F) : A→B, B→C, C→A

A→B is in FAB, B→C is in FBC

Is this dependency preserving? Is C→A preserved?????

** Dependency preserving does not imply lossless join:
**

ABC, A→B, decomposed into AB and BC.

And vice-versa! (Example?)

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 70
**

Decomposition into BCNF

** Consider relation R with FDs F. If X→Y violates
**

BCNF, decompose R into R - Y and XY.

Repeated application of this idea will give us a collection of

relations that are in BCNF; lossless join decomposition, and

guaranteed to terminate.

e.g., CSJDPQV, key C, JP→C, SD→P, J→S

To deal with SD→P, decompose into SDP, CSJDQV.

To deal with J→S, decompose CSJDQV into JS and CJDQV

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 71
**

Decomposition into BCNF

Decomposition of CSJDPQV

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 72
**

Decomposition into BCNF

** The decomposition of CSJDQV into SDP, JS, and
**

CJDQV is not dependency-preserving. JP→C cannot

be enforced without a join.

One way to deal with this situation is to add a relation

with attributes CJP. In effect, this solution amounts to

storing some information redundantly in order to

make the dependency enforcement cheaper.

Problem: Redundancy across relations!!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 73
**

Decomposition into BCNF

** Suppose that we choose to decompose CSJDPQV
**

into JS and CJDPQV instead (choose J→S first!).

The only dependencies that hold over CJDPQV are

JP→C and the key dependency C→CJDPQV. Since JP

is a key, CJDPQV is in BCNF.

Thus, the schemas JS and CJDPQV represent a

lossless-join decomposition of Contracts into BCNF

relations too!

designer must discriminate among alternatives!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 74
**

BCNF and Dependency Preservation

** In general, there may not be a dependency preserving
**

decomposition into BCNF.

e.g., CSZ, CS→Z, Z→C

Can’t decompose while preserving 1st FD; not in BCNF.

Similarly, decomposition of CSJDQV into SDP, JS and

CJDQV is not dependency preserving (w.r.t. the FDs

JP→C, SD→P and J→S).

However, it is a lossless join decomposition.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 75
**

Decomposition into 3NF

** Clearly, the algorithm for lossless join decomp into
**

BCNF can be used to obtain a lossless join decomp

into 3NF (can stop earlier).

Refinement: Instead of the given set of FDs F, use a

minimal cover for F.

the resulting decomposition will be lossless join and

dependency preserving!

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 76
**

Minimal Cover for a Set of FDs

Minimal cover G for a set of FDs F:

Closure of F = closure of G.

Right hand side of each FD in G is a single attribute.

If we modify G by deleting an FD or by deleting attributes

from an FD in G, the closure changes.

Intuitively, every FD in G is needed, and ``as small as

possible’’ in order to get the same closure as F.

e.g., A→B, ABCD→E, EF→GH, ACDF→EG has the

following minimal cover:

A→B, ACD→E, EF→G and EF→H

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 77
**

Minimal Cover for a Set of FDs

Given (F):

A → B, ABCD → E, EF → G, EF → H, ACDF → EG

rewrite ACDF → EG

ACDF → E and ACDF → G

delete ACDF → G bec. it is implied by the ff FDs:

A → B, ABCD → E, EF → G

delete ACDF → E, and so on.

Minimal Cover:

A → B, ACD → E, EF → G, and EF → H

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 78
**

Minimal Cover for a Set of FDs

Algorithm for getting the MC (G):

Put the FDs in a standard form: Obtain a collection G of

equivalent FDs with a single attribute on the right side

(using the decomposition axioms).

Minimize the left side of each FD: For each FD in G, check

each attribute in the left side to see if it can be deleted while

preserving equivalence to F+.

Delete redundant FDs: Check each remaining FD in G to

see if it can be deleted while preserving equivalence to F +.

Note that we could produce different minimal covers

for a given set of FDs depending on w/c FD was

considered first.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 79

Minimal Cover for a Set of FDs

** What is the minimal cover of:
**

ABCD → E, E → D, A → B, and AC → D

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 80
**

Minimal Cover for a Set of FDs

** What is the minimal cover of:
**

ABCD → E, E → D, A → B, and AC → D

FDs are already in standard form.

Minimize the left side of each FD:

AC → E, E → D, A → B, and AC → D

Delete redundant FDs:

AC → E, E → D implies AC → D

Minimal Cover:

AC → E, E → D, and A → B

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 81
**

Decomposition into 3NF

** Refined algo for a 3NF decomposition which is
**

lossless join and dependency preserving:

Consider relation R with FDs G that is a minimal cover. If

X→Y violates 3NF, decompose R into R - Y and XY.

For each FD X→A in G that is not preserved, create a

relation schema XA and add it to the decomposition of R

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 82
**

Decomposition into 3NF

** Consider the Contracts relation with attributes
**

CSJDPQV and FDs JP → C, SD → P, and J → S.

If we decompose CSJDPQV into SDP and CSJDQV, then

SDP is in BCNF, but CSJDQV is not even in 3NF.

So we decompose it further into JS and CJDQV.

The relation schemas SDP, JS, and CJDQV are in 3NF (in

fact, in BCNF), and the decomposition is lossless-join.

However, the dependency JP→ C is not preserved.

This problem can be addressed by adding a relation schema

CJP to the decomposition.

**Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 83
**

3NF vs BCNF

** It is always possible to decompose a relation into
**

relations in 3NF and

The decomposition is lossless

Dependencies are preserved

It is always possible to decompose a relation into

relations in BCNF and

The decomposition is lossless

It may not be possible to preserve dependencies

But a schema that is in 3NF but not in BCNF may

contain redundancy

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 84

Refining an ER Diagram

1st diagram translated: Before:

Workers(S,N,L,D,C) since

name dname

Departments(D,M,B)

ssn lot did budget

Lots associated with workers.

Suppose all workers in a Employees Works_In Departments

dept are assigned the same

lot: D→L

Redundancy; fixed by: After:

Workers2(S,N,D,C) since

budget

**Dept_Lots(D,L) name dname
**

ssn did lot

Can fine-tune this:

Workers2(S,N,D,C) Employees Works_In Departments

Departments(D,M,B,L)

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 85

Summary of Schema Refinement

If a relation is in BCNF, it is free of redundancies that

can be detected using FDs. Thus, trying to ensure

that all relations are in BCNF is a good heuristic.

If a relation is not in BCNF, we can try to decompose

it into a collection of BCNF relations.

Must consider whether all FDs are preserved. If a lossless-

join, dependency preserving decomposition into BCNF is

not possible (or unsuitable, given typical queries), should

consider decomposition into 3NF.

Decompositions should be carried out and/or re-examined

while keeping performance requirements in mind.

Database Management Systems, 3ed, R. Ramakrishnan and J. Gehrke 86

- Ch05-7EdUploaded byMarufjweel Khan
- 26FFunctional Dependency(Jandir)Uploaded byNigga John
- Articles Page 1 of 8 Normalizing AUploaded bynhipz25
- Lesson 1 File StructuresUploaded byupcursor
- ch14aUploaded bySohini Nayak
- BCNFUploaded bymansha99
- 05 AlgebraUploaded byupcursor
- Dependency PreservationUploaded bydon504
- Unit 1CompletedUploaded bykalaivani
- 05_LabExerciseTransactionsUploaded byupcursor
- Assignment Ipwt.docUploaded byasingh_519006
- Chapter 08Uploaded byGanessa Roland
- Basic CREATE TABLE Mysql StatementUploaded bycombrote
- SQLUploaded byinfrared_sky
- Public DataUploaded byapi-3852421
- 1 to 24 ddbs.pdfUploaded byKhizraSaleem
- Assignment Group 2BUploaded bymayank
- IP Worksheet-1 (XII)Uploaded byManinderjeet Singh
- OBIEE 10G-11G - Level-Based Hierarchy GerardNico_com (BI, OBIEE, OWB, DataWarehouse)Uploaded byp_k_vinod
- Chapter 09Uploaded byAnn Mamales