You are on page 1of 4

Normalization

• Improve the schema by decomposing relations


and removing anomalies.
CS 235: • Boyce-Codd Normal Form (BCNF): all FD’s
Introduction to Databases follow from the fact key → everything.
• Formally, R is in BCNF if every nontrivial FD for
Svetlozar Nestorov R, say X → A, has X a superkey.
– “Nontrivial” = right-side attribute not in left side.

Lecture Notes #6

BCNF properties Example


1. Guarantees no redundancy due to FD’s. Beers(name, manf, manfAddr).
2. Guarantees no update anomalies = one • FD’s:
– name → manf,
occurrence of a fact is updated, not all.
– manf → manfAddr.
3. Guarantees no deletion anomalies = • Only key is name.
valid fact is lost when tuple is deleted.
• manf → manfAddr violates BCNF with a left
side unrelated to any key.

Decomposition into BCNF Proof


• Setting: relation R, given FD’s F. Suppose • Let Y → A is a BCNF violation and follows
relation R has BCNF violation X → B. from F
• We need only look among FD’s of F for a • Then the computation of Y+ used at least
BCNF violation. one FD X → B from F.
• If there are no violations, then the relation • X must be a subset of Y.
is in BCNF. • Thus, if Y is not a superkey, X cannot be a
• Don’t we have to consider implied FD’s? superkey either, and X → B is also a
• No, because… BCNF violation.

1
Decomposition Algorithm (1/2) Decomposition Algorithm (2/2)
For every violation X → B among given 3. Find the FD’s for the decomposed
FD’s: relations.
1. Compute X+. – Project the FD’s from F = calculate all
ƒ Cannot be all attributes – why? consequents of F that involve only attributes
from X+ or only from (R−X+) ∪ X.
2. Decompose R into X+ and (R–X+) ∪ X.

R X
X+

Example (1/3) Example (2/3)


R = Drinkers(name, addr, beersLiked, manf, favoriteBeer)
F D’s: • BCNF violations?
• name → addr – For Drinkers1, name is key and all left sides
• name → favoriteBeer
of FD’s are superkeys.
• beersLiked → manf
Pick BCNF violation name → addr. – For Drinkers2, {name, beersLiked} is the key,
• Close the left side: name + = name addr favoriteBeer. and beersLiked → manf violates BCNF.
• Decomposed relations:
Drinkers1(name, addr, favoriteBeer)
Drinkers2(name, beersLiked, manf)
• Projected FD’s (skipping a lot of work):
– For Drinkers1: name → addr and name → favoriteBeer.
– For Drinkers2: beersLiked → manf.

Third Normal Form (3NF)


Example (3/3) • Sometimes we have a dilemma:
– If you decompose, you can’t check the FD’s in the
decomposed relations.
• Decompose Drinkers2
– If you don’t decompose, you violate BCNF.
• Close beersLiked + = beersLiked, manf. • Abstractly: AB → C and C → B.
• Decompose: • In book: title city → theatre and theatre →
Drinkers3(beersLiked, manf) city.
• Another example: street city → zip,
Drinkers4(name, beersLiked)
zip → city.
• Resulting relations are all in BCNF: • Keys: AB and AC, but C → B has a left side
Drinkers1(name, addr, favoriteBeer) not a superkey.
Drinkers3(beersLiked, manf) • Suggests decomposition into BC and AC.
– But you can’t check the FD AB → C in these
Drinkers4(name, beersLiked) relations.

2
Example “Elegant” Workaround
• What can go wrong if we decompose: • Define the problem away.
street zip
A = street, • A relation R is in 3NF iff for every nontrivial
545 Tech Sq. 02138
B = city, 545 Tech Sq. 02139 FD X → A, either:
C = zip. city zip 1. X is a superkey, or
Cambridge 02138
Cambridge 02139 2. A is prime = member of at least one
key.
city s treet zip
• Thus, the canonical problem goes away:
Cambridge 545 Tech Sq. 02138 you don’t have to decompose because all
Join: Cambridge 545 Tech Sq. 02139 attributes are prime.

Decomposition Properties 3NF vs. BCNF


1. We should be able to recover from the • Without proof, we assert that it is always
decomposed relations the data of the possible to decompose into BCNF and
original. satisfy (1).
– Recovery involves projection and join (next • Also without proof, we can decompose
time). into 3NF and satisfy both (1) and (2).
2. We should be able to check that the FD’s • But it is not possible to decompose into
for the original relation are satisfied by BCNF and get both (1) and (2).
checking the projections of those FD’s in – Street-city-zip is an example of this point.
the decomposed relations.

Multivalued Dependencies Example


• The multivalued dependency X →→ Y • Drinkers(name, addr, phones, beersLiked)
holds in a relation R if whenever we have • MVD name →→ phones.
two tuples of R that agree in all the • If Drinkers has the two tuples:
attributes of X, then we can swap their Y name addr phones beersLiked
components and get two new tuples that sue a p1 b1
sue a p2 b2
are also in R.
it must also have the same tuples with phones
components swapped:
name addr phones beersLiked
sue a p1 b2
sue a p2 b1

3
MVD Rules Splitting Doesn’t Hold
• Every FD is an MVD: if X →Y, then • Sometimes you need to have several attributes on
swapping Y’s between tuples that agree the right of an MVD.
on X doesn’t create new tuples. • For example: Drinkers(name, areaCode, phones,
beersLiked, beerManf)
• Example, in Drinkers: name →→ addr.
name areaCode phones beersLiked beerManf
• Complementation: if X →→ Y, then Leo 773 555-1111 Bud A.B.
Leo 773 555-1111 Honkers G.I.
X →→ Z, where Z is all attributes not in X Leo 800 555-9999 Bud A.B.
or Y. Leo 800 555-9999 Honkers G.I.
• Example: since name →→ phones • name →→ areaCode phones holds, but neither
holds in Drinkers, so does name →→ areaCode nor name →→ phones do.
name →→ addr beersLiked.

Fourth Normal Form (4NF)


• Eliminate redundancy due to multiplicative effect
of MVD's.
• Roughly: treat MVD's as FD's for decomposition,
but not for finding keys.
• Formally: R is in Fourth Normal Form if
whenever MVD X →→ Y is nontrivial (Y is not a
subset of X, and X ∪ Y is not all attributes), then
X is a superkey.
– Remember, X → Y implies X →→ Y, so 4NF is more
stringent than BCNF.
• Decompose R, using 4NF violation X →→ Y,
into XY and X ∪ (R—Y).

You might also like