You are on page 1of 40

I am a pig.

Then, you must


have 4 legs!

Decomposition and
Functional Dependency
Outline

 Redundancy
 Decomposition at first glance
 Functional dependency
 Dependency properties

2
Why?
 We have learnt that, an ER diagram
can be directly converted to relational
tables.

 However, these tables may contain


redundancy, namely, repetition of the
same information.

 To eliminate redundancy (as much as


possible), we need to refine the
relational tables (known as
normalization in db terminology).

3
An example of redundancy
rating
id hourly-wages

employee

4
An example of redundancy

 Negative impacts of redundancy


 Higher space consumption.
 Higher update overhead
• Imagine the operations we need to do, if we raise the
hourly-wage of B1 to 120.
 Insertion/update anomaly
• The DBA must prevent insertion of tuples resulting in
inconsistent hourly-wages, e.g., (7, B1, 200).
 Solution? 5
An example of redundancy (cont.)

6
An example of redundancy (cont.)
rating
id hourly-wages

employee have salary

 In fact, the two tables could have been


obtained directly, if we had designed a
“perfect ER diagram”.

 It is not realistic to assume that we can


always discover the perfect ER diagram.

 We need a tool to refine our design, even


if we started from an imperfect diagram.

7
Outline

 Redundancy
 Decomposition at first glance
 Functional dependency
 Dependency properties

8
Basic questions to ask
 Do we need to decompose a relation?

 What problems (if any) does a given decomposition cause?


 Lossless-join
 Dependency-preservation

9
Decomposition
 Just now, we decomposed
EMPLOYEE into two tables, to
avoid redundancy.

 Note that, the new tables can


reproduce the original
EMPLOYEE.

 This is a rule we must obey:


the decomposed tables must be
able to reproduce the original
table.
Lossless Join!

10
Illegal decomposition

 This decomposition violates the rule mentioned


earlier.

 How do we judge whether a decomposition is


legal?

11
Another illegal decomposition
 The judgment is not that easy! Lossless Join?
NO!
A B
1 2
2 2
A B C
A B C 3 3
1 2 3
1 2 3
1 2 2
2 2 2
2 2 3
3 3 1
2 2 2
B C
3 3 1
2 3
2 2
3 1

12
legal decomposition

 Checking the “legitimacy” of decomposition:


 The new tables must have common attribute(s).
 The common attribute(s) must be the candidate key of at
least one new table.

13
Some confusing notions
 Key? Better Definitions
An abbreviation of candidate key. Later

 Candidate key?
A minimal set of attributes that uniquely identifies every tuple.

 Primary key?
A candidate key selected by a database designer

 Superkey
Any superset of a candidate key.

14
Decomposition may not be obvious

 How to decompose the above table to minimize redundancy?

 Before we can answer the question, we need to gain more


understanding about redundancy.

15
Outline

 Redundancy
 Decomposition at first glance
 Functional dependency
 Dependency properties

16
Why does redundancy exist?

 Reason:
 rating determines hourly-wages.

 Once the tuple’s rating is known, its hourly-wages is also


decided.

 A concise representation: rating  hourly-wages.


17
Functional dependency

 rating  hourly-wages
 is called a functional dependence (FD).

 Do we have “rating  id”?


 If tuple’s rating is known, are we sure about its id?
 No.

18
FD (cont.)
 Do we have
hourly-wages  rating?

 Yes, different ratings have


different hourly-wages.
 Namely, if we know a tuple’s hourly-wages, then its rating has
only one possibility.

 Do we have id  rating?
 Yes, because each employee has only a single rating.

19
FD (cont.)
 Do we have id  id?
 Of course, known as a trivial FD.

 Do we have
id  (rating, hourly-wages)?
 Yes, because each employee has only a single (rating,
hourly-wages)-combination.

My id = 1

You are at scale B1 and


earn $100 per hour.

20
FD (cont.)

 Do we have
(id, rating)  hourly-wages?

 In English, if a tuple’s (id, rating)-combination is decided, how


many possibilities for hourly-wages?

 In fact, once (id, rating) is decided, we know exactly which


employee is concerned.
 Therefore, hourly-wages has only one possibility.
 So “(id, rating)  hourly-wages” 21
is true.
Functional dependency definition
 Let L and R be two sets of attributes.
 L  R means that
 if we know a tuple’s L,
then there is only a single
possibility for the tuple’s R!

 I.e., if we know L, we know R.


L
 rating  hourly-wages
 hourly-wages  rating
 id  rating
 id  id
R!
 id  (rating, hourly-wages)
 (id, rating)  hourly-wages 22
Secret of redundancy
 In general, a table has redundancy,
if there is a FD, whose left hand
side is not a candidate key.

 For example, the only candidate key of EMPLOYEE is id.


 EMPLOYEE has redundancy, because we have
rating  hourly-wages.

23
Where are FDs from?
 Two channels.
 First, common senses.
 HK-id  name.
 country  capital.
 (father, mother)  eldest-child.
…

 Second, special constraints of the underlying application.


 If every employee has her/his own office
• emp-id  office-number.
 If every customer can have a single account
• cust-id  acc-id.

24
Outline

 Redundancy
 Decomposition at first glance
 Functional dependency
 Dependency properties

25
A candidate key determines all
 For example, a candidate
key of EMPLOYEE is id.
 Thus, id determines any
combination of the attributes.

 id  id
 id  rating
 id  hourly-wages
 id  rating, hourly-wages
 id  id, rating, hourly-wages

 If we know the tuple’s id, then its any attribute has only 1
possibility.

26
Super key determines all
 A candidate key is id.
 Then, (id, rating) determines
any combination of the attributes.

 (id, rating)  id
 (id, rating)  rating
 (id, rating)  hourly-wages
 (id, rating)  rating, hourly-wages
 (id, rating)  id, rating, hourly-wages

 We only need id to claim that the tuple’s any attribute has only 1
possibility.

 So, of course, given its (id, rating), we can make the same claim.
27
Trivial functional dependences
 “id  id” is trivially true.

 Put it in English, and you will


find out.

 “If we know a tuple’s id = 1,


then we know its id.”
I have 4 legs. Guess how
many legs I have.

Don’t waste my time.


28
Trivial functional dependences (cont.)
 L  R is trivial, if L contains R.

 Examples:

 (id, rating)  id
 (id, rating)  rating
 (id, rating)  (id, rating)
 (id, hourly-wages)  id
 (id, rating, hourly-wages)  (id, rating)
 …

29
Inference rules for FDs
 Given a set of FDs F, we can infer additional FDs that hold
whenever the FDs in F hold

 One example:
id  rating
rating  hourly-wages

What can we derive from these two FDs?


id  hourly-wages

 More rules to come in next page

30
Union

 Given
 cust-id  cust-name (1)
 cust-id  cust-city (2)
 we can derive
 cust-id  (cust-name, cust-city) (3)

 Reasoning:
 By (1), if we know cust-id, then we know cust-name.
 By (2), if we know cust-id, then we know cust-city..
 Hence, if we know cust-id, then we know the (cust-name, cust-city)-
combination.
31
Transitivity

 Given
 creditcard-no  cust-id (1)
 cust-id  cust-name (2)
 we can derive
 creditcard-no  cust-name (3)

 Reasoning:
 By (1), if we know creditcard-no, then we know cust-id.
 By (2), if we know cust-id, then we know cust-name.
 Hence, if we know creditcard-no, then we know cust-name.

32
Augmentation

 Given
 creditcard-no  cust-id (1)
 we can derive
 (creditcard-no, branch-id)  (cust-id, branch-id) (2)

 Reasoning:
 By (1), if we know creditcard-no 40101342, we know cust-id 1.
 Hence, if (40101342, B1) = (creditcard-no, branch-id) of a tuple, we
know that (1, B1) = (stu-id , branch-id) of the tuple

33
FD derivation

 Given
 creditcard-no  cust-id (1)
 (cust-id, branch-id)  acc-id (2)
 We can derive (creditcard-no, branch-id)  acc-id as follows.

 From (1), we have


 (creditcard-no, branch-id)  (cust-id, branch-id) (3)
 Augementation

 From (3) and (2), by transitivity, we have


 (creditcard-no, branch-id)  acc-id
34
Summary of Inference Rules
 Let R be a relation schema, W, X, Y, Z be subsets of R.
 Reflexivity
 If Y ⊆ X, then X  Y (trivial FD’s)
 Augmentation
 If X  Y, then XZ  YZ, for every Z
 Transitivity
 If X  Y and Y  Z, then X  Z
 Union (Combining) Rule
 If X  Y and X  Z, then X  YZ
 Decomposition (Splitting) Rule
 If X  YZ, then X  Y and X  Z
 Pseudo-transitivity Rule
 If X  Y and WY  Z, 35then XW  Z
Prove FDs
 Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C}
 Prove or disprove F |= AE → C

{A→B, B →D, DE→C}


|= {A → D, DE→C} (Transitivity Rule:
If X  Y and Y  Z, then X  Z)
|= {AE → C} (Pseudo-transitivity Rule
If X  Y and WY  Z, then XW  Z)

36
Disprove FDs
 Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C}
 Prove or disprove F |= A → C

Find a counter example.

A B C D E
---------------------------
2 4 3 1 3
4 1 6 4 2

37
Closure Test
 A standard way to test if FDs hold is to compute the closure of
Y, denoted Y+
 Note that Y + is a set of attributes, not FDs

 Basis step: Y + = Y.

 Induction:
 Look for an FD’s left side X that is a subset of the current
Y+
 If the FD is X -> A, add A to Y +.

38
Prove FDs : A revisit
 Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C}
 Prove or disprove F |= AE → C

AE+= AEBDC

Since C  AE+, AE →C is implied by F

39
Disprove FDs : A revisit
 Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C}
 Prove or disprove F |= A → C

A+=ABD

Since C  A+, A → C is not implied by F

40

You might also like