Decomposition and Functional Dependency: Iamapig. Then, You Must Have 4 Legs!

I am a pig.
Then, you must

have 4 legs!
Decomposition and
Functional Dependency
Outline
 Redundancy
 Decomposition at first glance
 Functional dependency
 Dependency properties
2
Why?
 We have learnt that, an ER diagram
can be directly converted to relational
tables.
 However, these tables may contain

redundancy, namely, repetition of the
same information.
 To eliminate redundancy (as much as

possible), we need to refine the
relational tables (known as
normalization in db terminology).
3
An example of redundancy
rating
id hourly-wages
employee
4
An example of redundancy
 Negative impacts of redundancy

 Higher space consumption.
 Higher update overhead
• Imagine the operations we need to do, if we raise the
hourly-wage of B1 to 120.
 Insertion/update anomaly
• The DBA must prevent insertion of tuples resulting in
inconsistent hourly-wages, e.g., (7, B1, 200).
 Solution? 5
An example of redundancy (cont.)
6
An example of redundancy (cont.)
rating
id hourly-wages
employee have salary
 In fact, the two tables could have been

obtained directly, if we had designed a
“perfect ER diagram”.
 It is not realistic to assume that we can

always discover the perfect ER diagram.
 We need a tool to refine our design, even

if we started from an imperfect diagram.
7
Outline
 Redundancy
8
Basic questions to ask
 Do we need to decompose a relation?
 What problems (if any) does a given decomposition cause?

 Lossless-join
 Dependency-preservation
9
Decomposition
 Just now, we decomposed
EMPLOYEE into two tables, to
avoid redundancy.
 Note that, the new tables can

reproduce the original
EMPLOYEE.
 This is a rule we must obey:

the decomposed tables must be
able to reproduce the original
table.
Lossless Join!
10
Illegal decomposition
 This decomposition violates the rule mentioned

earlier.
 How do we judge whether a decomposition is

legal?
11
Another illegal decomposition
 The judgment is not that easy! Lossless Join?
NO!
A B
1 2
2 2
A B C
A B C 3 3
1 2 3
1 2 3
1 2 2
2 2 2
2 2 3
3 3 1
2 2 2
B C
3 3 1
2 3
2 2
3 1
12
legal decomposition
 Checking the “legitimacy” of decomposition:

 The new tables must have common attribute(s).
 The common attribute(s) must be the candidate key of at
least one new table.
13
Some confusing notions
 Key? Better Definitions
An abbreviation of candidate key. Later
 Candidate key?
A minimal set of attributes that uniquely identifies every tuple.
 Primary key?
A candidate key selected by a database designer
 Superkey
Any superset of a candidate key.
14
Decomposition may not be obvious
 How to decompose the above table to minimize redundancy?
 Before we can answer the question, we need to gain more

understanding about redundancy.
15
Outline
 Redundancy
16
Why does redundancy exist?
 Reason:
 rating determines hourly-wages.
 Once the tuple’s rating is known, its hourly-wages is also

decided.
 A concise representation: rating  hourly-wages.

17
Functional dependency
 rating  hourly-wages
 is called a functional dependence (FD).
 Do we have “rating  id”?

 If tuple’s rating is known, are we sure about its id?
 No.
18
FD (cont.)
 Do we have
hourly-wages  rating?
 Yes, different ratings have

different hourly-wages.
 Namely, if we know a tuple’s hourly-wages, then its rating has
only one possibility.
 Do we have id  rating?
 Yes, because each employee has only a single rating.
19
FD (cont.)
 Do we have id  id?
 Of course, known as a trivial FD.
 Do we have
id  (rating, hourly-wages)?
 Yes, because each employee has only a single (rating,
hourly-wages)-combination.
My id = 1
You are at scale B1 and

earn $100 per hour.
20
FD (cont.)
 Do we have
(id, rating)  hourly-wages?
 In English, if a tuple’s (id, rating)-combination is decided, how

many possibilities for hourly-wages?
 In fact, once (id, rating) is decided, we know exactly which

employee is concerned.
 Therefore, hourly-wages has only one possibility.
 So “(id, rating)  hourly-wages” 21
is true.
Functional dependency definition
 Let L and R be two sets of attributes.
 L  R means that
 if we know a tuple’s L,
then there is only a single
possibility for the tuple’s R!
 I.e., if we know L, we know R.

L
 rating  hourly-wages
 hourly-wages  rating
 id  rating
 id  id
R!
 id  (rating, hourly-wages)
 (id, rating)  hourly-wages 22
Secret of redundancy
 In general, a table has redundancy,
if there is a FD, whose left hand
side is not a candidate key.
 For example, the only candidate key of EMPLOYEE is id.

 EMPLOYEE has redundancy, because we have
rating  hourly-wages.
23
Where are FDs from?
 Two channels.
 First, common senses.
 HK-id  name.
 country  capital.
 (father, mother)  eldest-child.
…
 Second, special constraints of the underlying application.

 If every employee has her/his own office
• emp-id  office-number.
 If every customer can have a single account
• cust-id  acc-id.
24
Outline
 Redundancy
25
A candidate key determines all
 For example, a candidate
key of EMPLOYEE is id.
 Thus, id determines any
combination of the attributes.
 id  id
 id  rating
 id  hourly-wages
 id  rating, hourly-wages
 id  id, rating, hourly-wages
 If we know the tuple’s id, then its any attribute has only 1
possibility.
26
Super key determines all
 A candidate key is id.
 Then, (id, rating) determines
any combination of the attributes.
 (id, rating)  id
 (id, rating)  rating
 (id, rating)  hourly-wages
 (id, rating)  rating, hourly-wages
 (id, rating)  id, rating, hourly-wages
 We only need id to claim that the tuple’s any attribute has only 1
possibility.
 So, of course, given its (id, rating), we can make the same claim.
27
Trivial functional dependences
 “id  id” is trivially true.
 Put it in English, and you will

find out.
 “If we know a tuple’s id = 1,

then we know its id.”
I have 4 legs. Guess how
many legs I have.
Don’t waste my time.

28
Trivial functional dependences (cont.)
 L  R is trivial, if L contains R.
 Examples:
 (id, rating)  id
 (id, rating)  rating
 (id, rating)  (id, rating)
 (id, hourly-wages)  id
 (id, rating, hourly-wages)  (id, rating)
 …
29
Inference rules for FDs
 Given a set of FDs F, we can infer additional FDs that hold
whenever the FDs in F hold
 One example:
id  rating
rating  hourly-wages
What can we derive from these two FDs?

id  hourly-wages
 More rules to come in next page
30
Union
 Given
 cust-id  cust-name (1)
 cust-id  cust-city (2)
 we can derive
 cust-id  (cust-name, cust-city) (3)
 Reasoning:
 By (1), if we know cust-id, then we know cust-name.
 By (2), if we know cust-id, then we know cust-city..
 Hence, if we know cust-id, then we know the (cust-name, cust-city)-
combination.
31
Transitivity
 Given
 creditcard-no  cust-id (1)
 cust-id  cust-name (2)
 we can derive
 creditcard-no  cust-name (3)
 Reasoning:
 By (1), if we know creditcard-no, then we know cust-id.
 By (2), if we know cust-id, then we know cust-name.
 Hence, if we know creditcard-no, then we know cust-name.
32
Augmentation
 Given
 we can derive
 (creditcard-no, branch-id)  (cust-id, branch-id) (2)
 Reasoning:
 By (1), if we know creditcard-no 40101342, we know cust-id 1.
 Hence, if (40101342, B1) = (creditcard-no, branch-id) of a tuple, we
know that (1, B1) = (stu-id , branch-id) of the tuple
33
FD derivation
 Given
 (cust-id, branch-id)  acc-id (2)
 We can derive (creditcard-no, branch-id)  acc-id as follows.
 From (1), we have

 (creditcard-no, branch-id)  (cust-id, branch-id) (3)
 Augementation
 From (3) and (2), by transitivity, we have

 (creditcard-no, branch-id)  acc-id
34
Summary of Inference Rules
 Let R be a relation schema, W, X, Y, Z be subsets of R.
 Reflexivity
 If Y ⊆ X, then X  Y (trivial FD’s)
 Augmentation
 If X  Y, then XZ  YZ, for every Z
 Transitivity
 If X  Y and Y  Z, then X  Z
 Union (Combining) Rule
 If X  Y and X  Z, then X  YZ
 Decomposition (Splitting) Rule
 If X  YZ, then X  Y and X  Z
 Pseudo-transitivity Rule
 If X  Y and WY  Z, 35then XW  Z
Prove FDs
 Consider R(A, B, C, D, E) with FDs F ={A→B, B →D, DE→C}
 Prove or disprove F |= AE → C
{A→B, B →D, DE→C}

|= {A → D, DE→C} (Transitivity Rule:
If X  Y and Y  Z, then X  Z)
|= {AE → C} (Pseudo-transitivity Rule
If X  Y and WY  Z, then XW  Z)
36
Disprove FDs
 Prove or disprove F |= A → C
Find a counter example.
A B C D E
---------------------------
2 4 3 1 3
4 1 6 4 2
37
Closure Test
 A standard way to test if FDs hold is to compute the closure of
Y, denoted Y+
 Note that Y + is a set of attributes, not FDs
 Basis step: Y + = Y.
 Induction:
 Look for an FD’s left side X that is a subset of the current
Y+
 If the FD is X -> A, add A to Y +.
38
Prove FDs ： A revisit
 Prove or disprove F |= AE → C
AE+= AEBDC
Since C  AE+, AE →C is implied by F
39
Disprove FDs ： A revisit
 Prove or disprove F |= A → C
A+=ABD
Since C  A+, A → C is not implied by F
40

Decomposition and Functional Dependency: Iamapig. Then, You Must Have 4 Legs!

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Decomposition and Functional Dependency: Iamapig. Then, You Must Have 4 Legs!

Uploaded by

Copyright:

Available Formats

I am a pig.

Then, you must

 However, these tables may contain

 To eliminate redundancy (as much as

 Negative impacts of redundancy

employee have salary

 In fact, the two tables could have been

 It is not realistic to assume that we can

 We need a tool to refine our design, even

 What problems (if any) does a given decomposition cause?

 Note that, the new tables can

 This is a rule we must obey:

 This decomposition violates the rule mentioned

 How do we judge whether a decomposition is

 Checking the “legitimacy” of decomposition:

 How to decompose the above table to minimize redundancy?

 Before we can answer the question, we need to gain more

 Once the tuple’s rating is known, its hourly-wages is also

 A concise representation: rating  hourly-wages.

 Do we have “rating  id”?

 Yes, different ratings have

You are at scale B1 and

 In English, if a tuple’s (id, rating)-combination is decided, how

 In fact, once (id, rating) is decided, we know exactly which

 I.e., if we know L, we know R.

 For example, the only candidate key of EMPLOYEE is id.

 Second, special constraints of the underlying application.

 Put it in English, and you will

 “If we know a tuple’s id = 1,

Don’t waste my time.

What can we derive from these two FDs?

 More rules to come in next page

 From (1), we have

 From (3) and (2), by transitivity, we have

{A→B, B →D, DE→C}

Find a counter example.

Since C  AE+, AE →C is implied by F

Since C  A+, A → C is not implied by F

You might also like