Professional Documents
Culture Documents
Data Modeling and Database Design 2nd Edition Umanath Solutions Manual
Data Modeling and Database Design 2nd Edition Umanath Solutions Manual
Chapter 7 Objectives
After completing this chapter, the student will understand:
• What constitutes a functional dependency between attributes
• How to identify data redundancies in a relation schema
• How it is possible to eliminate data redundancies in a relation schema by
decomposing the relation schema into a series of relation schemas
• Armstrong’s axioms – Inference rules for functional dependencies
• The use of Armstrong’s axioms to derive the minimal cover for a set of functional
dependencies
• How to compute the closure of a set of attributes
• How to use the methods of synthesis and decomposition to derive candidate keys
• The distinction between prime and non-prime attributes
Chapter 7 Overview
Normalization, the topic of the next chapter (Chapter 8), is a technique that facilitates
systematic validation of participation of attributes in a relation schema from a perspective
of data redundancy. The building block that enables a scientific analysis of data
redundancy and the elimination of anomalies caused by data redundancy through the
process of normalization is called functional dependency. Chapter 7 introduces the
concept of functional dependency and how this concept can be used to scientifically
evaluate the “goodness” of a conceptual/logical design from the perspective of data
redundancy. Topics covered include a definition of functional dependency, a discussion of
inference rules that govern functional dependencies called Armstrong’s axioms, and the
idea of a minimal cover for a set of functional dependencies. Applications of Armstrong’s
axioms to systematically derive the candidate keys of a relation schema, given a set of
functional dependencies that hold on the relation schema, are also presented.
Data Modeling and Database Design 7-2
Chapter 7 Solutions
1. What is the purpose of the normalization technique in the data modeling process?
Answer. The purpose of the normalization technique is the systematic validation of
the participation of attributes in a relation schema from a perspective of data
redundancy.
2. Explain why data redundancy exists for the attributes Discount and Location in the
STOCK table in Figure 7.1c.
Answer. Data redundancy is defined as the superfluous repetition of data that does
not add new meaning. The simple appearance of duplication of data does not
necessarily imply data redundancy. The location of a store in STOCK is always the
same irrespective of any other fact in the table. Therefore, repetition of the value of
Location for every occurrence of Store value is a superfluous repetition and entails
data redundancy. Likewise, the discount associated with a specific quantity in
STOCK is always the same irrespective of any other fact in the table. Thus, repetition
of the same value of Discount every time a specific value of Quantity occurs does not
add any new meaning and so entails data redundancy.
In other words, in the context of the relation state STOCK in Figure 7.1c, Store
Location since each individual value of Store is always associated with one and only
one individual value of Location. On the other hand, Location Not Store since
each value of Location is not always associated with one and only one individual
value of Store.
Data Modeling and Database Design 7-5
4. Why can functional dependency not be inferred from a particular relation state?
Answer. A functional dependency is a property of the semantics (meaning) of the
relationship among attributes emerging from the business rules. That is, a functional
dependency is a property of the relation schema R, not of a particular relation state r
of R. Therefore, a functional dependency cannot be automatically inferred from any
relation state r of R (e.g., such as STOCK in Figure 7.1). It must be explicitly
specified as a constraint, and the source for this specification is the business rules of
the application domain. Thus, in the context of STOCK in Figure 7.1c, Store
Location if and only if there is a business rule specifying that each store exists in one
and only one location. Likewise, while Quantity Discount is true regardless of the
store in Figure 7.1c, unless this constraint is part of a business rule, this functional
dependency would not exist. A relation state, r of R, willfully constructed to represent
the specified business rules (FDs) is labeled as an “instance” of R in this book.
5. Identify the set of functional dependencies in the relation instance CAR shown below.
Does this constitute the minimal cover for the set of functional dependencies present
in CAR? If it is not a minimal cover, derive a minimal cover.
CAR
Model #cylinders Origin Tax Fee
Camry 4 Japan 15 30
Mustang 6 USA 0 45
Fiat 4 Italy 18 30
Accord 4 Japan 15 30
Century 8 USA 0 60
Mustang 4 Canada 0 30
Monte Carlo 6 Canada 0 45
Civic 4 Japan 15 30
Mustang 4 Mexico 15 30
Mustang 6 Mexico 15 45
Civic 4 Korea 15 30
Answer.
Since it is stated that the data above reflects an “instance” of a relation CAR, the
following set of semantically obvious FDs, F, are listed based on an assumed set of
business rules honoring the relationships reflected in the instance of CAR above.
fd1: Origin Tax; fd2: (Model, #cylinders, Origin) Fee; fd3: #cylinders Fee;
fd4: Fee #cylinders; fd5: {Model, #cylinders, Origin} Tax
F, above, will be a minimal cover, Fc, only if (F – fdx for x = 1,2,3,4,5) is not equivalent to
F. That is, there are no redundant attributes or redundant FDs in F. Given F, it can
be seen that fd2 and fd5 are redundant FDs in F. Therefore, F is not a minimal cover
for F.
Data Modeling and Database Design 7-6
Fc [fd1, fd3, fd4] is a minimal cover for F. F can be derived from Fc and further
reduction of Fc will yield a set of FDs that is not equivalent to Fc and F.
Answer
a. Rule of Decomposition (Ssn Dnumber) and Rule of Transitivity (Ssn Dnumber
{Dname, Dmgrssn})
b. Rule of Reflexivity
c. Rule of Decomposition
d. Rule of Decomposition (Ssn Dnumber and Dnumber Dname) and Rule of
Transitivity (Ssn Dnumber Dname)
10. Describe the two approaches used in this book to derive candidate keys.
Answer. The two approaches for deriving candidate keys for a relation schema are
the synthesis approach and the decomposition approach. The synthesis approach is
based on the principle of the closure of an attribute set and seeks to derive an
irreducible set of attributes whose closure is precisely all attributes of the URS
(universal relation schema). The decomposition approach starts with a URS and the
set of functional dependencies F that prevails over the URS, and systematically
reduces the superkey until it is further irreducible under F. This irreducible superkey
is a candidate key of the relation schema. After one candidate key is derived using
either of these approaches, the method used to derive the rest of the candidate keys of
the relation schema is the same (see Tables 7.2 and 7.3 for the respective
procedures).
11. What is the difference between (a) a prime attribute and a non-prime attribute and (b)
a key and non-key attribute?
Answer. A prime attribute is any attribute, atomic or composite, in a relation schema
R that is a proper subset of the primary key of R. An attribute of R that is not a subset
of the primary key in a non-prime attribute except when it is a candidate key of R. A
candidate key is neither a prime attribute nor a non-prime attribute; given a primary
key of R, it is an alternate key of R. A key attribute is an attribute, atomic or
composite, in a relation schema R that is a proper subset of any candidate key of R.
Attributes that are not subsets of a candidate key of R are non-key attributes.
12. Given R (X, A, Z, B) and A {B, Z}, what is the candidate key(s) of R?
Answer. The only candidate key of R is {A, X}
13. Consider the universal relation schema INVENTORY (Store#, Item, Vendor, Date,
Cost, Units, Manager, Price, Sale, Size, Color, Location) and the constraint set F
{fd1, fd2, fd3, fd4, fd5, fd6, fd7} where:
The algorithm to compute the minimal cover for a given set of functional
dependencies F is:
a. Set G to F.
b. Convert all FDs in G to standard (canonical) form—i.e., the right side
(dependent attribute) of every FD in G should be a singleton attribute.
c. Remove all redundant attributes from the left side (determinant) of the FDs in
G.
d. Remove all redundant FDs from G.
Data Modeling and Database Design 7-8
A systematic inspection of the standard form FDs above indicates that there are
no redundant attributes in the determinants of the FDs above, nor are there any
redundant FDs. Thus, G ( equivalent to F) is a minimal cover (Fc) for F.
The principle underlying the synthesis approach to deriving the candidate keys of
a URS is to condense a minimum set of attributes in URS whose attribute closure
is the URS – i.e., all attributes of URS. The heuristics specified in Table 7.2 offer
a step-by-step procedure to accomplish this. Note that a minimal cover (Fc) of F
that prevails over URS is the basis for this derivation. So, given Fc, as per the
heuristics stated in table 7.2, the following steps ensue:
The target determinant, TD1, the determinant of an FD with the most number of
attributes as its determinant, is:
{Store#, Item, Date}
The attribute closure, TD1+ (TD1 | Fc), is {Store#, Item, Date, Manager, Sale,
Units, Size, Color}.
Since TD1+ is not precisely the set of all attributes in URS, TD1, {Store#, Item,
Date} cannot be a candidate key of URS.
The attribute closure, TD2+ (TD2 | Fc), is {Item, Vendor, Cost, Price, Size, Color,
Location}. Since TD2+ is not precisely the set of all attributes in URS, TD2
{Item, Vendor} cannot be a candidate key of URS either.
TD1+ U TD2+ = {Store#, Item, Date, Manager, Sale, Units, Size, Color} U
{Item, Vendor, Cost, Price, Size, Color, Location}, which is precisely the set of
all the attributes in URS. Therefore, we have TD1 U TD2 as the first candidate
key of URS – i.e., {Store#, Item, Vendor, Date} is a candidate key of
INVENTORY.
It is easy to observe that there are no other candidate keys for INVENTORY | F.
Accordingly, {Store#, Item, Vendor, Date, Cost, Units, Manager, Price, Sale, Size,
Color, Location} is a superkey, K, of INVENTORY.
Next, removal of Item from the current K’ will result in {K’ – Item} not Item
since Item is not a dependent in any FD in Fc. Thus, the resulting new K’ will
not be a superkey of INVENTORY. So, Item cannot be removed from K’.
Likewise, Vendor and Date also cannot be removed from K’.
Removal of Manager from the current K’’ will result in {K’’ – Manager} not
Manager. Observe that Manager is a dependent in fd2a in Fc. Nonetheless, Store#,
part of the determinant in fd2a, is no longer present in the current K’’. Thus, the
new K’’ excluding Manager will not be a superkey of INVENTORY. So, Manager
cannot be removed from K’’.
Next, removing Price, Sale, Size, Color, Location from K’’, we have
’’’ ’’
K = {K – (Price, Sale, Size, Color, Location)} = {Item, Vendor, Date, Manager}.
K’’’ remains a superkey of INVENTORY since {K’’ – (Price, Sale, Size, Color,
Location)} { Price, Sale, Size, Color, Location}.
Since further reduction of K’’’ = {Item, Vendor, Date, Manager} will not yield a
superkey of INVENTORY | Fc, {Item, Vendor, Date, Manager} is, by definition, a
candidate key of INVENTORY.
Data Modeling and Database Design 7-10
14. Given the set of functional dependencies F {fd1, fd2, fd3, fd4, fd5, fd6, fd7, fd8, fd9,
fd10} where:
a. Construct the universal relation schema that includes (i.e., preserves) the set of
functional dependencies in F.
URS (Tenant#, Name, Job, Phone#, Address, Salary, Gender, Deposit, County,
Tax_rate, Area, Rent, Survey#, Lot)
Following the algorithm prescribed to derive a minimal cover for a set of FDs, F
(see Section 7.2.3.2, p. 370), it can be shown that {F – fd: Tenant# Address} in
this case is a minimal cover, Fc, for F.
Using the decomposition approach to deriving a candidate key (see Table 7.3, pp.
379 - 380),
{Tenant#, Name, Job, Phone#, Address, Salary, Gender, Deposit, County, Tax_rate,
Area, Rent, Survey#, Lot} is a superkey, K, of URS.
Next, arbitrarily starting at the right end of the URS, {K – Lot} Lot | Fc, since Fc,
includes fd8: Survey# Lot. Therefore, K’ = {Tenant#, Name, Job, Phone#, Address,
Salary, Gender, Deposit, County, Tax_rate, Area, Rent, Survey#}, is a superkey of
URS.
Next in line for removal, since we have arbitrarily chosen to work from the right,
is Survey#. But, {K’ – Survey#} not Survey# | Fc. Therefore, Survey# must
remain in K’ in order for it to remain a superkey of URS.
Data Modeling and Database Design 7-11
’ ’
{K – Rent} Rent | Fc while {K – Area} not Area | Fc. Accordingly, we have a
reduced superkey of URS K’’where K’’ = {Tenant#, Name, Job, Phone#, Address,
Salary, Gender, Deposit, County, Tax_rate, Area, Survey#}.
Continuing to work from right to left, we see that Tax_rate, County, Deposit,
’’
Gender, Salary, Address, Phone#, Job, and Name can be purged from K without
sacrificing the superkey status of the reduced K’’. Accordingly, we have,
K’’’ = {Tenant#, Area, Survey#}, a further reduced superkey of URS – i.e.,
’’
{Tenant#, Area, Survey#} URS | Fc. Since further reduction of K without
sacrificing its superkey status is impossible, K’’’ is a candidate key of URS.
Following Steps 4 through 6 in Table 7.3, the rest of the candidate keys of URS
can be derived as follows:
{Tenant#, Area, Survey#} and {Tenant#, Lot, County} seem equally viable as the
primary key of URS. Both have the least number of attributes constituting the set,
and based on the FDs in F, the attributes in these two sets seem to be most
reflective of user-specified business rules conveyed through the FDs.
So, let us say that {Tenant#, Lot, County}
e. Considering your primary key and candidate key(s), distinguish between (1) key
versus non-key attributes and (2) prime versus non-prime attributes.
URS (Tenant#, Name, Job, Phone#, Address, Salary, Gender, Deposit, Lot, County,
Tax_rate, Area, Rent, Survey#,)
Note: Any composite attribute that includes one or more non-key attribute(s) is a non-key attribute.
15. Given the set of functional dependencies F {fd1, fd2, fd3, fd4, fd5, fd6, fd7, fd8, fd9,
f10, f11} where:
a. Construct the universal relation schema that includes (i.e., preserves) the set of
functional dependencies in F.
Following the algorithm prescribed to derive a minimal cover for a set of FDs, F
(see Section 7.2.3.2, p. 320), it can be shown that F in this case is a minimal
cover, Fc, for F since there are no redundant attributes or redundant FDs in F.
The target determinant, TD1, the determinant of an FD with the most number of
attributes as its determinant, is:
{Stock, Broker}
The attribute closure, TD1+ (TD1 | Fc), is, through multiple iteration through Fc,
{Stock, Broker, Exchange, Dividend, Profile, Investment, Volume, Company, Client,
Office, Risk_profile, Analyst, Commission, Return}.
Since TD1+ is not precisely the set of all attributes in URS, TD1, {Stock, Broker}
cannot be a candidate key of URS.
The attribute closure, TD2+ (TD2 | Fc), is {Account, Assets}. Since TD2+ is not
precisely the set of all attributes in URS, TD2, Account cannot be a candidate key
of URS either.
TD1+ U TD2+ = {Stock, Broker, Exchange, Dividend, Profile, Investment, Volume,
Company, Client, Office, Risk_profile, Analyst, Commission, Return} U {Account,
Data Modeling and Database Design 7-14
Assets} which is precisely the set of all the attributes in URS. Therefore, we have
TD1 U TD2 as the first candidate key of URS – i.e., {Stock, Broker, Account} is a
candidate key of URS | F.
Having derived one candidate key of URS, following Steps 7 through 9 in Table
7.2, the other candidate keys, if any, can be derived as follows.
It is easy to observe that there are no other candidate keys for URS | F.
Based on the FDs in F, the attributes Stock and Broker seem to be most reflective
of user-specified business rules conveyed through these FDs. Accordingly,
{Stock, Broker, Account} seems to be the semantically ideal choice for the primary
key of URS.
e. Considering your primary key and candidate key(s), distinguish between (1) key
versus non-key attributes and (2) prime versus non-prime attributes.
16. Given the Universal Relation Schema URS (A, B, C, D, F, G) and the set of FDs prevailing
over URS F {fd1, fd2, fd3, fd4, fd5, fd6}, where:
Observe that based on fd1 and fd4 using the rule of transitivity (Armstrong’s axiom) we
infer that in fd2a B is redundant; therefore, we get fd2ax: A C; likewise, we also can
derive fd2bx: A D.
Next, based on fd6 and fd4, we can derive fd3ax: C F because we can infer
here that in fd3a B is redundant.
Given fd2ax and fd6, it is seen that fd1 is redundant. Likewise, given fd6, fd3b
becomes a redundant fd.
Further application of the algorithm for deriving the canonical cover will clarify
that there are no more redundancies in F. Thus we have,
17. Given the Universal Relation Schema URS (A, B, C, G, W, X, Y, Z) and the set of FDs
prevailing over URS, F {fd1, fd2, fd3, fd4, fd5}, where: