You are on page 1of 15

Database Dependencies & Data Integration

G. Raschia
December 11, 2017

Examination Instructions
• The exam duration is of 1h30.
• A handwritten double-sided A4 sheet of paper is allowed.
• The exam is closed notes/documents/class book.
• The exam is closed electronic devices: mobile phone, laptop, tablet PC, etc.
• English to/from Native language dictionary is allowed.
• The exam has 5 exercises and 21 questions, up to a total of 40 points.

Grade Policy
Exercise 1 (Knowledge Test) is rewarded 16 points. Then, you must hunt for 14 more
points into the 4 subsequent exercises in order to reach a maximum grade of 30 points
(over 40 available). In other words, you can answer as many questions as you like from
exercises 2 to 5, but they will be globally rewarded 14 points maximum.

Write answers after each question in the devoted area. If your answer requires
more space, please follow on at the end of the exam paper.

Name: Email address:

Student number: Signature:

1 Knowledge Test (16 pts)


(2) 1. Give the definition of a functional dependency.
.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

1
Solution: If 2 tuples agree on X in R then they must agree on Y . We denote
the functional dependency by X → Y .

(2) 2. Write the equality generating dependency (egd) counter-part of the functional depen-
dency AB → C in relation R(ABCD).

Solution:
∀x, y, z, t : R(x, y, z, _) ∧ R(x, y, t, _) ⇒ z = t

(2) 3. Express in natural language (English) the following Conjunctive Query (CQ) against
the database Sells(bar,beer,price) that stores records about “a bar that sells a
beer at a given price”:

Q(x, t) ← Sells(x, y, z), Sells(t, y, v), z < v

.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

Solution: Pairs of bars that sell the same beer such like the first bar charges a
lower price than the second one.

(2) 4. Show that the size, in total number of subgoals, of an unfolded query can grow
exponentially (Hints: consider views defined as union of CQ’s).
.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

DS-DD&DI Final Exam Page 2 of 15


Vi : View i
Solution:
Q ← V1 , V2 , . . . , Vn
where each Vi has 2 rules with one subgoal each. Then unfolding Q on V1 gives
|Q1 | = 2 × n. It follows, |Q2 | = 2 × 2 × n, and finally |Qn | = 2n × n.
this is number of subgoals

(2) 5. Give a definition of Query Containment.


.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

Solution: Q ⊆ Q′ ⇔ ∀D : Q(D) ⊆ Q′ (D). In other words, query answers must


satisfy the containment property for every database instance.

(2) 6. What is a GaV setting for schema mapping?


.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

Solution: Global-as-View is a declarative language, proper subset of FO, that


admits sentences of the form:
relation of source schema relation of target schema
∀x¯1 . . . x¯n : S1 (x¯1 ) ∧ . . . ∧ Sn (x¯n ) ⇒ ∃ȳ : T (ȳ)

where x̄i ’s and ȳ are tuples of constant and variables and Si ’s are relations of the
Source schemes and T is one single relation of the target schema. Hence, in the
GaV setting, each Global (Target) relation is described as a view on the Source
schemes.

(2) 7. How does the Tane algorithm work for pruning the search space?
.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

DS-DD&DI Final Exam Page 3 of 15


Solution: It maintains data structures that are candidate RHS of FD’s for every
subset of attributes at each level of the lattice. It also performs minimality
checking and uses (super)key properties.

(2) 8. Give 3 more types of dependencies (other than functional dependency) in databases.
.................................................................................
.................................................................................
.................................................................................
.................................................................................
.................................................................................

Solution:

• AFD: Approximate Functional Dependency

• CFD: Conditional Functional Dependency

• IND: Inclusion Dependency

• OD: Order Dependency

• MVD: Multi-Valued Dependency

• DC: Denial Constraint

• JD: Join Dependency

• etc.

From this point, we use FD as a shorthand for Functional Dependency, wherever it is


unambiguous.

2 Implication Problem of FD’s (6 pts)


(2) 1. By using the Armstrong’s Axioms only, show that

Σ = {AB → CD, C → EH, D → G} |= AB → EHG

DS-DD&DI Final Exam Page 4 of 15


Solution: A1 is Reflexivity, A2 is Augmentation and A3 is Transitivity.
Derivation:

1. AB → CD (in Σ)

2. C → EH (in Σ)

3. CD → EHD (A2 on 2.)

4. D → G (in F)

5. EHD → EHG (A2 on 4.)

6. CD → EHG (A3 on 3. and 5.)

7. AB → EHG (A3 on 1. and 6.) □

DS-DD&DI Final Exam Page 5 of 15


(2) 2. Prove by induction that the iterative closure algorithm for XΣ+ is sound, ie. W ⊆ XΣ+
where X is a set of attributes, Σ denotes a set of FD’s and W is the output of the
algorithm1 .

Solution: Remind that X + = {A | Σ |= X → A}. X0 = X ∈ X + is trivial since


X → X whatever is Σ. Assume Xn ⊆ X + ; if Xn+1 = Xn = W then Xn+1 ⊆ X +
trivially. Assume, w/o loss of generality, Xn+1 = Xn ∪ A. It means that there
exists Y → Z in Σ with Y ⊆ Xn and A ∈ Z. By decomposition of Z, we know
that Σ |= Y → A. Assume Xn = Y W , then X → Y hold by decomposition of
Xn . Hence, Σ |= X → A by transitivity. That is to say, A is in X + . And finally,
Xn+1 = Xn ∪ {A} ⊆ X + . □

(2) 3. Give the sketch of a method to actually check for equivalence of 2 sets of FD’s

Solution: Σ1 ≡ Σ2 ⇔ Σ+ +
1 = Σ2 . But testing for the closure equality is insane!
Thus, it is much better to check for Σi |= Σj both ways. In other words:

1. for each FD f = X → Y in Σ1 , check for Σ2 |= f by the closure test


Y ∈ XΣ+2 .

2. idem for each FD in Σ2 w.r.t. Σ1 .

1
The completness XΣ+ ⊆ W , the other way around, is much more tricky.

DS-DD&DI Final Exam Page 6 of 15


3 Database Design (6 pts)
Consider
∑ a database with the table schema R(ABCDE), equipped with the set of FD’s
= {AB → C, DE → C, B → D, CD → AE}.

(1) 1. Find all the keys of R.

Solution: B belongs to any key (not part of any RHS). Keys are {AB, BC, BE}.

DS-DD&DI Final Exam Page 7 of 15


(1) 2. Show that R is not in Third Normal Form (3NF).

Solution: DE → C holds in R and DE is not a superkey and C is non prime.


Then R is not 3NF.

(2) 3. Decompose R up to Boyce-Codd Normal Form (BCNF). Tell whether the decompo-
sition is dependency-preserving or not.

Solution: From DE → C, one build R1 (DEAC) and R2 (BDE) with Σ1 =


{DE → C, CD → AE} et Σ2 = {B → D}. Note that we loose AB → C.
The decomposition is not dependency-preserving. Keys of R1 are DE and CD.
R1 is BCNF. R2 has one single key BE and B → D violates BCNF. Thus, one
decompose R2 to R3 (BD) and R4 (BE). They are both trivially BCNF.
The final BCNF decomposition is then R1 (ACDE), R3 (BD) and R4 (BE). It is
worth to notice that it is not the only possible decomposition.

(2) 4. Prove by the Chase Test that the decomposition is lossless join.

DS-DD&DI Final Exam Page 8 of 15


Solution: Given t = (a, b, c, d, e) in the result set of R1 ⋊ ⋉ R3 ⋊ ⋉ R4 ; then there
exists u = (a, b1 , c, d, e), v = (a1 , b, c1 , d, e) and w = (a2 , b, c2 , d1 , e) in R. Since
DE → C, then v[C] = u[C] = c. And CD → AE implies v[A] = u[A] = a. Thus,
v = t and t is also in R. □
只需要证明一个tuple能解决loss join 问题即可

DS-DD&DI Final Exam Page 9 of 15


4 Query Answering using Views (6 pts)
(2) 1. Given the two following queries:

Q1 (x, w) ← R(x, y), R(x, z), ¬T (w), z ̸= 3


Q2 (a, d) ← R(a, b), R(a, c), ¬T (d), U (a), b > 5

Show that Q1 contains Q2 .

Solution: Define the following containment mapping (sort of unification) from


Q1 to Q2 :
σ = {(x, a), (w, d), (y, c), (z, b)}
σ fulfills the requirements of the query containment theorem, ie.:

• it maps each positive subgoal of Q1 to a positive subgoal of Q2 ;

• it maps each negative subgoal of Q1 to a negative subgoal of Q2 ;

• it maps the distinguished variables (the head) of the query;

• it satisfies b > 5 |= σ(z) ̸= 3. what is the meaning of |= ?


there is no conflict?

(2) 2. Given the 2 following views:

V1 (x, y) ← R(y, x), S(y, x)


V2 (x) ← R(x, 1)

and a query Q against the database:

Q(x, y) ← R(1, y), R(x, 1), S(x, 1)

Perform a Bucket-like algorithm to find all the candidate rewritings of Q using views
V1 and V2 .

DS-DD&DI Final Exam Page 10 of 15


Solution: Have to combine every rewriting of a query subgoal b.t.w. of matching
candidate views:

• R(1, y): {V1 (y, 1), V2 (1)}

• R(x, 1): {V1 (1, x), V2 (x)}

• S(x, 1): {V1 (1, x)}

Thus, there are 4 rewritings to check:

1. Q1 (x, y) ← V1 (y, 1), V1 (1, x)

2. Q2 (x, y) ← V1 (y, 1), V2 (x), V1 (1, x)

3. Q3 (x, y) ← V2 (1), V1 (1, x)

4. Q4 (x, y) ← V2 (1), V2 (x), V1 (1, x)

(2) 3. From the previous rewritings, give an equivalent query of Q if it exists, the maximally-
contained query otherwise.
展开四个Q,check是否与Q相等,注意细节,如果都不等,把所有Q的子集并起来作为最后的maximally contained query

DS-DD&DI Final Exam Page 11 of 15


Solution: The process requires to unfold each candidate rewriting and check
for containment (and equivalence). Since the variables are all distinguished, the
containment mapping of each rewriting is straightforward: σ = {(x, x), (y, y)}. It
is actually the identity.

1. Q1 (x, y) ← R(1, y), S(1, y), R(x, 1), S(x, 1): Q1 ⊆ Q but Q ̸⊆ Q1 since
S(1, y) subgoal of Q1 cannot map to any subgoal of Q.

2. Q2 = Q1 : nothing to do.

3. Q3 (x, y) ← R(1, 1), R(x, 1), S(x, 1): Q3 ⊆ Q but Q ̸⊆ Q3 since R(1, 1)
subgoal of Q3 cannot map to any subgoal of Q. Note that y/1 substitution
is allowed (not 1/y) since 1 is a constant.

4. Q4 = Q3 : nothing to do.

Finally, we stay with 2 rewritings that are contained into Q but no equivalent
query. Hence, the maximally-contained rewriting of Q is the union of Q1 and Q3 .

5 Data Integration & Data Exchange (6 pts)


Consider a data exchange scenario, given by the following s-t tgd: general constraint expression

∀y : U (y) ⇒ ∃z : T (y, z)
∀x, y : W (x, y) ⇒ T (y, x)

and the target dependency:

∀x, y, z : (T (x, y) ∧ T (x, z)) ⇒ (y = z)

DS-DD&DI Final Exam Page 12 of 15


You are given the following source instance:

r = {W (1, 0), W (0, 2), U (0), U (3)}

(2) 1. Construct a universal solution for r, explaining the construction.

Solution:

1. add T (0, 1) from W (1, 0);

2. add T (2, 0) from W (0, 2);

3. add T (0, x) from U (0), with a labeled null x;

4. add T (3, y) from U (3), with a fresh new labeled null y;

5. drop T (0, x) since x = 1 by the target dependency.

Then, a universal solution would be {T (0, 1), T (2, 0), T (3, y)}.

M:mapping

DS-DD&DI Final Exam Page 13 of 15


(2) 2. Compute the certain answers to the queries:
(a) Q1 (x, y) ← T (x, y)
(b) Q2 (x) ← T (x, y)
given the source instance r.

Solution: Certain answers to Q1 are {(0, 1), (2, 0)} and certain answers to Q2
are {0, 2, 3}.

(2) 3. Show a source instance r0 for which no universal solution exists, assuming the above
dependencies.

Solution: It suffices to show an instance that would violate the target depen-
dency. An easy one would be {W (0, 1), W (1, 1)} since T (0, 1) and T (1, 1) cannot
coexist.

DS-DD&DI Final Exam Page 14 of 15


Blank space
To be filled only in case of overflow.

DS-DD&DI Final Exam Page 15 of 15

You might also like