You are on page 1of 6

Department of Computer and Information Science and Engineering

University of Florida

COP5725 Fall 2013

Exam II

Instructor: Dr. Daisy Zhe Wang


TAs: Yang Chen, Kun Li, Yang Peng

This is a 120-minute, closed-book exam.


This exam contains 5 single-sided sheets of paper (excluding this one).
Write all answers on these pages, preferably on the white space in the problem statement.
Continue on the draft pages if running out of space but clearly number your answers
if doing so.
Make sure you attack every problem; partial credit will be awarded for incomplete or
partially correct results.

THIS IS A CLOSED BOOK, CLOSED NOTES EXAM.

Name:
UFID:

For grading use only:

Question: I II III IV Total


Points: 25 25 25 25 100
Score:
COP5725, Fall 2013 Exam II Page 1 of 5

I. [25 points] SQL.


Consider the following database for movies and stars:
Movies(title, year, length, genre, studioName, producerC#)
MovieStar(name, address, gender, birthyear)
StarsIn(movieTitle, movieYear, starName)
MovieExec(name, address, cert#, netWorth)
Studio(name, address, presC#)

The movie producers and studio presidents are all movie executives, and Movies(producerC#)
and Studio(presC#) are foreign keys referring to movieExec(cert#).
(1) [6 points] Find the producers (names) of movies in which Harrison Ford stars.
Solution:

SELECT name
FROM MovieExec, Movies, StarsIn
WHERE cert# = producerC# AND
title = movieTitle AND
year = movieYear AND
starName = ’Harrison Ford’;

(2) [6 points] For the producers who made at least one film before 1930, find their
names and the total film length they made.
Solution:

SELECT name, SUM(length)


FROM MovieExec, Movies
WHERE producerC# = cert#
GROUP BY producerC#
HAVING MIN(year) < 1930;

(3) [6 points] For each movie executive who is the president of a studio, attach the
title ’Pres. ’ in front of his name. (Hint: use || for string concatenation).
Solution:

UPDATE MovieExec
SET name = ’Pres. ’ || name
WHERE cert# IN (SELECT presC# FROM Studio);

(4) [7 points] Create a CHECK or ASSERTION for the following constraint:


A star may not appear in a movie made before they were born.
Solution:

CREATE ASSERTION birthCheck CHECK (


NOT EXISTS (
SELECT *
COP5725, Fall 2013 Exam II Page 2 of 5

FROM MovieStar, StarsIn


WHERE starName = name AND movieYear < birthyear
)
);

II. [25 points] Indexes.


Consider an empty B+ Tree with at most 4 keys (and 5 pointers) per node. Bulkload
the B+ tree with data entries with even numbers from 2 to 100 (i.e. 2, 4, 6, . . . , 100).
(1) [6 points] What is the height of the tree after inserting all the above keys?
Solution: 2 (3 levels).
(2) [6 points] List all the keys whose insertion increased the height of the tree, e.g.
when key 2 was inserted, tree height increased from 0 to 1.
Solution: {2,42}.
Common Mistake: Several students gave {2, 10, 42} as the answer. This is wrong as
the key with value 2 never goes to the root. {2, 4, 6, 8} are directly inserted as a leaf
(say L1). Meanwhile the root is empty except for the first child pointer to L1. The next
key (10) then goes to the first slot in the root and so on.

Consider the B+-tree shown in the following figure. Its nodes have at most 4 keys (and
5 pointers). The subtrees A, B and C are valid B+ sub-trees. Answer the following
questions:

10 20 30 80

A B C

90 98

35 42 50 65 81 82 98 99 100 105

30 31 42 43 68 69 70 79 94 95 96 97

36 38 51 52 56 60

(3) [8 points] Show the tree after deleting key 31 and 81.
Solution: See next page.
(4) [5 points] Starting from the original tree, find a key which increases the height of
the tree when inserted.
COP5725, Fall 2013 Exam II Page 3 of 5

10 20 30 80

A B C

95 98

42 50 65 82 94 98 99 100 105

30 36 38 51 52 56 60 95 96 97

42 43 68 69 70 79

Solution: Any key in the range [53, 78] will cause a split that propagates all the way to
the root.

III. [25 points] Query Evaluation.


Assume two relations R(A, B) and S(A, C) with |R| = 15, 000, 000 tuples and |S| =
800, 000 tuples. The disk block size is 2,000 Bytes; the tuple size 400 Bytes for both
relations. The values of the integer attribute A are uniformly distributed between 1 and
500,000 in relation R.
(1) [6 points] Consider a clustered alternative 2 B+-tree index on attribute A in
relation R. Assume the index and data entries are 20 Bytes long. Determine
the number of disk blocks (pages) at each level of the tree.
Solution: Nodes (=index blocks): 100 index entries per node Index blocks required at
each level:
• level 3: d500000/100e = 5,000 blocks (leaf nodes);
• level 2: d5000/100e = 50 blocks;
• level 1: d50/100e = 1 block.
⇒ 5,051 index blocks are needed in total.
(2) [6 points] Estimate the number of disk I/O’s for query σA=x (R), assuming the
B+-tree index in Part (1) is used. Compare the cost if we use an unclustered B+-
tree index instead (you need to give the estimated I/O’s for the unclustered index).

Solution: Clustered:
• Traverse the tree: 2 internal + 1 index = 3 block I/O’s;
• Read all data blocks with qualifying tuples (i.e., A = x);
• On avg. 15,000,000/500,000 = 30 qualifying data tuples = 6 data blocks. (1 index
block).
• Total block I/O’s: 3 + 6 = 9.
Unclustered: 3 + 30 = 33.
COP5725, Fall 2013 Exam II Page 4 of 5

(3) [13 points] Estimate the number of disk I/O’s for the following evaluation plans
for S ./ R, when M = 3 main memory buffer blocks are available:
• Plan p1: Index nested loop join using the B+ index in (1).
• Plan p2: Sort-merge-join (assume that the relations are already sorted).
Solution: Plan p1: Indexed nested loop join
• Cost = ns × c + bs = 800, 000 × 9 + 160, 000 = 7, 360, 000 block I/O’s.
Note: Do not deduct points if they get c = 9 wrong in the previous problem.
Plan p2: Merge join
• Avg. number of tuples with same A-value in s: 800, 000/500, 000 = 2;
• All tuples with the same A-value fit in memory;
• Cost = bs + br = 160,000 + 3,000,000 = 3,160,000 I/O’s.

IV. [25 points] Transactions.


Given the following schedule for transactions T1 , T2 , and T3 .

T1 T2 T3
r2 (Z)
r2 (Y )
w2 (Y )
r3 (Y )
r3 (Z)
r1 (X)
w1 (X)
w3 (Y )
w3 (Z)
r2 (X)
r1 (Y )
w1 (Y )
w2 (X)

(1) [5 points] Draw the conflict graph of this schedule and show whether it is conflict
serializable or not.
Solution:

T1 T2

T3

The schedule is not conflict serializable since the conflict graph contains cycles.
COP5725, Fall 2013 Exam II Page 5 of 5

(2) [5 points] Write down the conflict action pairs and the type of anomalies such
conflict pairs can result in: dirty read, overwriting uncommitted data, and unre-
peatable read.

For (3) to (5), consider the following two transactions:


T1 : r1 (A);
r1 (B);
if A = 0 then B := B + 1;
w1 (B).
T2 : r2 (B);
r2 (A);
if B = 0 then A := A + 1;
w2 (A).
(3) [5 points] Add lock and unlock instructions to T1 and T2 so that they obey the
two-phase locking protocol. Use shared locks for read-only elements.
Solution: Lock and unlock instructions:
T1 : sl1 (A); T2 : sl2 (B);
r1 (A); r2 (B);
xl1 (B); xl2 (A);
r1 (B); r2 (A);
if A = 0 then B := B + 1; if B = 0 then A := A + 1;
w1 (B); w2 (A);
u1 (A); u2 (B);
u1 (B). u2 (A).
(4) [5 points] Show a conflict serializable schedule of T1 and T2 with some degree of
concurrent execution. If it does not exist, explain why.
Solution: It does not exist. If all the reads/writes of this transaction are not run serially,
then there will be a deadlock.
(5) [5 points] Show a concurrent schedule of T1 and T2 that results in a deadlock.
Show also the evolution of the wait-for graph.
Solution:
T1 T2 Wait-for graph
1 sl1 (A)
2 sl2 (B)
3 r2 (B)
4 r1 (A)
5 xl1 (B) T1 → T2
6 xl2 (A) T1 ↔ T2

You might also like