You are on page 1of 5

UP, FAMNIT, DBD 2020/21 – 2nd exam 1

Databases for Big Data

2020/21

2nd exam, FAMNIT

NAME AND SURNAME:


STUDENT NUMBER:
PROGRAM:
SIGNATURE:
UP, FAMNIT, DBD 2020/21 – 2nd exam 2

Exercise 1. (25%)

Suppose we are building an information system for the library called NewBooks. The
central table of the information system is a table Books.

Each book is described by using an identifier, a title, an author, a rating, a publisher, and
a year of publishing.

Books(book_id,title,author,rating,publisher,year)

The most important applications that access the table Books include the following
queries.

1. For each publisher, find the number of published books with a rating greater or
equal to 6.
2. For a given author, find the books published after the year 2000.

a) Determine the minimal and complete set of predicates for the primary horizontal
fragmentation. Show how the minterms and minterm fragments can be defined.

b) In order to improve availability, we replicate each fragment to one additional server of


our network. We have the following requirements for the information system NewBooks.

• The data should constantly be mutually consistent.


• All the servers in the cluster are able to give an answer to a query.

Which replication strategy do you suggest? Justify the selection.


UP, FAMNIT, DBD 2020/21 – 2nd exam 3

Exercise 2. (25%)

Besides a table Books, the important table of our information system is the table
Borrow. Each row of the table Borrow states that a book was borrowed to a member
on the given date, and for the given number of days.

Books(book_id,title,author,rating,publisher,year)
Borrow(borrow_id,book_id,member_id,date,num_days)

The tables Books and Borrow are fragmented on the basis of a set of predicates:
• B1 = σyear>2000(Books), B2 = σyear<=2000(Books)
• R1 = σdate>1/1/2000(Borrow), R2 = σdate<=1/1/2000(Borrow)

The following SQL query is given.

Select B.title, B.author


from Books B, Borrow R
where B.title like ’%%’
and B.year<1980
and R.date>15/1/2021
and B.book_id=R.book_id;

a) Translate a query into a query tree, i.e., a tree of relational algebra operations.

b) Present a query graph for the given query.

c) Perform the reduction of the above query for the given horizontal fragmentation.
UP, FAMNIT, DBD 2020/21 – 2nd exam 4

Exercise 3. (25%)

a) Present briefly the alternative multiprocessor architectures that are used for parallel
database systems.

b) Suppose we have sites denoted as Site1-6. Let B1 and B2 be two fragments of a


table Books stored at Site1 and Site2, and let R1 and R2 be fragments of Borrow
stored at Site3 and Site4, respectively.

Simplified instances of fragments are given below. Let the join result be constructed at
Site5 and Site6. The simple hash function h(k) = { even(k)->5, odd(k)->6 } can be
used.

Sketch the steps in the computation of the parallel hash join Books ⋈ Borrow.

B1 Site1
book_id title
2 Book X
7 Book Z

B2 Site2
book_id title
5 Book U
4 Book Q

R1 Site3
brrw_id book_id member_id
101 3 55
102 2 44

R2 Site4
brrw_id book_id member_id
103 5 48
104 2 76

c) Why the join Books ⋈ Borrow is speed up by choosing the two sites where the join
results are computed from Site1-Site4?
UP, FAMNIT, DBD 2020/21 – 2nd exam 5

Exercise 4. (25%)

Answer the following questions about the novel database systems.

a) Describe the structure of the Google SST table and the essential functions of the
associated Tablet servers.

b) Present the main ideas involved in partitioning a key-value database by using


consistent hashing.

c) Present the replication mechanism implemented in the key-value store Dynamo.

You might also like