You are on page 1of 3

Honor Code of the School of Engineering Midterm Examination

"All students taking courses in the School of Engineering COEN 280 Database Systems
agree, individually and collectively, that they will not give or Department of Computer Engineer
receive unpermitted aid in examinations or other course Santa Clara University
works that is to be used by the instructor as the basis of Dr. Ming-Hwa Wang Fall Quarter 2020
grading." Phone: (408) 805-4175 Email address: mwang2@cse.scu.edu
-From the Graduate/Undergraduate Bulletin Course website: http://www.cse.scu.edu/~mwang2/bigdata/
Office Hours: Friday 9:00pm-9:30pm
I have read, understood, and agree to abide by the Honor Code of
the School of Engineering. 1. [20 points] True or false problems with wrong-answer penalties:
a) Traditional database can support either structured or unstructured data.
Name: ___________________________ b) A column-oriented database can handle much huge database than row-
oriented database.
ID: ___________________________ c) Aggregate database or distributed database normally does not support
ACID transaction.
Signature: ___________________________ d) Json typically is used in content-management systems, and XML is typically
used in web-based operational loads.
Date: ___________________________
2. [20 points] Among the 4 kind of NoSQLs (key-value NoSQLs, document NoSQLs,
column-family NoSQLs, or graph NoSQLs),
1. [20 points] a) Which do not or has difficult to support join() operation.
b) Which focus more on relationships than data.
2. [20 points]
3. [20 points] Which of the following operations are more efficient on columnar
3. [20 points] databases than row-oriented databases:
a) CRUD (create, read, update, delete) operations
4. [20 points] b) Aggregate functions (e.g., sum(), average(), etc.)
c) OLTP (online transaction processing)
5. [20 points] d) OLAP (online analytic processing)
e) Data compression and decompression (especially for sorted data)
6. [20 points] f) ETL (extract, transform, load) or ELT tasks
g) Projection operations
7. [20 points]
4. [20 points] Which of the following support foreign keys:
8. [20 points] a) Relational databases
b) Key-value NoSQLs
9. [20 points] c) Document NoSQLs
d) Column-family NoSQLs,
10. [20 points] e) Graph NoSQLs

Total Score:
5. [20 points] Which of the follow NoSQLs are key-value NoSQLs, document airplanes stationed and maintained at the airport. The relevant information is
NoSQLs, column-family NoSQLs, or graph NoSQLs: Cassandra, CouchDB, as follows:
Dynamo, HBase, MongoDB, Neo4j, OrientDB, Redis, Riak, Scylla. • Every airplane has a registration number, and each airplane is of a specific
model.
6. [20 points] For the following applications, please classify them based on CAP • The airport accommodates a number of airplane models, and each model
theorem (i.e., CA, CP, or AP): is identified by a model number (e.g., DC-10) and has a capacity and a
a) Traditional relational databases. weight.
b) Global social networks. • A number of technicians work at the airport. You need to store the name,
c) Applications need to support ACID transaction in a distributed database. SSN, address, phone number, and salary of each technician.
d) Distributed applications with imperfect internet connections. • Each technician is an expert on one or more plane model(s), and his or her
expertise may overlap with that of other technicians. This information
7. [20 points] For tunable or configurable consistency NoSQL, e.g., Dyanmo, we about technicians must also be recorded.
can use 1) N for replica factor, 2) W for the number of copies of the data item • Traffic controllers must have an annual medical examination. For each
that must be written before the write can completes, and 3) R for number of traffic controller, you must store the date of the most recent exam.
copies that the application will access when reading the data item. Please use • All airport employees (including technicians) belong to a union. You must
N, W, and R to represent how to do tunable consistency. store the union membership number of each employee. You can assume
that each employee is uniquely identified by a social security number.
8. [20 points] In distributed NoSQLs, e.g., Cassandra, with 3 nodes: node1, node2, • The airport has a number of tests that are used periodically to ensure that
node3. The nodes use vector clocks with monotonic increasing counters to airplanes are still airworthy. Each test has a Federal Aviation
detect conflicts. Initially, all nodes start with counters (0, 0, 0), and then Administration (FAA) test number, a name, and a maximum possible score.
increment the local clock whenever a send/receive event happens. Please • The FAA requires the airport to keep track of each time a given airplane is
indicate the clock vectors for each event and indicate which event is conflicted tests by a given technician using a given test. For each testing event, the
and need to be resolved. Events are named a, b, c, d, e, f, g, h. Each node has information needed is the date, the number of hours the technician spent
the events listed in chronological order below: doing the test, and the score the airplane received on the test.
node1: send b to node2 as d, receive node2’s d as f and send f to node2 as h a) Draw an ER diagram for the airport database. Be sure to indicate the
node2: receive node3’s a as c, receive node1’s b as d and send d to both node1 various attributes of each entity and relationship set; also specify the key
as f and node3 as e, receive node1’s f and node2’s g as h and participation constraints for each relationship set. Specify any
node3: send a to node2 as c, receive node2’s d as e, send g to node2 as h necessary overlap and covering constraints as well (in English).
node1 node2 node3 b) The FAA passes a regulation that tests on a plane must be conducted by a
(0, 0, 0) (0, 0, 0) (0, 0, 0) technician who is an expert on that model. How would you express this
| | a constraint in the ER diagram? If you con not express it, explain why.
b c |
| d | 10. [20 points] Although you always wanted to be an artist, you ended up being an
f | e expert on database because you love to cook data and you somehow confused
| | g database with data baste. Your old love is still there, however, so you set up a
| h | database company, ArtBase, that builds a product for art galleries. The core of
this product is a database with a schema that captures all the information that
9. [20 points] Computer Science Department frequent fliers have been galleries need to maintain. Galleries keep information about artists, their
complaining to Dane County Airport officials about the poor organization at the names (which are unique), birthplaces, age, and style of art. For each piece of
airport. As a result, the officials decided that all information related to the artwork, the artiest, the year it was made, its unique title, its type of art (e.g.,
airport should be organized using a DBMS, and you have been hired to design painting, lithograph, sculpture, photograph), and its price must to stored.
the database. Your first task is to organize the information about all the Pieces of artwork are also classified into groups of various kinds, e.g., portraits,
still lifes, works by Picasso, or works of the 19th century; a given piece may
belong to more than one group. Each group is identified by a name (like those
just given) that describes the group. Finally, galleries keep information about
customers. For each customer, galleries keep that person’s unique name,
address, total amount of dollars spent in the gallery (very important), and the
artists and groups of art that the customer tends to like. Please draw the ER
diagram for the database.

You might also like