You are on page 1of 16

PRES PRESENTATIONS

PRES NO PRES MEMBERS AND QUESTIONS DUE 4 March PRES AREAS TOPICS AND
2022 POINTS FOR PRESENTATION
Sir The relational model
 Relation and its
properties

 Relational Model
Concepts

 Relational Constraints
and Relational Database
Schemas

 Update Operations and


Dealing with Constraint
Violations

 Basic Relational Algebra


Operations

 Additional Relational
Operations

Examples of Queries in
Relational Algebra
Group 1 a) Suppose you and your friends are starting an e-
commerce company which sells various kinds of
Charlton Gotami products in daily life like perfume and toys Database analysis and design
B1851654 online. Now you are trying to design the  Overview of the Entity
Blessing Chikonhi company’s website. Based on the following relationship (ER) model
B1851993 requirements, design an ER diagram for the
Tanaka Zvakaramba database of the website. For each binary and Extended Entity
B1953658 relationship you identified, state the cardinalities relationship (EER)
Kundiso Nyowani (1:1, 1:m or m:n) on the entities participating in model
B1852769 this relationship. [20]

• The database maintains the information of  Design methodologies ,


customers, including the customer’s name, email address, Conceptual, Logical and
shipping address, billing address, credit card number, Physical
and phone number. In order to arrange the shipment
efficiently and reduce the cost, the shipping address is
composed by street, state and zip code.  ER- and EER-to-
• There are two kinds of customers, registered Relational Mapping
customer and non-registered customer. Registered
customers are identified by their registered ids, and for  Relational Database
each non-registered customer, a temporary id is used.
• A product has a product id, a name, its price, a Design Using ER-to-
supplier (from where this product is purchased) and a Relational Mapping
description. Each product is identified by the product id.
• Each product has a number of items. All the Mapping EER Model Concepts
items from a same product are identical in looking,
to Relations
however, they are different in their item ids (imagine
when you go to the supermarket, although you buy two
PRES PRESENTATIONS

same things, they have different barcodes). In addition,


each item has a producing date. The item id alone is not
enough to distinguish different items from all kinds of
products; instead, it must be associated with its
corresponding product id.
• Each customer can order many items at a time.
When he/she is making an order, the date, time, and total
amount of that order will be recorded. The total amount
is not stored information but calculated each time when a
customer makes an order, by adding all the prices of
items together.
• Each product belongs to one or more categories.
For example, a photographer’s book can belong to both
“book” and “photography”. Each category includes many
kinds of products. A category has its category number, its
category name, and is identified by the category number.
• For each registered customer, you will keep track
of his/her favorite categories. This will be useful when you
suggest products for him/her in his/her future purchase.
One customer can favorite in one or more categories, and
for each of his/her favorite, you will keep record of the
number of purchases he/she made in this category.

b) Consider the following information about a


university database:
 Professors have an SSN, a name, an age, a rank,
and a research specialty.
 Projects have a project number, a sponsor name
(e.g., NSF), a starting date, an ending date, and a
budget.
 Graduate students have an SSN, a name, an age,
and a degree program (e.g., M.S. or Ph.D.).
 Each project is managed by one professor
(known as the project’s principal investigator).
 Each project is worked on by one or more
professors (known as the project’s co-
investigators).
 Professors can manage and/or work on multiple
projects.
 Each project is worked on by one or more
graduate students (known as the project’s
research assistants).
 When graduate students work on a project, a
professor must supervise their work on the
project. Graduate students can work on multiple
projects, in which case they will have a
(potentially different) supervisor for each one.
 Departments have a department number, a
department name, and a main office.
 Departments have a professor (known as the
chairman) who runs the department.
 Professors work in one or more departments, and
for each department that they work in, a time
percentage is associated with their job.
 Graduate students have one major department in
which they are working on their degree.
 Each graduate student has another, more senior
graduate student (known as a student advisor)
who advises him or her on what courses to take.
PRES PRESENTATIONS

 Design and draw an ER diagram that captures


the information about the university. [20]

c. The academic world is an interesting example of


international cooperation and exchange. This problem is
concerned with modelling of a database that contains
information on researchers, academic institutions, and
collaborations among researchers. A researcher can
either be employed as a professor or a lab assistant. There
are three kinds of professors: Assistant, associate, and full
professors. The following should be stored:
For each researcher, his/her name, year of birth,
and current position (if any).
For each institution, its name, country, and
inauguration year.
For each institution, the names of its schools (e.g.
School of Law, School of Business, School of Computer
Science,. . . ). A school belongs to exactly one institution.
An employment history, including information
on all employments (start and end date, position, and
what school).
Information about co-authorships, i.e., which
researchers have co-authered a research paper. The titles
of common research papers should also be stored.
For each researcher, information on his/her
highest degree (BSc, MSc or PhD), including who was the
main supervisor, and at what school.
For each professor, information on what
research projects (title, start date, and end date) he/she is
involved in, and the total amount of grant money for
which he/she was the main applicant.

Design and draw an ER diagram for the data sets


described above. [20]

d) Transform the following ER model to a database


schema, Identifying the primary keys and foreign
keys. [12]
PRES PRESENTATIONS

a)

Group 2 a) Suppose you are given a relation R = (A,B,C,D,E)


with the following functional dependencies: {CE D,D
Emelda Tsvuura B,C A}. Index Structures for Files
B190069A i. Identify the best normal form that R satisfies.
James Dahwa [2]  Types of Single-Level
B1953172 ii. If the relation is not in BCNF, decompose it until
Chantelle Matsheza it becomes BCNF. At each step, identify a new relation, Ordered Indexes
B191118A decompose and re-compute the keys and the normal
Wilson Mafunga forms they satisfy.  Types of Single-Level
B1953482 [4] Ordered Indexes 6.2
b) Prove the Armstrong’s union rule.
[4] Multilevel Indexes 6.3
c) Given R (A, B, C, D, E) with the set of FDs, F Dynamic Multilevel
{AB CD, A E, C D}. Is the decomposition of R into Indexes Using
R1 (A,B,C), R2 (B,C,D) and R3(C,D,E) lossless? Prove
[10]
 B-Trees and B+-Trees
d) In recent years alternatives to the Relational data 6.4 Indexes on
model have emerged. Describe the key
characteristics of each of the following data models:  Multilevel Indexes
i. The Document Oriented; [5]
ii. A Spatial/Geographical. [5]
 Other Types of Indexes
b) Using your own suitable example code and
diagrams, explain how the following database concepts
are implemented:  Multiple Keys 6.5 Other
i. Trigger; [5] Types of Indexes
ii. Function; [5]
iii. Cursor; [5]
iv. View; [5]
v. Stored Procedure. [5]
Group 3 b) Let us assume a database for Web pages that
keep data related to courses taught at a university. We
Database Languages
PRES PRESENTATIONS

Terrence Zakwa consider the following part of the database schema:  Relational algebra
B1851379
Kudzanayi Kawara webpage(webID, webTitle, url, hits)  Tuple Relational
B190619A courseware(cID, cDescription, ftpLocation, category)
Benovelent ftpLink(webID, cID) Calculus
Mudzinganyama
B1851682 The relation webpage stores data about web pages. Each  Domain Relational
Cloudious Chikazhe web page is described by a unique identifier webID, a title Calculus
B1953683 webTitle, its url, and the number of hits to it. The relation
courseware contains data about the course material of a
course. Each courseware record stores a unique identifier  Overview of the QBE
cID to identify a courseware file, a description Language
cDescription of the courseware file, a directory
ftplocation from where the file can be downloaded, and
the category of courseware (‘P’ = publication, ‘N’ = notes,
‘D’ = document, ‘E’ = executable). The relation ftpLink
stores for each Web page the ftp links to multiple
courseware files and for each courseware file the different
Web pages to which it is linked.

Write relational algebra for the following queries


i. Find web ids and web titles of web pages which
provide courseware with the cID of “COP5725”.
ii. Find web ids and web titles of web pages which
ONLY provide courseware with the cID of “COP 5725”.
iii. Find the url of web pages which have the number
of hits between [5,000, 10,000 ).
i. Find web ids and web titles of web pages which
provide course materials with the cID of
“COP5725” and the category of publication.

[4X3]
Group 4 a) Let us assume a database for Web pages that
keep data related to courses taught at a university. We
Mavis Mundenda consider the following part of the database schema: Structured Query Language
B190230A (PRACTICAL)
Salomu Muzondo webpage(webID, webTitle, url, hits)
B190676B courseware(cID, cDescription, ftpLocation, category)  Data Definition,
Maranatha Chitare ftpLink(webID, cID) Constraints, and Schema
B1953785
Changes in SQL
Melisa Musa The relation webpage stores data about web pages. Each
B190049A web page is described by a unique identifier webID, a title  Basic Queries in SQL
Tapiwa Shumba webTitle, its url, and the number of hits to it. The relation
B1851232 courseware contains data about the course material of a  More Complex SQL
course. Each courseware record stores a unique identifier Queries
cID to identify a courseware file, a description
cDescription of the courseware file, a directory  Views (Virtual Tables)
ftplocation from where the file can be downloaded, and in SQL
the category of courseware (‘P’ = publication, ‘N’ = notes,
‘D’ = document, ‘E’ = executable). The relation ftpLink  Specifying General
stores for each Web page the ftp links to multiple Constraints as
courseware files and for each courseware file the different
Assertions
Web pages to which it is linked.
 Additional Features of
Write relational algebra for the following queries
SQL
i. Find web ids and web titles of web pages which
provide courseware with the cID of “COP5725”.
ii. Find web ids and web titles of web pages which
ONLY provide courseware with the cID of “COP 5725”.
PRES PRESENTATIONS

iii. Find the url of web pages which have the number
of hits between [5,000, 10,000 ).
ii. Find web ids and web titles of web pages which
provide course materials with the cID of
“COP5725” and the category of publication.
[4X3]

c) Consider the relations defined below:


PHYSICIAN (regno, name, telno, city)
PATIENT (pname, street, city)
VISIT (pname, regno, date-of-visit, fee)

Express the following queries in SQL.

i. Create an assertion that the total fees charged for


a patient can not be more than Rs.1000/- assuming that
patients can visit the same doctor more than once. [2]
ii. Create a view Patient_visits(name, times) where
name is the name of the patient and times is the number
of visits of a patient. [3]
iii. Display the ptname, ptaddr of the patient(s) who
have visited more than one physician in the month of May
2000 in ascending order of ptname. [3]

Group 5
a) How does a query tree represent a relational
Moses Jawi B1952984
Liberty Alois algebra expression? [4]
Chandiwana
B1953675 b) Discuss any two rules for query
Trisher T Mpatsi
B191368A
optimisation, giving example as to when
Munashe should each rule be applied.[4]
Ticharambeyi
B1953003 c) Examine the typical phases of query
Query Processing and Optimization
processing. [10]
 Translating SQL Queries into
d) Given the following three linked tables: Relational Algebra
 Using Heuristics in Query
Customers (custID, name, country) Optimization
Products (prodID, price)  Using Selectivity and Cost
Orders (orderID, custID*, prodID*, Estimates in Query
Optimization
date)

Suppose we have the following query and


its corresponding initial parse tree:

SELECT Customers.name
FROM Customers, Orders, Products
WHERE Customers.custID =
Orders.custID
AND Orders.prodID =
Products.prodID
PRES PRESENTATIONS

AND Orders.date = ’16-Sep-2017’


AND Products.price > 50;

e) What problems arise if the query is


executed based on the above parse tree?
[2]
f) Transform the above parse tree into one
that corresponds to the most efficient way
of processing the query. [12]
g) Consider the following three linked tables
that contain information about students, the
modules they are studying, and results of
assessments for those modules:
students (studID, name, course)
modules (modNbr, title, credits)
results (studID*, modNbr*, date, grade)
Consider the following query:
SELECT M.title, S.name
FROM students S, modules M, results R
WHERE S.studID = R.studID
AND M.modNbr = R.modNbr
AND S.course = ‘Computing’
PRES PRESENTATIONS

AND M.credits = 15
AND R.grade = ‘A’;
Draw a query tree that corresponds to the most
efficient way of processing this query.
[12]
h) Consider the following three linked tables
that contain information about employees
and the projects they work on:

employees (empID, name, salary)


project (projNbr, title, budget)
workload (empID*, projNbr*, duration)

Consider the following query:

SELECT P.title, E.name


FROM employees E, project P, workload
W
WHERE E.empID = W.empID
AND P.projNbr = W.projNbr
AND E.salary > 15000
AND W.duration < 20;

i) Draw an initial relational algebra tree for


the above query. [8]
j) Apply a series of transformations to the tree
obtained in part (i) to make the query more
efficient. Discuss each step and state the
heuristic used. [12]
k) Consider the following tables:
Film (filmNbr, title, year)
Director (directID, name)
Directing (directID, filmNbr)

And the following query:


SELECT Film.title
FROM Film, Director, Directing
WHERE Film.filmNbr = Directing.filmNbr
AND Director.directID = Directing.directID
AND Director.name = ‘Lucas’
AND Film.year = 2015;
PRES PRESENTATIONS

l) Draw a query tree that corresponds to the


most efficient way of processing this query.
[10]
m) Assume there is a B-Tree index on the
column “title” of the table “Film”. For each
of the following queries, explain how this
index could be used when executing each
query:

SELECT * FROM Film WHERE title = ‘Up’;


SELECT * FROM Film ORDER BY title;
SELECT COUNT(title) FROM Film; [6]

Group 6
a) Describe the shadow paging technique
Knowledge Sithole
B1953752 using a diagram. [9]
Tinashe Taruvinga
B1953017 b) Examine deferred modification and
Allan Kamuruko
B190044A immediate modification technique for
Rowson Zingwina recovery explaining how does recovery
B190191A
Kudzai Tsikira takes place in case of a failure in these
B1953895
techniques. [10]
c) Let transactions T1, T2 and T3 be defined Transaction management and
to perform the following operations : Database Recovery
 Schedules and Recoverability
T1 : Add one to A  Serializability of Schedules
 Transaction Support in SQL
T2 : Double A  Concurrency Control
Techniques
T3 : Display A on the screen and  Database Recovery
then set A to one. Techniques

(where A is some item in the


database)
Suppose transactions T1, T2 and T3 are
allowed to execute concurrently. If A has initial
value zero, how many possible correct results
are there? Enumerate them. [12]

d) Using your own simple examples and/or


PRES PRESENTATIONS

diagrams, discuss the three potential data


integrity and consistency problems that
may occur in a multi-user database if
concurrency control techniques are not
fully implemented. [12]
e) Consider the transactions t1, t2 and t3 and a
schedule S given below:
S: read1 (A); read2 (B); write1(C); read3
(B); read3(C); write2 (B); write3 (A) where
the subscript denotes the transaction
number. Assume that the time stamp of
t1<t2<t3.Using time-stamp ordering
scheme for concurrency control find out if
the schedule will go through. If there is to
be a rollback, which transaction(s) will be
rolled back?
[5]
f) The following programming techniques can be
thought of as extending the range and
functionality of SQL and perform tasks that
SQL cannot perform on its own. Describe with
the aid of examples how each programming
technique extends the range and functionality
of SQL.

g) Stored procedures [3]


h) Triggers [3]
i) Embedding SQL in a programming language
such as Java/PHP [3]

j) Using your own simple examples and suitable


diagrams, explain the following transaction-
processing concepts.
k) Two-phase locking
l) Two-phase commit
m) Cascaded rollback
n) Wait-for graphs
PRES PRESENTATIONS

o) Checkpoints [5X3]

Group 7 The tables below are currently held on a


LOCADIA L MURISA database at the single site called CTR.
B1953743
SHORAI MATIYA Following reorganization it is intended to
B1851534
FARAI DZIMIRI distribute the journals held at the central
B1953661 database across 3 new branch libraries
VINCENT
KAPIKINYU located at remote sites called GTR; UTC Distributed Databases
B190301A
LAANCE and TWE. The central library becomes the  Data Fragmentation,
MUNZWEMBIRI Replication, and Allocation
B190625A HQ (an administrative centre) meaning that Techniques for Distributed
Database Design
it no longer keeps or loans out any journals
 Types of Distributed
itself. Instead journals are made available Database Systems
for loan to borrowers registered at any of  Query Processing in
Distributed Databases
the 3 new sites.
 Concurrency Control and
Recovery in Distributed
Databases
 Overview of Client-Server
Architecture and Its
Relationship to Distributed
Databases
PRES PRESENTATIONS

a) Describe three different proposals for data


distribution of the central database (CTR).
[15]
Hint: Show the distribution/replication of table
fragments/partitions and explain any trade-off
and pros/cons you think are relevant.
b) Describe the criteria you would use to
assess the effectiveness of the approach you
recommend. [15]
c) A small Bank has two branches, one located in
York and one located in Leeds. Currently the
banks centralised database is managed at its
HQ in London, where it keeps data about its
customer accounts. Other uses include
collecting data for the generation of reports and
to monitor the services that customers use.
Applications have been installed at the two
branches to access the centralised database via
a fast communication network for accessing the
data they need. There is also a communication
PRES PRESENTATIONS

link between the two branches, which is


currently used only when one of the main links
to the London headquarters fail. For simplicity
the centralised database system holds a single
Customer table (see figure below), where data
about customer accounts are kept. The columns
of the Customer table are the account number,
the customer’s name, the branch where the
account is kept and its current balance. The
bank has been asked to split its branches so that
they can support customers when there is
failure of the centralised database or the
communication system is slow or breaks down.

Describe three ways to distribute data


across the three sites. Comment on the pros
and cons of each option. [15]

d) Describe transparency features that exist in


a distributed database environment. [9]

e) By their very nature, data warehouses get


bigger over time. As the search space
increases, the query performance decreases
and tuning techniques are required. For
each of the two approaches listed below,
using your own appropriate diagrams and
examples, discuss the essential concepts
and techniques.
f) Aggregation and Summary Data[7]
PRES PRESENTATIONS

g) Indexing and Optimization [8]

Group 8 a) Explain the following terms in data


Bessie W Chitote warehousing and data science.
B190709A
Susan Guni B1953572 i. Snowflake Schema,
Joel Tinashe Mashawi
B190144A ii. Fact Constellation,
Elton Bangajena iii. Star schema
B1953092
iv. Data Marts
v. Data Cleansing
vi. Materialized Views
vii. Roll-up, drill-down and slicing &
dicing [7X5]

b) Discuss the key characteristics of a data


warehouse and how it differs in content,
structure and function from an on-line
transaction processing (OLTP) database. Data Warehousing And Data
Mining
You should support your discussion with
 Data Warehousing
suitable diagrams and examples.  Data Mining
[12]
c) Describe the five types of knowledge
produced from data mining giving an
example of each. [10]
d) Recently there has been a lot of interest in
alternatives to relational database systems
known colloquially as 'NoSQL' databases
and often referred to as 'Not Only SQL'.
These alternatives have emerged to handle
the massive growth of data - so called Big
Data. Examples of Big Data applications
include Facebook (and social media in
general) and Google Analytics (Intelligent
Product marketing).
PRES PRESENTATIONS

With reference to the above statement,


discuss any four main requirements and
challenges of developing and running Big
Data applications. [12]
e) With the aid of appropriate examples
explain the following data mining
algorithms:
i. Apriori Algorithm [5]
ii. Frequent Pattern Tree Algorithm
[5]

Group 9
The hosting of database services on the ‘Cloud’ is
JUSTIN JAPI
B1953909 representative of a Service Oriented Architecture
TANAKA T
CHIKOMO B191139A (SOA). The SOA Company that hosts a client’s
PANASHE W
KASEKEB1953765 database distributes their database physically and
OTHINIEL T geographically, replicating their data across Database Security and Emerging
MAKAKA B190591A Database Technologies and
ROBERT NEMON multiple database servers. Applications
B190177A
a) Outline the benefits of the distributed  Introduction to Database
Security Issues
database architecture described above as
 Access Control Based on
opposed to a database that is centralised Granting/Revoking of
Privileges
and not hosted. Illustrate your answer with
 Databases on the World Wide
a scenario of your choice that describes a Web

database application that would benefit  Multimedia Databases


 Mobile Databases
from this approach.[10]
 Geographic Information
b) Describe the concept of data replication as Systems Databases
a particular technique for distributing data
and explain using your application in a) as
a source of examples how this is achieved.
(Hint: use diagrams and examples to
PRES PRESENTATIONS

illustrate how a distributed database is


configured and accessed) [10]
c) Given that the fundamental principle of a
distributed system is that to a database user
a distributed database system should
behave exactly like a non-distributed
system. Discuss the implications of the
above statement on database integrity and
the consistency of distributed queries and
transactions in a distributed database that
supports replication. [10]
b) Discuss any four techniques that can be
used to insure data security in both
distributed databases and centralized
databases. [20]
c) Examine how supercomputers can be
effectively used in data analytics and data
science. [10]

You might also like