Professional Documents
Culture Documents
Database
Management
Systems
2. Define query.
4. W
hich operators are called as unary operators and why are they called
so?
Part B (5 × 16 = 80 marks)
11. (a) (i) Discuss the main characteristics of the database approach and
how does it differ from traditional file system.
(ii) What are the three levels of abstraction in DBMS? (8 + 8)
Or
(b) (i) A database is being constructed to keep trade of teams and
games of a sports league. A team has a number of players, not all
of whom participate in each game. It is described to keep track
of players participating in each game in each team, the positions
they played on that game and the result of game. Draw the ER
diagram and list its entities and attributes. (10)
(ii) Briefly explain mapping cardinality in detail (6)
12. (a) Consider the database schema
E
mp (emp-name, type, birthday, set of (Exam-names), set of (Skills))
Children(emp-name, ch-name, birthday)
Skills(type, set of (exam-names))
Exams(exam-name, year, city)
Write SQL statements for the following queries.
(i) Find the names of all employees who have a birthday in March
as their children.
(ii) Find those employees who took an examination for the skill type
“typing” in the city “Chennai”.
(iii) List all exam names under specific skill type for the given
employee other than his exam names.
(iv) Find the names of the city and year where the examination is
going to held for. the given skill type. (8)
(v) Explain reverential integrity with an example. (8)
Or
(b) W
hat is the need for building distributed database? Explain important
issues in building distributed database with ah example. Explain
how distributed database is used in client/server environment. (16)
13. (a) (i) What is redundant data? What are the problems caused by
redundant data? (6)
(ii) Explain the process of normalization from INF to BCNF stage
with example (10)
Or
(b) C
onsider the relation R(A, B, C, D, E) with functional dependencies.
{A → BC, CD → E, B → D, E → A} Identify Super keys.
Find Fc, F+. (16)
14. (a) Explain the following:
(i) Different locking mechanism used in lock based concurrency
control. (10)
(ii) Validation based protocol with an example, (6)
Or
(b) (i) What is the difference between conflict serializability and view
serializability? Explain in detail with an example. (12)
(ii) Briefly explain ACID properly with an example. (4)
15. (a) W
hat is RAID? Briefly explain different levels of RAID. Discuss the
factors to be considered in choosing a RAID level. (16)
Or
(b) (i) Explain three kinds database tuning in detail. (6)
(ii) Explain the structure of B+ tree and how to process queries in B+
tree. (10)
Solutions
Part A
3. Data integrity = making sure the data is correct and not corruptData
security = making sure only the people who should have access to the
data are the only ones who can access the data. also, keeping straight
who can read the data and who can write data.
9. Slotted Page
Part B
11. (a) (i)
Database Approach
The problems inherent in file-oriented systems make using the database
system very desirable. Unlike the file-oriented system, with its many
separate and unrelated files, the database system consists of logically
related data stored in a single data dictionary. Therefore, the database
approach represents the change in the way end user data are stored,
accessed and managed. It emphasizes the integration and sharing of
data throughout the organisation. Database systems overcome the
disadvantages of file-oriented system. They eliminate problems related
with data redundancy and data control by supporting an integrated and
centralized data structure. Data are controlled via a data dictionary (DD)
system which itself is controlled by database administrators (DBAs).
The following figure illustrates a comparison between file-oriented and
database systems.
File-oriented versus database systems
Conceptual/Internal mapping
External/Conceptual mapping
Conceptual/Internal Mapping
The conceptual schema is related to the internal schema through
conceptual/internal mapping. The conceptual internal mapping defines
the correspondence between the conceptual view and the stored
database. It specifies how conceptual records and fields are presented
at the internal level.
External/Conceptual Mapping
Each external schema is related to the conceptual schema by the external/
conceptual mapping. The external/conceptual mapping defines the
correspondence between a particular external view and the conceptual
view. It gives the correspondence among the records and relationships of
the external and conceptual views.
12. (a)
select emp-name from children where birthday=”march”
ii)select emp-name from emp where skill type =”typing”&
city=”Chennai”
12. (b)
Client/Server Database System
Client/server architecture of database system has two logical components
namely client, and server. Clients are generally personal computers or
workstations whereas server is large workstations, mini range computer
system or a mainframe computers system. The applications and tools of
DBMS run on one or more client platforms, while the DBMS softwares
reside on the server. The server computer is called backend and the
client’s computer is called front-end. These server and client computers
are connected into a network. The applications and tools act as clients
of the DBMS, making requests for its services. The DBMS, in turn,
processes these requests and returns the results to the client(s). Client/
server architecture handles the graphical user interface (GUI) and does
computations and other programming of interest to the end user. The
server handles parts of the job that are common to many clients, for
example, database access and updates. The following figure illustrates
client/server database architecture.
Example 1
As shown in Fig. 10.3, the partial dependency of the doctor’s contact
number on the key DOCTORNAME indicates that the relation is not
in 2NF. Therefore, to bring the relation in 2NF, the information about
doctors and their contact numbers have to be separated from information
about patients and their appointments with doctors. Thus, the relation is
decomposed into two tables, namely PATIENT_DOCTOR and DOCTOR,
as shown in Table 10.4. The relational table can be depicted as:
PATIENT_DOCTOR (PATIENT-NAME, DATE-OF-BIRTH,
DOCTOR-NAME,
DATE-TIME, DURATION-MINUTES)
DOCTOR (DOCTOR-NAME, CONTACT-NO)
•• mutually independent,
•• functionally dependent on the primary (or relation) key.
Both of the above relations are in BCNF. The only FD between the USE
attributes is
PROJECT, MACHINE →QTY-USED
and (PROJECT, MACHINE) is a super key.
The two FDs between the PROJECTS attributes are
PROJECT →PROJ-MANAGER
PROJ-MANAGER →PROJECT
Both PROJECT and PROJ-MANAGER are super keys of relation
PROJECTS and PROJECTS is in
BCNF.
13. (b)
A closure of a set (also called complete sets) of functional dependency
defines all the FDs that can be derived from a given set of FDs. Given
a set of F of FDs on attributes of a table T, closure of F is defined. The
notation F+ is used to denote the closure of the set of all FDs implied by
F. Armstrong’s axioms can be used to develop algorithm that will allow
computing F+ from F.
F = {EMPLOYEE-NO →EMPLOYEE-NAME,
PROJECT-NO →{PROJECT-NAME, PROJECT-LOCATION},
{EMPLOYEE-NO, PROJECT-NO} →HOURS-SPENT}
or Read lock and (b) X locks-exclusive or Write lock. The lock manager
refuses incompatible requests, so if
(a) Transaction T1 holds an S lock on granule G1. A request by
transaction T2 for an S lock will be granted. In other words, Read-
Read is permutable.
(b) Transaction T1 holds an S lock on granule G1. A request by transaction
T2 for an X lock will be refused. In other words, Read-Write is not
permutable.
(c) Transaction T1 holds an X lock on granule G1. No request by
transaction T2 for a lock on G1. will be granted. In other words, Write
is not permutable.
Lock Granularity
A database is basically represented as a collection of named data items.
The size of the data item chosen as the unit of protection by a concurrency
control program is called granularity. Granularity can be a field of some
record in the database, or it may be a larger unit such as record or even
a whole disk block. Granule is a unit of data individually controlled by
the concurrency control subsystem. Granularity is a lockable unit in a
lock-based concurrency control scheme. Lock granularity indicates the
level of lock use. Most often, the granule is a page, although smaller or
larger units (for example, tuple, relation) can be used. Most commercial
database systems provide a variety of locking granularities. Locking can
take place at the following levels:
•• Database level.
•• Table level.
•• Page level.
•• Row (tuple) level.
•• Attributes (fields) level.
Thus, the granularity affects the concurrency control of the data items,
that is, what portion of the database a data item represents. An item
can be as small as a single attribute (or field) value or as large as a disk
block, or even a whole file or the entire database.
Database Level Locking
At database level locking, the entire database is locked. Thus, it prevents
the use of any tables in the database by transaction T2 while transaction
T1 is being executed.
Lock Types
The DBMS mainly uses the following types of locking techniques:
•• Binary locking.
•• Exclusive locking.
•• Shared locking.
•• Two-phase locking (2PL).
•• Three-phase locking (3PL).
Binary Locking
In binary locking, there are two states of locking namely (a) locked (or
‘1’) or (b) unlocked (‘0’). If an object of a database table, page, tuple
(row) or attribute (field) is locked by a transaction, no other transaction
can use that object. A distinct lock is associated with each database item.
If the value of lock on data item X is 1, item X cannot be accessed by a
database operation that requires the item. If an object (or data item) X
is unlocked, any transaction can lock the object for its use. As a rule, a
transaction must unlock the object after its termination. Any database
operation requires that the affected object be locked. Therefore, every
transaction requires a lock and unlock operation for each data item that
is accessed. The DBMSs manages and schedules these operations.
Two operations, lock_item(data item) and unlock_item(data item)
are used with binary locking. A transaction requests access to a data
item X by first issuing a lock_item(X) operation. If LOCK(X) = 1,
the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (that
is, transaction locks the data item X) and the transaction is allowed to
access item X. When the transaction is through using the data item, it
issues unlock_item(X) operation, which sets LOCK(X) to 0 (unlocks
the data item) so that X may be accessed by other transactions. Hence, a
binary lock enforces mutual exclusion on the data item.
It can be observed from the above table that the lock and unlock features
eliminate the lost update problem as depicted in table. Binary locking
system has advantages of easy to implement. However, the binary
locking technique has limitations of being restrictive to yield optimal
concurrency conditions. For example, the DBMS will not allow the
two transactions to read the same database object, even though neither
transaction updates the database. Therefore, concurrency problems do
not occur as is the case in lost update.
Binary lock
15. (a)
RAID TECHNOLOGY
With fast growing database applications such as World Wide Web,
multimedia and so on, the data storage requirements are also growing
at the same pace. Also, faster microprocessors with larger and larger
primary memories are continually becoming available with the
exponential growth in the performance and capacity of semiconductor
devices and memories. Therefore, it is expected that secondary storage
technology must also take steps to keep up in performance and reliability
with processor technology to match the growth. Development of
redundant arrays of inexpensive disks (RAID) was a major advancement
in secondary storage technology to achieve improved performance and
reliability of storage system. Lately, the “I” in RAID is said to stand for
independent. The main goal of RAID is to even out the widely different
rates of performance improvement of disks against those in memory and
microprocessor. RAID technology provides a disk array arrangement
in which a large number of small independent disks operate in parallel
and act as a single higher-performance logical disk in place of a single
very large disk. The parallel operation of several disks improve the rate
at which data can be read or written and performs several independent
reads and writes in parallel. In a RAID system, a combination of data
stripping (also called parallelism) and data redundancy is implemented.
Data is distributed over several disks and redundant information is
stored on multiple disks.
Thus, in case of disk failure the redundant information is used to
reconstruct the content of the failed disk. Therefore, failure of one
disk does not lead to the loss of data. The RAID system increases the
performance and improves reliability of the resulting storage system.
10. W
hat is the basic difference between static hashing and dynamic
hashing?
Part B (5 × 16 = 80 marks)
and
Salesman # → Commission %
B
ased on the given primary key, is this relation in 1 NF, 2 NF, or
3 NF? Why or why not? How would you successively normalize it
completely?
Or
(b) Explain the principles of
(i) Loss less join decomposition (5)
(ii) Join dependencies (5)
(iii) Fifth normal form. (6)
14. (a) Illustrate dead lock and conflict serializability with suitable example.
Or
(b) (i) Explain two phase commit protocol. (10)
(ii) Write different SQL facilities for recovery. (6)
15. (a) Construct B tree to insert the following (order of the tree is 3)
+
Solutions
Part A
4. Audit changes (e.g. keep a log of the users and roles involved in
changes) enhance changes (e.g. ensure that every change to a record is
time-stamped by the server’s clock) enforce business rules (e.g. require
that every invoice have at least one line item) execute business rules
(e.g. notify a manager every time an employee’s bank account number
changes) replicate data (e.g. store a record of every change, to be shipped
10. Numbers of buckets are fixed. Numbers of Buckets are not fixed
As the file grows, performance decreases. Performance donot degrade
as the file grows.
Space overhead is more. Minimum space lies overhead.
Part B
11. (a)
DATABASE SYSTEM
A database system, also called database management system
(DBMS), is a generalized software system for manipulating databases.
It is basically a computerized record-keeping system; which it stores
information and allows users to add, delete, change, retrieve and
update that information on demand. It provides for simultaneous use
of a database by multiple users and tool for accessing and manipulating
the data in the database. DBMS is also a collection of programs that
enables users to create and maintain database. It is a general-purpose
software system that facilitates the process of defining (specifying the
data types, structures and constraints), constructing (process of storing
data on storage media) and manipulating (querying to retrieve specific
data, updating to reflect changes and generating reports from the data)
for various applications.
Typically, a DBMS has three basic components, as shown in Fig. 1.16,
and provides the following facilities:
DBMS Components
•• Data description language (DDL): It allows users to define the
database, specify the data types, and data structures, and the con-
straints on the data to be stored in the database, usually through data
definition language. DDL translates the schema written in a source
language into the object schema, thereby creating a logical and phys-
ical layout of the database.
•• Data manipulation language (DML) and query facility: It allows
users to insert, update, delete and retrieve data from the database,
usually through data manipulation language (DML). It provides gen-
eral query facility through structured query language (SQL).
•• Software for controlled access of database: It provides controlled
access to the database, for example, preventing unauthorized user
trying to access the database, providing a concurrency control sys-
tem to allow shared access of the database, activating a recovery
control system to restore the database to a previous consistent state
following a hardware or software failure and so on.
11. (b)
branch-city
branch-name assets
branch
loan-branch
social-security payment-date
access-date
account-number balance
cust-banker type
depositor account
manager
employee works-for ISA
worker
e-social-security employee-name
savings-account checking-account
dependent-name telephone-number
12. (a)
create table employee
(person-name char(20),
street char(30),
city char(30),
primary key (person-name) )
create table works
(person-name char(20),
company-name char(15),
salary integer,
primary key (person-name),
foreign key (person-name) references employee,
foreign key (company-name) references company)
create table company
(company-name char(15),
city char(30),
primary key (company-name))
create table manages
(person-name char(20),
manager-name char(20),
primary key (person-name),
foreign key (person-name) references employee,
foreign key (manager-name) references employee)
Note that alternative datatypes are possible. Other choices for not null
attributes may be acceptable.
12. (b)
select salary as total salary from employee
ii)SELECT MIN(salary) AS “Lowest salary” FROM employees;
iii) SELECT AVG(SAL), COMPANY NAME FROM EMP GROUP
BY COMPANY NAME;
iv) SELECT employee_name, salary, company_name,
(SELECT ROUND(AVG(salary))
FROM employees
WHERE employee.company_name =company_name) AS avg_sal
FROM employees e
WHERE salary > avg_sal
ORDER BY avg_sal DESC
13. (a)
The relation is in 1NF because all attribute values are single atomic
values.
The relation is not in 2NF because:
Car# >>DateSold
Car# >>DiscountAmount
Salesman# >>Commission%
Thus, these attributes are not fully functionally dependent on the primary
key.
•• 2NF decomposition:
CAR_SALE1(Car#, DateSold, DiscountAmount)
CAR_SALE2(Car#, Salesman#)
CAR_SALE3(Salesman#, Commission%)
•• The relations are not in 3NF because:
Car# >> DateSold >> DiscountAmount
Thus, DateSold is neither a key itself nor a subset of a key and
DiscountAmount is not a prime attribute.
•• 3NF decomposition:
CAR_SALES1A(Car#, DateSold)
CAR_SALES1B(DateSold, DiscountAmount)
CAR_SALE2(Car#, Salesman#)
CAR_SALE3(Salesman#, Commission%)
where Π = projection
= the natural join of all relations in D.
The lossless-join decomposition is a property of decomposition, which
ensures that no spurious tuples are generated when a natural join
operation is applied to the relations in the decomposition.
Let us consider the relation scheme (or table) R(X, Y, Z) with functional
dependencies YZ → X, X → Y and X → Z, as shown in Fig. 9.17. The
relation R is decomposed into two relations, R1and R2 that are defined by
following two projections:
R1 = projection of R over X, Y
R2 = projection of R over X, Z
where X is the set of common attributes in R1and R2 .
The decomposition is lossless if R ⊂ join of R1and R2 over X and the
decomposition is lossy if R ⊂ join of R1 and R2 over X.
It can be seen in Fig. 9.17 that the join of R1 and R2 yields the same
number of rows as does R. The decomposition of R(X,Y, Z) into R1 (X,Y)
and R2 (X, Z) is lossless if for attributes X, common to both R1 and R2,
either X → Y or X → Z. Thus, in example of Fig. 9.16 the common
attribute of R1and R2 is B, but neither B → A nor B → C is true. Hence
the decomposition is lossy. In Fig. 9.17, however, the decomposition is
lossless because for the common attribute X, both X → Y and X → Z.
Lossless decomposition
13 (b) (ii)
Join Dependencies (JD)
A join dependency (JD) can be said to exist if the join of R1 and R2 over
C is equal to relation R. Where, R1 and R2 are the decompositions R1 (A,
B, C), and R2(C, D) of a given relations R (A, B, C, D). Alternatively, R1
and R2 is a lossless decomposition of R. In other words, *(A, B, C, D),
(C, D) will be a join dependency of R if the join of the join’s attributes
is equal to relation R. Here, *(R1, R2, R3, ….) indicates that relations R1,
R2, R3 and so on are a join dependency (JD) of R. Therefore, a necessary
condition for a relation R to satisfy a JD *(R1, R2,…., Rn) is that
R = R1 ∪ R2 ∪.....∪ Rn
Thus, whenever we decompose a relation R into R1 = XUY and R2 = (R –
Y) based on an MVD X →Y that holds in relation R, the decomposition
has lossless join property. Therefore, lossless-join dependency can be
defined as a property of decomposition, which ensures that no spurious
tuples are generated when relations are returned through a natural join
operation.
13 (b) iii)
Fifth Normal Form (5NF)
A relation is said to be in fifth normal form (5NF) if every join dependency
is a consequence of its relation (candidate) keys. Alternatively, for every
non-trivial join dependency *(R1, R2, R3) each decomposed relation Ri
is a super key of the main relation R. 5NF is also called project-join
normal form (PJNM).
There are some relations, who cannot be decomposed into two or
higher normal form relations by means of projections as discussed in
1NF, 2NF, 3NF and BCNF. Such relations are decomposed into three
or more relations, which can be reconstructed by means of a three-way
or more join operation. This is called fifth normal form (5NF). The
5NF eliminates the problems of 4NF. 5NF allows for relations with join
dependencies. Any relation that is in 5NF, is also in other normal forms
namely 2NF, 3NF and 4NF. 5Nf is mainly used from theoretical point of
view and not for practical database design.
14 (a)
Deadlocks
A deadlock is a condition in which two (or more) transactions in a set
are waiting simultaneously for locks held by some other transaction in
the set. Neither transaction can continue because each transaction in the
set is on a waiting queue, waiting for one of the other transactions in the
set to release the lock on an item. Thus, a deadlock is an impasse that
may result when two or more transactions are each waiting for locks to
be released that are held by the other. Transactions whose lock requests
have been refused are queued until the lock can be granted. A deadlock
is also called a circular waiting condition where two transactions are
waiting (directly or indirectly) for each other. Thus in a deadlock, two
transactions are mutually excluded from accessing the next record
required to complete their transactions, also called a deadly embrace. A
read(B)
write(B
Limitations
•• A failure of the coordinator of sub-transactions can result in the
transaction being blocked from completion until the coordinator is
restored.
•• Requirement of coordinator results into more messages and more
overhead.
15. (b)
Tuples of r1: nr1 = 20, 000
Tuples of r2: nr1 = 45, 000
Blocks required for r1: br1 = [20, 000/25] = 800 blocks
Blocks required for r2: br2 = [45, 000/30] = 1500 blocks
(a) cost(p1) = br1 + nr1 * br2 = 800 + 20, 000 * 1, 500 = 30, 000, 800
(b) cost(p2) = br1 + br1 * br2 = 800 + 800 * 1, 500 = 1, 200, 800
(c) We assume that all tuples for any given value of the join attributes
fit in memory:
cost(p3) = br1 + br2 = 800 + 1, 500 = 2, 300
(d) We have to add the cost of sorting (using external sort-merge) to the
cost of p3, i.e.,
cost(p4) = cost(p3) + cost(sorting)
cost(sorting relation r) = br(2[logM−1(br/M)] + 1), where M is the
number of blocks in buffer. We have to add to these costs the
output of the sorted relation, i.e., add br block transfers.
Þ cost(sorting) = 800 * (2[logM−1(800/M)] + 2) + 1, 500 *
(2[logM−1(1, 500/M)] + 2)
(e) cost (p5) = 3* (br1 + br2) = 3*(800 + 1,500) = 6,900
3. M
ention the six fundamental operations of relational algebra and their
symbols.
4. List two reasons why null values might be introduced into the database.
9. W
hat can be done to reduce the occurrence of bucket overflows in a hash
file organizations?
10. H
ow might the type of index available influence the choice of a query
processing strategy?
Part B (5 × 16 = 80 marks)
11. (a) (i) Discuss in detail about the major disadvantages of file-
processing system. (6)
(ii) Explain in detail about different data models with neat diagram.
(10)
Or
(b) D
raw an E-R diagram for a Life insurance company with almost all
components and explain. (16)
12. (a) G
ive an introduction to Distributed database and Client/Server
database. (16)
Or
(b) I llustrate the uses of Embedded SQL and Dynamic SQL with
suitable examples. (16)
13. (a) E
xplain in detail about all functional dependencies based normal
forms with suitable examples. (16)
Or
(b) D
escribe about the Join Dependencies and Fifth normal form with
suitable example. (16)
14. (a) D
iscuss in detail about transaction concepts and two phase commit
protocol. (16)
Or
(b) Write down in detail about intent locking and isolation levels. (16)
15. (a) Jot down detailed notes on ordered indices and B – Tree index files.
(16)
Or
(b) Describe in detail about RAID and Tertiary storage. (16)
Solutions
Part A
7. SQL server is designed to recover from system and media failure and
recovery system can scale to machines with very large buffer pools and
thousand of disk drives
Part B
for all programs that access the file. It can also be noticed in Fig.
1.18 that SALES file has been used in both “Account receivable
program” and “Sales statement program”. If it is decided to change
the CUST-ID field length from 4 characters to 6 characters, the
file descriptions in each program that is affected would have to be
modified to confirm to the new file structure. It is often difficult to
even locate all programs affected by such changes. It could be very
time consuming and subject to error when making changes. This
characteristic of file-oriented system is known as program-data
dependence.
(d) Poor data control: As shown in Fig. 1.19, a file-oriented system
being decentralised in nature, there was no centralised control at
the data element (field) level. It could be very common for the data
field to have multiple names defined by the various departments
of an organisation and depending on the file it was in. This could
lead to different meanings of a data field in different context, and
conversely, same meaning for different fields. This leads to a poor
data control, resulting in a big confusion.
(e) Limited data sharing: There is limited data sharing opportunities
with the traditional file-oriented system. Each application has its own
private files and users have little opportunity to share data outside
their own applications. To obtain data from several incompatible
files in separate systems will require a major programming effort.
In addition, a major management effort may also be required since
different organisational units may own these different files.
(f) Inadequate data manipulation capabilities: Since File-oriented
systems do not provide strong connections between data in different
files and therefore its data manipulation capability is very limited.
(g) Excessive programming effort: There was a very high
interdependence between program and data in file-oriented system
and therefore an excessive programming effort was required for a
new application program to be written. Even though an existing
file may contain some of the data needed, the new application often
requires a number of other data fields that may not be available in
the existing file. As a result, the programmer had to rewrite the code
for definitions for needed data fields from the existing file as well
as definitions of all new data fields. Therefore, each new application
required that the developers (or programmers) essentially start from
scratch by designing new file formats and descriptions and then
write the file access logic for each new program. Also, both initial
and maintenance programming efforts for management information
applications were significant.
(h) Security problems: Every user of the database system should
not be allowed to access all the data. Each user should be allowed
to access the data concerning his area of application only. Since,
applications programs are added to the file-oriented system in an
ad hoc manner, it was difficult to enforce such security system.
university academic departments along with data on all faculties for each
department and all courses taught by each faculty within a department.
Fig. 2.13 (b) shows the defined fields or data types for department,
faculty, and course record types. A single department record at the root
level represents one instance of the department record type. Multiple
instances of a given record type are used at lower levels to show that
a department may employ many (or no) faculties and that each faculty
may teach many (or no) courses. For example, we have a COMPUTER
department at the root level and as many instances of the FACULTY
record type are faculties in the computer department. Similarly, there
will be as many COURSE record instances for each FACULTY record
as that faculty teaches. Thus, there is a one-to-many (1:m) association
among record instances, moving from the root to the lowest level of
the tree. Since there are many departments in the university, there are
many instances of the DEPARTMENT record type, each with its own
FACULTY and COURSE record instances connected to it by appropriate
branches of the tree. This database then consists of a forest of such tree
instances; as many instances of the tree type as there are departments in
the university at any given time. Collectively, these comprise a single
hierarchic database and multiple databases will be online at a time.
11. (b)
12. (a)
Client/Server Database System
Client/server architecture of database system has two logical components
namely client, and server. Clients are generally personal computers or
workstations whereas server is large workstations, mini range computer
system or a mainframe computers system. The applications and tools of
DBMS run on one or more client platforms, while the DBMS softwares
reside on the server. The server computer is called backend and the
client’s computer is called front-end. These server and client computers
are connected into a network. The applications and tools act as clients
of the DBMS, making requests for its services. The DBMS, in turn,
processes these requests and returns the results to the client(s). Client/
server architecture handles the graphical user interface (GUI) and does
computations and other programming of interest to the end user. The
server handles parts of the job that are common to many clients, for
example, database access and updates.
12 (b)
Embedded Structured Query Language (SQL)
We have looked at a wide range of SQL query constructs in the previous
sections, wherein SQL is treated as an independent language in its
own right. A RDMS supports an interactive SQL interface through
which users directly enter these SQL commands. However, in practice,
often we need a greater flexibility of a general purpose programming
language such as integrating database application with graphical user
interface. This is in addition to the data manipulation facilities provided
by SQL. To deal with such requirements, SQL statements can be directly
embedded in procedural language (that is, program’s source code such
as COBOL, C, Java, PASCAL, FROTRAN, PL/I and so on) along with
other statements of the programming language. A language in which
programd can create SQL queries as strings at run time and can either
have them executed immediately or have prepared for subsequent use
SQL defines standards for embedded Dynamic SQL calls in a host
language,such as C as in the following example.
Char*sqlprog=”update account
Set balance = balance*1.05
Where acc_no=?”
EXEC SQL prepare dynprog from : sqlprog;
Char acc[10]=”101”;
EXEC SQL execute dynprog using :acc;
13. (a)
First Normal Form (1NF)
A relation is said to be in first normal form (1NF) if the values in the
domain of each attribute of the relation are atomic (that is simple and
indivisible). In 1NF, all domains are simple and in a simple domain, all
elements are atomic. Every tuple (row) in the relational schema contains
only one value of each attribute and no repeating groups. 1NF data
requires that every data entry, or attribute (field) value, must be non-
decomposable. Hence, 1NF disallows having a set of values, a tuple of
values or a combination of both as an attribute value for a single tuple.
1NF disallows multi-valued attributes that are themselves composites.
This is called “relations within relations”, or nested relations, or
“relations as attributes of tuples”.
Example 1
Consider a relation LIVED_IN, which keeps records of person and his
residence in different cities. In this relation, the domain RESIDENCE is
not simple. For example, an attribute “Abhishek” can have residence in
Jamshedpur, Mumbai or Delhi. Therefore, the relation is un-normalised.
Now, the relation LIVED_IN is normalised by combining each row in
residence with its corresponding value of PERSON and making this
combination a tuple (row) of the relation, as shown in below figure.
Thus, now non-simple domain RESIDENCE is replaced with simple
domains.
Relation LIVED-IN
Example 1
The partial dependency of the doctor’s contact number on the key
DOCTORNAME indicates that the relation is not in 2NF. Therefore, to
bring the relation in 2NF, the information about doctors and their contact
numbers have to be separated from information about patients and their
appointments with doctors. Thus, the relation is decomposed into two
tables, namely PATIENT_DOCTOR and DOCTOR, as shown in Table.
The relational table can be depicted as:
PATIENT_DOCTOR (PATIENT-NAME, DATE-OF-BIRTH,
DOCTOR-NAME,
DATE-TIME, DURATION-MINUTES)
DOCTOR (DOCTOR-NAME, CONTACT-NO)
Table 10.4 Relation PATIENT_DOCTOR decomposed into two tables
for refirement into 2NF
•• mutually independent,
•• functionally dependent on the primary (or relation) key.
•• X is super key of R,
•• X → Y is a trivial FD, that is, Y ⊂ X.
13. (b)
Join Dependencies (JD)
A join dependency (JD) can be said to exist if the join of R1 and R2 over
C is equal to relation R. Where, R1 and R2 are the decompositions R1(A,
B, C), and R2 (C,D) of a given relations R (A, B, C, D). Alternatively, R1
and R2 is a lossless decomposition of R. In other words, *(A, B, C, D),
(C, D) will be a join dependency of R if the join of the join’s attributes
is equal to relation R. Here, *(R1, R2, R3, ….) indicates that relations R1,
R2, R3 and so on are a join dependency (JD) of R. Therefore, a necessary
condition for a relation R to satisfy a JD *(R1, R2,…., Rn) is that
R = R1 ∪ R2 ∪.....∪ Rn
Thus, whenever we decompose a relation R into 1 R = XUY and R2 = (R –
Y) based on an MVD X →→Y that holds in relation R, the decomposition
has lossless join property. Therefore, lossless-join dependency can be
defined as a property of decomposition, which ensures that no spurious
tuples are generated when relations are returned through a natural join
operation.
Example 1
Let us consider a relation PERSONS_ON_JOB_SKILLS, as shown
in Table 10.14. This relation can be decomposed into three relations
namely, HAS_SKILL, NEEDS_SKILL and ASSIGNED_TO_JOBS.
Fig. 10.11 illustrates the join dependencies of decomposed relations. It
can be noted that none of the two decomposed relations are a lossless
decomposition of PERSONS_ON_JOB_SKILLS. In fact, a join of all
three decomposed relations yields a relation that has the same data as
does the original relation PERSONS_ON_JOB_SKILLS. Thus, each
relation acts as a constraint on the join of the other two relations.
Now, if we join decomposed relations HAS_SKILL and NEEDS_
SKILL, a relation CAN_USE_JOB_SKILL is obtained, as shown in
Fig. 10.11. This relation stores the data about persons who have skills
applicable to a particular job. But, each person who has a skill required
for a particular job need not be assigned to that job. The actual job
assignments are given by the relation JOB_ASSIGNED. When this
relation is joined with HAS_SKILL, a relation is obtained that will
contain all possible skills that can be applied to each job. This happens
because persons assigned to that job, possesses those skills. However,
some of the jobs do not require all the skills. Thus, redundant tuples
(rows) that show unnecessary SKILL-TYPE and JOB combinations are
removed by joining with relation NEEDS_SKILL.
Example 1
Let us consider the relation PERSONS_ON_JOB_SKILLS of Fig.
10.11. The three relations are
HAS_SKILL (PERSON, SKILL-TYPE)
NEEDS_SKILL (SKILL-TYPE, JOB)
JOB_ASSIGNED (PERSON, JOB))
Now by applying the definition of 5NF, the join dependency is given as:
*((PERSON, SKILL-TYPE), (SKILL-TYPE, JOB), (PERSON, JOB))
The above statement is true because a join relation of these three
relations is equal to the original relation PERSONS_ON_JOB_SKILLS.
The consequence of these join dependencies is that the SKILL-TYPE,
JOB or PERSON, is not relation key, and hence the relation is not in
5NF. Now suppose, the second tuple (row 2) is removed form relation
PERSONS_ON_JOB_SKILLS, a new relation is created that no longer
has any join dependencies. Thus the new relation will be in 5NF.
14. (a)
Two-phase Commit (2PC)
Two-phase commit protocol (2PL) is the simplest and most widely used
technique for recovery and concurrency control in distributed database
environment. 2PL mechanism guarantees that all database servers
participating in a distributed transaction either all commit or all abort.
In a distributed database system, each sub-transaction (that is, part of a
transaction getting executed at each site) must show that it is prepared-
to-commit. Otherwise, the transaction and all of its changes are entirely
aborted. For a transaction to be ready to commit, all of its actions must
have been completed successfully. If any sub-transaction indicates that
its actions cannot be completed, then all the sub-transactions are aborted
and none of the changes are committed. The two-phase commit process
requires the coordinator to communicate with every participant site.
As the name implies, two-phase commit (2PC) protocol has two phases
namely the voting phase and the decision phase. Both phases are initiated
by a coordinator. The coordinator asks all the participants whether they
are prepared to commit the transaction. In the voting phase, the sub-
transactions are requested to vote on their readiness to commit or abort.
In the decision phase, a decision as to whether all sub-transactions
should commit or abort is made and carried out. If one participant votes
to abort or fails to respond within a timeout period, then the coordinator
Limitations
•• A failure of the coordinator of sub-transactions can result in the
transaction being blocked from completion until the coordinator is
restored.
•• Requirement of coordinator results into more messages and more
overhead.
TRANSACTION CONCEPTS
A transaction is a logical unit of work of database processing that
includes one or more database access operations. A transaction can be
defined as an action or series of actions that is carried out by a single
user or application program to perform operations for accessing the
contents of the database. The operations can include retrieval, (Read),
insertion (Write), deletion and modification. A transaction must be either
completed or aborted. A transaction is a program unit whose execution
may change the contents of a database. It can either be embedded
within an application program or can be specified interactively via a
high-level query language such as SQL. Its execution preserves the
consistency of the database. No intermediate states are acceptable. If the
database is in a consistent state before a transaction executes, then the
database should still be in consistent state after its execution. Therefore,
to ensure these conditions and preserve the integrity of the database a
database transaction must be atomic (also called serialisability). Atomic
transaction is a transaction in which either all actions associated with the
transaction are executed to completion or none are performed. In other
words, each transaction should access shared data without interfering
with the other transactions and whenever a transaction successfully
completes its execution; its effect should be permanent. However, if due
to any reason, a transaction fails to complete its execution (for example,
system failure) it should not have any effect on the stored database. This
basic abstraction frees the database application programmer from the
following concerns:
•• Inconsistencies caused by conflicting updates from concurrent users.
•• Partially completed transactions in the event of systems failure.
•• User-directed undoing of transactions.
14. (b)
In the ideal world of isolation, a transaction completes before another
begins
•• In the real world, isolation has levels
•• The higher the level of isolation, the less interference, and the greater
concurrency
•• Cursor stability permits shared locks for reading, that are released
before the transaction completes
•• Repeatable read ensures that a read is repeatable throughout the
transaction
•• Locking granularity: locks can be taken at the tuple level, or by rel-
var, database, or attribute
•• To avoid examining every tuple to determine if any are locked,
the intent locking protocol declares a lock at the relvar to forecast
conflicts
•• Modes are intent shared, intent exclusive and shared intent exclusive
15. (a)
In an ordered index, index entries are stored sorted on the search key
value. E.g., author catalog in library.
Primary index: in a sequentially ordered file, the index whose search key
specifies the sequential order of the file. Also called clustering index
The search key of a primary index is usually but not necessarily the
primary key.
Dense and sparse indices— Index record appears for every search-key
value in the file.
Dense index:
An index record appears for every search key value in the file. The
index record contains the search key value and a pointer to the first data
records with that search key value.
Multilevel Indies:
If our file is extremely large,even the outer index may grow too large to
fit in main memory.In such case we can create yet another level of index.
Indeed we can repeat this process as many time as necessary.Indice with
two or more levels are called multilevel indices.
Index Update:
Insertion
The system performs a lookup using the search key value that appears in
the record to be inserted.
Dense indices:
1. If the search key value does not appear in the index,the system
inserts an index record with the search key value in the index at the
appropriate position.
2. Otherwise if the index record stores pointer to all records with the
same search key value the system adds a pointer to the new record to
the index record.
Spares indices:
If the system creates a new block it inserts the first search key value
appearing in the new block into the index.
Deletion
If deleted record was the only record in the file with its particularsearch-
key value, the search-key is deleted from the index also.
Single-level index deletion:
Dense indices – deletion of search-key is similar to file record deletion.
Sparse indices – if an entry for the search key exists in the index, itis
deleted by replacing the entry in the index with the next search keyvalue
in the file (in search-key order). If the next search-keyvalue already has
an index entry, the entry is deleted instead of being replaced.
B-Tree Index Files
Similar to B+-tree, but B-tree allows search-key values to appear only
once; eliminates redundant storage of search keys.
Search keys in nonleaf nodes appear nowhere else in the Btree an
additional pointer field for each search key in a nonleaf node must be
included.
Nonleaf node – pointers Bi are the bucket or file record
Example
15. (b)
Raid Technology
With fast growing database applications such as World Wide Web,
multimedia and so on, the data storage requirements are also growing
at the same pace. Also, faster microprocessors with larger and larger
primary memories are continually becoming available with the
exponential growth in the performance and capacity of semiconductor
devices and memories. Therefore, it is expected that secondary storage
technology must also take steps to keep up in performance and reliability
with processor technology to match the growth.
Development of redundant arrays of inexpensive disks (RAID) was a
major advancement in secondary storage technology to achieve improved
performance and reliability of storage system. Lately, the “I” in RAID
is said to stand for independent. The main goal of RAID is to even out
the widely different rates of performance improvement of disks against
1. L
ist four significant differences between a file-processing system and a
DBMS.
3. D
escribe a circumstance in which you would choose to use embedded
SQL rather than using SQL alone.
4. L
ist two major problems with processing of update operations expressed
in terms of views.
5. G
ive an example of a relation schema R and a set of dependencies such
that R is in BCNF, but not in 4NF
6. W
hy are certain functional dependencies called as trivial functional
dependencies?
8. W
hat benefit does strict two-phase locking provide? What disadvantages
result?
10. W
hen is it preferable to use a dense index rather than a sparse index?
Explain your answer.
Part B (5 × 16 = 80 marks)
11. (a) D
iscuss in detail about database system architecture with neat
diagram.
Or
(b) D
raw an E-R diagram for a banking enterprise with almost all
components and explain.
12. (a) E
xplain in detail about Relational Algebra, Domain Relational
Calculus and Tuple Relational Calculus with suitable examples.
Or
(b) Briefly present a survey on Integrity and Security.
13. (a) E
xplain in detail about 1NF, 2NF, 3NF and BCNF with suitable
examples.
Or
(b) D
escribe about the Multi-Valued Dependencies and Fourth normal
form with suitable example.
14. (a) D
iscuss in detail about Transaction Recovery, System Recovery and
Media Recovery.
Or
(b) Write down in detail about Deadlock and Serializability.
15. (a) C
onstruct a B+ tree to insert the following key elements (order of the
tree is 3) 5, 3, 4, 9, 7, 15, 14, 21, 22, 23.
Or
(b) D
escribe in detail about how records are represented in a file and
how to organize them in a file.
Solutions
Part A
2. Data models can be broadly classified into the following three categories:
•• Record-based data models
•• Object-based data models
•• Physical data models
4. (a) Since the view may not have all the attributes of the underlying tables,
insertion of a tuple into the view will insert tuples into the underlying
tables, with those attributes not participating in the view getting null
values. This may not be desirable, especially if the attribute in question
is part of the primary key of the table.
(b) If a view is a join of several underlying tables and an insertion
results in tuples with nulls in the join columns, the desired effect of the
insertion will not be achieved. In other words, an update to a view may
not be expressible at all as updates to base relations. For an explanatory
example, see the loaninfo updation example in Section 3.5.2.
A B C D
a1 b1 c1 d1
a1 b2 c2 d1
a1 b2 c3 d1
a2 b2 c3 d2
7. List sql
•• Query Optimizer - translates SQL into an ordered expression of rela-
tional DB operators (Select, Project, Join)
•• Query Executor - executes the ordered expression by running a pro-
gram for each operator, which in turn accesses records of files
Linear Hashing
•• This is another dynamic hashing scheme, an alternative to Extendible
Hashing.
•• LH handles the problem of long overflow chains without using a
directory, and handles duplicates.
•• Idea: Use a family of hash functions h0, h1, h2,...
Part B
11. (a)
Refer Qn 11 (b) from May/June 2013
11. (b)
branch-city
branch-name assets
branch
loan-branch
social-security payment-date
access-date
account-number balance
cust-banker type
depositor account
manager
employee works-for ISA
worker
e-social-security employee-name
savings-account checking-account
dependent-name telephone-number
12. (a)
RELATIONAL ALGEBRA
Relational algebra is a collection of operations to manipulate or access
relations. It is a procedural (or abstract) language with operations that
is performed on one or more existing relations to derive result (another)
relations without changing the original relation(s). Furthermore,
relational algebra defines the complete scheme for each of the result
relations. Relational algebra consists of set of relational operators. Each
operator has one or more relations as its input and produces a relation as
its output. Thus, both the operands and the results are relations and so the
output from one operation can become the input to another operation.
The relational algebra is a relation-at-a-time (or set) language in which
all tuples, possibly from several relations, are manipulated in one
statement without looping. There are many variations of the operations
that are included in relational algebra. Originally eight operations were
proposed by Dr. Codd, but several others have been developed. These
eight operators are divided into the following two categories:
•• Set-theoretic operations.
•• Native relational operations.
Set-theoretic operations make use of the fact that tables are essentially
sets of rows. There are four set-theoretical operations, as shown in Table
4.3.
Table 4.3 Set-theoretic operations
= negation
∃ = existential quantifier (meaning ‘there EXISTS’)used for in formulae
that must be true for at least one instance
∀ = universal quantifier meaning ‘FORALL’) used in statements about
every instance
Tuple variables that are quantified by ∀ or ∃ are called bound variable.
Otherwise, they are called free variables.
Dr. Codd defined the well-formed formulas (WFFs) as follows:
•• Any term is a WFF.
•• If x is a WFF, so is (x) = | x. All free tuple variables in x remain free
in (x) and | x, and all bound tuple variables in x remain bound in (x)
and | x.
•• If x, y are WFFs, so are x ∧ y and x ∨ y. All free tuple variables in x
and y remain free in x ∧ y and x ∨ y.
•• ‘If x is a WFF containing a free tuple variable T, then ∃T(x) and
∀T(x) are WFFs. T now becomes a bound tuple variable, but any
other free tuple variables remain free. All bound terms in x remain
bound in ∃T(x) and ∀T(x).
•• No other formulas are WFFs.
13. (a)
Refer Q. No. 13 (a) November/December 2012
13. (b)
MULTI-VALUED DEPENDENCIES (MVD) AND FOURTH
NORMAL FORM (4NF)
To deal with the problem of BCNF, R. Fagin introduced the idea of
multi-valued dependency (MVD) and the fourth normal form (4NF). A
multi-valued dependency (MVD) is a functional dependency where the
dependency may be to a set and not just a single value. It is defined as
X →→Y in relation R (X, Y, Z), if each X value is associated with a set
of Y values in a way that does not depend on the Z values. Here X and
Y are both subsets of R. The notation X →→Y is used to indicate that a
set of attributes of Y shows a multi-valued dependency (MVD) on a set
of attributes of X.
14. (a)
TYPES OF DATABASE RECOVERY
In case of any type of failures, a transaction must either be aborted or
committed to maintain data integrity. Transaction log plays an important
role for database recovery and bringing the database in a consistent state
in the event of failure. Transactions represent the basic unit of recovery
in a database system. The recovery manager guarantees the atomicity
and durability properties of transactions in the event of failures. During
recovery from failure, the recovery manager ensures that either all the
effects of a given transaction are permanently recorded in the database
or none of them are recorded. A transaction begins with successful
execution of a <T, BEGIN>” (begin transaction) statement. It ends with
successful execution of a COMMIT statement. The following two types
of transaction recovery are used:
•• Forward recovery.
•• Backward recovery.
since the last database copy was made. In this case, only the last one of
those changes at the point that the disk was destroyed needs to be used
in updating the database copy in the rolled-forward operation. Another
roll-forward variation is to record an indication of what the transaction
itself look like at the point of being executed along with other necessary
supporting information, instead of reading before and after images of
the data in the log.
Backward Recovery (or UNDO)
Backward recovery (also called roll-backward) is the recovery procedure,
which is used in case an error occurs in the midst of normal operation on
the database. The error could be a human keying in a value, or a program
ending abnormally and leaving some of the changes to the database that
it was suppose to make. If the transaction had not committed at the time
of failure, it will cause inconsistency in the database as because in the
interim, other programs may have read the incorrect data and made use
of it. Then the recovery manager must undo (rollback) any effects of the
transaction database. The backward recovery guarantees the atomicity
property of transactions.
Fig. 13.2 illustrates an a example of backward recovery method. In case
of a backward recovery, the recovery is started with the database in its
current state and the transaction log is positioned at the last entry that
was made in it. Then a program reads ‘backward’ through log, resetting
each updated data value in the database to it “before image” as recorded
in the log, until it reaches the point where the error was made. Thus,
the program ‘undoes’ each transaction in the reverse order from that in
which it was made.
Example 1
Roll-backward (undo) and roll forward (redo) can be explained with
an example as shown in Fig. 13.3 in which there are a number of
concurrently executing transactions T1, T2, ....., T6 . Now, let us assume
that the DBMS starts execution of transactions at time ts but fails at
time tf due to disk crash at time tc. Let us also assume that the data for
transactions T2 and T3 has already been written to the disk (secondary
storage) before failure at time tf.
It can be observed from Fig. 13.3 that transactions T1 and T6 had not
committed at the point of the disk crash. Therefore, the recovery manager
must undo the transactions T1 and T6 at the start. However, it is not clear
from Fig. 13.3 that to what extent the changes made by the other already
committed transactions T1 and T6 have been propagated to the database
on secondary storage. This uncertainty could be because the buffers may
or may not have been flushed to secondary storage. Thus, the recovery
manager would be forced to redo transactions T2, T3, T4 and T5 .
the write log (W, 1, A, 50, 20), the value 50 is the before image for the
balance column in this row and 20 is the after image for this column.
Now, let us assume that a system crash occurs immediately after the
operation 1 W (B, 80) has completed, in the sequence of events of Table
13.1. This means that the log entry (W, 1, B, 50, 80) has been placed
in the log buffer, but the last point at which the log buffer was written
out to disk was with the log entry (C, 2). This is the final log entry that
will be available when recovery is started to recover from the crash. At
this time, since transaction T2 has committed while transaction T1 has
not, we want to make sure that all updates performed by transaction T2
are placed on disk an that all updates performed by transaction T1 are
rolled back on disk. The final values for these data items after recovery
has been performed should be A = 50, B = 50, and C = 50, which is the
values just before Table 13.1.
After the crash system is reinitialised, a command is given to initiate
database recovery. The process of recovery takes place in two phases
namely (a) roll backward or ROLLBACK and (b) roll forward or ROLL
FORWARD. In the ROLLBACK phase, the entries in the sequential
log file are read in reverse order back to system start-up, when all data
access activity began. We assume that the system start-up happened
just before the first operation 1 R (A, 50) of transaction history. In the
ROLL FORWARD phase, the entries in the sequential log file are read
forward again to the last entry. During the ROLLBACK step, recovery
performs UNDO of all the updates that should not have occurred,
because the transaction that made them did not commit. It also makes a
list of all transactions that have committed. We have assumed here that
the ROLLBACK phase occurs first and the ROLL FORWARD phase
afterward, as is the case in most of the commercial DBMSs such as DB2,
System R of IBM.
Table 13.2 and 13.3 list all the log entries encountered and the actions
taken during ROLLBACK and ROLL FORWARD phases of recovery.
It is to be noted that the steps of ROLLBACK are numbered on the left
and the numbering is continued during the ROLL FORWARD phase of
table 13.3. During ROLLBACK the system reads backward through the
log entries of the sequential log file and makes a list of all transactions
that did and did not commit. The list of committed transactions is used in
the ROLL FORWARD, but the list of transactions that did not commit is
used to decide when to UNDO updates. Since the system knows which
transactions did not commit as soon as it encounters (reading backward)
the final log entry, it can immediately begin to UNDO write log changes
of uncommitted transactions by writing before images onto disk over
the row values affected. Disk buffering is used during recovery to read
in pages containing rows that need to be updated by UNDO or REDO
steps. An example of UNDO write is shown in step 4 of table 13.2. Since
the transaction responsible for the write log entry did not commit, it
should not have any transactional updates out on disk. It is possible that
some values given in the after images of these write log entries are not
out on disk. But, in any event it is clear that writing the before images
in place of these data items cannot hurt. Eventually, we return to the
value such data items had before any uncommitted transactions tried to
change them.
Table 13.3 ROLL FORWARD process for transaction history taking
place after ROLLBACK of table 13.2
During the ROLL FORWARD phase of table 13.3, the system simply
uses the list of committed transactions gathered during the ROLLBACK
phase as a guide to REDO updates of committed transactions that might
not have gotten out of disk. An example of REDO is shown in step 9
of table 13.3. At the end of this phase the data item would have the
right values. All updates of transactions that committed are applied and
all updates of transactions that did not complete are rolled back. It can
be noted that in step 4 of ROLLBACK of table 13.2, the value 50 is
written to the data item A and in step 9 of ROLL FORWARD of table
13.3, the value 50 is written to data item C. It can be recalled that the
crash occurred just after the operation in W1(B, 80) of transaction log
operation history. Since the log entry for this operation did not get to
the disk, as can be seen in table 13.1, the before image of B cannot be
applied during recovery. The update for B to the value 80 also did not get
out to disk. Thus, the final values for the three data items mentioned in
the original transaction log history are A = 50, B = 50 and C = 50, which
was the values just before table 13.1.
Media Recovery
Media recovery is performed when there is a head crash (record
scratched by a phonograph needle) on the disk. During a head crash, the
data stored on the disk is lost. Media recovery is based on periodically
making a copy of the database. In the simplest form of media recovery,
before system start-up, bulk copy is performed for all disks being run
on a transactional system. The copies are made to duplicate disks or
to less expensive tape media. When a database object such as a file
or a page is corrupted or a disk has been lost in a system crash, the
disk is replaced with a back-up disk, and normal recovery processes is
performed. During this recovery, however, ROLLBACK is performed all
the way to system start-up, since one can not depend on the backup disk
to have any updates that were forced out to the last checkpoint. Then,
ROLL FORWARD is performed from that point to the time of system
crash. Thus, the normal recovery allows recovering all updates on this
backup disk.
14. (b)
Refer Q. No. 14. (a) from May/June 2013
15. (a)
Refer Qn. No. 15. (a) from May/June 2013
15. (b)
FILE ORGANISATION
A file organisation in a database system essentially is a technique of
physical arrangement of records of a file on secondary storage device.
It is a method of arranging data on secondary storage devices and
addressing them such that it facilitates storage and read/write (input/
output) operations of data or information requested by the user. The
organisation of data in a file is influenced by number of factors that
must be taken into consideration while choosing a particular technique.
Some of these factors are as follows:
•• Fast response time required to access a record (data retrieval), trans-
fer the data to the main memory, write record and or modify a record.
•• High throughput.
•• Intended use (type of application).
•• Efficient utilisation of secondary storage space.
•• Efficient file manipulation operations.
•• Protection from failure or data loss (disk crashes, power failures and
so on).
•• Security from unauthorised use.
•• Provision for growth.
•• Cost.
reside on one or several pages in the secondary storage. Each record has
a unique identifier called a record-id. A file can be of:
(a) Fixed length records.
(b) Variable length records.
Fixed-length Records
In a file with fixed-length records, all records on the page are of the
same slot length. Record slots are uniform, and records are arranged
consecutively within a page. Every record in the file has exactly
the same size (in bytes). The below Figure (a) shows a structure of
PURCHASE record and the below Figure (b) shows number of records
in the PURCHASE record. As shown, all records are having same fixed
length of total 50 bytes, if we assume that each character occupies 1 byte
of space. That means, each record uses 50 bytes and occupies slots in the
page one after another in a serial sequence. A record is identified using
both page-id and slot number of the record.
PURCHASE record
The first operation is to insert records in the first available slots (or
empty spaces). Now whenever a record is deleted, the empty slot created
by deletion of record must be filled with some other record of the file.
This can be achieved using number of alternatives. The first alternative
is that the record that came after deleted record can be moved into the
empty space formally occupied by the deleted record. This operation
will continue until every record following the deleted record has been
moved ahead. The below Fig. (a) shows an empty slot created by deletion
of record 5, whereas in the below Fig. (b) all the subsequent records have
moved one slot upward from record 6 onwards. All empty slots appear
together at the end of the page. Such an approach requires moving a
large number of records depending on the position of deleted record in
a page of the file.
Deletion operation on PURCHASE record
The second alternative is that only the last record is shifted in empty
slot of deleted record, instead of disturbing large number of records,
as shown in Fig. (c). In both these two alternatives, it is not desirable
to move records to occupy the empty slot of deleted record as because
doing so requires additional block accesses. As insertion of records is a
more frequently performed operation than deletion of records, it would
be more appropriate to keep the empty slot of the deleted record vacant
for a subsequent insertion of a record before the space can be reused.
Therefore, a third alternative is used in which the deletion of a record
is handled by using an array of bits (or bytes) called file header at the
beginning of the file, one per slot, to keep track of free (or empty) slot
information. Till the time record is stored in the slot, its bit is ON. But
when a record is deleted, its bit is turned OFF. The file header tracks this
shorter than the maximum size is filled with a special null or end-of-
record symbol. Fig. 3.12 shows fixed-length representation of the file of
Fig. 3.11. As shown, suppliers KLY System, Concept Shapers and Trinity
Agency have maximum of two order numbers (ORD-NO). Therefore, the
PURCHASE-INFO array of PURCHASE-LIST record contains exactly
two records for maximum of two ORD-NO per supplier. The suppliers
with less than two ORD-NO will have records with null field (symbol
⊥) in the place of second ORD-NO. The reserved-space method is useful
when most records have a length close to the maximum. Otherwise, a
significant amount of space may be wasted.
Fig. 3.12 Reserved-space method of Fixed-length representation for
implementing variable-length records
10. When is it preferable to use a dense index rather than a sparse index?
Explain your answer.
PART–B (5 × 16 = 80 Marks)
11. (a) Discuss in detail about database system architecture with neat dia-
gram.
Or
(b) Draw an E-R diagram for a banking enterprise with almost all com-
ponents and explain.
12. (a) Explain in detail about Relational Algebra, Domain Relational Cal-
culus and Tuple Relational Calculus with suitable examples.
Or
(b) Briefly present a survey on Integrity and Security.
13. (a) Explain in detail about 1NF, 2NF, 3NF and BCNF with suitable
examples.
Or
(b) Describe about the Multi-Valued Dependencies and Fourth normal
form with suitable example
14. (a) Discuss in detail about Transaction Recovery, System Recovery and
Media Recovery.
15. (a) Construct a B+ tree to insert the following key elements (order of the
tree is 3) 5, 3, 4, 9, 7, 15, 14, 21, 22, 23.
Or
(b) Describe in detail about how records are represented in a file and
how to organize them in a file.
2. Data Models:-
classified in four different categories.
• Relational Model.
• Entity–Relationship Model.
• Object Based Data Model
• Semi structured Data Model.
5. Result : = {R};
done ; = False;
compute F+;
while (not done) do
if (there is a schema R, in result that is not in BCNF)
then begin
let a→b be a non trivial functional dependency than holds on Ri such that
a→Ri is not in F+, and a ∩ b =f;
result: = (result – Ri) ∪ (Ri−b) ∪ (a, b);
end
else done : = true;
8.
• Ensures serializability
• Because it produces only cascadless schedules, recovery is very easy.
• It produces only cascadless schedules, recovery is easy.
• A transaction always reads a value written by a committed transaction.
10. Using the dense index we follow the pointer directly to the first “Per-
ryridge” record. We process this record, and follow the pointer in that
record to locate the next record in search-key order. We continue pro-
cessing records until we encounter a record for a branch other than ‘Per-
ryridge’.
PART B
11. (a)
Native users Sophisticated
Application Data base
(tellers, agents users
programmers administrator
web users) (analysts)
Compiler and
DML queries DDL interpreter
linker
Application
program DML compiler
object code and organizer
Query evaluation
engine Query processor
Buffer
ff manager File manager Authorization Transaction
and integrity manager
manager
Storage manager
DDL compiler:
Which translates DML statement in a query language into an evalua-
tion plan consisting of low level instructions.
Query evaluation engine:
Which executes low level instructions generated by the DML Com-
piler.
Data Model:
It is a collection of concepts that can be used to describe the structure
of a database.
11. (b)
Br-name Br-city
Assets
Branch
Loan-
Cus-name branch
Cus-id Cus-street
Loan-number Amount
Cus-city
Access-date
Cust- Acc-num Balance
T
Type
banker
Depositer
p Account
Manager
Employee Worker
Worker
Emp-id Emp-name
p
Dept-name
p T
Telephone-num
selection condition
s(Selection condition) (R) R-relation
Selection condition
⇓
<attribute name><comparison operator><constant value>
<attribute name><comparison operator><attribute name>
Eg :- s salary > 1000(EMPLOYEE)
Project operation:
It is used to select some of the columns from the tables and discard
the other column.
p<attribute list> (R)
Eg : pname, DNO(EMPLOYEE)
Rename operation:-
This operation can rename either the relation name or the attribute
names, or both.
Ps(B , B …B )(R) = renames both the relation and its attributes
1 2 n
OR
Ps(R) = renames the relation
Division operation(÷):
It is suited to queries that include the phrase “for all”
Eg: Retrieve the names of employees who work on all the project
that “ABC” works –on.
Eg: ABC←s ename = ‘ABC’ (EMPLOYEE)
ABC-DNOS←pPNO(WORKS-ON EEID = EIDABC)
EID-DNOS ←pEEID, PNO(WORKS-ON)
(EIDS)(ETD) ←EID-PNOS-ABC-PNOS
RESULT←pename(EEIDS*EMPLOYEE)
R(Z) ÷ S(X) } > where X ∈ z.
Y = Z – X (ie) Z = X ∪ Y).
Aggregate functions & Grouping
Aggregate functions of Script-F
< grouping attribute > of < function list >(R)
⇑
List of attribute of the relation specified in R.
< function list >(< function > < attribute >)
⇓
function name.
Relational calculus:-
It is a formal query language where we can write one declara-
tive expression to specify a retrieval request and hence there is no
description of how to retrieve it.
A calculus expression specifies what is to be retrieved rather than
how to retrieve it.
Relational calculus is considered to be non procedural language.
Tuple Relational calculus:-
Tuple variables and Range Relations:-
A tuple relational calculus is based on specifying a number of tuple
variables.
{t/COND(t)}
t-tuple, COND(t) –conditional expression involving it.
Eg: {t/EMPLOYEE (t) and t. salary 50,000}
Expressions and Formulas:-
A general expressions of the tuple relational calculus of the form.
{t.A1, t2.A2, … tn.An/ COND (t1, t2…tn, tn+1…tn + m) t1, t2…tn, tn+1…tn + m -tuple
variables
A1-attribute of the relation of which ti ranges. COND is a condi-
tion or formulas.
1. An atom of the form R(ti)
2. An atom of the form ti.A op tj, B, op tj.B, op is one of the compari-
son operation {=, <, >, ≤, ≥, ≠}ti, Δtj,
3. An atom of the form ti. A op C or C op tj.B
Essential and universal quantifiers:-
A tuple variable t is bound if it is quantified, meaning that it appears
in an (3t) or (∀ t ) clause otherwise it is free.
• An occurance of a tuple variable in a formula F that is an atom
is free in F.
12. (b) It is mechanism used to prevent invalid data entry into the table.
Types
Domain integrity constraints
Entity integrity constraints
Referential integrity constraints
Domain integrity constraints
Types
(i) Not Null constraint
(ii) Check constraint
(i) Not Null constraint
Eg: Create table employee (eid Number (5) constraint emp not
Null, ename Varchar(2));
emp-constraint name (it is optional)
It is used to enforce that the particular column will not accept
null values.
EMP-PROJ 1 EMP-PROJ 2
eid ename eid Pnumber Hours
ED1
ED2
ED3
2NF normalisation
ED1
eid Pnumber Hours
ED1
ED2
eid ename
ED2
ED3
Pnumber Pname PLOCATION
A
ED3
EMP-DEPT
ename eid DOB address Dnumber Dname DMGRid
ED1
ename eid DOB address Dnumber
ED3
Dnumber Dname DMGRid
Relational Decomposition:-
A single decomposition schema R = {A1, A2... An} that includes all
the attributes of the database.
∪R
i =1
i R
EMP
EMP-PROJECTS EMP-DEPENDENTS
(a)
EMP is not in 4NF, because in the non trivial MVDS
ENAME PNAME and ENAME DNAME
Ename is not super key of EMP
Importance of 4NF:
EMP relation has additional employee BB who has three dependents
(CC, DD, EE) and works on your different projects (A, B, C, D).
There are 16 tuples in EMP, but (a), (b) we need to store only 11
tuples in both relations.
Not only would the decomposition save on storage but also the
update anomalies associated with multi valued dependencies are
avoided.
Serializabilty:-
→There are two types of serializability
• Conflict serializability
• View serializability
Conflict Serializability:
Let us consider a schedule S in which there are two consecutive
instructions Ii and Ij of transactions Ti and Tj respectively
(i ≠ j)
(i) Ii = read (Q), Ij = read(Q). The order of Ii and Ij does not matter, since
the same value of Q is read by Ti and Tj regardless of the order.
(ii) Ii = read(Q), Ij = write(Q). If Ii comes before Ij then Ti does
not read the value of Q that is written by Tj in instruction.Ij. If
Ij comes before Ii, then Ti reads the value of Q that is written
by Tj. Thus the order of Ii and Ij matter S.
(iii) Ii = write (Q), Ij = read(Q). The order of Ii and Ij matters,
reason is same as previous case.
(iv) Ii = write(Q), Ij = write(Q) Since both instructions are write
operations, the order of these instructions does not affect either
Ti or Tj . However, the value obtained by the next read (Q)
instruction of S is affected, since the result of only the latter of
the two write instructions is preserved in the database.
T1 T2
read (A)
write(A)
read (A)
write(A)
read(B)
write(B)
read(B)
write(B)
Fig: Showing only the read & write operations.
T1 T2
read (A)
write(A)
read (A)
read(B)
write(A)
write(B)
read(B)
write(B)
Fig: After Swapping a pair of instructions.
T1 T2
read(Q)
write(Q)
write(Q)
View serializability:
For each data item Q, if transaction Ti reads the initial value of Q in
schedule S, then transaction Ti must, in Schedule S1, also read the
initial value of Q.
For each data item Q, if transaction Ti, executes read (Q) in sched-
ules S, and if that value was produced by a write(Q) operation exe-
cuted by Transaction Tj, then the read(Q) operation of transaction Ti
must, in schedule S1, also read the value of Q that was produced by
the same write (Q) operation of transaction Tj.
For each data item Q, the transaction that performs the final
write(Q) operation in schedule S must perform the final write(Q)
Operation in Schedule S’.
T3 T4 T5
read(Q)
write(Q)
write(Q)
write(Q)
15. (a) B+tree is nothing but technology used in the Query processing in the
database. For searching key elements here used in the tree pointers.
Now we construct the B+ tree for following elements 5, 3, 4, 9, 7,
15, 14, 21, 22, 23,
⇒First we construct 2 elements[∴order is 3]
3 5
3 4
5 9 then insert 9
5 9 3 4
7 5 9
9 15
5 7
3 4 9 15
14
21
3 4 7 9 15 21
3 4 7 9 15 21
22 23
header
record 0 A-102 Perry ridge 400
record 1
record 2 A-215 Mianus 700
record 3 A-101 Down town 500
record 4
record 5 A-201 Perry ridge 900
record 6
Customer-name Account-number
Hayes A-102
Hayes A-220
Hayes A-503
Hayes A-305
Fig: The depositor relation
PART B – (5 × 16 = 80 marks)
11. (a) (i) Construct an E-R diagram for a car-insurance company whose
customers own one or more cars each. Each car has associated
with it zero to any number of recorded accidents. State any
assumptions you make. (6)
(ii) A university registrar’s office maintains data about the following
entities:
1. Courses, including number, title, credits, syllabus, and pre-
requisites,
2. Course offerings, including course number, year, semester,
section number, instructor, timings, and classroom;
3. Students, including student-id, name, and program; and
4. Instructors, including identification number, name, depart-
ment, and title. Further, the enrollment of students in
courses and grades awarded to students in each course they
are enrolled for must be appropriately modeled. Construct
an E-R diagram for the registrar’s office. Document all
assumptions that you make about the mapping constraints.
(10)
Or
(b) (i) With a neat-sketch discuss the three-schema architecture of a
DBMS, (8)
(ii) What is aggregation in an ER model? Develop an E-R diagram
using aggregation that captures the following information:
Employees work for projects. An employee working for a par-
ticular project use various machinery. Assume necessary attri-
butes. State any assumptions you make. Also discuss about the
ER diagram you have designed. (2 + 6)
12. (a) (i) Explain the distinctions among the terms primary key, candidate
key, and super key. Give relevant examples. (6)
(ii) What is referential integrity? Give relevant example. (4)
(iii) Consider the following six relations for an Order-processing
Database
Application in a Company:
CUSTOMER (CUSTNO, CNAME, CITY)
ORDER (ORDERNO, ODATE, CUSTNO, ORD_AMT)
ORDER_ITEM (ORDERNO, ITEMNO, QTY)
ITEM (ITEMNO, ITEM_NAME, UNIT_PRICE)
SHIPMENT (ORDERNO, ITEMNO, WAREHOUSENO,
SHIP_DATE)
WAREHOUSE (WAREHOUSENO, CITY)
Here, ORD_AMT refers to total amount of an order; ODATE is
the date the order was placed: SHIP_DATE is the date an order
is shipped from the warehouse. Assume that an order can be
shipped from several warehouses. Specify the foreign keys for
this schema, stating any assumptions you make. (6)
(b) With relevant examples discuss the various operations in Relational
Algebra. (16)
13. (a) Define a functional dependency. List and discuss the six inference
rules for functional dependencies. Give relevant examples.(16)
Or
(b) (i) Give a set of Functional dependencies for the relation schema
R(A,B,C,D,E) with primary key AB under which R is in 2NF, but
not in 3NF. (5)
(ii) Prove that any relation schema with two attributes is in
BCNF. (5)
(iii) Consider a relation R that has three attributes ABC. It is
decomposed into relations R1, with attributes AB and R2 with
attributes BC. State the definition of lossless-join decomposition
with respect to this example, Answer this question concisely by
writing a relational algebra equation involving R,R1, and R2. (6)
14. (a) (i) Define a transaction. Then discuss the following with relevant
examples: (8)
1. A read only transaction
2. A read write transaction
3. An aborted transaction
(ii) With a neat sketch discuss the states a transaction can be in. (4)
(iii) Explain the distinction between the terms serial schedule and
serializable schedule. Give relevant example. (4)
(b) (i) Discuss the ACID properties of a transaction. Give relevant
example. (8)
(ii) Discuss two phase locking protocol. Give relevant example. (8)
15. (a) (i) When is it preferable to use a dense index rather than a sparse
index? Explain your answer. (4)
(ii) Since indices speed query processing, why might they not be
kept on several search keys? List as many reasons as possible.
(6)
(iii) Explain the distinction between closed and open hashing. Dis-
cuss the relative merits of each technique in database applica-
tions. (6)
Or
(b) Diagrammatically illustrate and discuss the steps involved in pro-
cessing a query. (16)
Solutions
PART A
2. A derived attribute is one that represent a value that is derivable from the
value of a related attribute or set of attributes
Eg the age attribute can be derived from the Date of Birth attribute.
4. 1 Open
2 Fetch
5. Yes, the above relation is in second normal form because it is in 1st normal
form and satisfy partial dependency
6. No, the above relation is not in 3 normal form. Because it does not satisfy
transitive dependency.
7. 1. Commit: It saves all transactions that have not already been saved to
the database since the last commit or rollback command was issued.
2. Rollback: It is used to undo transactions that have not already been
saved to the database.
3. Savepoint: It establishes a point back to which you may roll.
4. Set Transaction: It establishes properties for current transaction
8. 1. Lock Based Protocols
To ensure serializability, it is required that data items should be ac-
cessed in mutual exclusive manner. eg two phase locking Protocol.
2. Time stamp Based Protocols.
It ensures serializability. It selects an ordering among transactions in
advance using timestamps.
PART B
Driver-id Y
Year
License Model
Address
Report-no Location
Driver
Paticipated Accident
Date
Damage-amount
T
Teached by Has
Section number
Tables used:
1. University (Name, Address; Award)
2. Student (Student-id, name, program)
3. Courses(title, credits, syllabus, prerequisites)
4. Course offerings Course number, year, semester, section
number, instructor, timings, classroom
5. instructor (Identification number, name, department, title)
11. (b) (i) A database system is partioned into modules that deal with each
of the responsibilities of the overall system.
Storage Manager:-
A storage manager is a program module that provides the inter-
face between the low level data stored in the database and the
application programs and queries submitted to the system.
• The storage manager is responsible for the interaction with
the file manager.
• The storage manager translates the various DML statements
into low level file system commands.
Components of Storage Manager:
Authorization and Integrity Manager:
It tests for satisfaction of various integrity constraints and checks
the authority of users accessing the data.
Application
program DML compiler
object code and organizer
Query evaluation
engine Query processor
Buffer
ff manager File manager Authorization Transaction
and integrity manager
manager
Storage Manager
Transaction Manager:
It ensures that the database remain in a consistent state despite
system failures and concurrent executions proceed without con-
flicting
File Manager:
It manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer Manager:
It is responsible for fetching data from disk storage into main
memory & deciding what data to cache in main memory.
Data files:
which store the database itself.
Data Dictionary:
It contains metadata that is about data.The schema of a table is an
example of metadata.
Indices:
which provides fast access to data items that hold particular
values.
The query Processor:
It helps the database system to simplify and facilitate access to
data.
DDL Interpreter:
which interprets DDL statements and records the definitions in
the data dictionary
DML Compiler:
which translates DML statements in a query language into an
evaluation plan consisting of low level instructions that query
evaluation engine understands.
Query Evaluation Engine:
which executes low level instructions generated by the DML
compiler.
Machinery
Manager
g
Manager
Machinery
Manager
g
Manager
12. (a) ( i )
• A key allows is a set of attributes and thus distinguishes entities
from each other.
• Key also helps uniquely identify relationships.
Super key:
A super key is a set of one or more attributes that allows us to
identify uniquely an entity in the entity set.
Eg Roll_No attribute of the entity set ‘student’ distinguishes
one student entity from another.
Candidate Key:
A super key may contain extraneous attributes and we are
often interest in smallest super key. A super key for which no
subset is a super key called a candidate key.
Eg. stu_ name, stu_ street_unique student
Roll_no, {stu_name, stu_street}_candidate key
12. (a) (ii) “A value that appears in one relation for a given set of attributes
also appears for a certain set of attributes in another relation”.
→ the relation scheme for a weak entity set must include the
primary key of the entity set on which it depends.
→ Relation scheme for weak entity set includes a foreign key
that leads to a referential integrity constraint.
Eg ., create table Deposit (B_name, char [15], Acc_no char[10],
cust_name, char[20] not null, Bal integer, Primary key (Acc-no,
cust_name), foreign key (B_name) references Branch, foreign
key (cust_name) references customer);
12. (a) (iii) • An order can be shipped form several warehouses Orderno
and warehouse number are foreign keys for the order ship-
ment from warehouse.
• Schema.
• Foreign key is refer the 2 or more relations
Foreign Key for this,.
1. CUSTNO
2. ORDERNO
3. ITEMNO
4. WAREHOUSE NO
• foreign key is refers the unique key it contains mostly num-
bers.
Consider 2 relations
Depositor
Cust-name city
Hayes Pune
Johnson Mumbai
Jones Solapur
Lindsay Nashik
Smith Pune
Turner Mumbai
Borrower
Cust-name city
Adamar Mumbai
curry Pune
Hayes Pune
Jackson Solapur
Jones Solapur
Smith Pune
William Kolhapur
Union:
→ operation denoted by U
→ It includes all the tuples that are either in depositor & borrower
Result
Depositor ∪ Borrower
Cust-name city
Hayes Pune
.
.
.
.
Williams Kolhapur
Intersection:
The result of n is a relation that includes all tuples that all in both
depositor & borrower.
O/P
Depositor n borrower
Cust - name City
Hayes Pune
Jones Solapur
Smith Pune
Difference:
→ denoted by depositor- borrower
→ result of this is contain all tuples in depositor but not in bor-
rower.
Depositor-Borrower
Cust-name City
Johnson Mumbai
Lindsay Nashik
Turner Mumbai
Cartesian Product:
→ also known as CROSS PRODUCT or CROSS JOINS. denoted
by ‘X’.
→ the result relation will have one tuple or each combination of
tuples from each participating relation.
Pub_Info
Pub_Code Name
P001 MCGRAW
P002 PHI
P003 Pearson
BOOK_INFO
BOOK-ID Title
B001 DBMS
B002 Compiler
Pub_Info × book_info
Pub_Code Name Book_ID Title
P001 MCGRAW B001 DBMS
P002 PHI B001 DBMS
P003 Pearson B001 DBMS
P001 MCGRAW B002 Compiler
P002 PHI B002 Compiler
P003 Pearson B002 Compiler
Select operation:
Select operation selects tuples that satisfy a given predicate.
→ represented as s < select condi >(R)
Query: Display details of account holders living in the city
‘pune’
scity = ‘Pune’(depositor)
The project operation:
→ Project open selects certain columns from a table while dis-
carding others.
p < attributelist > (R)
Eg Query: pname (borrower)
Rename Operation:
→ we can rename either the relation or the attributes or both.
rs(new attribute name) (R)
Eg rTemp(Bname, Aname, pyear, Bprice) (Book,
→ the attributes are renamed.
13. (a)
• Functional dependencies differentiate good database designs from
bad.
• is a type of constraint that is a generalization of the notion of key.
Relation schema –R.
Let a ≤ R, b ≤ a functional dependency a →b
→ for all pairs of tuples t1 & t2 in r such that t1[a] = t2[a], call that
t1[b] = t2[b]
→ k is a super key of R if k→ R. i.e., k is super key if, whenever
t1[k] = t2[k], it is also call that t1[R] = t2[R] ie., (t1 = t2) functional
dependency
Loan-no → Branch-name
Loan-no → Amount
No functional dependency
Loan-no → Customer−name
as a given loan can be made to more than one customers.
Uses:
→ To test relations to see whether they are legal under a given set of
functional dependencies
→ To specify constraints on the set of legal relations.
6 Rules:
1. Armstrong ‘s Axioms:
(i) Reflexivity Rule: If a is set of attributes and b ≤ a, then a →
b holds.
(ii) Augmentation Rule: If a → b holds and r is a set of attributes
then, r a → r b holds.
(iii) Transitivity Rule: If a → b holds and b → r holds, then a →
r holds.
Additional Rules:
(i) Union rule: If a → b holds, and a → r holds, then a → br
holds
(ii) Decomposition rule: If a → br holds, then a → b holds and
a → r holds
(iii) Pseudo transitivity rule: If a → b holds and rb → d holds,
then ar → d holds.
Eg: R = (A,B,C,G,H,I)
Set FDS (A → B, A → C, CG → H, CG → H, CG → I, B → H)
Some numbers of F+ are,
* A→H
Proof: A → B & B → H
∴ transitivity rule.
A→H
* CG→ HI
Proof: CG→ H and CG→ I
∴ By union rule.
CG→ HI
AG→ I
Proof: A→ C & CG→ I
∴ By Pseudotransitivity rule.
AG→ I
Or A→ C
∴ By augmentation rule.
AG → CG & CG → I
∴ By transitivity rule.
Canonical cover:
A canonical cover Fc for F is a set of dependencies such that F logi-
cally implies all dependencies in Fcand Fc logically implies all in F
Fc must have
• no functional dependency on Fc contain an extraneous attribute.
• Each left side of functional dependency in Fc is unique.
Eg: consider the following set F of functional dependencies on
schema (A, B, C).
A → BC
B→C
A→B
AB → C
The canonical cover F is computed as,
A → BC A → B
Combine there FD s into A → BC
* A is extraneous attribute in AB → C b/c
(F−{AB−C})U{B → C}. b/c B → C already in FDS.
13. (b) (i) The relation schema is said to be in a second normal form when
it is already in 1st normal form & it satisfy partial dependency.
Partial Dependency:
Here the primary key A, B is in Relation schema, the primary key
partially depends on other attributes i.e non key so the relation R
(A, B, C, D, E) is in 2nd normal form.
Transitive Dependency:
The relation is 2nd normal form but it not satisfy transitive depen-
dency ie the primary key uniquely distinguish the each tuple so
it’s not transitive.
∴ Relation R is in 2NF not in 3NF.
Añ Title Royality
A1 T1 5000
A2 T2 7000
Añ Aname Aquali
A1 John Ph.D
A2 tom M.Tech
Partially
Committed
committed
Active
Failed Aborted
T1 T2
read (A)
A: = A−50
write (A)
read (B)
B: = B + 50
write(B)
read (A)
temp = A*0.1
A: A−temp
write(A)
read(B)
B: = B + temp
write(B)
Serializable schedule:
A serializable schedule is a schedule that has the same effect on
the database as a serial schedule
Eg
T1 T2 T3
read (Q)
write (Q) write (Q) write (Q)
14. (b) (i) Atomicity: Either all operations of the transaction are reflected
property in database or none are.
Eg Before execution of transaction values of A & B are 1000 &
2000 a failure occurs at middle (i e) after write(A) before write(B)
then A = 950 & B = 2000 to avoid this atomicity ensured.
Advantage:
* ensures conflict serializability
Disadvantage:
(i) Two phase locking does not ensure freedom from deadlock.
T3 T4
lock−X(B)
read(B)
B: = B − 50
write(B) lock − S(A)
read(A)
lock − S(B)
lock = X(A)
15. (b)
• Query processing refers to the range of activities involved in
extracting data from a database.
• Include translation of queries in high level database languages,
into expressions that can be used at the physical level of the file
system.
Parser Relational
Query & algebra
translator expression
Optimizer
Data Statistics
abt data
p balance
|
s balance < 2500
|
account
Query evaluation plan.
• to implement the preceding selection, we can scarch every tuple in
account to find tuples with balance less than 2500.
• If a B+ tree index is available on the attribute balance, we can use
the index instead of locate the tuples.
• A relational algebra operation annotated with instructions on how
to evaluate it is called an evaluation primitive.
• A sequence of primitive operations that can be used to evaluate a
query is a query−evaluation plan
• The different evaluation plan for a query can have different costs.
4. What is the difference between tuple relational calculus and domain rela-
tional calculus?
PART B – (5 × 16 = 80 marks)
11. (a) (i) With a neat diagram, explain the structure of a DBMS. (9)
(b) (i) What are the relational algebra operations supported in SQL?
Write the SQL statement for each operation. (12)
13. (a) (i) Explain 1NF, 2NF, 3NF, and BCNF with suitable example. (8)
(b) What are the pitfalls in relational database design? With a suitable
example, explain the role of functional dependency in the process of
normalization. (16)
14. (a) (i) Explain about immediate update and deferred update recovery
techniques. (8)
15. (a) (i) List the different levels in RAID technology and explain its fea-
tures. (12)
(b) (i) Explain the various indexing schemes used in database environ-
ment. (12)
4.
6. It is not a 3rd normal form. Because it does not satisfy transitive de-
pendency.
9. Advantage:
Records are stored in sequential order, according to the value of a “search
key” of each record.
Disadvantage:
Over flow occurs in indexed sequential file organization.
10. Database tuning describes a group of activities used to optimize and ho-
mogenize the performance of a database.
• Goal is to maximize use of system resources to perform work.
PART B
11. (a) (i)
Native users Sophisticated
Application Data base
(tellers, agents users
programmers admin
web users) (analysts)
Application
program DML compiler
obj code and organizer
Query evaluation
engine Query processor
Buffer
ff manager File manager Authorization Transaction
and integrity manager
manager
Storage Manager
A database system is portioned into modules that deal with each of the
responsibilities of the overall system.
Storage Manager:
• A Storage Manager is a program module that provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system.
• The Storage Manager is responsible for the interaction with the file
manager.
DDL Interpreter:
Which interprets DDL statements and records the definitions in the data
dictionary
DML compiler:
Which translates DML statements in a query language into an evalua-
tion plan consisting of low level instructions that query evaluation engine
understands.
Query Evaluation Engine:
Which executes low level instructions generated by the DML compiler.
Cus-name Cus-street
Loan-no Amount
Cus-city
Account
Account
C-name Branch
Balance Acc-no
11. (b) (i) Technically, both of them support the basic features necessary for
data access. For example both of them ensure
• Data is managed to ensure its integrity and quality
• Allow shared access by a community of users
• Use of well defined schema for data-access
• Support a query language
But, file-systems seriously lack some of the critical features neces-
sary for managing data. Lets take a look at some of these feature.
Transaction support
Atomic transactions guarantee complete failure or success of an
operation. This is especially needed when there is concurrent ac-
cess to same data-set. This is one of the basic features provided
by all databases.
But, most file-systems don’t have this features. Only the lesser
known file-systems – Transactional NTFS(TxF), Sum ZFS, Veri-
tas VxFS support this feature. Most of the popular opensource
file-systems (including ext3, xfs, reiserfs) are not even POSIX
compliant.
Fast Indexing
Databases allow indexing based on any attribute or data-property
(i.e. SQL columns). This helps fast retrieval of data, based on
the indexed attribute. This functionality is not offered by most
file-systems i.e. you can’t quickly access “all files created after
2PM today”.
The desktop search tools like Google or MAC spotlight of-
fer this functionality. But for this, they have to scan and index
the complete file-system and store the information in a internal
relational-database.
Snapshots
Snapshot is a point-in-time copy/view of the data. Snapshots are
needed for backup applications, which need consistent point-in-
time copies of data.
The transactional and journaling capabilities enable most of
the database to offer snapshots without shopping access to the
data. Most file-systems however, don’t provide this feature (ZFS
and VxFS being only exceptions). The backup softwares have to
either depend on running application or underlying storage for
snapshots.
Clustering
Advanced databases like Oracle (and now MySQL) also offer
clustering capabilities.The “g” in “Oracle 11g” actually stands
for “grid” or clustering capability. MySQL offers shared-nothing
clusters using synchronous replication. This helps the databases
scale up and support larger & more-fault tolerant production en-
vironments.
File systems still don’t support this option . The only excep-
tions are Veritas CFS and GFS (Open Source).
Replication
Replication is commodity with databases and form the basis for
disaster-recovery plans. File-systems still have to evolve to han-
dle it.
Relational View of Data
File systems store files and other objects only as a stream of bytes,
and have little or no information about the data stored in the files.
Such file systems also provide only a single way of organizing the
files, namely via directories and file names. The associated attri-
butes are also limited in number e.g. – type, size, author, creation
time etc. This does not help in managing related data, as disparate
items do not have any relationships defined.
Databases on the other hand offer easy means to relate stored
data. It also offers a flexible query language (SQL) to retrieve the
11. (b) (ii) A major purpose of a database system is to provide users with
an abstract view of data.
Physical Level:
The lowest level of abstraction describes how the data are
actually stored.
The physical level describes complex low level data
structures in detail.
Logical Level:
Next higher level of abstraction describes what data are
stored in the database.
View Level:
The highest level of abstraction describes only part of the
entire database. Even though the logical level was simpler
structure, complexity remains because of the variety of in-
formation.
View level
View 1 View 2 View n
Logical
level
Physical
level
11. (b) (iii) Schema definition: DBA creates the original database schema
by executing a set of data definition statements in DDL.
Storage structure and access method definition:
Schema and physical-organization modification.
The DBA carries out changes to the schema and physical orga-
nization to reflect the changing needs of the organization.
Granting of authorization for data access:
By granting different types of authorization, the database admin
can regulate which parts of the database various users can access.
Routine Maintenance:
→ Periodically backing up the database
→ Ensure five disk space is available,
→ Monitoring.
Depositor Borrower
Cust-name city Cust-name City
Hayes Pune Adamas Mumbai
Johnson Mumbai Carry Pune
Jones Solapur Hayes Pune
Lindsay Nashik Jackson Solapur
Smith Pune Jones Solapur
Turner Mumbai Smith Pune
Willians Kolhapur
Union
• Operation denoted by U.
• It includes all the tuples that are either in depositor and
borrower.
Result
Depositor ∪ Borrower
Cust-name City
Hayes Pune
.
.
.
.
Willians Kolhapur
Intersection:
→ The result of ∩ is a relation that includes all tuples that are
in both depositor & borrower.
Depositor ∩ Borrower
Cust-name City
Hayes Pune
Jones Solapur
Smith Pune
Difference:
→ denoted by depositor-borrower.
→ result of this is contain all tuples in depositor but not in
borrower.
Depositor-Borrower
Cust-name City
Johnson Mumbai
Lindsay Nashik
Turner Mumbai
Cartesian product:
→ also known as CROSS PRODUCT or CROSS JOINS de-
noted by ‘X’
→ The result relation will have one tuple or each combina-
tion of tuples from each participating relation.
Pub-Info Book-Info
Pub-code Name Book-ID Title
P001 McGraw B001 DBMS
P002 PHI B002 Compiler
P003 Pearson
Pub-Info × Book-Info
Pub-code Name Book-ID Title
P001 McGraw B001 DBMS
P002 PHI B001 DBMS
P003 Pearson B001 DBMS
P001 McGraw B002 Computer
P002 PHI B002 Computer
P003 Pearson B002 Computer
Select operation:
→ Select operation on selects tuples that satisfy a given pred-
icate
→ Represented as s < select condi > (R)
Rename operation:
→ We can rename either the relation or the attributes or
both
rs (new attribute name) (R)
E g: r Temp (B name, A name, P year, B price) (BOOK)
→ The attributes are renamed.
12 (b) (ii) “Integrity constraints guard against accidental damage to the da-
tabase, by ensuring the authorized changes to the database do not
result in loss of data also called as data Integrity
Constraints types:
Domain constraint:
→ elementary form of constraint.
→ test queries to ensure that the comparison make sense.
→ new domains can be crated.
Null value constrain:
If a row lacks a data value for particular column, that value
is said to be null.
Primary key constraint:
Primary key is one or more columns in a table used to unique-
ly identify each row in the table. Primary key value must not
be null.
Research 5 100 B
Research 5 100 C
Administration 4 200 D
Head 1 400 E
EMP-PROJ 1 EMP-PROJ 2
eid ename eid Pnumber Hours
ED1
ED2
ED3
2NF normalisation
ED1
eid Pnumber Hours
ED1
ED2
eid ename
ED2
ED3
Pnumber Pname PLOCATION
A
ED3
ED1
ename eid DOB address Dnumber
ED3
Dnumber Dname DMGRid
Relational Decomposition:
A single decomposition schema R = {A1, A2, …..An} that includes
all the attributes of the database.
m
∪R
i =1
i R
13. (a) (ii) As shown in figure, we can all the attributes dependents on at-
tributes A and B hence, the key of Relation R is (A, B).
I
D
A J
E
C
G
B F
H
ª 3NF:
The 3NF removes transitive dependency. The dependency dia-
gram for R1 and R2 is shown, above.
Thus R1 in divided into following relations:
(i) R3 (A, D, I)
(ii) R4 (A, E)
(iii) R5 (A, D, J).
R2 is divided into following relations to avoid transitive depen-
dency.
(i) R6 (B, F, G)
(ii) R7 (B, F, H)
13. (b) Pitfalls in Relational Database Design
Creating an effective design for a relational database is a key element in
building a reliable system. There is no one “correct” relational database
design for any particular project, and developers must make choices
to create a design that will work efficiently. There are a few common
design pitfalls that can harm a database system. Watching out for these
errors at the design stage can the help to avoid problems later on.
Careless Naming Practices
• Choosing names is an aspect of database design that is often ne-
glected but can have a considerable impact on usability and future
development. To avoid this, both table and column names should
be chosen to be meaningful and to conform to the established
conventions, ensuring that consistency is maintained throughout
a system. A number of conventions can be used in relational da-
tabase names, including the following two examples for a record
storing a client name: “client _ name” and “clientName”.
Lack of Documentation
• Creating documentation for a relational database can be a vital
step in safeguarding future development. There are different lev-
els of documentation that can be created for databases, and some
database management systems are able to generate the documen-
tation automatically. For projects where formal documentation is
not considered necessary, simply including comments within the
SQL code can be helpful.
Failure to Normalize
• Normalization is a technique for analyzing, and improving on,
an initial database design. A variety of techniques are involved,
including identifying features of a database design that may
FUNCTIONAL DEPENDENCIES
A functional dependency occurs between two attributes in a data-
base, A and B, if there exists a relationship such that for each value
of A there is only one corresponding value of B (A → B). This can be
extended to a functional dependency where A may be a set of tuples
(x, y, z) that correspond to a single value B([x, y, z] → B). In simple
mathematical terms the functional dependency must pass the vertical
line test for proper functions.
Normalization of a relation database means that the relation
(tables) in the database conform to a set of rules for a certain form
(First-Sixth Normal Form [1 – 6NF] and/or Boyce–Codd Nor-
mal Form [BCNF]. The higher the normal form of a table the less
vulnerable it is to data inconsistency and data anomalies formed
during updates, inserts, and deletes. Normalization often reduces
data redundancy in a database which reduces data inconsistency
and anomaly risks. Normalizing a database requires analysis of the
closure of the set of functional dependencies to ensure that the set
complies with the rules for the given normal form. If the table does
not comply with the rules then the table is split following specific
procedures to achieve the desired normal form. Every table in a da-
tabase has a normal form and to make a statement that a database is
in a certain normal form (ex. 3NF) means that every table complies
with the rules for 3NF.
14. (a) (ii) Serializability is the generally accepted “Criteria for correct-
ness” for the interleaved execution of a set of transaction; that
is such an execution is considered to be correct if and only if it
is serializble. A given execution of a given set of transactions is
serializable and therefore correct if and only if its equivalent to
some serial execution of the save transaction.
Serial execution:
It is one in which the transactions are run one at a time in some
sequence.
Guaranteed:
It means that the given execution and the serial one always pro-
duce the same result. As each other. No matters what the initial
state of the database might be.
Individual transactions are assumed to be correct that is there
are assumed to transform a correct state of a database into an-
other correct data state. Running the transactions one at a time in
any serial order is therefore also correct. Let I be an interleaved
schedule involving some set of transaction t1, t2 ….tn. If I is seri-
alizable then there exist atleast one serial schedule S involving t1,
t2 ….t such that I is equivalent to S. S is said to be the serializa-
tion of I.
14. (b) (i) This protocol requires that each transaction issue lock and unlock
requests in 2 phase.
Growing phase:
In this phase, a transaction may obtain locks, but may not re-
lease any lock.
Shrinking phase:
• A transaction may release locks, but may not obtain any
new locks.
• Initially a transaction is in growing phase.
• Once the transaction releases a lock, it enters in the shrink-
ing phase, at it cannot issue more lock requests.
Transaction T1 & T2 are not 2 phase while T3 is 2 phase.
T3: lock – X (B);
read (B);
B: = B − 50;
write (B);
lock − X (A);
read (A);
A: = A + 50;
write (A);
unlock (B);
unlock (A);
Advantage:
* Ensures conflict serialzability.
Disadvantage:
(i) Two phase locking does not ensure freedom from deadlock.
T3 T4
lock – X (B)
read (B)
B: B − 50
write (B) lock − S (A)
read (A)
lock – S (B)
lock – X (A)
(ii) Cascading rollbacks may occur under 2 phase lacking:
(1) Strict 2 phase locking protocol: all exclusive mode locks
taken by transaction should be held till finish.
(2) The rigorous 2 phase locking protocol: all locks be held
until transaction commit.
14. (b) (ii) One approach ensures that no cycle waits can occur by ordering the
requests for locks, or requiring all locks to be acquired together.
Disadvantage:
(a) It is hard to predict before the transaction begins, what data
items need to be locked.
(b) Data item utilization may be very low, since many of the
data items may be locked but unused for a long time.
• Second approach for dead lock prevention is to use preemp-
tion and transaction rollbacks.
• The system user timestamp to decide whether a transaction
should wait or roll back.
Wait die:
• Non preemption technique.
• Ti requests a data item held by Tj then Ti is allowed to
wait if it has a timestamp smaller that Tj.
Wound wait:
• preemptive technique.
• When transaction Ti requests data item held by Ti. Ti is
allowed to wait, only if it has timestamp greater than TS
otherwise rolled back.
15. (a) (i) Having a large number of disks in a system presents opportuni-
ties for improving the rate at which data can be read or written,
if the disks are operated in parallel several independent reads on
write can also be performed in parallel. A variety of disk organi-
zation techniques are collectively called as Redundant Arrays of
independent disks (RAID)
RAID levels:
a) RAID 0
C C C C
b) RAID 1
P P P
c) RAID 2
d) RAID 3
e) RAID 4
P P P P P
f) RAID 5
P P P P P
g) RAID 6
RAID 0:
Refers to disk arrays with striping at the level of blocks, but
without any redundancy.
RAID 1:
Refers to disk mirroring with block striping it shows mirrored
organization.
RAID level 2:
It is known as memory style error correcting code organiza-
tion which employs parity level bits. Memory system have long
used parity bits for error detection and correction. Each byte in a
memory system may have a primary parity bit associated with it
that records whether the number of the bits in the byte are set 1.
RAID level 3:
Bit interleaved parity organization improved on level 2 by ex-
ploiting the fact that disk controller unlike memory systems can
detect whether a sector has been read correctly to a single parity
bit can be used for error correction as well as for detection.
RAID level 4:
Block interleaved parity organization uses block level striping
like RAID 0 and in addition keep a parity block on a separate
block for corresponding block from N other disks.
RAID level 5:
Block interleaved distributed parity, improves an level 4 by por-
tioning data and parity among all N + 1 disks, instead of storing
data in N disks and parity in one disks.
RAID level 6:
P + Q redundancy scheme is much like level 5 but stores extra
information to guard against multiple disk failures.
15. (a) (ii) Variable length records arise in database systems in several
ways.
→ Storage of multiple record type in a file.
→ Record types that allow variable lengths for one or more
fields
→ Record types that allow repeating fields
Eg: (1) Byte string Representation
(2) Fixed length Representation
(3) Which uses Anchor block & overflow block – which
contains records other than those that are the first re-
cords of a chain.
Dense Index:
15. (b) (ii) (1) Nested – loop join with r1, as outer relation.
⇒ Best case
= r 1 ∗ r2
= 20000 × 45000
= 90 × 107 pairs of tuples
⇒ Worst case
= nr ∗ bs + br
= 20000 ∗ 30 + 25
= 600025 pairs of tuples
(2) Block nested-loop join with r1 as outer relation
⇒ br ∗ bs + br
= 25 ∗ 30 + 25
= 775 block access are required
(3) Merge join if r1 and r2 are initially sorted
= br + bs
= 25 + 30
= 55.
So, 55 block access are required.
(4) Hash join: → h (r) ≠ h (S) ⇒ more amount of blocks re-
quired here,
r1 → outer relation
r2 → inner relation
br → block in outer
bs → block in inner
nr → outer relation
ns → inner relation
NOV/DEC 2010
Fourth Semester
Computer Science and Engineering
CS 2255- DATABASE MANAGEMENT SYSTEMS
(Common to Information Technology)
Regulation 2008
Time: Three Hours Maximum: 100 MARKS
Answer ALL questions
PART A – (10 × 2 = 20 Marks)
9. Which are the factors to be considered for the evaluation of indexing and
hashing techniques?
PART B – (5 × 16 = 80 Marks)
11. (a) Explain the three different groups of data models with examples.
(16)
Or
12. (a) Describe the features of Embedded SQL and Dynamic SQL. Give
suitable examples. (16)
Or
13. (a) Explain non loss decomposition and functional dependencies with
suitable example. (16)
Or
(b) Discuss join Dependencies and Fifth Normal Form, and explain why
5NF? (16)
14. (a) (i) State the Two-Phase Commit protocol. Discuss the implications
of a failure of the coordinator and some participants. (10)
(b) (i) State and explain the three concurrency problems. (9)
(ii) What is meant by isolation level and define the five different
isolation levels. (7)
15. (a) (i) Discuss the improvement of reliability and performance of RAID
(8)
(ii) Explain the structure of a B+- tree. (8)
Or
3. The various operators used in relation algebra are binary and unary.
Unary operators are select, project and rename.
Binary operators are set union, set intersect, set difference and Cartesian
product.
SHIP
ITEM CUSTOMER
MENT
Qty
Fig: ER diagram
Advantages:
(i) conceptual simplicity
(ii) Database security
(iii) Data independence
(iv) Database integrity
(v) Efficiency.
Disadvantages:
(i) Complex implementation
(ii) Difficult to manage
(iii) Lack of structural independence
(iv) Implementation limitations
(v) Lack of standards.
Network Data Model:
Network data model uses two different data structures to represent
the database entities and relationships between entities, namely
record type and set type. A record type is used to represent an
entity type. It is made up of a number of data items that represent
the attributes of the entity. A set type is used to represent a direct
relationship between two record types.
C1 Alicia A UK C2 Malcom A USA
C3 Francis C UK
Advantages:
(i) Conceptual simplicity
(ii) Handles more relationship types
(iii) Data access flexibility
(iv) Promotes database integrity
Advantages:
1. Structural independence
2. Improved conceptual simplicity
3. Easier database design, management and use
4. Adhoc query capability
5. A powerful database management system
Disadvantages:
1. Substaintial hardware and system software overhead.
2. Poor design and implementation.
3. May promote “islands of information” problem.
11. (b) The Entity-Relationships (E-R) data model, which is popular for high
level database design, provides a means for representing relationships
between entities.
Features of ER-Model:
• This is used to give structure to the data.
• Model can be evolved independent of any DBMS.
• It is an aid for database design.
• It is easy to visualize and understand.
ITEM Shipment
p CUSTOMER
Qty
Basic Concepts:
Entity set:
An entity is a thing or object in the real world i.e., distinguishable
from all other objects.
Eg: Pearson is a entity. An entity can be concrete such as Pearson
are a book or it may be abstract such as loan or an account.
Entity set is a set of entity’s of same type that shares the same
properties or attributes.
Eg: set of all persons who are customers in a bank.
Attributes:
An entity is represented by set of attributes. Attributes are distributive
properties possessed by each member of entity set. For each attribute
there is a set of permitted values called the domain of attribute.
Eg: The domain of attribute the customer name might be set of
all things of certain length
Types of attributes:
Simple and Composite attribute:
A simple attribute cannot be divided into subparts.
Eg: customer-age = 30
A composite attribute can be divided into subparts.
Eg: customer-address = street + city + state
Single value and Multi value Attribute:
Attribute which has only one value is called as single value
attribute.
Eg: cust-no = 321467
Attribute which can have more than one value is said to be
multi valued attribute.
Eg: cust-phone no = 24765214
Derived attribute:
The value for the this type of attribute can be derived form the
values of other related attribute or entities.
Eg: The value for loan-held attribute can be derived from the
entity loan.
An attribute can be take a null value when entity does not have
a value for it. Null may indicate that the value is not applicable
or value is unknown.
Relationship set:
A relationship is an association among several entities. A relation-
ship set is a set of relationship of the same type. If E1, E2…En are
entity then relationship set are subset of {e1 e E1, e2 e E2, …en e En}
Eg: Consider a two entity set customer and loan we can define a
relationship set borrower to denote the association between the
customers and bank loans.
Types:
A relationship may also have distributive attribute describe the
relationship among entities.
Eg: We can access-date as the attribute for borrower relationship.
Unary relationship:
Boss
Employee
Manager Worker
Binary relationship:
Ternary relationship
T
Teacher T
Teacher Subject
Student
Quaternary relationship:
T
Teacher
Course
Student Studies
meterial
Subject
Mapping cardinalities:
It expresses then number of entity to which attribute of another entity
can be associated via a relationship set.
one to one
An entity in A is associated with at most one entity in B. An
entity B is associated with atmost one entity in A.
eg:
A B
a1 b1
a2 b2
a3 b3
one to many:
An entity in A is associated with any number of entity in B. An
entity in B however can associated with at most in A.
Eg :
A B
b1
b2
a2 b3
b4
a3 b5
Many to one
An entity in A is associated with at most one entity in B. An
entity in B can be associated with any number of entities in A.
Eg:
A B
a1 b1
a2
a3 b2
a4
a5 b3
Many to Many
An entity in A is associated with any number of entities in B and
any entity in B is associated with any number of entities in A
Eg:
A B
a1 b1
a2 b2
a3 b3
a4 b4
Entity Attribute
12. (a) EMBEDDED SQL: Any SQL statement that can be used interac-
tively can also be embedded in an application program called embed-
ded SQL.
It is necessary to known a number of preliminary details.
1. Embedded SQL statements are prefixed by EXEC SQL, to
distinguish from statements.
2. An executable SQL statement can appear whenever an executable
host statement can appear.
3. SQL statements can include references to host variables such
references must include a colon prefix to distinguish them from
SQL column names.
4. The purpose of INTO clause is to specify the target variables into
which values are to be retrieved.
5. All host variables referenced in SQL statements must be declared
within an embedded SQL declare section. which is delimited by
the BEGIN and END DECLARE section statements.
6. Every program containing embedded. SQL statements must in-
clude a host variable called SQLSTATE
7. Every host variable must have a data type appropriate to the uses
to which it is put.
8. Host variables and SQL column can have the same name.
9. SQL statement should in principle be followed by a test of the
returned SQLSTATE value.
Eg: Authority SA
Grant select {eno, stock, qty}
on supplier
To {Ajith, Arun}
Revoke:
Syntax;
Authority <SA>
Revoke:
<privilige list>
On<relvar name>
To <user list>;
Eg: Authority SA
Revoke;
Select {sno, stock, qty}
On supplier
To (Ajith, Arun}
Mandatory:
There are top secret, confidential, secret. Each object has
classification like topscret, scret and confidential and each
user is assigned a clearance level similar to objects. User
i can access the object j only if the clearance level of i is
greater than or equal to classification level of i.
Data Encryption:
The process of changing plane text to cyber text by encryption
algorithm
Algorithm needs plane text and key as inputs
There are two types of encryption, public and private.
Missing information:
If we don’t know the value of any particular column it is
the value is NULL. The value is existing but we don’t know
means unknown value.
The concept of NULL refers through any value which is
not known or which does not exist. NULL leads us to a logic
in which there are three truth values True, false, unknown
Note the null values are consider to be equal or unequal. The
relational expression would in turn results to a null value.
Information is often missing in the real world.
Eg: Date of Birth unknown,
AND T U F
t t u f
u u u f
f f f f
OR t u f
t t t t
u t u u
f t u f
NOT
t f
u u
f t
Eg: A = 3, B = 4, C = UNK
A > B AND A > C → false
A > B OR B > C → UNK
A < B OR B < C → true.
NOT (A = C) →UNK.
Supplier
name STATUS
CITY
(b)
SUPP NO
QTY
PART NO
(c) PART
NAME
COLOR
PART NO
WEIGHT
CITY
S1 P2 J1
S2 P1 J1
S1 P1 J1
SHIPMENT PJ JS
SUPPLIER- PART PART PROJECT PROJECT SUPPLIER
NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER
S1 P1 P1 J2 J2 S1
S1 P2 P2 J1 J1 S1
S2 P1 P1 J1 J1 S2
S1 P2 J1 Original shipments
S2 P1 J1
S2 P1 J2
Spurious
S1 P1 J1
Join dependency:
Let R be a relvar and Let A, B, …Z be the subsets of the attributes of R.
Then we say that R statisfies the JD *{A, B…Z} if and only if every legal
value of R is equal to the join of its projections on A, B…Z.
FIFTH Normal Form:
A relvar R is in 5NF also called projection join normal form if and only if
every non- trival join dependency that is statisfied by R is implied by the
candidate key(s) of R, where.
(a) The join dependency * {A, B... Z} on R is trivial if and only if at least
one of A, B,… Z is the set of all attributes of R.
(b) The join dependency *{A, B…Z} on R is implied by the candidate
key(s) of R if and only if each of A, B….Z is a super key for R.
14. (a) (i) In this section we briefly discuss a very important elaboration
on the basic commit/Rollback concept called two phase commit.
Two phase commit is important whenever a given transaction
can interact with several independent resource managers each
managing its own set of recoverable resources and maintaining
its own recovery log.
For example consider a transaction among running on an IBM
mainframe that updates both an IMS database and a DB2 database.
If the transaction completes successfully, then allow updates, to
both IMS and DB2 data, must be committed; conversely if it fails
then all of its updates must be rolled back.
It follows that it does not make sense for the transaction to
issuse say COMMIT to IMS and a ROLLBACK to DB2 and
even if it issused the same instruction to both the system could
still crash between the two, will unfortunate results.
The transaction that has completed its process it issuse COM-
MIT. On that receiving that COMMIT request, the coordinator
goes through the following two process:
• Prepare: First it instructs all resource managers to get ready to
go either way on the transaction. In practice this means that
each participant in the process that is, each resource manager
involved – must force all log records for local resources used
by the transaction out to its own physical log. Assuming the
forced write is successful, the resource manager now replies
ok to coordinator, otherwise it replies not ok.
• Commit: When the coordinator has received replies from all
participants, it forces a record to its own physical log, record-
ing its decision regarding transaction. If fall replies ok that
14. (b) (i) The term concurrency refers to fact that DBMS typically allow
many transactions to access the same database at the same time. In
T1 T2
read item (X); read –item (X);
X: = X – N; X: = X + M;
write – item (X); write –item (X) ← item X
has an incorrect value
read-item (Y); because its update by T1
is lost.
Y: = Y + N;
X: = X − N; X: = X + M;
T1 T3
SUM: = 0;
read-item (A);
X: = X − N;
Write-item (X);
read-item (X); ← T3 reads
X after N is substracted
read-item (Y); and reads Y before N is
added a wrong summary
Y: = Y + N; is the result.
write – item (Y);
SUM:= SUM + X;
read-item (Y);
SUM: = SUM + Y
15. (a) (ii) • Data pointers are stored only at leaf nodes.
• Leaf nodes have an entry for every value of the search field.
• Leaf nodes are linked together to provide ordered access.
• Internal nodes are similar to tree pointer
1 K1 Ki − 1 Pi Ki Kq − 1 Pq
X X X
X ≤ Ki − 1 Ki − 1 < X ≤ Ki Kq − 1 < X
Mianus Redwood
15. (b) File scan: Search algorithms that locate and retrieve records that
fulfill a selection condition.
Two scan algorithms to implement the relocation operation are A1
(linear Search) and A2 (binary Search)
Algorithm A1 (linear search)
• The system scans each file block and test all records to see
whether, they satisfy the selection.
• The cost of linear search, in terms of condition disk opera-
tions, is one seek plus br, blocks transfers
• If selections on key attributes have an average transfer cost of
br/2 for a selection on a key attribute, the system can terminate
the scan if the required record is found.
A2 (Binary search)
• Applicable if selection is an equality comparison on the at-
tribute on which file is ordered.
• Assume that blocks of a relation are stored contiguously
• Cost estimate
→ [log2(br)]- cost of locating the lst tuple by a binary search on
the blocks.
→ plus number of blocks containing records that satisfy selec-
tion condition.
Selection using Indices:
• Index-scan-search algorithms that use an index selection con-
dition must be on a search key of index.
5. What is normalization?
PART B – (5 × 16 = 80 marks)
11. (a) (i) Discuss the various disadvantages in the file system and explain
how it can be overcome by the database system. (6)
(ii) What are the different Data models present? Explain in detail.
(10)
Or
(b) (i) Explain the Database system structure with a neat diagram. (10)
(ii) Construct an ER diagram for an employee payroll system. (6)
12. (a) (i) Explain the use of trigger with your own example. (8)
(ii) Discuss the terms Distributed databases and client/server data-
bases. (8)
Or
(b) (i) What is a view? How can it be created? Explain with an example.
(7)
(ii) Discuss in detail the operators SELECT, PROJECT, UNION with
suitable examples. (9)
13. (a) Explain 1NF, 2NF and 3NF with an example. (16)
Or
(b) Explain the Boyce–Codd normal form with an example. Also state
how it differs from that of 3NF. (16)
14. (a) (i) How can you implement atomicity in transactions? Explain. (8)
(ii) Describe the concept of serilalizability with suitable example.
(8)
Or
(b) How concurrency is performed? Explain the protocol that is used to
maintain the concurrency concept? (16)
PART A
3. Rename:
Rename operation can rename either the relation name or the attribute
name or both.
Syntax:
Ps (R)
Eg:
PTemp (Branch)
4. An entity set may not have sufficient attributes to form a primary key.
• Such an entity set is known as weak entity set.
5.
• Normalization is the process of making the database design to its ulti-
mate normal form.
• The non key attributes must be mutually independent and indirectly
dependant on the primary key.
6.
• A functional dependency is basically a many to one relationship from
one set of attributes to another within a given relvar or relation.
• Types of Functional Dependencies are:
Full Functional Dependency
Partial Functional Dependency
Transitive Functional Dependency
7.
• A transaction is a logical unit of work which alters or accesses the da-
tabase.
• It begins with the execution of begin transaction keyword.
8. Every transaction possess the ACID properties
PART – B
11. (a) (i) Disadvantages in the file system:
Data redundancy and inconsistency:
Since different programmers create the files and application
programs over a long period, the various files are likely to
have different structures and the programs may be written in
several programming languages. Moreover the same infor-
mation may be duplicated in several places.
Difficulty in accessing data:
The conventional file processing environments do not al-
low needed data to be retrieved in a convenient and efficient
manner. More responsive data retrieval systems are required
for general use.
Data isolation:
Because data are scattered in various files, and files may be
in different formats, writing new application programs to
retrieve the appropriate data is difficult.
Integrity problems:
The data values stored in the database must satisfy certain types
of consistency constraints. The problem is compounded when
constraints involve several data items from different files.
Atomicity problems:
A computer system like any other device is subject to fail-
ure. In many applications, it is crucial that if a failure oc-
curs, the data be restored to the consistent state that existed
prior to the failure. It is difficult to ensure atomicity in a
conventional file processing system.
Concurrent access anomalies:
For the sake of overall performance of the system and faster
response, many systems allow multiple users to update the
data simultaneously. But supervision is difficult to provide
because data may be accessed by many different applica-
tion programs that have not been coordinated previously.
Security problems:
Not every user of the database system should be able to ac-
cess all the data. Since application programs are added to
the file processing system in an ad hoc manner, enforcing
such security constraint is difficult.
11. (a) (ii) Data models:
Underlying the structure of database is the data model: a collec-
tion of conceptual tools for describing data, data relationship,
data semantics and consistency constraints. A data model pro-
vides a way to describe the design of a database at the physi-
cal, logical and view level. There are a number of different data
models that can be classified into four different categories.
Relational Model:
The relational model uses a collection of tables to represent
both data and the relationship among those data. Each table
has multiple columns and each column has a unique name.
Tables are known as relations. The relational model is an
example of record based model. Record based models are so
named because the database is structured in fixed format re-
cords of several types. Each record of particular type is con-
tained in a table. Each record type defines a fixed number of
fields, or attributes. The columns of the table correspond to
the attributes of the record type. The relational data model is
the most widely used data model and a vast majority of cur-
rent database systems are based on the relational model.
Entity Relationship Model:
The Entity Relationship (E-R) data model uses a collection
of basic objects called entities and relationships among these
Transaction Manager:
It ensures that the database remains in a consistent state after con-
current transaction without conflicts.
File Manager:
It manages the allocation of space on disc and data structures for
representation of data.
Native users Sophisticated
Application Data base
(tellers, agents users
programmers administrator
web users) (analysts)
Compiler and
DML queries DML interpreter
linker
Application
programme DML compiler
object code and organizer
Query evaluation
engine Query processor
Buffer
ff manager File manager Authorization Transaction
and integrity manager
manager
Storage manager
Indices Data
dictionary
Statistical
Data data
Disc Storage:
Data files:
The actual data stored in the database.
Data dictionary:
It contains meta data.
Indices:
It is used to provide fast access of data items that holds key values.
Query processor:
It is an important part of database system. The various subcompo-
nents of query processor are:
• DDL Interpreter
• DML compiler
• Query evaluation engine
Database users:
Naive users:
They are unsophisticated users who interact with the system
by invoking application programs that are written previously.
Application programmers:
They are computer professionals who writes and develops ap-
plication programmer using application tools.
Sophisticated users:
These users interact with the system without writing programs.
They form their queries and statements for request.
Specialized users:
These users are sophisticated users who write sophisticated da-
tabase application that is beyond traditional data processing.
Database Administrator (DBA):
• DBA has central control over both programs and data. The
functions of DBA are as follows:
The DBA creates schema definition.
DBA is responsible for storage structure and access method
definition.
DBA carries out the changes in schema and physical orga-
nization according to the changing needs.
DBA periodically takes back up of database.
It ensures enough disc space is available.
It monitors the database and upgrades the performance.
11 (b) (ii)
Emp. name Dept. no
Income tax
Address Dept. name
Emp. no
Salary
Undergoes
g Employee Manager Department
deduction
Cont. no Assistants
P.f
Earns
Lunch
Basic Salary
allowance
Hra
Da Travel
allowance
Eg 1:
Create trigger warn_trig before delete on emp begin
Raise-application error (1101, ‘you cant delete’);
end;
Eg 2:
Create (or replace) trigger trig_update
after insert on restate for each row
begin
if: new class = ‘A’ then
update flight set seatsfc = seatsfc + 1 where
flight no = : newfno;
else
update flight set seats sc = seats sc + 1 where
flight no =: new flight no;
end if;
end;
Communication
network
Transparent remote
accesses
DBMS
Server machine
12. (b) (i) Views provide a shorthand or macro capability. There is a strong
analogy here with views in a programming language system. In
principle a user in a programming language system could write
out the expanded form of a given macro directly. Analogous re-
marks apply to views. Thus views in a database system play a role
somewhat analogous to that of macros in a programming language
system. Views allow the same data to be seen by different users in
different ways at the same time. Views provide automatic security
for hidden data. Views can provide logical data independence.
Views follows two principles:
• Principle of Interchangeability
• Principle of Database Relativity
Creation of view:
Eg create view city_pair (scity, pcity)
Eg of views:
Create view v as select ∗ from suppliers
where status > 25 or city = “Chennai”;
suppno city
qty
sartno status
Second
suppno status city
S1 20 Chennai
S2 10 Bombay
S3 30 Delhi
/ ∗ assume A → 0 holds}
then normalize as,
R1 {A, D} primary key {A}
R2 {A, B, C} primary key {A}
FK {A} reference R
Third Normal form:
A relvar is in 3NF if and only if it is in 2NF and every non key attri-
bute is non transitively dependant on the primary key. Note that the
transitive dependencies implies mutual dependencies. The second
step in the normalization procedure is to take projections to eliminate
transitive dependencies
R {A, B, C}
Primary key {A}
/ ∗ assume B → C holds}
then normalize as,
R1 {B, C} primary key {B}
R2 {A, B} primary key {A}
FK {B} reference R1
then there exists at least one serial schedule S involving T1, T2,
….Tn such that I is equivalent to S. S is said to be the serializa-
tion of I.
14. (b) The term concurrency refers to the fact that DBMS’s typically allow
many transactions to access the same database at the same time.
Concurrent transactions do not interfere with each other.
Three concurrency problems:
There are essentially three ways in which a transaction, though cor-
rect in itself in produce the wrong answer if some other transaction
interferes with it in some way.
The lost update problem:
– –
retrieve t t1 –
– –
– retrieve t
t2
– –
update t –
– t3 –
– update t
– t4 –
↓
Transaction A relieves some tupels t at time t1. Transaction B retrieves
the same tuple t at time t2. A updates tuple at time t3. B updates the
same tuple of the values seen at time t2. Transaction A’s update is lost at
time t4, because transaction B overwrites it without even looking at it.
The Uncommitted Dependency Problem:
The uncommitted dependency problem arises if one transaction is al-
lowed to retrieve or update a tuple that has been updated by another
15. (a) Having a large number of disks in a system presents opportunities for
improving the rate at which data can be read or written, if the disks
are operated is parallel. Several independent reads or writes can also
be performed in parallel. A variety of disk organization techniques
are collectively called as Redundant Arrays of Independance Disks
(RAID).
RAID levels:
a) RAID 0
C C C C
b) RAID 1
P P P
c) RAID 2
d) RAID 3
e) RAID 4
P P P P P
f) RAID 5
P P P P P P
g) RAID 6
RAID 0: Refers to disk arrays with striping at the level of blocks, but
without any redundancy.
15. (b) As index entry or index record consists of a search key value and
pointer to one or more records with that value as their search key
value. The pointer to a record consists of the identifier of a disk block
and an offset within the disk block to identify the record within a
block.
B+ tree:
The main disadvantage of the index sequential file organization is
that performance degrades as the file grows both for index lookups
and for sequential scans though the data. The B+ tree index structure
is the more widely used of several index structures that maintain
their efficiency despite insertion and deletion of data.
Structure of a B+ tree:
A B+ tree index is a multilevel index but it has a structure that differs
from that of the multilevel index sequential file. Structure consists of
a typical node of B+ tree. It contains upto n – 1 reach key values k1,
k2…. kn-1 and n pointers P1, P2…. Pn. The search key values within
a node are bept in sorted order, thus if i < j, then ki < kj. First let us
consider the structure of leaf nodes. For i = 1, 2…n – 1 pointer Pi
points to a file record with search key value ki pointer Pn has a special
purpose.
5. Mention the advantage of using magnetic tape for storing the data.
10. How do the XML data differ from that of Relational data?
PART B – (5 × 16 = 80 Marks)
11. (a) Discuss the various design issues involved in ER Database schema.
(16).
(b) (i) Explain the distinction between condition-defined and user-
defined constraints in Generalization, which of these constraints
can the system check automatically? Give your answer. (8)
12. (a) Explain the Third Normal Form with suitable example and compare
with BCNF.
13. (a) Explain the different methods of storing variable size records. (16)
15. (a) (i) A car rental company maintains a vehicle database for all vehi-
cles in its current fleet. For all vehicles it includes vehicle-id,
license-no, manufacturer, model, date-of-purchase and color,
Special data types are included for certain vehicles.
Trucks: cargo-capacity
Sports Car: horsepower, renter-age requirement
Vans: number of passengers
Construct an object-oriented database schema definition for this
database.
Use inheritance wherever appropriate. (8)
(ii) For the following schema
Books (title, authorset setoff (Author), publisherset
setoff(Publisher))
Author (first-name, last-name)
Publisher (name, branch)
Give the XML representation and its DTD. (8)
(b) (i) Describe the Data Warehouse Architecture with a neat diagram.
(7)
(ii) Explain the data mining applications – classifications, associa-
tion and clustering. (9)
2.
S.No Relational Algebra Relational Calculus
(i) It is Procedural Query language It is non Procedural Query
language
(ii) It consists of set of operations It writes one declarative
expression to specify a
retrieval.
(iii) It takes one or more relations as Hence there is no description
input and produce a new relation of how to retrieve it.
as their result.
(iv) It consists of following operations, It consists of two types
(a) select (i) Tuple Relational
(b) project calculus.
(c) Rename (ii) Domain Relational Cal-
(d) set operations culus.
(e) Cartesian product
(f) Join operations
4. Triggers are useful mechanism for database designers for starting certain
tasks automatically when certain conditions are met.
5. Tapes have a high capacity (5-giga bytes tapes) and can be removed from
the tape drive, facilating cheap archival storage.
7. Concurrency control:
→ Concurrency control is the technique used to control concurrent execu-
tion of transaction
→ The concurrency control schemes are based on the serializability property.
8. (i) Atomicity
(ii) Consistency
(iii) Isolation
(iv) Durability
PART B
11 (a) Refer Nov/Dec 2010 Q.NO 11.(b)
Employs Contains
T
Test Reasons for
Doc name admission
Patient-code
Doctor Labs
Doc code
e Fees Patient Address
ISA Relation
D.O.B name
Blood
Permanent Consulting gr
Salary Age
doctor doctor
Contact-no Changes
g Admitted
Address Pincode
Specialization
p
Street State
City Relation
Room
code
Room type
yp Charges
12. (b) (i) Q > Select item from Sales where dept not in second floor
(ii) Q> Select dname from Dept where employee Salary < man-
ager_ no_ salary
(iii) Create view View as select manager_ no, salary from employee
where
Job = “Manager – no and ename = ‘Aarthi’;
(iv) Yes, the view is updatable
15 (a) Trucks:
(i) Create type Trucks under vehicles (vehicle_id varchar (10), license_
no varchar (10), manufacture varchar 2 (20), model varchar (20), data-
of-purchase, date, color varchar (15), Cargo-capacity varchar 2 (10)
Sports Car:
Create type Sports Car under vehicles (vehicle_id varchar(10),
license_ no varchar(10), manufacture varchar(20), model
varchar(20) date-of-purchase, date, color varchar (15), horsepow-
er varchar 2 (10), renter-age-requirement Number (3));
Vans:
Create type Vans under vehicles (vehicle_ id varchar(10), license_no
varchar (10), manufacture varchar 2 (20), model varchar (20), date-
of-purchase, date, color varchar (15), Number-of-passengers number
(3)); above the object oriented database scheme
Super-types: vehicles
Sub-types: Trucks, Sports car, and vans.
Trucks, sports car and vans inherits the attributes of vehicles
Data source 1
Data
Data source 2 loaders
Query and
analysis tool
Data source n Data warehouse
6. How does a B-tree differ from B+ trees? Why is B+ tree usually preferred
as an access structure to a data file?
10. What are structured data types? What are collection types, in particular?
PART B (5 ë 16 = 80 marks)
11. (a) (i) Explain the component modules of a DBMS and their
interactions with the architecture. (10)
or
(b) (i) Explain the basic relational algebra operations with the
symbol used and example for each. (10)
or
13. (a) (i) Describe the different types of file organization? Explain
using a sketch of each of them with their advantages and
disadvantages. (10)
(ii) Describe static hashing and dynamic hashing. (6)
or
(b) (i) Explain the index schemas used in DBMS. (10)
(ii) How does a DBMS represent a relational query evaluation
plan? (6)
or
(b) (i) Describe strict two-phase locking protocol. (10)
(ii) Explain the log based recovery technique (6)
15. (a) (i) Explain 2-phase commitment protocol and the behavior
of this protocol during lost messages and site failures. (12)
(ii) Describe X path and X query with an example. (4)
or
(b) (i) Explain Data mining and data warehousing. (12)
(ii) Describe the anatomy of XML document.
8. Define deadlock.
PART B – (5 × 16 = 80 marks)
11. (a) (i) Construct an ER diagram for a car insurance company that has
a set of customers, each of whom owns one/more cars. Each car
has associated with it zero to any number of recorded accidents.
(ii) Construct appropriate tables for the above ER diagram.
Or
(b) (i) Define data model. Explain the different types of data models
with relevant examples. (10)
(ii) Explain the role and functions of the database administrator. (6)
15 (a) State and explain the features of object oriented data model. Use
banking application as an example. (16)
Or
(b) Write detail notes on following:
(i) Distributed Databases (8)
(ii) Data Mining (8)
2. Give the reasons why null values might be introduces into the database.
PART B – (5 × 16 = 80 marks)
11. (a) Explain the system structure of a database system with neat block
diagram. (16)
Or
(b) (i) Construct an ER-diagram for hospital with a set of patients and a
set of medical doctors. Associate with each patient a log of the vari-
ous tests and examinations conducted.
(ii) Discuss on various relational algebra operators with suitable exam-
ple.
12. (a) (i) Consider the employee database, where the primary keys are
underlined.
employee (empname, street, city)
works (empname, companyname, salary)
company (companyname, city)
manages (empname, managername)
And given an expression in SQL for the following queries:
(1) Find the names of all employees who work for First Bank
Corporation.
(2) Find the names, street addresses, and cities of residence of
all employees who work for First Bank Corporation and earn
more than 200000 per annum.
(3) Find the names of all employees in this database who live in
the same city as the companies for which they work.
(4) Find the names of all the employees who earn more than
every employees of Small Bank Corporation.
(ii) Discuss the strengths and weaknesses of the trigger mechanism.
Compare triggers with other integrity constraints supported by
SQL. (8)
Or
(b) (i) What is normalization? Explain the various normalization tech-
niques with suitable example. (12)
(ii) Give the comparison between BCNF and 3NF. (4)
13. (a) (i) Explain how the RAID system improves performance and reli-
ability. (8)
(ii) Describe the structure of B+ tree and list the characteristics of a
B+ tree. (8)
Or
8. What benefit is provided by strict – two phase locking? What are the
disadvantages results?
PART B – (5 × 16 = 80 marks)
11. (a) (i) What are the types of knowledge discovered during data mining?
Explain with suitable examples. (8)
(ii) Highlight the features of object oriented database. (8)
Or
(b) (i) What is nested relations? Give example. (8)
(ii) Explain the structure of XML with suitable example. (8)
12. (a) (i) Compare file system with database system. (8)
(ii) Explain the architecture of DBMS. (8)
Or
(b) (i) What are the steps involved in designing a database application?
Explain with an application. (10)
(ii) List the possible types of relations that may exist between two
entities. How would you realize that into tables for a binary rela-
tion? (6)
13. (a) (i) What are the relational algebra operations supported in SQL?
Write the SQL statement for each operation. (8)
(ii) Justify the need for normalization with examples. (8)
Or
(b) (i) What is normalization? Explain 1NF, 2NF, 3NF and BCNF with
suitable example. (8)
(ii) What is FD? Explain the role of FD in the process of normaliza-
tion. (8)
14. (a) (i) Explain the security features provided in commercial query lan-
guages. (8)
(ii) What are the steps involved in query processing? How would
you estimate the cost of the query? (8)
(b) (i) Explain the different properties of indexes in detail. (8)
(ii) Explain various hashing techniques. (8)
15. (a) (i) Explain the four important properties of transaction that a DBMS
must ensure to maintain database. (8)
(ii) What is RAID? List the different levels in RAID technology and
explain its features. (8)
Or
2. Give the distinction between primary key, candidate key and super key.
3. Write a SQL statement to find the names and loan numbers of all custom-
ers who have a loan at Chennai branch.
PART B – (5 × 16 = 80 marks)
11. (a) (i) Describe the system structure of database system. (12)
(ii) List out the functions of DBA (4)
Or
12. (a) (i) Discuss about triggers. How do triggers offer a powerful mech-
anism for dealing with the changes to database with suitable
example. (10)
(ii) What are nested queries? Explain with example. (6)
Or
(b) (i) What is normalization? Give the various normal forms of rela-
tional schema and define a relation which is in BCNF and explain
with suitable example. (12)
(ii) Compare BCNF versus 3NF. (4)
13. (a) (ii) Explain why allocations of records to blocks affects database
system performance significantly. (6)
Or
(b) (i) Describe the structure of B+ tree and give the algorithm for search
in the B+ tree with example. (12)
(ii) Give the comparison between ordered indexing and hashing
Or
(b) (i) Discuss on two-phase locking protocol and timestamp-based
protocol. (12)
(ii) Write short notes on log-based recovery. (4)
15. (a) (i) Discuss in detail about the object relational database and its
advantages.
(ii) Illustrate the issues to implement distributed database.
(b) (i) Give the basic structure of XML and its document schema.
(ii) What are the two important classes of data mining problems?
Explain about rule discovery using those classes.
4. Define triggers.
PART B (5 ë 16 = 80 Marks)
11. (a) (i) Explain the Database Management System architecture
with a neat diagram. (10)
(ii) What are the need for the development of relational
databases? (6)
12. (a) (i) Explain briefly about various relational algebra expressions
with examples. (8)
(ii) Discuss about the evolution of distributed database.
Compare with client/server mode. (8)
or
(b) Consider the relational table given below and answer the
following SQL queries. (16)
Employee (SSN-No, Name, Department, Salary)
(i) List all the employees whose name starts with the letter ‘L’
(ii) Find the maximum salary given to employees in each
department.
(iii) Find the number of employee working in ‘accounts’
department.
(iv) Find the second maximum salary from the table.
(v) Find the employee who is getting the minimum salary.
14. (a) Explain briefly about the working of two phase locking
protocol using a sample transaction. (16)
or
(b) (i) When is a transaction said to be deadlocked? Explain the
deadlock prevention methods with an example. (8)
15. (a) (i) Draw and explain the structure of B+ tree index files. (10)
(ii) Write notes on RAID. (6)
or
(b) Explain briefly about query processing with examples to
perform sort and join operation. (16)