3 - Database Management Systems PDF

Semester-III
Database
Management
Systems
03_Database Management Systems.indd 1 4/23/2014 12:41:00 PM

The aim of this publication is to supply information taken from sources believed to be valid and
reliable. This is not an attempt to render any type of professional advice or analysis, nor is it to
be treated as such. While much care has been taken to ensure the veracity and currency of the
information presented within, neither the publisher nor its authors bear any responsibility for
any damage arising from inadvertent omissions, negligence or inaccuracies (typographical or
factual) that may have found their way into this book.
EEE_Sem-VI_Chennai_FM.indd iv 12/7/2012 6:40:43 PM

B.E./B.Tech. Degree Examination,
Nov/Dec 2013
Third Semester
Computer Science and Engineering
(Common to Information Technology)
DATABASE MANAGEMENT SYSTEMS

(Regulation 2008/2010)
Time: Three hours Maximum: 100 marks
Answer ALL questions
PART A (10 × 2 = 20 marks)
1. What do you mean by simple and composite attribute?
2. Define query.
3. State the difference between security and integrity.
4. W
hich operators are called as unary operators and why are they called
so?
5. Define trivial functional dependency.
6. Define functional dependency.
7. Brief about Cascading rollback.
8. What is a rigorous two phase locking protocol?
9. What is slotted page sheet? Draw the diagram.
10. What is the content of update log record?

3.4 B.E./B.Tech. Question Papers
Part B (5 × 16 = 80 marks)
11. (a) (i) Discuss the main characteristics of the database approach and
how does it differ from traditional file system.
(ii) What are the three levels of abstraction in DBMS? (8 + 8)
Or
(b) (i) A database is being constructed to keep trade of teams and
games of a sports league. A team has a number of players, not all
of whom participate in each game. It is described to keep track
of players participating in each game in each team, the positions
they played on that game and the result of game. Draw the ER
diagram and list its entities and attributes. (10)
(ii) Briefly explain mapping cardinality in detail (6)
12. (a) Consider the database schema
E
mp (emp-name, type, birthday, set of (Exam-names), set of (Skills))
Children(emp-name, ch-name, birthday)
Skills(type, set of (exam-names))
Exams(exam-name, year, city)
Write SQL statements for the following queries.
(i) Find the names of all employees who have a birthday in March
as their children.
(ii) Find those employees who took an examination for the skill type
“typing” in the city “Chennai”.
(iii) List all exam names under specific skill type for the given
employee other than his exam names.
(iv) Find the names of the city and year where the examination is
going to held for. the given skill type. (8)
(v) Explain reverential integrity with an example. (8)
Or
(b) W
hat is the need for building distributed database? Explain important
issues in building distributed database with ah example. Explain
how distributed database is used in client/server environment. (16)

Database Management Systems (Nov/Dec 2013) 3.5
13. (a) (i) What is redundant data? What are the problems caused by
redundant data? (6)
(ii) Explain the process of normalization from INF to BCNF stage
with example (10)
Or
(b) C
onsider the relation R(A, B, C, D, E) with functional dependencies.
{A → BC, CD → E, B → D, E → A} Identify Super keys.
Find Fc, F+. (16)
14. (a) Explain the following:
(i) Different locking mechanism used in lock based concurrency
control. (10)
(ii) Validation based protocol with an example, (6)
Or
(b) (i) What is the difference between conflict serializability and view
serializability? Explain in detail with an example. (12)
(ii) Briefly explain ACID properly with an example. (4)
15. (a) W
hat is RAID? Briefly explain different levels of RAID. Discuss the
factors to be considered in choosing a RAID level. (16)
Or
(b) (i) Explain three kinds database tuning in detail. (6)
(ii) Explain the structure of B+ tree and how to process queries in B+
tree. (10)

Solutions
Part A
1. A composite attribute is an attribute composed of multiple components,

each with an independent existence. Some attributes can be further
broken down or divided into smaller components with an independent
existence of their own.
2. Queries are the primary mechanism for retrieving information from

a database and consist of questions presented to the database in a
predefined format. Many database management systems use the
Structured Query Language (SQL) standard query format.
3. Data integrity = making sure the data is correct and not corruptData
security = making sure only the people who should have access to the
data are the only ones who can access the data. also, keeping straight
who can read the data and who can write data.
4. A recursive relationship is a relationship between the instances of a single

entity type. It is a relationship type in which the same entity type is associated
more than once in different roles. Thus, the entity relates only to another
instance of its own type. For example, a recursive binary relationship
‘manages’ relates an entity PERSON to another PERSON by management.
Recursive relationships are sometimes called unary elationships
5. When the right-hand side of the functional dependency is a subset of

the left-hand side, it is called trivial dependency. An example of trivial
dependency can be given as:
FD: {BOOK-ID, CITY-ID} → {BOOK-ID}
Trivial dependencies are satisfied by all relations. For example, X → X
is satisfied by all relations involving attribute X.
6. A functional dependency is a many-to-one relationship between two sets

of attributes X and Y of a given table T. Here X and Y are subsets of the
set of attributes of table T. Thus, the functional dependency X → Y is said
to hold in relation R if and only if, whenever two tuples (rows or records)
of T have the same value of X, they also have the same value for Y.
7. A cascading rollback occurs in database systems when a transaction (T1)

causes a failure and a rollback must be performed. Other transactions

dependent on T1’s actions must also be rollbacked due to T1’s failure,

thus causing a cascading effect. That is, one transaction’s failure causes
many to fail.
8. Two-phase locking (also called 2PL) is a method or a protocol of

controlling concurrent processing in which all locking operations precede
the first unlocking operation. Thus, a transaction is said to follow the
twophase locking protocol if all locking operations (such as read_lock,
write_lock) precede the first unlock operation in the transaction..
9. Slotted Page
1) page is organized into areas (slots)

2) slots point to data chunks
3) slots may point to other pages
10. Update Log Record notes an update (change) to the database. It

includes this extra information:
1) PageID: A reference to the Page ID of the modified page.
2) Length and Offset: Length in bytes and offset of the page are usually
included.
3) Before and After Images: Includes the value of the bytes of page
before and after the page change. Some databases may have logs
which include one or both images.

Part B
11. (a) (i)
Database Approach
The problems inherent in file-oriented systems make using the database
system very desirable. Unlike the file-oriented system, with its many
separate and unrelated files, the database system consists of logically
related data stored in a single data dictionary. Therefore, the database
approach represents the change in the way end user data are stored,
accessed and managed. It emphasizes the integration and sharing of
data throughout the organisation. Database systems overcome the
disadvantages of file-oriented system. They eliminate problems related
with data redundancy and data control by supporting an integrated and
centralized data structure. Data are controlled via a data dictionary (DD)
system which itself is controlled by database administrators (DBAs).
The following figure illustrates a comparison between file-oriented and
database systems.
File-oriented versus database systems


11. (a) (ii)

ANSI-SPARC three-tier database architecture is shown in the below
figure. It consists of following three levels:
Internal level,
Conceptual level,
External level.
ANSI-SPARC three-tier database structure
The view at each of the above levels is described by a scheme or schema.

A schema is an outline or plan that describes the records, attributes and
relationships existing in the view. The term view, scheme and schema
are used interchangeably. A data definition language (DDL), is used to
define the conceptual and external schemas. Structured query language
(SQL) commands are used to describe the aspects of the physical (or
internal schema). Information about the internal, conceptual and
external schemas is stored in the system catalog.

11. (b) (i)
11. (b) (ii)

In the three-schema architecture database system, each user group refers
only to its own external schema. Hence, the user’s request specified at
external schema level must be transformed into a request at conceptual
schema level. The transformed request at conceptual schema level should
be further transformed at internal schema level for final processing of
data in the stored database as per user’s request.
The final result from processed data as per user’s request must be
reformatted to satisfy the user’s external view.
The process of transforming requests and results between the three
levels are called mappings. The database management system (DBMS)
is responsible for this mapping between internal, conceptual and external
schemas. The three-tier architecture of ANSI-SPARC model provides
the following two-stage mappings as shown below:

Mappings of three-tier architecture
Conceptual/Internal mapping
External/Conceptual mapping
Conceptual/Internal Mapping
The conceptual schema is related to the internal schema through
conceptual/internal mapping. The conceptual internal mapping defines
the correspondence between the conceptual view and the stored
database. It specifies how conceptual records and fields are presented
at the internal level.
External/Conceptual Mapping
Each external schema is related to the conceptual schema by the external/
conceptual mapping. The external/conceptual mapping defines the
correspondence between a particular external view and the conceptual
view. It gives the correspondence among the records and relationships of
the external and conceptual views.
12. (a)
select emp-name from children where birthday=”march”
ii)select emp-name from emp where skill type =”typing”&
city=”Chennai”

iii)select exam-name from exam where emp=skill type

iv)not known
v)wrong ques(its referential integrity)
12. (b)
Client/Server Database System
Client/server architecture of database system has two logical components
namely client, and server. Clients are generally personal computers or
workstations whereas server is large workstations, mini range computer
system or a mainframe computers system. The applications and tools of
DBMS run on one or more client platforms, while the DBMS softwares
reside on the server. The server computer is called backend and the
client’s computer is called front-end. These server and client computers
are connected into a network. The applications and tools act as clients
of the DBMS, making requests for its services. The DBMS, in turn,
processes these requests and returns the results to the client(s). Client/
server architecture handles the graphical user interface (GUI) and does
computations and other programming of interest to the end user. The
server handles parts of the job that are common to many clients, for
example, database access and updates. The following figure illustrates
client/server database architecture.
Client/server database architecture

As shown in figure, the client/server database architecture consists

of three components namely, client applications, a DBMS server and
a communication network interface. The client applications may be
tools, user-written applications or vendor-written applications. They
issue SQL statements for data access. The DBMS server stores the
related software, processes the SQL statements and returns results. The
communication network interface enables client applications to connect
to the server, send SQL statements and receive results or error messages
or error return codes after the server has processed the SQL statements.
In client/server database architecture, the majority of the DBMS services
are performed on the server.
The client/server architecture is a part of the open systems architecture
in which all computing hardware, operating systems, network protocols
and other software are interconnected as a network and work in concert
to achieve user goals. It is well suited for online transaction processing
and decision support applications, which tend to generate a number of
relatively short transactions and require a high degree of concurrency.
Advantages of Client/server Database System
•• Client-server system has less expensive platforms to support appli-
cations that had previously been running only on large and expensive
mini or mainframe computers.
•• Clients offer icon-based manu-driven interface, which is superior
to the traditional command-line, dumb terminal interface typical of
mini and mainframe computer systems.
•• Client/server environment facilitates in more productive work by the
users and making better use of existing data.
•• Client-server database system is more flexible as compared to the
centralised system.
•• Response time and throughput is high.
•• The server (database) machine can be custom-built (tailored) to the
DBMS function and thus can provide a better DBMS performance.
•• The client (application database) might be a personnel workstation,
tailored to the needs of the end users and thus able to provide better
interfaces, high availability, faster responses and overall improved
ease of use to the user.
•• A single database (on server) can be shared across several distinct
client (application) systems.

Disadvantages of Client/Server Database System

Labour or programming cost is high in client/server environments,
particularly in initial phases.
There is a lack of management tools for diagnosis, performance
monitoring and tuning.
Distributed Database System
Distributed database systems are similar to client/server architecture in
a number of ways. Both typically involve the use of multiple computer
systems and enable users to access data from remote system. However,
distributed database system broadens the extent to which data can be
shared well beyond that which can be achieved with the client/server
system. Fig. 2.23 shows a diagram of distributed database architecture.

Distributed database system

In distributed database system, data is spread across a variety of
different databases. These are managed by a variety of different DBMS
softwares running on a variety of different computing machines
supported by a variety of different operating systems. These machines
are spread (or distributed) geographically and connected together by
a variety of communication networks. In distributed database system,
one application can operate on data that is spread geographically on
different machines. Thus, in distributed database system, the enterprise
data might be distributed on different computers in such a way that
data for one portion (or department) of the enterprise is stored in one
computer and the data for another department is stored in another. Each
machine can have data and applications of its own. However, the users
on one computer can access to data stored in several other computers.
Therefore, each machine will act as a server for some users and a client
for others.
Advantages of Distributed Database System
Distributed database architecture provides greater efficiency and better
performance.
Response time and throughput is high.
The server (database) machine can be custom-built (tailored) to the
DBMS function and thus can provide better DBMS performance.
The client (application database) might be a personnel workstation,
interfaces, high availability, faster responses and overall improved ease
of use to the user.
A single database (on server) can be shared across several distinct client
(application) systems.
As data volumes and transaction rates increase, users can grow the
system incrementally.
It causes less impact on ongoing operations when adding new locations.
Distributed database system provides local autonomy.
Disadvantages of Distributed Database System
Recovery from failure is more complex in distributed database systems
than in centralized systems.

13. (a) (i)

1NF contains redundant information. For example, the relation
PATIENT_DOCTOR in 1NF of Table 10.3 has the following problems
with the structure:
(a) A doctor, who does not currently have an appointment with a patient,
cannot be represented.
(b) Similarly, we cannot represent a patient who does not currently have
an appointment with a doctor.
(c) There is redundant information such as the patient’s date-of-birth
and the doctor’s phone numbers, stored in the table. This will require
considerable care while inserting new records, updating existing
records or deleting records to ensure that all instances retain the
correct values.
(d) While deleting the last remaining record containing details of a
patient or a doctor, all records of that patient or doctor will be lost.
13. (a) (ii)

First Normal Form (1NF)
A relation is said to be in first normal form (1NF) if the values in the
domain of each attribute of the relation are atomic (that is simple and
indivisible). In 1NF, all domains are simple and in a simple domain, all
elements are atomic. Every tuple (row) in the relational schema contains
only one value of each attribute and no repeating groups. 1NF data
requires that every data entry, or attribute (field) value, must be non-
decomposable. Hence, 1NF disallows having a set of values, a tuple of
values or a combination of both as an attribute value for a single tuple.
1NF disallows multi-valued attributes that are themselves composites.
This is called “relations within relations”, or nested relations, or
“relations as attributes of tuples”.
Example 1
Consider a relation LIVED_IN, as shown in Fig. 10.2 (a), which keeps
records of person and his residence in different cities. In this relation, the
domain RESIDENCE is not simple. For example, an attribute “Abhishek”
can have residence in Jamshedpur, Mumbai or Delhi. Therefore, the
relation is un-normalised. Now, the relation LIVED_IN is normalised
by combining each row in residence with its corresponding value of
PERSON and making this combination a tuple (row) of the relation, as

shown in Fig. 10.2 (b). Thus, now non-simple domain RESIDENCE is

replaced with simple domains.
Second Normal Form (2NF)

A relation R is said to be in second normal form (2NF) if it is in 1NF
and every non-prime key attributes of R is fully functionally dependent
on each relation (primary) key of R. In other words, no attributes of the
relation (or table) should be functionally dependent on only one part of
a concatenated primary key. Thus, 2NF can be violated only when a key
is a composite key or one that consists of more than one attribute. 2NF
is an intermediate step towards higher normal forms. It eliminates the
problems of 1NF.

Example 1
As shown in Fig. 10.3, the partial dependency of the doctor’s contact
number on the key DOCTORNAME indicates that the relation is not
in 2NF. Therefore, to bring the relation in 2NF, the information about
doctors and their contact numbers have to be separated from information
about patients and their appointments with doctors. Thus, the relation is
decomposed into two tables, namely PATIENT_DOCTOR and DOCTOR,
as shown in Table 10.4. The relational table can be depicted as:
PATIENT_DOCTOR (PATIENT-NAME, DATE-OF-BIRTH,
DOCTOR-NAME,
DATE-TIME, DURATION-MINUTES)
DOCTOR (DOCTOR-NAME, CONTACT-NO)
Relation PATIENT_DOCTOR decomposed into two tables for

refirement into 2NF
Third Normal Form (3NF)
A relation R is said to be in third normal form (3NF) if the relation R is
in 2NF and the non-prime attributes (that is, attributes that are not part
of the primary key) are

•• mutually independent,
•• functionally dependent on the primary (or relation) key.
In other words, no attributes of the relation should be transitively

functionally dependent on the primary key. Thus, in 3NF, no non-prime
attribute is functionally dependent on another non-prime attribute. This
means that a relation in 3NF consists of the primary key and a set of
independent nonprime attributes.
Example 1
Let us again take example of relation PATIENT_DOCTOR, as shown
in Table 10.4 (a). In this relation, there is no dependency between
PATIENT-NAME and DURATION-MINUTES. However, PATIENT-
NAME and DATE-OF-BIRTH are not mutually independent. Therefore,
the relation is not in 3NF. To convert this PATIENT_DOCTOR relation
in 3NF, it has to be decomposed to remove the parts that are not directly
dependent on relation (or primary) key. Though each value of the
primary key has a single associated value of the DATE-OF-BIRTH,
there is further dependency called transitive dependency linking DATE-
OF-BIRTH directly to the primary key, through its dependency on the
PATIENT-NAME. A functional dependency diagram is shown in Fig.
10.6. Thus, following three relations are created:
PATIENT (PATIENT-NAME, DATE-OF-BIRTH)
PATIENT_DOCTOR (PATIENT-NAME, DOCTOR-NAME, DATE-
TIME, DURATIONMINUTES)
BOYCE-CODD NORMAL FORM (BCNF)
To eliminate the problems and redundancy of 3NF, R.F. Boyce proposed
a normal form known as Boyce-Codd normal form (BCNF). Relation
R is said to be in BCNF if for every nontrivial FD: X → Y between
attributes X and Y holds in R. That means:
•• X is super key of R,
•• X →Y is a trivial FD, that is, Y → X.
In other words, a relation must only have candidate keys as determinants.

Thus, to find whether a relation is in BCNF or not, FDs within each
relation is examined. If all non-key attributes depend upon only the
complete key, the relation is in BCNF.
Any relation in BCNF is also in 3NF and consequently in 2NF. However,
a relation in 3NF is not necessarily in BCNF. The BCNF is a simpler

form of 3NF and eliminates the problems of 3NF. The difference

between 3NF and BCNF is that for a functional dependency A→B, 3NF
allows this dependency in a relation if B is a primary key attribute and A
is not a candidate key. Whereas, BCNF insists that for this dependency
to remain in a relation, A must be a candidate key. Therefore, BCNF is a
stronger form of 3NF, such that every relation in BCNF is also in 3NF.
Example 1
Relation USE in Fig. 10.7 (a) does not satisfy the above condition, as it
contains the following two functional dependencies:
PROJ-MANAGER → PROJECT
PROJECT → PROJ-MANAGER
But neither PROJ-MANAGER nor PROJECT is a super key.
Now, the relation USE can be decomposed into the following two BCNF
relations:
USE (PROJECT, MACHINE, QTY-USED)
PROJECTS (PROJECT, PROJ-MANAGER)
Both of the above relations are in BCNF. The only FD between the USE
attributes is
PROJECT, MACHINE →QTY-USED
and (PROJECT, MACHINE) is a super key.
The two FDs between the PROJECTS attributes are
PROJECT →PROJ-MANAGER
PROJ-MANAGER →PROJECT
Both PROJECT and PROJ-MANAGER are super keys of relation
PROJECTS and PROJECTS is in
BCNF.
13. (b)
A closure of a set (also called complete sets) of functional dependency
defines all the FDs that can be derived from a given set of FDs. Given
a set of F of FDs on attributes of a table T, closure of F is defined. The
notation F+ is used to denote the closure of the set of all FDs implied by
F. Armstrong’s axioms can be used to develop algorithm that will allow
computing F+ from F.

Let us consider the set of F of FDs given by

F = {A →B, B →C, C →D, D →E, E →F, F →G, G →H}
Now by transitivity rule of Armstrong’s axioms,
A →B and B →C together imply A →C, which must be included in F+.
Also, B →C and C →D together imply B →D.
In fact, every single attribute appearing prior to the terminal one in
the sequence A B C D E F G H can be shown by transitivity rule to
functionally determine every single attribute on its right in the sequence.
Trivial FDs such as A →A is also present.
Now by union rule of Armstrong’s axioms, other FDs can be generated
such as A →A B C D E F G H. All FDs derived above are contained in
F+.
To be sure that all possible FDs have been derived by applying the
axioms and rules, an algorithm similar to membership algorithm of
Fig. 9.12 (b), can be developed. The below figure illustrates such an
algorithm to compute a certain subset of the closure. In other words, for
a given set F of attributes of table T and a set of S of FDs that hold for
T, the set of all attributes of T that are functionally dependent on F, is
called closure F+ of F under S.
Finding redundancy using membership algorithm

Computing closure F+ of F under S

Let us consider the functional dependency diagram (FDD) of relation
schema EMP_PROJECT, as shown in Fig. From the semantics of the
attributes, we know that the following functional dependencies should
hold:
FD: {EMPLOYEE-NO} →{EMPLOYEE-NAME}
FD: {EMPLOYEE-NO, PROJECT-NO} →{HOURS-SPENT}
FD: {PROJ-NO} →{PROJECT-NAME, PROJECT-LOCATION}
Now, from the semantics of attributes, following set F of FDs can be
specified that should hold on
EMP_PROJECT:
F = {EMPLOYEE-NO →EMPLOYEE-NAME,
PROJECT-NO →{PROJECT-NAME, PROJECT-LOCATION},
{EMPLOYEE-NO, PROJECT-NO} →HOURS-SPENT}

Functional dependency diagram of relation EMP_PROJECT

The closure sets with respect to F can be calculated as follows:
{EMPLOYEE-NO}+ = {EMPLOYEE-NO, EMPLOYEE-NAME}
{PROJECT-NO}+ = {PROJECT-NO, PROJECT-NAME, PROJECT-
LOCATION}
{EMPLOYEE-NO, PROJECT-NO}+ = {EMPLOYEE-NO, PROJECT-
NO,
EMPLOYEE-NAME, PROJECT-NAME, PROJECT-LOCATION,
HOURS-SPENT}.
14. (a) (i)

LOCKING METHODS FOR CONCURRENCY CONTROL
A lock is a variable associated with a data item that describes the status
of the item with respect to possible operations that can be applied
to it. It prevents access to a database record by a second transaction
until the first transaction has completed all of its actions. Generally,
there is one lock for each data item in the database. Locks are used
as means of synchronising the access by concurrent transactions to
the database items. Thus, locking schemes aim to allow the concurrent
execution of compatible operations. In other words, permutableactions
are compatible. Locking is the most widely used form of concurrency
control and is the method of choice for most applications. Locks are
granted and released by a lock manager. The principle data structure of
a lock manager is the lock table. In the lock table, an entry consists of
a transaction identifier, a granule identifier and lock type. The simplest
type of a locking scheme has two types of lock namely (a) S locksshared

or Read lock and (b) X locks-exclusive or Write lock. The lock manager
refuses incompatible requests, so if
(a) Transaction T1 holds an S lock on granule G1. A request by
transaction T2 for an S lock will be granted. In other words, Read-
Read is permutable.
(b) Transaction T1 holds an S lock on granule G1. A request by transaction
T2 for an X lock will be refused. In other words, Read-Write is not
permutable.
(c) Transaction T1 holds an X lock on granule G1. No request by
transaction T2 for a lock on G1. will be granted. In other words, Write
is not permutable.
Lock Granularity
A database is basically represented as a collection of named data items.
The size of the data item chosen as the unit of protection by a concurrency
control program is called granularity. Granularity can be a field of some
record in the database, or it may be a larger unit such as record or even
a whole disk block. Granule is a unit of data individually controlled by
the concurrency control subsystem. Granularity is a lockable unit in a
lock-based concurrency control scheme. Lock granularity indicates the
level of lock use. Most often, the granule is a page, although smaller or
larger units (for example, tuple, relation) can be used. Most commercial
database systems provide a variety of locking granularities. Locking can
take place at the following levels:
•• Database level.
•• Table level.
•• Page level.
•• Row (tuple) level.
•• Attributes (fields) level.
Thus, the granularity affects the concurrency control of the data items,
that is, what portion of the database a data item represents. An item
can be as small as a single attribute (or field) value or as large as a disk
block, or even a whole file or the entire database.
Database Level Locking
At database level locking, the entire database is locked. Thus, it prevents
the use of any tables in the database by transaction T2 while transaction
T1 is being executed.

Database level of locking is suitable for batch processes. Being very

slow, it is unsuitable for on-line multi-user DBMSs.
Table Level Locking
At table level locking, the entire table is locked. Thus, it prevents the
access to any row (tuple) by transaction T2 while transaction T1 is using
the table. If a transaction requires access to several tables, each table
may be locked. However, two transactions can access the same database
as long as they access different tables.
Table level locking is less restrictive than database level. But, it causes
traffic jams when many transactions are waiting to access the same
table. Such a condition is especially problematic when transactions
require access to different parts of the same table but would not interfere
with each other. Table level locks are not suitable for multi-user DBMSs.
Page Level Locking
At page level locking, the entire disk-page (or disk-block) is locked. A
page has a fixed size such as 4 K, 8 K, 16 K, 32 K and so on. A table
can span several pages, and a page can contain several rows (tuples) of
one or more tables.
Page level of locking is most suitable for multi-user DBMSs.
Row Level Locking
At row level locking, particular row (or tuple) is locked. A lock exists
for each row in each table of the database. The DBMS allows concurrent
transactions to access different rows of the same table, even if the rows
are located on the same page.
The row level lock is much less restrictive than database level, table
level, or page level locks. The row level locking improves the availability
of data. However, the management of row level locking requires high
overhead cost.
Attribute (or Field) Level Locking
At attribute level locking, particular attribute (or field) is locked.
Attribute level locking allows concurrent transactions to access the same
row, as long as they require the use of different attributes within the row.
The attribute level lock yields the most flexible multi-user data access.
However, it requires a high level of computer overhead.

Lock Types
The DBMS mainly uses the following types of locking techniques:
•• Binary locking.
•• Exclusive locking.
•• Shared locking.
•• Two-phase locking (2PL).
•• Three-phase locking (3PL).
Binary Locking
In binary locking, there are two states of locking namely (a) locked (or
‘1’) or (b) unlocked (‘0’). If an object of a database table, page, tuple
(row) or attribute (field) is locked by a transaction, no other transaction
can use that object. A distinct lock is associated with each database item.
If the value of lock on data item X is 1, item X cannot be accessed by a
database operation that requires the item. If an object (or data item) X
is unlocked, any transaction can lock the object for its use. As a rule, a
transaction must unlock the object after its termination. Any database
operation requires that the affected object be locked. Therefore, every
transaction requires a lock and unlock operation for each data item that
is accessed. The DBMSs manages and schedules these operations.
Two operations, lock_item(data item) and unlock_item(data item)
are used with binary locking. A transaction requests access to a data
item X by first issuing a lock_item(X) operation. If LOCK(X) = 1,
the transaction is forced to wait. If LOCK(X) = 0, it is set to 1 (that
is, transaction locks the data item X) and the transaction is allowed to
access item X. When the transaction is through using the data item, it
issues unlock_item(X) operation, which sets LOCK(X) to 0 (unlocks
the data item) so that X may be accessed by other transactions. Hence, a
binary lock enforces mutual exclusion on the data item.
It can be observed from the above table that the lock and unlock features
eliminate the lost update problem as depicted in table. Binary locking
system has advantages of easy to implement. However, the binary
locking technique has limitations of being restrictive to yield optimal
concurrency conditions. For example, the DBMS will not allow the
two transactions to read the same database object, even though neither
transaction updates the database. Therefore, concurrency problems do
not occur as is the case in lost update.

Binary lock
Shared/Exclusive (or Read/Write) Locking

A shared/exclusive (or Read/Write) lock uses multiple-mode lock.
In this type of locking, there are three locking operations namely (a)
Read_lock(A), (b) Write_lock(B), and Unlock(A). A read-locked item is
also called share-locked, because other transactions are allowed to read
the item. A write-locked item is called exclusive lock, because a single
transaction exclusively holds the lock on the item. A shared lock is
denoted by S and the execlusive lock is denoted by X. A share/executive
lock exists when access is specifically reserved for the transaction that
locked the object. The exclusive lock must be used when there is a
chance of conflict. An exclusive lock is used when a transaction wants
to write (update) a data item and no locks are currently held on that
data item by any other transaction. If transaction T2 updates data item
A, then an exclusive lock is required by transaction T2 over data item A.
The exclusive lock is granted if and only if no other locks are held on
the data item.
A shared lock exists when concurrent transactions are granted READ
access on the basis of a common lock. A shared lock produces no
conflict as long as the concurrent transactions are Read-only. A shared
lock is used when a transaction wants to Read data from the database
and no exclusive lock is held on that data item. Shared locks allow
several READ transactions to concurrently Read the same data item.
For example, if transaction T1 has shared lock on data item A, and
transaction T2 wants to Read data item A, transaction T2 may also obtain
a shared lock on data item A.
If an exclusive or shared lock is already held on data item A by transaction
T1, an exclusive lock cannot be granted on transaction T2.

Two-phase Locking (2PL)

Two-phase locking (also called 2PL) is a method or a protocol of
controlling concurrent processing in which all locking operations
precede the first unlocking operation. Thus, a transaction is said to
follow the twophase locking protocol if all locking operations (such
as read_lock, write_lock) precede the first unlock operation in the
transaction. Two-phase locking is the standard protocol used to maintain
level 3 consistency. 2PL defines how transactions acquire and relinquish
locks. The essential discipline is that after a transaction has released
a lock it may not obtain any further locks. In practice this means that
transactions hold all their locks they are ready to commit. 2PL has the
following two phases:
•• A growing phase, in which a transaction acquires all the required
locks without unlocking any data. Once all locks have been acquired,
the transaction is in its locked point.
•• A shrinking phase, in which a transaction releases all locks and can-
not obtain any new lock.
•• The above two-phase locking is governed by the following rules:
•• Two transactions cannot have conflicting locks.
•• No unlock operation can precede a lock operation in the same
transaction.
•• No data are affected until all locks are obtained, that is, until the
transaction is in its locked point.
Fig. 12.7 illustrates schematic of two-phase locking. In case of a strict

two-phase locking the interleaving is not allowed. Fig. 12.8 shows a
schedule with strict two-phase locking in which transaction T1 would
obtain an exclusive lock on A first and then Read and Write A. Fig.
12.9 illustrates an example of strict two-phase locking with serial
execution in which first strict locking is done as explained above, then
transaction T2 would request an exclusive lock on A. However, this
request cannot be granted until transaction T1 releases its exclusive lock
on A, and the DBMS therefore, suspends transaction T2. Transaction
T1 now proceeds to obtain an exclusive lock on B, Reads and Writes
B, then finally commits, at which time its locks are released. The lock
request of transaction T2 is now granted, and it proceeds. Similarly, Fig.
12.10 illustrates the schedule following strict two-phase locking with
interleaved actions.

Schedule with strict two-phase locking
Two-phase locking guarantees serialisability, which means that

transactions can be executed in such a way that their results are the
same as if each transaction’s actions were executed in sequence without
interruption. But, two-phase locking does not prevent deadlocks and
therefore is used in conjunction with a deadlock prevention technique.
Schedule with strict two-phase locking with serial execution

Fig. 12.9 illustrates an example of strict two-phase locking with serial

execution in which first strict locking is done as explained above, then
transaction T2 would request an exclusive lock on A. However, this
request cannot be granted until transaction T1 releases its exclusive lock
on A, and the DBMS therefore, suspends transaction T2. Transaction
T1 now proceeds to obtain an exclusive lock on B, Reads and Writes
B, then finally commits, at which time its locks are released. The lock
request of transaction T2 is now granted, and it proceeds. Similarly, Fig.
12.10 illustrates the schedule following strict two-phase locking with
interleaved actions.
Schematic of Two-phase locking (2PL)
14. (a) (ii)

Validation Phase
In a validation (or certification) phase, the transaction is validated to
assure that the changes made will not affect the integrity and consistency
of the database. If the validation test is positive, the transaction goes
to the write phase. If the validation test is negative, the transaction is
restarted, and the changes are discarded. Thus, in this phase the list of

granules is checked for conflicts. If conflicts are detected in this phase,

the transaction is aborted and restarted. The validation algorithm must
check that the transaction has
•• seen all modifications of transactions committed after it starts.
•• not read granules updated by a transaction committed after its start.
14. (b) (i)

Two major types of serializability exist: view-serializability, and conflict-
serializability. View-serializability matches the general definition of
serializability given above. Conflict-serializability is a broad special case,
i.e., any schedule that is conflict-serializable is also view-serializable,
but not necessarily the opposite. Conflict-serializability is widely
utilized because it is easier to determine and covers a substantial portion
of the view-serializable schedules. Determining view-serializability of
a schedule is an NP-complete problem (a class of problems with only
difficult-to-compute, excessively time-consuming known solutions).
View-serializability of a schedule is defined by equivalence to a serial
schedule (no overlapping transactions) with the same transactions, such
that respective transactions in the two schedules read and write the same
data values (“view” the same data values).
Conflict-serializability is defined by equivalence to a serial schedule
(no overlapping transactions) with the same transactions, such that both
schedules have the same sets of respective chronologically ordered
pairs of conflicting operations (same precedence relations of respective
conflicting operations).
14. (b) (ii)

A transaction must have the following four properties, called ACID
properties (also called ACIDITY of a transaction), to ensure that a
database remains stable state after the transaction is executed:
•• Atomicity.
•• Consistency.
•• Isolation.
•• Durability.
Atomicity: The atomicity property of a transaction requires that all

operations of a transaction be completed, if not, the transaction is

aborted. In other words, a transaction is treated as single, individual

logical unit of work.
Consistency: Database consistency is the property that every transaction
sees a consistent database instance.
Isolation: Isolation property of a transaction means that the data
used during the execution of a transaction cannot be used by a second
transaction until the first one is completed. This property isolates
transactions from one another.
Durability: The durability property of transaction indicates the
performance of the database’s consistent state.
15. (a)
RAID TECHNOLOGY
With fast growing database applications such as World Wide Web,
multimedia and so on, the data storage requirements are also growing
at the same pace. Also, faster microprocessors with larger and larger
primary memories are continually becoming available with the
exponential growth in the performance and capacity of semiconductor
devices and memories. Therefore, it is expected that secondary storage
technology must also take steps to keep up in performance and reliability
with processor technology to match the growth. Development of
redundant arrays of inexpensive disks (RAID) was a major advancement
in secondary storage technology to achieve improved performance and
reliability of storage system. Lately, the “I” in RAID is said to stand for
independent. The main goal of RAID is to even out the widely different
rates of performance improvement of disks against those in memory and
microprocessor. RAID technology provides a disk array arrangement
in which a large number of small independent disks operate in parallel
and act as a single higher-performance logical disk in place of a single
very large disk. The parallel operation of several disks improve the rate
at which data can be read or written and performs several independent
reads and writes in parallel. In a RAID system, a combination of data
stripping (also called parallelism) and data redundancy is implemented.
Data is distributed over several disks and redundant information is
stored on multiple disks.
Thus, in case of disk failure the redundant information is used to
reconstruct the content of the failed disk. Therefore, failure of one
disk does not lead to the loss of data. The RAID system increases the
performance and improves reliability of the resulting storage system.

15. (b) (i)

Database Tuning is the activity of making a database application run
more quickly. “More quickly” usually means higher throughput, though
it may mean lower response time for time-critical applications.
15. (b) (ii)

B+-tree Indexing
B+-tree index is a balanced tree in which the internal nodes direct the
search operation and the leaf nodes contain the data entries. Every
path from the root to the tree leaf is of the same length. Since the tree
structure grows and shrinks dynamically, it is not feasible to allocate the
leaf pages sequentially. In order to retrieve all pages efficiently, they are
linked using page pointers. In a B+-tree index, references to data are
made only from the leaves. The internal nodes of the B+-tree are indexed
for fast access of data, which is called an index set.
The leaves have different structure than other nodes of the B+-tree.
Usually the leaves are linked sequentially to form a sequence set so
that scanning this list of leaves results in data given in ascending order.
Hence, a B+-tree index is a regular B-tree plus a linked list of data. B+-
tree index is a widely used structure.
Characteristics of a B+-tree index:
•• It is a balanced tree.
•• A minimum occupancy of 50 per cent is guaranteed for each node
except the root.
•• Since file grows rather than shrink, deletion is often implemented
by simply locating the data entry and removing it, without adjusting
the tree.
•• Searching for a record requires just a traversal from the root to the
appropriate leaf.

May/june 2013
Third Semester

(Regulation 2008/2010)
PART A (10 × 2 = 20 marks)
1. Define Atomicity in transaction management.
2. Give example for one to one and one to many relationship.
3. What are primary key constrains?
4. Write the purpose of Trigger.
5. Define Boyce Codd Normal Form.
6. What is the need for Normalisation?
7. List ACID properties.
8. Define two phase locking.
9. What is the need for RAID?
10. W
hat is the basic difference between static hashing and dynamic
hashing?

Part B (5 × 16 = 80 marks)
11. (a) Explain the purpose of database system.

Or
(b) W
rite about the structure of database system architecture with block
diagram.
12. (a) Consider the following relational database
Employee (Employee – Name, Street, City)
Works (Employee – Name, Company-Name, Salary)
Company (Company-Name, City)
Manager (Employee-Name, Manager-Name)
G
ive an SQL DDL definition of this database. Identify referential
integrity constrains that should hold, and include them in the DDL
definition.
Or
(b) Consider the following relation
Employee (Employee-Name, Company-Name, Salary)
Write SQL for the following: (4 × 4 = 16)
(i) Find the total salary of each company
(ii) Find the employee name who is getting lowest salary
(iii) Find the company name which has lowest average salary
(iv) Find the employee name whose salary is higher than average
salary of TCS.
13. (a) Consider the following relation
AR-SALE (Car #, Data-Sold, Salesman #, Commission %,
C
Discount-amount)
A
ssume that a car may be sold by multiple salesmen, and hence (Car
#, Salesman #) is the primary key.
Additional dependencies are
Date-Sold → Discount-amt

Database Management Systems (May/June 2013) 3.37
and
Salesman # → Commission %
B
ased on the given primary key, is this relation in 1 NF, 2 NF, or
3 NF? Why or why not? How would you successively normalize it
completely?
Or
(b) Explain the principles of
(i) Loss less join decomposition (5)
(ii) Join dependencies (5)
(iii) Fifth normal form. (6)
14. (a) Illustrate dead lock and conflict serializability with suitable example.
Or
(b) (i) Explain two phase commit protocol. (10)
(ii) Write different SQL facilities for recovery. (6)
15. (a) Construct B tree to insert the following (order of the tree is 3)
+
26, 27, 28, 3, 4, 7, 9, 46, 48, 51, 2, 6.

Or
(b) L
et relations r1(A, B, C) and r2(C, D, E) have the following properties:
r1 has 20,000 tuples, r2 has 45,000 tuples, 25 tuples of r1 fit on one
block and 30 tuples of r2 fit on one block. Estimate the number of
block transfers and seeks required, using each of the following Join
strategies for r1 ∞ r2 :
(i) Nested – loop join (4)
(ii) Block nested loop join (8)
(iii) Merge join. (4)

Solutions
Part A
1. Atomicity means that either all the work of a transaction or none of it

is applied. With atomicity property of the transaction, other operations
can only access any of the rows involved in transactional access either
before the transaction occurs or after the transaction is complete, but
never while the transaction is partially complete.
2. There are three basic constructs of connectivity for binary relationship

namely, one to-one (1:1), one-to-many (1:N), and many-to-many (M:N).
In case of one-to-one connection, exactly one PERSON manages the
entity DEPT and each person manages exactly one DEPT. Therefore,
the maximum and minimum connectivities are exactly one for both the
entities. In case of one-to-many (1:N), the entity DEPT is associated to
many PERSON, whereas each person works within exactly one DEPT.
The maximum and minimum connectivities to the PERSON side are of
unknown value N, and one respectively. Both maximum and minimum
connectivities on DEPT side are one only. In case of many-to-many
(M:N) connectivity, the entity PERSON may work on many PROJECTs
and each project may be handled by many persons. Therefore, maximum
connectivity for PERSON and PROJECT are M and N respectively, and
minimum connectivities are each defined as one. If the values of M and
N are 10 and 5 respectively, it means that the entity PERSON may be
a member of a maximum 5 PROJECTs, whereas, the entity PROJECT
may contain maximum of 10 PERSONs.
3. The PRIMARY KEY constraint uniquely identifies each record in a

database table.
Primary keys must contain unique values. A primary key column cannot
contain NULL values. Each table should have a primary key, and each
table can have only ONE primary key.
4. Audit changes (e.g. keep a log of the users and roles involved in
changes) enhance changes (e.g. ensure that every change to a record is
time-stamped by the server’s clock) enforce business rules (e.g. require
that every invoice have at least one line item) execute business rules
(e.g. notify a manager every time an employee’s bank account number
changes) replicate data (e.g. store a record of every change, to be shipped

to another database later) enhance performance (e.g. update the account

balance after every detail transaction, for faster queries)
5. Boyce proposed a normal form known as Boyce-Codd normal form

(BCNF). Relation R is said to be in BCNF if for every nontrivial FD:
X → Y between attributes X and Y holds in R. That means:
•• X → Y is a trivial FD, that is, Y ⊂ X.
6. The process of normalization provides the following to the database

designers:
•• A formal framework for analysing relation schemas based on their
keys and on the functional dependencies among their attributes.
•• A series of normal form tests that can be carried out on individual
relation schemas so that the relational database can be normalised to
any desired degree.
7. Transaction has, generally, following four properties, called ACID:

•• Atomicity
•• Consistency
•• Isolation
•• Durability
8. Two-phase locking (also called 2PL) is a method or a protocol of

controlling concurrent processing in which all locking operations precede
the first unlocking operation. Thus, a transaction is said to follow the
twophase locking protocol if all locking operations (such as read_lock,
write_lock) precede the first unlock operation in the transaction.
9. In a RAID system, a combination of data stripping (also called

parallelism) and data redundancy is implemented. Data is distributed
over several disks and redundant information is stored on multiple
disks. Thus, in case of disk failure the redundant information is used
to reconstruct the content of the failed disk. Therefore, failure of one
disk does not lead to the loss of data. The RAID system increases the
performance and improves reliability of the resulting storage system.
10. Numbers of buckets are fixed. Numbers of Buckets are not fixed
As the file grows, performance decreases. Performance donot degrade
as the file grows.
Space overhead is more. Minimum space lies overhead.

Part B
11. (a)
DATABASE SYSTEM
A database system, also called database management system
(DBMS), is a generalized software system for manipulating databases.
It is basically a computerized record-keeping system; which it stores
information and allows users to add, delete, change, retrieve and
update that information on demand. It provides for simultaneous use
of a database by multiple users and tool for accessing and manipulating
the data in the database. DBMS is also a collection of programs that
enables users to create and maintain database. It is a general-purpose
software system that facilitates the process of defining (specifying the
data types, structures and constraints), constructing (process of storing
data on storage media) and manipulating (querying to retrieve specific
data, updating to reflect changes and generating reports from the data)
for various applications.
Typically, a DBMS has three basic components, as shown in Fig. 1.16,
and provides the following facilities:

DBMS Components
•• Data description language (DDL): It allows users to define the
database, specify the data types, and data structures, and the con-
straints on the data to be stored in the database, usually through data
definition language. DDL translates the schema written in a source
language into the object schema, thereby creating a logical and phys-
ical layout of the database.
•• Data manipulation language (DML) and query facility: It allows
users to insert, update, delete and retrieve data from the database,
usually through data manipulation language (DML). It provides gen-
eral query facility through structured query language (SQL).
•• Software for controlled access of database: It provides controlled
access to the database, for example, preventing unauthorized user
trying to access the database, providing a concurrency control sys-
tem to allow shared access of the database, activating a recovery
control system to restore the database to a previous consistent state
following a hardware or software failure and so on.
The database and DBMS software together is called a database system.

A database system overcomes the limitations of traditional file-oriented
system such as, large amount of data redundancy, poor data control,
inadequate data manipulation capabilities and excessive programming
effort by supporting an integrated and centralized data structure.

11. (b)
branch-city
branch-name assets
branch
loan-branch
social-security payment-date
customer-street payment-number payment-amount

customer-name
loan-number
customer-city
amount
customer borrower loan loan-payment payment
access-date
account-number balance
cust-banker type
depositor account
manager
employee works-for ISA
worker
e-social-security employee-name
savings-account checking-account
dependent-name telephone-number
employment-length start-date interest-rate overdraft-amount

12. (a)
create table employee
(person-name char(20),
street char(30),
city char(30),
primary key (person-name) )
create table works
company-name char(15),
salary integer,
primary key (person-name),
foreign key (person-name) references employee,
foreign key (company-name) references company)
create table company
(company-name char(15),
city char(30),
primary key (company-name))
create table manages
manager-name char(20),
primary key (person-name),
foreign key (person-name) references employee,
foreign key (manager-name) references employee)
Note that alternative datatypes are possible. Other choices for not null
attributes may be acceptable.
12. (b)
select salary as total salary from employee
ii)SELECT MIN(salary) AS “Lowest salary” FROM employees;
iii) SELECT AVG(SAL), COMPANY NAME FROM EMP GROUP
BY COMPANY NAME;
iv) SELECT employee_name, salary, company_name,
(SELECT ROUND(AVG(salary))

FROM employees
WHERE employee.company_name =company_name) AS avg_sal
FROM employees e
WHERE salary > avg_sal
ORDER BY avg_sal DESC
13. (a)
The relation is in 1NF because all attribute values are single atomic
values.
The relation is not in 2NF because:
Car# >>DateSold
Car# >>DiscountAmount
Salesman# >>Commission%
Thus, these attributes are not fully functionally dependent on the primary
key.
•• 2NF decomposition:
CAR_SALE1(Car#, DateSold, DiscountAmount)
CAR_SALE2(Car#, Salesman#)
CAR_SALE3(Salesman#, Commission%)
•• The relations are not in 3NF because:
Car# >> DateSold >> DiscountAmount
Thus, DateSold is neither a key itself nor a subset of a key and
DiscountAmount is not a prime attribute.
•• 3NF decomposition:
CAR_SALES1A(Car#, DateSold)
CAR_SALES1B(DateSold, DiscountAmount)
CAR_SALE2(Car#, Salesman#)
CAR_SALE3(Salesman#, Commission%)
13. (b) (i)

Lossless-Join Decomposition
A relational table is decomposed (or factored) into two or more smaller
tables, in such a way that the designer can capture the precise content

of the original table by joining the decomposed parts. This is called

lossless-join (or non-additive join) decomposition. The decomposition
of R (X, Y, Z) into R1(X, Y) and R2(X, Z) is lossless if for attributes X,
common to both R1 and R2, either X → Y or Y → Z.
All decompositions must be lossless. The word loss in lossless refers to
the loss of information. The lossless-join decomposition is always defined
with respect to a specific set F of dependencies. A decomposition D ≡
{R1, R2, R3, ....., Rm} of R is said to have the lossless-join property with
respect to the set of dependencies F on R if, for every relation state r of
R that satisfies F, the following relation holds:
where Π = projection
= the natural join of all relations in D.
The lossless-join decomposition is a property of decomposition, which
ensures that no spurious tuples are generated when a natural join
operation is applied to the relations in the decomposition.
Let us consider the relation scheme (or table) R(X, Y, Z) with functional
dependencies YZ → X, X → Y and X → Z, as shown in Fig. 9.17. The
relation R is decomposed into two relations, R1and R2 that are defined by
following two projections:
R1 = projection of R over X, Y
R2 = projection of R over X, Z
where X is the set of common attributes in R1and R2 .
The decomposition is lossless if R ⊂ join of R1and R2 over X and the
decomposition is lossy if R ⊂ join of R1 and R2 over X.
It can be seen in Fig. 9.17 that the join of R1 and R2 yields the same
number of rows as does R. The decomposition of R(X,Y, Z) into R1 (X,Y)
and R2 (X, Z) is lossless if for attributes X, common to both R1 and R2,
either X → Y or X → Z. Thus, in example of Fig. 9.16 the common
attribute of R1and R2 is B, but neither B → A nor B → C is true. Hence
the decomposition is lossy. In Fig. 9.17, however, the decomposition is
lossless because for the common attribute X, both X → Y and X → Z.

Lossless decomposition
13 (b) (ii)
Join Dependencies (JD)
A join dependency (JD) can be said to exist if the join of R1 and R2 over
C is equal to relation R. Where, R1 and R2 are the decompositions R1 (A,
B, C), and R2(C, D) of a given relations R (A, B, C, D). Alternatively, R1
and R2 is a lossless decomposition of R. In other words, *(A, B, C, D),
(C, D) will be a join dependency of R if the join of the join’s attributes
is equal to relation R. Here, *(R1, R2, R3, ….) indicates that relations R1,
R2, R3 and so on are a join dependency (JD) of R. Therefore, a necessary
condition for a relation R to satisfy a JD *(R1, R2,…., Rn) is that

R = R1 ∪ R2 ∪.....∪ Rn
Thus, whenever we decompose a relation R into R1 = XUY and R2 = (R –
Y) based on an MVD X →Y that holds in relation R, the decomposition
has lossless join property. Therefore, lossless-join dependency can be
defined as a property of decomposition, which ensures that no spurious
tuples are generated when relations are returned through a natural join
operation.
13 (b) iii)
Fifth Normal Form (5NF)
A relation is said to be in fifth normal form (5NF) if every join dependency
is a consequence of its relation (candidate) keys. Alternatively, for every
non-trivial join dependency *(R1, R2, R3) each decomposed relation Ri
is a super key of the main relation R. 5NF is also called project-join
normal form (PJNM).
There are some relations, who cannot be decomposed into two or
higher normal form relations by means of projections as discussed in
1NF, 2NF, 3NF and BCNF. Such relations are decomposed into three
or more relations, which can be reconstructed by means of a three-way
or more join operation. This is called fifth normal form (5NF). The
5NF eliminates the problems of 4NF. 5NF allows for relations with join
dependencies. Any relation that is in 5NF, is also in other normal forms
namely 2NF, 3NF and 4NF. 5Nf is mainly used from theoretical point of
view and not for practical database design.
14 (a)
Deadlocks
A deadlock is a condition in which two (or more) transactions in a set
are waiting simultaneously for locks held by some other transaction in
the set. Neither transaction can continue because each transaction in the
set is on a waiting queue, waiting for one of the other transactions in the
set to release the lock on an item. Thus, a deadlock is an impasse that
may result when two or more transactions are each waiting for locks to
be released that are held by the other. Transactions whose lock requests
have been refused are queued until the lock can be granted. A deadlock
is also called a circular waiting condition where two transactions are
waiting (directly or indirectly) for each other. Thus in a deadlock, two
transactions are mutually excluded from accessing the next record
required to complete their transactions, also called a deadly embrace. A

deadlock exists when two transactions T1 and T2 exist in the following

mode:
Schedule with strict two-phase locking with interleaved actions

Table: Deadlock situation
•• Transaction T1 = access data items X and Y

•• Transaction T2 = access data items Y and X

If transaction T1 has not unlocked the data item Y, transaction T2 cannot

begin. Similarly, if transaction T2 has not unlocked the data item X,
transaction T1 cannot continue. Consequently, transactions T1 and T2
wait indefinitely and each wait for the other to unlock the required data
item. Table 12.9 illustrates a deadlock situation of transactions T1 and
T2. In this example, only two concurrent transactions have been shown
to demonstrate a deadlock situation. In a practical situation, DBMS can
execute many more transactions simultaneously, thereby increasing the
probability of generating deadlocks. Many proposals have been made for
detecting and resolving deadlocks, all of which rely on detecting cycles
in a waits-for graph. A waits-for graph is a directed graph in which the
nodes represent transactions and a directed arc links a node waiting for
a lock with the node that has the lock. In other words, wait-for graph is
a graph of “who is waiting for whom”. A waits-for graph can be used to
represent conflict for any resource.
serilizability-Instructions Ii and Ij , of transactions Ti and Tj
respectively,conflict if and only if there exists some item Q accessed by
both Ii and Ij , and at least one of these instructions wrote Q.
1. Ii = read(Q), Ij = read(Q). Ii and Ij don’t conflict.
2. Ii = read(Q), Ij = write(Q). They conflict.
3. Ii = write(Q), Ij = read(Q). They conflict.
4. Ii = write(Q), Ij = write(Q). They conflict.
Intuitively, a conflict between Ii and Ij forces a (logical) temporal order
between them. If Ii and Ij are consecutive in a schedule and they do
not conflict, their results would remain the same even if they had been
interchanged in the schedule. If a schedule S can be transformed into a
schedule S0 by a series of swaps of non-conflicting instructions, we say
that S and S0 are conflict equivalent.
We say that a schedule S is conflict serializable if it is conflict equivalent
to a serial schedule.
Example of a schedule that is not conflict serializable :
T3 T4
read(Q)
write(Q)
write(Q)
We are unable to swap instructions in the above schedule to obtain either
the serial schedule < T3, T4 >, or the serial schedule < T4, T3 >.

Schedule 3 below can be transformed into Schedule 1, a serial schedule

where T2 follows T1, by a series of swaps of non-conflicting instructions.
Therefore Schedule 3 is conflict serializable.
T1 T2
read(A)
write(A)

read(A)
write(A)
read(B)
write(B)
read(B)
write(B
14. (b) (i)

Two-phase Commit (2PC)
Two-phase commit protocol (2PL) is the simplest and most widely used
technique for recovery and concurrency control in distributed database
environment. 2PL mechanism guarantees that all database servers
participating in a distributed transaction either all commit or all abort.
In a distributed database system, each sub-transaction (that is, part of a
transaction getting executed at each site) must show that it is prepared-
to-commit. Otherwise, the transaction and all of its changes are entirely
aborted. For a transaction to be ready to commit, all of its actions must
have been completed successfully. If any sub-transaction indicates that
its actions cannot be completed, then all the sub-transactions are aborted
and none of the changes are committed. The two-phase commit process
requires the coordinator to communicate with every participant site.
As the name implies, two-phase commit (2PC) protocol has two phases
namely the voting phase and the decision phase. Both phases are initiated
by a coordinator. The coordinator asks all the participants whether they
are prepared to commit the transaction. In the voting phase, the sub-
transactions are requested to vote on their readiness to commit or abort.
In the decision phase, a decision as to whether all sub-transactions
should commit or abort is made and carried out. If one participant votes
to abort or fails to respond within a timeout period, then the coordinator

instructs all participants to abort the transaction. If all vote to commit,

then the coordinator instructs all participants to commit the transaction.
The global decision must be adopted by all participants. Figs. 18.15 and
18.16 illustrates the voting phase and decision phase, respectively, of
two-phase commit protocol.
The basic principle of 2PC is that any of the transaction manager
involved (including the coordinator) can unilaterally abort a transaction.
However, there must be unanimity to commit a transaction. When a
message is sent in 2PC, it signals a decision by the sender. In order to
ensure that this decision survives a crash at the sender’s site, the log
record describing the decision is always forced to stable storage before
the message is sent. A transaction is officially committed at the time
the coordinator’s commit log record reaches stable storage. Subsequent
failures cannot affect the outcome of the transaction. The transaction is
irrevocably committed.
A log record is maintained with entries such as type of the record, the
transaction identification and the identity of the coordinator. When
a system comes back after crash, recovery process is invoked. The
recovery process reads the log and processes all transactions that were
executing the commit protocol at the time of the crash.
Voting phase of two-phase commit (2PC) protocol

Limitations
•• A failure of the coordinator of sub-transactions can result in the
transaction being blocked from completion until the coordinator is
restored.
•• Requirement of coordinator results into more messages and more
overhead.
14. (b) (ii)

SQL uses following statement to recover from transaction failure.
1. COMMIT Command
The commit command saves all transaction to the database since the last
COMMIT or ROLLBACK command was issued.
Syntax:
commit[work];
The keyword commit is the only mandatory part of the syntax. Keyword
work is optional;
Example:
SQL> delete from Emp Where Emp_age> 60;
The above command deletes the records of those employees whose
age is above 60 yrs. To store changes permanently in database commit
command is used.
SQL> COMMIT WORK;
2. ROLLBACK Command
The rollback command is the transactional control command used to
undo transaction that have not already been saved to the database. The
rollback comma nd can only be used to undo transaction since the last
COMMIT or ROLLBACK command was issued.
Syntax:
SQL>rollback[work];
The keyword rollback is the only mandatory part of the syntax. Keyword
work is optional;
Example:
SQL> delete from emp

Where emp_age > 60;

The above command deletes the records of those employees whose
age is above 60 yrs. To discard the changes made on database rollback
command is used.
SQL>ROLLBACK WORK;
3. SAVEPOINT Command
Savepoints offer a mechanism to roll back portions of transactions.
15. (b)
Tuples of r1: nr1 = 20, 000
Tuples of r2: nr1 = 45, 000
Blocks required for r1: br1 = [20, 000/25] = 800 blocks
Blocks required for r2: br2 = [45, 000/30] = 1500 blocks
(a) cost(p1) = br1 + nr1 * br2 = 800 + 20, 000 * 1, 500 = 30, 000, 800
(b) cost(p2) = br1 + br1 * br2 = 800 + 800 * 1, 500 = 1, 200, 800
(c) We assume that all tuples for any given value of the join attributes
fit in memory:
cost(p3) = br1 + br2 = 800 + 1, 500 = 2, 300
(d) We have to add the cost of sorting (using external sort-merge) to the
cost of p3, i.e.,
cost(p4) = cost(p3) + cost(sorting)
cost(sorting relation r) = br(2[logM−1(br/M)] + 1), where M is the
number of blocks in buffer. We have to add to these costs the
output of the sorted relation, i.e., add br block transfers.
Þ cost(sorting) = 800 * (2[logM−1(800/M)] + 2) + 1, 500 *
(2[logM−1(1, 500/M)] + 2)
(e) cost (p5) = 3* (br1 + br2) = 3*(800 + 1,500) = 6,900

Nov/Dec 2012
Third Semester

(Regulation 2008)
PART A (10 × 2 = 20 marks)
1. Distinguish the terms primary key and super key.
2. Specify the difference between physical and logical data independence.
3. M
ention the six fundamental operations of relational algebra and their
symbols.
4. List two reasons why null values might be introduced into the database.
5. Define Multivalued dependency.
6. Show that, if a relational database is in BCNF, then it is also in 3NF.
7. Write down the SQL facilities for recovery.
8. State the write-shead log rule. Why is the rule necessary?
9. W
hat can be done to reduce the occurrence of bucket overflows in a hash
file organizations?
10. H
ow might the type of index available influence the choice of a query
processing strategy?

Part B (5 × 16 = 80 marks)
11. (a) (i) Discuss in detail about the major disadvantages of file-
processing system. (6)
(ii) Explain in detail about different data models with neat diagram.
(10)
Or
(b) D
raw an E-R diagram for a Life insurance company with almost all
components and explain. (16)
12. (a) G
ive an introduction to Distributed database and Client/Server
database. (16)
Or
(b) I llustrate the uses of Embedded SQL and Dynamic SQL with
suitable examples. (16)
13. (a) E
xplain in detail about all functional dependencies based normal
forms with suitable examples. (16)
Or
(b) D
escribe about the Join Dependencies and Fifth normal form with
suitable example. (16)
14. (a) D
iscuss in detail about transaction concepts and two phase commit
protocol. (16)
Or
(b) Write down in detail about intent locking and isolation levels. (16)
15. (a) Jot down detailed notes on ordered indices and B – Tree index files.
(16)
Or
(b) Describe in detail about RAID and Tertiary storage. (16)

Solutions
Part A
1. Primary key is chosen by the database designer as the principal means

of identifying an entity in the entity set.
A super key is a set of one or more attributes that collectively allows us
to identify uniquely an entity in the entity set.
2. Logical data independence:

(i) It makes it possible to change the structure of the data independently
of modifying the applications or programs that make use of the data.
(ii) There is no need to rewrite current applications as part of the process
of adding to or removing data from then system.
Physical data independence:
(i) This approach has to do with altering the organization or storage
procedures related to the data, rather than modifying the data itself.
(ii) Accomplishing this shift in file organization or the indexing strategy
used for the data does not require any modification to the external
structure of the applications, meaning that users of the applications
are not likely to notice any difference at all in the function of their
programs.
3. (i) projection (π) (ii)selection(σ) (iii) Cartesian product (X) (iv)

Union(υ) (v) Set difference (-) (vi)rename(ρ).
4. 1.Special requirement for its use un SQL joins.2.Special handling

required by aggregate function & SQL grouping operator.
5. A multivalued dependency is a full constraint between two sets of

attributes in a relation.
In contrast to the functional independency, the multivalued dependency
requires that certain tuples be present in a relation. Therefore,
a multivalued dependency is also referred as a tuple-generating
dependency. The multivalued dependency also plays a role in 4NF
normalization.
6. A relation schema R is in third normal form (3NF) if for all:

a → b in F+ at least one of the following holds:

a → b is trivial (i.e., b Î a)
a is a superkey for R
7. SQL server is designed to recover from system and media failure and
recovery system can scale to machines with very large buffer pools and
thousand of disk drives
8. Before a block of data in main memory can be output to the database,

all log records pertaining to data in that block must have been output to
stable storage .this rule is called write-ahead log .This rule requires only
the undo information in the log have been output to stable storage ,and
permits the redo information to be written later.
9. To reduce bucket overflow the number of bucket is chosen to be (nr/

fr)*(1+d). We handle bucket overflow by using
•• Overflow chaining(closed hashing)
•• Open hashing
10. Possible solutions are:

•• Choose a better hash function
•• Allocate more buckets (increase space available)
Part B
11. (a) (i)

Conventional file-oriented system has the following disadvantages:
(a) Data redundancy (or duplication): Since a decentralised approach
was taken, each department used their own independent application
programs and special files of data. This resulted into duplication of
same data and information in several files, for example, duplication
of PRODUCT-ID data in both PRODUCT and SALES files, and
CUST-ID data in both CUSTOMER and SALES files as shown in
Table 1.5. This redundancy or duplication of data is wasteful and
requires additional or higher storage space, costs extra time and
money, and requires increased effort to keep all files up-to-date.

(b) Data inconsistency (or loss of data integrity): Data redundancy

also leads to data inconsistency (or loss of data integrity), since
either the data formats may be inconsistent or data values (various
copies of the same data) may no longer agree or both.
Fig. 1.19 shows an example of data inconsistency in which a field for
product description is being shown by all the three department files,
namely SALES, PRODUCT and ACCOUNTS. It can been seen in
this example that even though it was always the product description,
the related field in all the three department files often had a different
name, for example, PROD-DESC, PROD-DES and PRODDESC.
Also, the same data field might have different length in the various
files, for example, 15 characters in SALES file, 20 characters in
PRODUCT file and 10 characters in ACCOUNTS file. Furthermore,
suppose a product description was changed from steel cabinet to
steel chair. This duplication (or redundancy) of data increased the
maintenance overhead and storage costs. As shown in Fig. 1.19,
the product description filed might be immediately updated in the
SALES file, updated incorrectly next week in the PRODUCT file as
well as ACCOUNT file. Over a period of time, such discrepancies
can cause serious degradation in the quality of information contained
in the data files and can also affect the accuracy of reports.
Inconsistent product description data

(c) Program-data dependence: As we have seen, file descriptions
(physical structure, storage of the data files and records) are defined
within each application program that accesses a given file. For
example, “Account receivable program” of Fig. 1.18 accesses both
CUSTOMER file and SALES file. Therefore, this program contains
a detailed file description for both these files. As a consequence, any
change for a file structure requires changes to the file description

for all programs that access the file. It can also be noticed in Fig.
1.18 that SALES file has been used in both “Account receivable
program” and “Sales statement program”. If it is decided to change
the CUST-ID field length from 4 characters to 6 characters, the
file descriptions in each program that is affected would have to be
modified to confirm to the new file structure. It is often difficult to
even locate all programs affected by such changes. It could be very
time consuming and subject to error when making changes. This
characteristic of file-oriented system is known as program-data
dependence.
(d) Poor data control: As shown in Fig. 1.19, a file-oriented system
being decentralised in nature, there was no centralised control at
the data element (field) level. It could be very common for the data
field to have multiple names defined by the various departments
of an organisation and depending on the file it was in. This could
lead to different meanings of a data field in different context, and
conversely, same meaning for different fields. This leads to a poor
data control, resulting in a big confusion.
(e) Limited data sharing: There is limited data sharing opportunities
with the traditional file-oriented system. Each application has its own
private files and users have little opportunity to share data outside
their own applications. To obtain data from several incompatible
files in separate systems will require a major programming effort.
In addition, a major management effort may also be required since
different organisational units may own these different files.
(f) Inadequate data manipulation capabilities: Since File-oriented
systems do not provide strong connections between data in different
files and therefore its data manipulation capability is very limited.
(g) Excessive programming effort: There was a very high
interdependence between program and data in file-oriented system
and therefore an excessive programming effort was required for a
new application program to be written. Even though an existing
file may contain some of the data needed, the new application often
requires a number of other data fields that may not be available in
the existing file. As a result, the programmer had to rewrite the code
for definitions for needed data fields from the existing file as well
as definitions of all new data fields. Therefore, each new application
required that the developers (or programmers) essentially start from
scratch by designing new file formats and descriptions and then

write the file access logic for each new program. Also, both initial
and maintenance programming efforts for management information
applications were significant.
(h) Security problems: Every user of the database system should
not be allowed to access all the data. Each user should be allowed
to access the data concerning his area of application only. Since,
applications programs are added to the file-oriented system in an
ad hoc manner, it was difficult to enforce such security system.
11. (a) (ii)

DATA MODELS
A model is an abstraction process that concentrates (or highlights)
essential and inherent aspects of the organisation’s applications while
ignores (or hides) superfluous or accidental details. It is a representation
of the real world objects and events and their associations. A data
model (also called database model) is a mechanism that provides this
abstraction for database application. It represents the organisation itself.
It provides the basic concepts and notations to allow database designers
and end-users unambiguously and accurately communicate their
understanding of the organisational data. Data modelling is used for
representing entities of interest and their relationships in the database. It
allows the conceptualisation of the association between various entities
and their attributes. A data model is a conceptual method of structuring
data. It provides mechanism to structure data (consisting of a set of rules
according to which databases can be constructed) for the entities being
modelled, allow a set of manipulative operations (for example, updating
or retrieving data from the database) to be defined on them, and enforce
set of constraints (or integrity rules) to ensure accuracy of data.
Data models can be broadly classified into the following three categories:
•• Record-based data models
•• Object-based data models
•• Physical data models
Record-based Data Models

A record-based data models are used to specify the overall logical
structures of the database. In the record based models, the database
consists of a number of fixed-format records possibly of different types.
Each record type defines a fixed number of fields, each typically of a

fixed length. Data integrity constraints cannot be explicitly specified

using record-based data models. There are three principle types of
record-based data models:
•• Hierarchical data model.
•• Network data model.
•• Relational data model.
Object-based Data Models

Object-based data models are used to describe data and its relationships.
It uses concepts such as entities, attributes and relationships. Its
definition has already been explained in Chapter 1, Section 1.3.1. It has
flexible data structuring capabilities. Data integrity constraints can be
explicitly specified using object-based data models. Following are the
common types of object-based data models:
•• Entity-relationship.
•• Semantic.
•• Functional.
•• Object-oriented.
The entity-relationship (E-R) data model is one of the main techniques

for a database design and widely used in practice. The object-oriented
data models extend the definition of an entity to include not only the
attributes that describe the state of the object but also the actions that are
associated with the object, that is, its behaviour.
Physical Data Models
Physical data models are used for a higher-level description of storage
structure and access mechanism. They describe how data is stored in
the computer, representing information such as record structures, record
orderings and access paths. It is possible to implement the database at
system level using physical data models. There are not as many physical
data models so far. The most common physical data models are as
follows:
•• Unifying model.
•• Frame memory model.
Hierarchical Data Models

The hierarchical data model is represented by an upside-down tree. The
user perceives the hierarchical database as a hierarchy of segments. A

segment is the equivalent of a file system’s record type. In a hierarchical

data model, the relationship between the files or records forms a
hierarchy. In other words, the hierarchical database is a collection of
records that is perceived as organised to conform to the upside-down
tree structure. Fig. 2.12 shows a hierarchical data model. A tree may be
defined as a set of nodes such that there is one specially designated node
called the root (node), which is perceived as the parent (like a family
tree having parent-child or an organisation tree having owner-member
relationships between record types) of the segments directly beneath it.
The remaining nodes are portioned into disjoint sets and are perceived as
children of the segment above them. Each disjoint set in turn is a tree and
the sub-tree of the root. At the root of the tree is the single parent. The
parent can have none, one or more children. A hierarchical model can
represent a one-to-many relationship between two entities where the two
are respectively parent and child. The nodes of the tree represent record
types. If we define the root record type to level-0, then the level of its
dependent record types can be defined as being level-1. The dependents
of the record types at level–1 are said to be at level-2 and so on.
Hierarchical data model

A hierarchical path that traces the parent segments to the child segments,
beginning from the left, defines the tree shown in Fig. 2.12. For example,
the hierarchical path for segment ‘E’ can be traced as ABDE, tracing all
segments from the root starting at the leftmost segment. This left-traced
path is known as preorder traversal or the hierarchical sequence. As
can be noted from Fig. 2.12 that each parent can have many children but
each child has only one parent.
Fig. 2.13 (a) shows a hierarchical data model of a UNIVERSITY tree type
consisting of three levels and three record types such as DEPARTMENT,
FACULTY and COURSE. This tree contains information about

university academic departments along with data on all faculties for each
department and all courses taught by each faculty within a department.
Fig. 2.13 (b) shows the defined fields or data types for department,
faculty, and course record types. A single department record at the root
level represents one instance of the department record type. Multiple
instances of a given record type are used at lower levels to show that
a department may employ many (or no) faculties and that each faculty
may teach many (or no) courses. For example, we have a COMPUTER
department at the root level and as many instances of the FACULTY
record type are faculties in the computer department. Similarly, there
will be as many COURSE record instances for each FACULTY record
as that faculty teaches. Thus, there is a one-to-many (1:m) association
among record instances, moving from the root to the lowest level of
the tree. Since there are many departments in the university, there are
many instances of the DEPARTMENT record type, each with its own
FACULTY and COURSE record instances connected to it by appropriate
branches of the tree. This database then consists of a forest of such tree
instances; as many instances of the tree type as there are departments in
the university at any given time. Collectively, these comprise a single
hierarchic database and multiple databases will be online at a time.
Hierarchical data model relationship of university tree type

Suppose we are interested in adding information about departments
to our hierarchical database. For example, since the departments are
having various subjects for teaching, we want to keep record of subjects

with each department in the university. In that case, we would expand

the diagram of Fig. 2.13 to look like that of Fig. 2.14. DEPARTMENT is
still related to FACULTY which is related to COURSE. DEPARTMENT
is also related to SUBJECT which is related to TOPIC. We see from this
diagram that DEPARTMENT is at the top of a hierarchy from which a
large amount of information can be derived.
Hierarchical relationship of department with faculty and subject

Hierarchical database is one of the oldest database models used by
enterprise in the past. Information Management System (IMS), developed
jointly by IBM and North American Rockwell Company for mainframe
computer platform, was one of the first hierarchical databases. IMS
became the world’s leading hierarchical database system in the 1970s and
early 1980s. Hierarchical database model was the first major commercial
implementation of a growing pool of database concepts that were
developed to counter the computer file system’s inherent shortcomings.
Table: Comparison between different data models

11. (b)
12. (a)
Client/Server Database System
Client/server architecture of database system has two logical components
namely client, and server. Clients are generally personal computers or
workstations whereas server is large workstations, mini range computer
system or a mainframe computers system. The applications and tools of
DBMS run on one or more client platforms, while the DBMS softwares
reside on the server. The server computer is called backend and the
client’s computer is called front-end. These server and client computers
are connected into a network. The applications and tools act as clients
of the DBMS, making requests for its services. The DBMS, in turn,
processes these requests and returns the results to the client(s). Client/
server architecture handles the graphical user interface (GUI) and does
computations and other programming of interest to the end user. The
server handles parts of the job that are common to many clients, for
example, database access and updates.

Client/server database architecture

As shown in Figure the client/server database architecture consists of
three components namely, client applications, a DBMS server and a
communication network interface. The client applications may be tools,
user-written applications or vendor-written applications. They issue SQL
statements for data access. The DBMS server stores the related software,
processes the SQL statements and returns results. The communication
network interface enables client applications to connect to the server,
send SQL statements and receive results or error messages or error
return codes after the server has processed the SQL statements. In client/
server database architecture, the majority of the DBMS services are
performed on the server.
The client/server architecture is a part of the open systems architecture
in which all computing hardware, operating systems, network protocols
and other software are interconnected as a network and work in concert
to achieve user goals. It is well suited for online transaction processing
and decision support applications, which tend to generate a number of
relatively short transactions and require a high degree of concurrency.

Advantages of Client/server Database System

•• Client-server system has less expensive platforms to support appli-
cations that had previously been running only on large and expensive
mini or mainframe computers.
•• Clients offer icon-based manu-driven interface, which is superior
to the traditional command-line, dumb terminal interface typical of
mini and mainframe computer systems.
•• Client/server environment facilitates in more productive work by the
users and making better use of existing data.
•• Client-server database system is more flexible as compared to the
centralised system.
DBMS function and thus can provide a better DBMS performance.
Disadvantages of Client/Server Database System

•• Labour or programming cost is high in client/server environments,
particularly in initial phases.
•• There is a lack of management tools for diagnosis, performance
monitoring and tuning and security control, for the DBMS, client
and operating systems and networking environments.
Distributed Database System

Distributed database systems are similar to client/server architecture in
a number of ways. Both typically involve the use of multiple computer
systems and enable users to access data from remote system. However,
distributed database system broadens the extent to which data can be
shared well beyond that which can be achieved with the client/server
system. The below figure shows a diagram of distributed database
architecture.

In distributed database system, data is spread across a variety of

different databases. These are managed by a variety of different DBMS
softwares running on a variety of different computing machines
supported by a variety of different operating systems. These machines
are spread (or distributed) geographically and connected together by
a variety of communication networks. In distributed database system,
one application can operate on data that is spread geographically on
different machines. Thus, in distributed database system, the enterprise
data might be distributed on different computers in such a way that
data for one portion (or department) of the enterprise is stored in one
computer and the data for another department is stored in another. Each
machine can have data and applications of its own. However, the users
on one computer can access to data stored in several other computers.
Therefore, each machine will act as a server for some users and a client
for others.
Advantages of Distributed Database System
•• Distributed database architecture provides greater efficiency and
better performance.
DBMS function and thus can provide better DBMS performance.
•• As data volumes and transaction rates increase, users can grow the
system incrementally.
•• It causes less impact on ongoing operations when adding new
locations.
•• Distributed database system provides local autonomy.
Disadvantages of Distributed Database System

•• Recovery from failure is more complex in distributed database sys-
tems than in centralized systems.

Distributed database system
12 (b)
Embedded Structured Query Language (SQL)
We have looked at a wide range of SQL query constructs in the previous
sections, wherein SQL is treated as an independent language in its
own right. A RDMS supports an interactive SQL interface through
which users directly enter these SQL commands. However, in practice,
often we need a greater flexibility of a general purpose programming
language such as integrating database application with graphical user
interface. This is in addition to the data manipulation facilities provided
by SQL. To deal with such requirements, SQL statements can be directly
embedded in procedural language (that is, program’s source code such
as COBOL, C, Java, PASCAL, FROTRAN, PL/I and so on) along with
other statements of the programming language. A language in which

SQL queries are embedded is referred to as host programming language.

The use of SQL commands within a host program is called embedded
SQL. Special delimiters specify the beginning and end of the SQL
statements in the program. Thus, SQL’s powerful retrieval capabilities
can be used even within a traditional type of programming language. The
command syntax in the embedded mode is basically the same as in the
SQL query mode, except that some additional devices (such as special
pre-processor) are required to compensate for the differences between
the nature of SQL queries and the programming language environment.
The general syntax for embedded SQL is given as:
EXEC SQL 〈embedded SQL statement〉
END-EXEC
For example, the following code segment shows how an SQL statement
is included in a COBOL program.
COBOL statement
.....
.....
EXEC SQL
SELECT 〈attribute(s)-name〉
INTO: WS-NAME
FROM 〈table(s)-name〉
WHERE 〈conditions〉
END-EXEC
The embedded SQL statements are thus used in the application to
perform the data access and manipulation tasks. A special SQL pre-
complier accepts the combined source code that is, code containing
the embedded SQL statements and code containing programming
language statements. It compiles to convert into the executable form.
This compilation process is slightly different from the compilation of a
program, which does not have embedded SQL statements.
The exact syntax for embedded SQL requests depends on the language
in which SQL is embedded. For instance, a semicolon (;) is used instead
of END-EXEC (as in case of COBOL) when SQL is embedded in ‘C’.
The Java embedding of SQL (called SQL) uses syntax
#SQL {〈embedded SQL statement〉};

A statement SQL INCLUDE is placed in the program to identify the

place where the pre-processor should insert the special variables used
for communication between the program and the database system.
Variables of host programming language can be used within embedded
SQL statements but they must be preceded by a colon (:) to distinguish
them from SQL variables. A CURSOR is used to enable the program to
loop over the multiset of rows and process them one at a time. In this
case the syntax of embedded SQL can be written as:
EXEC SQL
DECLARE 〈variable-name〉 CURSOR FOR
SELECT 〈attribute(s)-name〉
FROM 〈table(s)-name〉
WHERE 〈conditions〉
END-EXEC
The declaration of CURSOR has no immediate effect on the database.
The query is only executed when the cursor is opened, after which
the cursor refers to the first record in the result set. Data values are
then copied from the table structures into host programming language
variables using FETCH statement. When no more tuples (records) are
available, the cursor is closed. An embedded SQL program executes a
series of FETCH statements to retrieve tuples of the result. The FETCH
statement requires one host programming language variable for each
attribute of the result relation.
Advantages of Embedded SQL
•• Since SQL statements are merged with the host programming lan-
guage, it combines the strengths of two programming environments.
•• The executable program of embedded SQL is very efficient in its
CPU usage as because the use of pre-complies shifts the CPU inten-
sive parsing and optimisation to the development phase.
•• The program’s run time interface to the private database routines is
transparent to the application program. The programmers work with
the embedded SQL at the source code level. They need not be con-
cerned about other database related issue.
•• The portability is very high.
Dynamic SQL: The dynamic SQL component of SQL allows program

to construct and submit SQL qureis at run time.Using Dynamic SQL

programd can create SQL queries as strings at run time and can either
have them executed immediately or have prepared for subsequent use
SQL defines standards for embedded Dynamic SQL calls in a host
language,such as C as in the following example.
Char*sqlprog=”update account
Set balance = balance*1.05
Where acc_no=?”
EXEC SQL prepare dynprog from : sqlprog;
Char acc[10]=”101”;
EXEC SQL execute dynprog using :acc;
13. (a)
First Normal Form (1NF)
A relation is said to be in first normal form (1NF) if the values in the
domain of each attribute of the relation are atomic (that is simple and
indivisible). In 1NF, all domains are simple and in a simple domain, all
elements are atomic. Every tuple (row) in the relational schema contains
only one value of each attribute and no repeating groups. 1NF data
requires that every data entry, or attribute (field) value, must be non-
decomposable. Hence, 1NF disallows having a set of values, a tuple of
values or a combination of both as an attribute value for a single tuple.
1NF disallows multi-valued attributes that are themselves composites.
This is called “relations within relations”, or nested relations, or
“relations as attributes of tuples”.
Example 1
Consider a relation LIVED_IN, which keeps records of person and his
residence in different cities. In this relation, the domain RESIDENCE is
not simple. For example, an attribute “Abhishek” can have residence in
Jamshedpur, Mumbai or Delhi. Therefore, the relation is un-normalised.
Now, the relation LIVED_IN is normalised by combining each row in
residence with its corresponding value of PERSON and making this
combination a tuple (row) of the relation, as shown in below figure.
Thus, now non-simple domain RESIDENCE is replaced with simple
domains.
Relation LIVED-IN

Second Normal Form (2NF)

A relation R is said to be in second normal form (2NF) if it is in 1NF
and every non-prime key attributes of R is fully functionally dependent
on each relation (primary) key of R. In other words, no attributes of the
relation (or table) should be functionally dependent on only one part of
a concatenated primary key. Thus, 2NF can be violated only when a key
is a composite key or one that consists of more than one attribute. 2NF
is an intermediate step towards higher normal forms. It eliminates the
problems of 1NF.

Example 1
The partial dependency of the doctor’s contact number on the key
DOCTORNAME indicates that the relation is not in 2NF. Therefore, to
bring the relation in 2NF, the information about doctors and their contact
numbers have to be separated from information about patients and their
appointments with doctors. Thus, the relation is decomposed into two
tables, namely PATIENT_DOCTOR and DOCTOR, as shown in Table.
The relational table can be depicted as:
PATIENT_DOCTOR (PATIENT-NAME, DATE-OF-BIRTH,
DOCTOR-NAME,
DATE-TIME, DURATION-MINUTES)
Table 10.4 Relation PATIENT_DOCTOR decomposed into two tables
for refirement into 2NF

A relation R is said to be in third normal form (3NF) if the relation R is
in 2NF and the non-prime attributes (that is, attributes that are not part
of the primary key) are

•• mutually independent,
•• functionally dependent on the primary (or relation) key.
In other words, no attributes of the relation should be transitively

functionally dependent on the primary key. Thus, in 3NF, no non-prime
attribute is functionally dependent on another non-prime attribute. This
means that a relation in 3NF consists of the primary key and a set of
independent nonprime attributes.
Example 1
Let us again take example of relation PATIENT_DOCTOR, as shown
in Table 10.4 (a). In this relation, there is no dependency between
PATIENT-NAME and DURATION-MINUTES. However, PATIENT-
NAME and DATE-OF-BIRTH are not mutually independent. Therefore,
the relation is not in 3NF. To convert this PATIENT_DOCTOR relation
in 3NF, it has to be decomposed to remove the parts that are not directly
dependent on relation (or primary) key. Though each value of the
primary key has a single associated value of the DATE-OF-BIRTH,
there is further dependency called transitive dependency linking DATE-
OF-BIRTH directly to the primary key, through its dependency on the
PATIENT-NAME. A functional dependency diagram is shown in Fig.
Thus, following three relations are created:
PATIENT (PATIENT-NAME, DATE-OF-BIRTH)
PATIENT_DOCTOR (PATIENT-NAME, DOCTOR-NAME, DATE-
TIME, DURATIONMINUTES)
Functional dependency diagram for relation PATIENT_DOCTOR

BOYCE-CODD NORMAL FORM (BCNF)
To eliminate the problems and redundancy of 3NF, R.F. Boyce proposed
a normal form known as Boyce-Codd normal form (BCNF). Relation
R is said to be in BCNF if for every nontrivial FD: X →Y between
attributes X and Y holds in R. That means:

•• X → Y is a trivial FD, that is, Y ⊂ X.
In other words, a relation must only have candidate keys as determinants.

Thus, to find whether a relation is in BCNF or not, FDs within each
relation is examined. If all non-key attributes depend upon only the
complete key, the relation is in BCNF.
Any relation in BCNF is also in 3NF and consequently in 2NF. However,
a relation in 3NF is not necessarily in BCNF. The BCNF is a simpler
form of 3NF and eliminates the problems of 3NF. The difference between
3NF and BCNF is that for a functional dependency A → B, 3NF allows
this dependency in a relation if B is a primary key attribute and A is
not a candidate key. Whereas, BCNF insists that for this dependency to
remain in a relation, A must be a candidate key. Therefore, BCNF is a
stronger form of 3NF, such that every relation in BCNF is also in 3NF.
Example 1
Relation USE in does not satisfy the above condition, as it contains the
following two functional dependencies:
But neither PROJ-MANAGER nor PROJECT is a super key.
Now, the relation USE can be decomposed into the following two BCNF
relations:
USE (PROJECT, MACHINE, QTY-USED)
PROJECTS (PROJECT, PROJ-MANAGER)
Both of the above relations are in BCNF. The only FD between the USE
attributes is
PROJECT, MACHINE →QTY-USED
and (PROJECT, MACHINE) is a super key.
The two FDs between the PROJECTS attributes are
Both PROJECT and PROJ-MANAGER are super keys of relation
PROJECTS and PROJECTS is in BCNF.

13. (b)
Join Dependencies (JD)
A join dependency (JD) can be said to exist if the join of R1 and R2 over
C is equal to relation R. Where, R1 and R2 are the decompositions R1(A,
B, C), and R2 (C,D) of a given relations R (A, B, C, D). Alternatively, R1
and R2 is a lossless decomposition of R. In other words, *(A, B, C, D),
(C, D) will be a join dependency of R if the join of the join’s attributes
is equal to relation R. Here, *(R1, R2, R3, ….) indicates that relations R1,
R2, R3 and so on are a join dependency (JD) of R. Therefore, a necessary
condition for a relation R to satisfy a JD *(R1, R2,…., Rn) is that
R = R1 ∪ R2 ∪.....∪ Rn
Thus, whenever we decompose a relation R into 1 R = XUY and R2 = (R –
Y) based on an MVD X →→Y that holds in relation R, the decomposition
has lossless join property. Therefore, lossless-join dependency can be
defined as a property of decomposition, which ensures that no spurious
tuples are generated when relations are returned through a natural join
operation.
Example 1
Let us consider a relation PERSONS_ON_JOB_SKILLS, as shown
in Table 10.14. This relation can be decomposed into three relations
namely, HAS_SKILL, NEEDS_SKILL and ASSIGNED_TO_JOBS.
Fig. 10.11 illustrates the join dependencies of decomposed relations. It
can be noted that none of the two decomposed relations are a lossless
decomposition of PERSONS_ON_JOB_SKILLS. In fact, a join of all
three decomposed relations yields a relation that has the same data as
does the original relation PERSONS_ON_JOB_SKILLS. Thus, each
relation acts as a constraint on the join of the other two relations.
Now, if we join decomposed relations HAS_SKILL and NEEDS_
SKILL, a relation CAN_USE_JOB_SKILL is obtained, as shown in
Fig. 10.11. This relation stores the data about persons who have skills
applicable to a particular job. But, each person who has a skill required
for a particular job need not be assigned to that job. The actual job
assignments are given by the relation JOB_ASSIGNED. When this
relation is joined with HAS_SKILL, a relation is obtained that will
contain all possible skills that can be applied to each job. This happens
because persons assigned to that job, possesses those skills. However,
some of the jobs do not require all the skills. Thus, redundant tuples
(rows) that show unnecessary SKILL-TYPE and JOB combinations are
removed by joining with relation NEEDS_SKILL.

Join dependencies of relation PERSONS_ON_JOB_SKILLS

Fifth Normal Form (5NF)
A relation is said to be in fifth normal form (5NF) if every join dependency
is a consequence of its relation (candidate) keys. Alternatively, for every
non-trivial join dependency *(R1, R2, R3) each decomposed relation Ri
is a super key of the main relation R. 5NF is also called project-join
normal form (PJNM).
There are some relations, who cannot be decomposed into two or
higher normal form relations by means of projections as discussed in
1NF, 2NF, 3NF and BCNF. Such relations are decomposed into three
or more relations, which can be reconstructed by means of a three-way
or more join operation. This is called fifth normal form (5NF). The
5NF eliminates the problems of 4NF. 5NF allows for relations with join
dependencies. Any relation that is in 5NF, is also in other normal forms
namely 2NF, 3NF and 4NF. 5Nf is mainly used from theoretical point of
view and not for practical database design.

Example 1
Let us consider the relation PERSONS_ON_JOB_SKILLS of Fig.
10.11. The three relations are
HAS_SKILL (PERSON, SKILL-TYPE)
NEEDS_SKILL (SKILL-TYPE, JOB)
JOB_ASSIGNED (PERSON, JOB))
Now by applying the definition of 5NF, the join dependency is given as:
*((PERSON, SKILL-TYPE), (SKILL-TYPE, JOB), (PERSON, JOB))
The above statement is true because a join relation of these three
relations is equal to the original relation PERSONS_ON_JOB_SKILLS.
The consequence of these join dependencies is that the SKILL-TYPE,
JOB or PERSON, is not relation key, and hence the relation is not in
5NF. Now suppose, the second tuple (row 2) is removed form relation
PERSONS_ON_JOB_SKILLS, a new relation is created that no longer
has any join dependencies. Thus the new relation will be in 5NF.
14. (a)
Two-phase Commit (2PC)
Two-phase commit protocol (2PL) is the simplest and most widely used
technique for recovery and concurrency control in distributed database
environment. 2PL mechanism guarantees that all database servers
participating in a distributed transaction either all commit or all abort.
In a distributed database system, each sub-transaction (that is, part of a
transaction getting executed at each site) must show that it is prepared-
to-commit. Otherwise, the transaction and all of its changes are entirely
aborted. For a transaction to be ready to commit, all of its actions must
have been completed successfully. If any sub-transaction indicates that
its actions cannot be completed, then all the sub-transactions are aborted
and none of the changes are committed. The two-phase commit process
requires the coordinator to communicate with every participant site.
As the name implies, two-phase commit (2PC) protocol has two phases
namely the voting phase and the decision phase. Both phases are initiated
by a coordinator. The coordinator asks all the participants whether they
are prepared to commit the transaction. In the voting phase, the sub-
transactions are requested to vote on their readiness to commit or abort.
In the decision phase, a decision as to whether all sub-transactions
should commit or abort is made and carried out. If one participant votes
to abort or fails to respond within a timeout period, then the coordinator

instructs all participants to abort the transaction. If all vote to commit,

then the coordinator instructs all participants to commit the transaction.
The global decision must be adopted by all participants. Figs. 18.15 and
18.16 illustrates the voting phase and decision phase, respectively, of
two-phase commit protocol.
The basic principle of 2PC is that any of the transaction manager
involved (including the coordinator) can unilaterally abort a transaction.
However, there must be unanimity to commit a transaction. When a
message is sent in 2PC, it signals a decision by the sender. In order to
ensure that this decision survives a crash at the sender’s site, the log
record describing the decision is always forced to stable storage before
the message is sent. A transaction is officially committed at the time
the coordinator’s commit log record reaches stable storage. Subsequent
failures cannot affect the outcome of the transaction. The transaction is
irrevocably committed.
A log record is maintained with entries such as type of the record, the
transaction identification and the identity of the coordinator. When
a system comes back after crash, recovery process is invoked. The
recovery process reads the log and processes all transactions that were
executing the commit protocol at the time of the crash.
Voting phase of two-phase commit (2PC) protocol

Limitations
•• A failure of the coordinator of sub-transactions can result in the
transaction being blocked from completion until the coordinator is
restored.
•• Requirement of coordinator results into more messages and more
overhead.
TRANSACTION CONCEPTS
A transaction is a logical unit of work of database processing that
includes one or more database access operations. A transaction can be
defined as an action or series of actions that is carried out by a single
user or application program to perform operations for accessing the
contents of the database. The operations can include retrieval, (Read),
insertion (Write), deletion and modification. A transaction must be either
completed or aborted. A transaction is a program unit whose execution
may change the contents of a database. It can either be embedded
within an application program or can be specified interactively via a
high-level query language such as SQL. Its execution preserves the
consistency of the database. No intermediate states are acceptable. If the
database is in a consistent state before a transaction executes, then the
database should still be in consistent state after its execution. Therefore,
to ensure these conditions and preserve the integrity of the database a
database transaction must be atomic (also called serialisability). Atomic
transaction is a transaction in which either all actions associated with the
transaction are executed to completion or none are performed. In other
words, each transaction should access shared data without interfering
with the other transactions and whenever a transaction successfully
completes its execution; its effect should be permanent. However, if due
to any reason, a transaction fails to complete its execution (for example,
system failure) it should not have any effect on the stored database. This
basic abstraction frees the database application programmer from the
following concerns:
•• Inconsistencies caused by conflicting updates from concurrent users.
•• Partially completed transactions in the event of systems failure.
•• User-directed undoing of transactions.
A transaction is a sequence of READ and WRITE actions that are

grouped together to from a database access. Whenever we Read from
and/or Write to (update) the database, a transaction is created. A
transaction may consist of a simple SELECT operation to generate a

list of table contents, or it may consist of a series of related UPDATE

command sequences. A transaction can include the following basic
database access operations:
Read_item(X): This operation reads a database item named X into a
program variable Y. Execution of Read-item(X) command includes the
following steps:
•• Find the address of disk block that contains the item X.
•• Copy that disk block into a buffer in main memory.
•• Copy item X from the buffer to the program variable named Y.
Write_item(X): This operation writes the value of a program variable Y

into the database item named X. Execution of Write-item(X) command
includes the following steps:
•• Find the address of the disk block that contains item X.
•• Copy that disk block into a buffer in main memory.
•• Copy item X from the program variable named Y into its correct loca-
tion in the buffer.
•• Store the updated block from the buffer back to disk.
14. (b)
In the ideal world of isolation, a transaction completes before another
begins
•• In the real world, isolation has levels
•• The higher the level of isolation, the less interference, and the greater
concurrency
•• Cursor stability permits shared locks for reading, that are released
before the transaction completes
•• Repeatable read ensures that a read is repeatable throughout the
transaction
•• Locking granularity: locks can be taken at the tuple level, or by rel-
var, database, or attribute
•• To avoid examining every tuple to determine if any are locked,
the intent locking protocol declares a lock at the relvar to forecast
conflicts
•• Modes are intent shared, intent exclusive and shared intent exclusive

•• Before a given transaction can acquire a shared lock on a given tuple,

it must first acquire an intent shared or stronger lock on the relvar
containing the tuple
•• Before a given transaction can acquire an exclusive lock on a given
tuple, it must first acquire an intent exclusive lock on the relvar con-
taining the tuple
•• For shared intent exclusive, a transaction can tolerate other readers,
but not other updaters, and reserves the right to take a lock to update
•• Under two phase locking, inconsistent analysis can still occur if
locks are released prior to the analysis being complete
•• An interleaved update can create a phantom
•• The solution is to lock the predicate for the duration of the transaction
15. (a)
In an ordered index, index entries are stored sorted on the search key
value. E.g., author catalog in library.
Primary index: in a sequentially ordered file, the index whose search key
specifies the sequential order of the file. Also called clustering index
The search key of a primary index is usually but not necessarily the
primary key.
Dense and sparse indices— Index record appears for every search-key
value in the file.
Dense index:
An index record appears for every search key value in the file. The
index record contains the search key value and a pointer to the first data
records with that search key value.

Sparse Index Files:

An index record appears for only some of the search key values.to locate
a record,we find the index entry with the largest search key value that
is less than or equal to the search key value for which we are looking.
Applicable when records are sequentially ordered on sear record ch-key
To locate a record with search-key value K we:
•• Find index record with largest search-key value < K
•• Search file sequentially starting at the record to which the index
record points
•• Less space and less maintenance overhead for insertions and
deletions.
•• Generally slower than dense index for locating records.
•• Good tradeoff: sparse index with an index entry for every block in
file, corresponding to least search-key value in the block.
Multilevel Indies:
If our file is extremely large,even the outer index may grow too large to
fit in main memory.In such case we can create yet another level of index.
Indeed we can repeat this process as many time as necessary.Indice with
two or more levels are called multilevel indices.

Index Update:
Insertion
The system performs a lookup using the search key value that appears in
the record to be inserted.
Dense indices:
1. If the search key value does not appear in the index,the system
inserts an index record with the search key value in the index at the
appropriate position.
2. Otherwise if the index record stores pointer to all records with the
same search key value the system adds a pointer to the new record to
the index record.
Spares indices:

If the system creates a new block it inserts the first search key value
appearing in the new block into the index.
Deletion
If deleted record was the only record in the file with its particularsearch-
key value, the search-key is deleted from the index also.
Single-level index deletion:
Dense indices – deletion of search-key is similar to file record deletion.
Sparse indices – if an entry for the search key exists in the index, itis
deleted by replacing the entry in the index with the next search keyvalue
in the file (in search-key order). If the next search-keyvalue already has
an index entry, the entry is deleted instead of being replaced.
B-Tree Index Files
Similar to B+-tree, but B-tree allows search-key values to appear only
once; eliminates redundant storage of search keys.
Search keys in nonleaf nodes appear nowhere else in the Btree an
additional pointer field for each search key in a nonleaf node must be
included.
Nonleaf node – pointers Bi are the bucket or file record

Advantages of B-Tree indices:

•• May use less tree nodes than a corresponding B+-Tree.
•• Sometimes possible to find search-key value before reaching leaf
node.
Disadvantages of B-Tree indices:

•• Only small fraction of all search-key values are found early Non-
leaf nodes are larger, so fan-out is reduced. Thus B-Trees typically
have greater depth than corresponding B+-Tree
•• Insertion and deletion more complicated than in B+-Trees
•• Implementation is harder than B+-Trees.
•• Typically, advantages of B-Trees do not out weigh disadvantages
Example
15. (b)
Raid Technology
With fast growing database applications such as World Wide Web,
multimedia and so on, the data storage requirements are also growing
at the same pace. Also, faster microprocessors with larger and larger
primary memories are continually becoming available with the
exponential growth in the performance and capacity of semiconductor
devices and memories. Therefore, it is expected that secondary storage
technology must also take steps to keep up in performance and reliability
with processor technology to match the growth.
Development of redundant arrays of inexpensive disks (RAID) was a
major advancement in secondary storage technology to achieve improved
performance and reliability of storage system. Lately, the “I” in RAID
is said to stand for independent. The main goal of RAID is to even out
the widely different rates of performance improvement of disks against

those in memory and microprocessor. RAID technology provides a

disk array arrangement in which a large number of small independent
disks operate in parallel and act as a single higher-performance logical
disk in place of a single very large disk. The parallel operation of
several disks improve the rate at which data can be read or written and
performs several independent reads and writes in parallel. In a RAID
system, a combination of data stripping (also called parallelism) and
data redundancy is implemented. Data is distributed over several disks
and redundant information is stored on multiple disks. Thus, in case of
disk failure the redundant information is used to reconstruct the content
of the failed disk. Therefore, failure of one disk does not lead to the
loss of data. The RAID system increases the performance and improves
reliability of the resulting storage system.
Tertiary Storage Device
Tertiary storage devices are primarily used for archival purposes. Data
held on tertiary devices is not directly loaded and saved by application
programs. Instead, operating system utilities are used to move data
between tertiary and secondary stores as required. The tertiary storage
devices are also non-volatile. Tertiary storage devices such as optical
disk and magnetic tape are the slowest class of storage devices.
Since cost of primary storage devices is very high, buying enough main
memory to store all data is prohibitively expensive. Thus, secondary and
tertiary storage devices play an important role in database management
systems for storage of very large volume of data. Large volume of data
is stored on the disks and/or tapes and a database system is built that
can retrieve data from lower levels of the memory hierarchy into main
memory as needed for processing. In balancing the requirements for
different types of storage, there are considerable trade-offs involving
cost, speed, access time and capacity. Main memory and most magnetic
disks are fixed storage media. The capacity of such devices can only be
increased by adding further devices. Optical disks and tape, although
slower, are relatively inexpensive because they are removable media.
Once the read/write devices are installed, storage capacity may be
expanded simply by purchasing further tapes, disks, CD-ROM and so
on.

May/June 2012
Third Semester

(Regulation 2008)
PART A (10 × 2 = 20 marks)
1. L
ist four significant differences between a file-processing system and a
DBMS.
2. What are the different types of Data Models?
3. D
escribe a circumstance in which you would choose to use embedded
SQL rather than using SQL alone.
4. L
ist two major problems with processing of update operations expressed
in terms of views.
5. G
ive an example of a relation schema R and a set of dependencies such
that R is in BCNF, but not in 4NF
6. W
hy are certain functional dependencies called as trivial functional
dependencies?
7. List down the SQL facilities for concurrency.
8. W
hat benefit does strict two-phase locking provide? What disadvantages
result?

9. Mention the different Hashing techniques.
10. W
hen is it preferable to use a dense index rather than a sparse index?
Explain your answer.
Part B (5 × 16 = 80 marks)
11. (a) D
iscuss in detail about database system architecture with neat
diagram.
Or
(b) D
raw an E-R diagram for a banking enterprise with almost all
components and explain.
12. (a) E
xplain in detail about Relational Algebra, Domain Relational
Calculus and Tuple Relational Calculus with suitable examples.
Or
(b) Briefly present a survey on Integrity and Security.
13. (a) E
xplain in detail about 1NF, 2NF, 3NF and BCNF with suitable
examples.
Or
(b) D
escribe about the Multi-Valued Dependencies and Fourth normal
form with suitable example.
14. (a) D
iscuss in detail about Transaction Recovery, System Recovery and
Media Recovery.
Or
(b) Write down in detail about Deadlock and Serializability.
15. (a) C
onstruct a B+ tree to insert the following key elements (order of the
tree is 3) 5, 3, 4, 9, 7, 15, 14, 21, 22, 23.
Or
(b) D
escribe in detail about how records are represented in a file and
how to organize them in a file.

Solutions
Part A
1. Some main differences between a database management system and a

file-processing system are:
•• Both systems contain a collection of data and a set of programs
which access that data. A database management system coordinates
both the physical and the logical access to the data, whereas a file-
processing system coordinates only the physical access.
•• A database management system reduces the amount of data duplica-
tion by ensuring that a physical piece of data is available to all pro-
grams authorized to have access to it, whereas data written by one
program in a file-processing system may not be readable by another
program.
•• A database management system is designed to allow flexible access
to data (i.e., queries), whereas a file-processing system is designed
to allow predetermined access to data (i.e., compiled programs).
•• A database management system is designed to coordinate multiple
users accessing the same data at the same time. A file-processing
system is usually designed to allow one or more programs to access
different data files at the same time. In a file-processing system, a
file can be accessed by two programs concurrently only if both pro-
grams have read-only access to the file.
2. Data models can be broadly classified into the following three categories:
•• Record-based data models
•• Object-based data models
•• Physical data models
3. In this case the syntax of embedded SQL can be written as:

EXEC SQL
DECLARE <variable-name> CURSOR FOR
SELECT <attribute(s)-name>
FROM <table(s)-name>
WHERE <conditions>
END-EXEC

4. (a) Since the view may not have all the attributes of the underlying tables,
insertion of a tuple into the view will insert tuples into the underlying
tables, with those attributes not participating in the view getting null
values. This may not be desirable, especially if the attribute in question
is part of the primary key of the table.
(b) If a view is a join of several underlying tables and an insertion
results in tuples with nulls in the join columns, the desired effect of the
insertion will not be achieved. In other words, an update to a view may
not be expressible at all as updates to base relations. For an explanatory
example, see the loaninfo updation example in Section 3.5.2.
5. 3NF example-there may still be redundancies

•• Consider R(A, B, C, D), A ® D, and D ® A.
•• BCD is also a key for R
•• Therefore R is in 3NF
•• However, A and D values may still occur redundantly
A B C D
a1 b1 c1 d1
a1 b2 c2 d1
a1 b2 c3 d1
a2 b2 c3 d2
6. When the right-hand side of the functional dependency is a subset of

the left-hand side, it is called trivial dependency. An example of trivial
dependency can be given as:
FD: {BOOK-ID, CITY-ID} → {BOOK-ID}
Trivial dependencies are satisfied by all relations. For example, X → X
is satisfied by all relations involving attribute X. In general, a functional
dependency of the form X → Y is trivial if Y ⊆ X. Trivial dependencies
are not used in practice. They are eliminated to reduce the size of the
functional dependencies.
7. List sql
•• Query Optimizer - translates SQL into an ordered expression of rela-
tional DB operators (Select, Project, Join)
•• Query Executor - executes the ordered expression by running a pro-
gram for each operator, which in turn accesses records of files

•• Access methods - provides indexed record-at-a-time access to files

(OpenScan, GetNext, …)
•• Page-oriented files - Read or Write (page address)
8. ADVANTAGE: It produces only cascadeless schedules, recovery is very

easy.
DISADVANTAGE: The set of schedules obtainable is a subset of those
obtainable from plaintwo phase locking, thus concurrency is reduced
9.
(i) Static External Hashing
(ii) Dynamic Hashing
(iii) Linear Hashing
Static External Hashing
•• One of the file fields is designated to be the hash key, K, of the file.
•• Collisions occur when a new record hashes to a bucket that is already
full.
•• An overflow file is kept for storing such records. Overflow records
that hash to each bucket can be linked together.
•• To reduce overflow records, a hash file is typically kept 70-80% full.
•• The hash function h should distribute the records uniformly among
the buckets; otherwise, search time will be increased because many
overflow records will exist.
Dynamic Hashing
•• Good for database that grows and shrinks in size
•• Allows the hash function to be modified dynamically
Linear Hashing
•• This is another dynamic hashing scheme, an alternative to Extendible
Hashing.
•• LH handles the problem of long overflow chains without using a
directory, and handles duplicates.
•• Idea: Use a family of hash functions h0, h1, h2,...
10. Let us take an example of a PURCHASE record of considering only three

data items namely supplier name (SUP-NAME), order number (ORD-
NO) and order value (ORD-VALUE), which shows dense indexing for
record of PURCHASE file. Suppose that we are looking up records for
the supplier name “KLY System”. Using the dense index, pointer is

followed directly to the first record with SUP-NAME “KLY System”.

This record is processed and the pointer in that record is followed to
locate the next record in order of search-key (SUP-NAME). The
processing of records is continued until a record for supplier name other
than “KLY System” is encountered. In case of sparse indexing there is
no index entry for supplier name “KLY System”. Since the last entry in
alphabet order before “KLY System” is “JUSCO Ltd”, that pointer is
followed. Then the PURCHASE file is read in sequential order until the
first “KLY System” record is found, and processing begins at that point.
Part B
11. (a)
Refer Qn 11 (b) from May/June 2013
11. (b)
branch-city
branch-name assets
branch
loan-branch
social-security payment-date
customer-street payment-number payment-amount

customer-name
loan-number
customer-city
amount
customer borrower loan loan-payment payment
access-date
account-number balance
cust-banker type
depositor account
manager
employee works-for ISA
worker
e-social-security employee-name
savings-account checking-account
dependent-name telephone-number
employment-length start-date interest-rate overdraft-amount

12. (a)
RELATIONAL ALGEBRA
Relational algebra is a collection of operations to manipulate or access
relations. It is a procedural (or abstract) language with operations that
is performed on one or more existing relations to derive result (another)
relations without changing the original relation(s). Furthermore,
relational algebra defines the complete scheme for each of the result
relations. Relational algebra consists of set of relational operators. Each
operator has one or more relations as its input and produces a relation as
its output. Thus, both the operands and the results are relations and so the
output from one operation can become the input to another operation.
The relational algebra is a relation-at-a-time (or set) language in which
all tuples, possibly from several relations, are manipulated in one
statement without looping. There are many variations of the operations
that are included in relational algebra. Originally eight operations were
proposed by Dr. Codd, but several others have been developed. These
eight operators are divided into the following two categories:
•• Set-theoretic operations.
•• Native relational operations.
Set-theoretic operations make use of the fact that tables are essentially
sets of rows. There are four set-theoretical operations, as shown in Table
4.3.
Table 4.3 Set-theoretic operations
Native relational operation focuses on the structure of the rows. There

are four native relational operations, as shown in Table 4.4.
Table 4.4 Native relational operations

Tuple Relational Calculus

The tuple relational calculus was originally proposed by Dr. Codd in
1972. In the tuple relational calculus, tuples are found for which a
predicate is true. The calculus is based on the use of tuple variables.
A tuple variable is a variable that ranges over a named relation, that
is, a variable whose only permitted values are tuples of the relation. To
specify the range of a tuple variable R as the EMPLOYEE relation, it
can be written as:
EMPLOYEE(R)
To express the query ‘Find the set of all tuples R such that F(R) is true’,
we write:
{R|F(R)}
F is called a well-formed formula (WFF) in mathematical logic. Thus,
relational calculus expressions are formally defined by means of well-
formed formulas (WFFs) that use tuple variables to represent tuple.
Tuple variable names are the same as relation names. If tuple variable R
represents tuple r at some point, R.A will represent the A-component of
r, where A is an attribute of R. A term can be defined as:
where <variable name> = <tuple variable> . <attribute name>

= R.A
<condition> = binary operations
= .NOT., >, <, ≥ and ≤
This term can be illustrated for relations shown in Fig. 4.15, for example,
WAREHOUSE.LOCATION = MUMBAI
or, ITEMS.ITEM-NO > 30
All tuple variables in terms are defined to be free. In defining a WFF,
following symbols are used that are commonly found in predicate
calculus:

= negation
∃ = existential quantifier (meaning ‘there EXISTS’)used for in formulae
that must be true for at least one instance
∀ = universal quantifier meaning ‘FORALL’) used in statements about
every instance
Tuple variables that are quantified by ∀ or ∃ are called bound variable.
Otherwise, they are called free variables.
Dr. Codd defined the well-formed formulas (WFFs) as follows:
•• Any term is a WFF.
•• If x is a WFF, so is (x) = | x. All free tuple variables in x remain free
in (x) and | x, and all bound tuple variables in x remain bound in (x)
and | x.
•• If x, y are WFFs, so are x ∧ y and x ∨ y. All free tuple variables in x
and y remain free in x ∧ y and x ∨ y.
•• ‘If x is a WFF containing a free tuple variable T, then ∃T(x) and
∀T(x) are WFFs. T now becomes a bound tuple variable, but any
other free tuple variables remain free. All bound terms in x remain
bound in ∃T(x) and ∀T(x).
•• No other formulas are WFFs.
Examples of WFFs are:

STORED.ITEM-NO = ITEMS.ITEM-NO ∧ ITEMS.WT > 30
∃ ITEMS (ITEMS.DESC = ‘Bulb’ ∧ ITEMS.ITEM-NO
= STORED.ITEM-NO)
In the above examples, STORED and ITEMS are free variables in the
first WFF. In the second WFF, only STORED is free, whereas ITEMS is
bound. Bound and free variables are important to formulating calculus
expression. A calculus expression may be given in the form mentioned
below so that all tuple variables preceding WHERE are free in the WFF.

Relational calculus expressions can be used to retrieve data from one or

more relations, with the simplest expressions being those that retrieve
data from one relation only.
Domain Relational Calculus
Domain relational calculus was proposed by Lacroix and Pirotte in
1977. In domain relational calculus, the variables take their values from
domains of attributes rather than tuples of relations. An expression for
the domain relational calculus has the following general form
{d1, d2, ...., dn | F (d1, d2, ..., dm)} m ≥ n
where d1, d2,..., dn and d1, d2,..., dn represent domain variables and F(d1,
d2,..., dm) represents a formula composed of atoms. Each atom has one
of the following forms:

•• R(d1, d2,..., dn), where R is a relation of degree n and each di is a

domain variable.
•• di θ dj, where di and dj are domain variables and θ is one of the com-
parison operators (<, ≤, >, ≥, =, ≠); the domains di and dj must have
members that can be compared by θ.
•• di θ c, where di is a domain variable, c is a constant from the domain
of di , and θ is one of the comparison operators.
•• We recursively build up formula from atoms using the following
rules:
•• An atom is a formula.
•• If F1 and F2 are formula, so are their conjunction F1 ∧ F2, their dis-
conjunction F1 ∨ F2 and the negation F1.
•• If F(X) is a formula with domain variable X, then (∃×) (F(X)) and
(∀×) (F(X)) are also formula.
The expression of domain relational calculus use the same operators

as those in tuple calculus. The difference is that in domain calculus,
instead of using tuple variables, we use domain variables to represent
components of tuples. A tuple calculus expression can be converted
to a domain calculus expression by replacing each tuple variable by n
domain variables. Here, n is the arity of the tuple variable.
13. (a)
Refer Q. No. 13 (a) November/December 2012
13. (b)
MULTI-VALUED DEPENDENCIES (MVD) AND FOURTH
NORMAL FORM (4NF)
To deal with the problem of BCNF, R. Fagin introduced the idea of
multi-valued dependency (MVD) and the fourth normal form (4NF). A
multi-valued dependency (MVD) is a functional dependency where the
dependency may be to a set and not just a single value. It is defined as
X →→Y in relation R (X, Y, Z), if each X value is associated with a set
of Y values in a way that does not depend on the Z values. Here X and
Y are both subsets of R. The notation X →→Y is used to indicate that a
set of attributes of Y shows a multi-valued dependency (MVD) on a set
of attributes of X.

Actions to convert un-normalized relation into normal forms

Thus, informally, MVDs occur when two or more independent multi-
valued facts about the same attribute occur within the same relation.
There are two important things to be noted in this definition of MVD.
Firstly, in order for a relation to contain an MVD, it must have three or
more attributes. Secondly, it is possible to have a table containing two
or more attributes which are inter-dependent multi-valued facts about
another attribute. This does not make the relation an MVD. For a relation
to be MVD, the attributes must be independent of each other.
Functional dependency (FD) concerns itself with the case where one
attribute is potentially a ‘single-value fact’ about another. Multi-valued
dependency (MVD), on the other hand, concerns itself with the case

where one attribute value is potentially a ‘multi-valued fact’ about

another.
Example 1
Let us consider a relation STUDENT_BOOK, as shown in Table 10.12.
The relation STUDENT_BOOK lists students (STUDENT-NAME),
the text books (TEXT-BOOK) they have borrowed from library, the
librarians (LIBRARIAN) issuing the books and the month and year
(MONTH-YEAR) of borrowing. It contains three multi-valued facts
about students, namely, the books they have borrowed, the librarians
who have issued these books to them and the month and year upon
which the books were borrowed.
Table 10.12 Relation STUDENT_BOOK
However, these multi-valued facts are not independent of each other.

There is clearly an association between librarians, the text books they
have issued and the month and year upon which they have issued the
books. Therefore, there are no MVDs in the relation. Also, there is no
redundant information in this relation. The fact that student “Thomas”,
for example, has borrowed the book “Database Management” is recorded
twice, but these are different borrowings, one in “May, 04” and the other
in “Oct. 04” and therefore constitute different items of information.
Table 10.13 Relation COURSE_STUDENT_BOOK

Now, let us consider another relation COURSE_STUDENT_BOOK,

as shown in Table 10.13. This relation involves courses (COURSE)
being attended by the students, students (STUDENT-NAME) taking the
courses and text books (TEXT-BOOKS) applicable for the courses. The
text books are prescribed by the authorities for each course, that is, the
students have no say in the matter. Clearly, the attributes STUDENT-
NAME and TEXT-BOOK give multi-valued facts about the attribute
COURSE. However, since a student has no influence over the text
books to be used for a course, these multi-valued facts about courses are
independent of each other. Thus, the relation COURSE_STUDENT_
BOOK contains an MVD. Because being MVD, it contains high degree
of redundant information, unlike the STUDENT_BOOK relation
example of Table 10.12. For example, the fact that the student “Thomas”
attends the “Computer Engg” course, is recorded twice, as are the text
books prescribed for that course.
The formal definition of MVD specifies that, given a particular value
of X, the set of values of Y determined by this value of X is completely
determined by X alone and does not depend on the values of the
remaining attributes Z of R. Hence, whenever two tuples (rows) exist
that have distinct values of Y but the same value of X, these values of Y
must be repeated in separate tuples with every distinct value of Z that
occurs with that same value of X. Unlike FDs, MVDs are not properties
of the information represented by relations. In fact, they depend on the
way the attributes are structured into relations. MVDs occur whenever a
relation with more than one non-simple domain is normalised.
Example 2
The relation PERSON_SKILL of is a relation with more than one non-
simple domain. Let us suppose that X is PERSON and Y is SKILL-
TYPE, then Z becomes a relation key {PROJECT, MACHINE}.
Suppose, a particular value of PERSON “John” is selected. Consider
all tuples (rows) that have some value of Z, for example, PROJECT
= P1 and MACHINE = “Shovel”. The value of Y in this tuples (rows)
is (‹Programmer›). Consider also all tuples with same value of Y, that
is PERSON but with some other value of Z, say, PROJECT = P2 and
MACHINE = “Welding”. The values of Y in these tuples is again
(‹Programmer›). This same set of values of Y is obtained for PERSON =

“John”, irrespective of the values chosen for PROJECT and MACHINE.

Hence X →→Y, or PERSON→→ SKILL-TYPE. It can be verified that
in the relation PERSON_SKILL the result is:
PROJECT →→ PERSON, SKILL-TYPE
PROJECT →→ MACHINE
PERSON →→ PROJECT, MACHINE
YX,Z is defined as the set of Y values, given a set of X and a set of Z values.
Thus, in relation PERSON_SKILL, we get:
SKILL_TYPE John, P1, Excavator = (‹Programmer›)
SKILL_TYPE John, P2, Dumper = (‹Programmer›)
In a formal definition of MVD the values of attribute Y depend only on
attributes X but are independent of the attributes Z. So, when given a
value of X, the value of Y will be the same for any two values Z1 or Z2,
of Z.
The value of Y1, given a set of values X and Z1, is YX, Z1 Y and the value
Y2, given a set of values X and Z2, is Yx, z2 /MCD Y requires that Y1 = Y2
so X →→Y in relation R (X, Y, Z) if Yx, z1 = Y x, z2 for any values Z1, Z2 .
It may be noticed that MVDs always come in pairs. Thus, if X →→Y in
relation R (X, Y, Z), it is also true that X →→ Z.
Thus, alternatively it can be stated that if X →→Y is an MVD in relation
R (X, Y, Z), whenever the two tuples (x1, y1, z1) and (x2, y2, z2) are in R,
the tuples (x1, y1, z2) and (x2, y2, z1) must also be in R. In this definition,
X, Y and Z represent sets of attributes rather than individual attributes.
Example 3
Let us examine relation PERSON_SKILL, as shown in Fig. 10.9 which
does not contain the MVD. As we discussed that for MVD
PERSON →→ PROJECT
But, we observe from that

Normal form of relation PERSON_SKILL not containing MVD

PROJECTJohn, Drilling, Programmer = (< P1>, < P2 >)
Whereas, PROJECTJohn, Excavator, Programmer = (< P1>)
Thus, PROJECTJohn, Excavator, Programmer = (< P1>) does not equal PROJECTJohn,
Drilling, Programmer
= (< P1>, < P2 >)which it would have to hold if PEARSON
→→ PROJECT. But, if MACHINE is projected out of relation
PERSON_SKILL, then PEARSON →→ PROJECT will become an
MVD. An MVD that does not hold in the relation R, but holds for a
projection on some subset of the attributes of relation R is sometimes
called embedded MVD of R.
Alike trivial FDs, there are trivial MVDs also. An MVD X →→ Y in
relation R is called a trivial MVD if
(a) Y is a subset of X, or
(b) X ∪ Y = R.
It is called trivial because it does not specify any significant or
meaningful constraint on R. An MVD that satisfies neither (a) nor (b) is
called non-trivial MVD.
Like trivial FDs, there are two kinds of trivial MVDs as follows:
(a) X →→ f, where f is an empty set of attributes.

(b) X →→ A – X, where A comprises all the attributes in a relation.

Both these types of trivial MVDs hold for any set of attributes of R and
therefore can serve no purpose as design criteria.
Properties of MVDs
Berri described the relevant rules to derive MVDs. Following four
axioms were proposed to derive a closure D+ of MVDs:
Rule 1 Reflexivity (inclusion) If Y ⊂ X, then X→→Y.
Rule 2 Augmentation: If X →→Y and W ⊂ U and V ⊂W, then
WX →→VY.
Rule 3 Transitivity: If X →→Y and Y → Z, then X →→ Z –Y.
Rule 4 Complementation: If X →→Y, then X →→U – X –Y holds.
In the above rules, X, Y, and Z all are sets of attributes of a relation R and
U is the set of all the attributes of R. These four axioms can be used to
derive the closure of a set D+, of D of multi-valued dependencies. It can
be noticed that there are similarities between the Armstrong’s axioms for
FDs and Berri’s axioms for MVDs. Both have reflexivity, augmentation,
and transitivity rules. But, the MVD set also has a complementation
rule.
Following additional rules can be derived from the above Berri’s axioms
to derive closure of a set of FDs and MVDs:
Rule 5 Intersection: If X →→Y and X →→ Z, then X →→Y ∩Z.
Rule 6 Pseudo-transitivity: If X →→Y and YW →→ Z, then XW →(Z
–WY).
Rule 7 Union: If X →→Y and X →→ Z, then X →→YZ.
Rule 8 Difference: If X →→Y and X →→ Z, then X →→Y – Z
and X →→ Z –Y.
Further additional rules can be derived from above rules, which are as
follows:
Rule 9 Replication: If X →Y, then X →→Y.
Rule 10 Coalescence: If X →→Y and Z ⊂ Y and there is a W such that
W ⊂ U andW ∩Y = null and W → Z, then X →Z.

Fourth Normal Form (4NF)

A relation R is said to be in fourth normal form (4NF) if it is in BCNF
and for every non-trivial MVD(X →→Y) in F+ , X is a super key for R.
The fourth normal form (4NF) is concerned with dependencies between
the elements of compound keys composed of three or more attributes.
The 4NF eliminates the problems of 3NF. 4NF is violated when a
relation has undesirable MVDs and hence can be used to identify and
decompose such relations.
Example 1
Let us consider a relation EMPLOYEE, as shown in Fig. 10.10. A
tuple in this relation represents the fact that an employee (EMP-
NAME) works on the project (PROJ-NAME) and has a dependent
(DEPENDENTNAME). This relation is not in 4NF because in the non-
trivial MVDs EMP-NAME →→ PROJ-NAME and EMP-NAME →→
DEPENDENT-NAME, EMP-NAME is not a super key of EMPLOYEE.
Now the relation EMPLOYEE is decomposed into EMP_PROJ
and EMP_DEPENDENTS. Thus, both EMP_PROJ and EMP_
DEPENDENT are in 4NF, because the MVDs EMP-NAME →→ PROJ-
NAME in EMP_PROJ and EMP-NAME →→ DEPENDENT-NAME
in EMP_DEPENDENTS are trivial MVDs. No other non-trivial MVDs
hold in either EMP_PROJ or EMP_DEPENDENTS. No FDs hold in
these relation schemas either.
Example 2
Similarly, the relation PERSON_SKILL of Fig. 10.9 is not in 4NF
because it has the non-trivial MVDs PROJECT →→ MACHINE, but
PROJECT is not a super key. To convert this relation into 4NF, it is
necessary to decompose relation PERSON_SKILL into the following
relations:
R1 (PROJECT, MACHINE)
R2 (PERSON, SKILL-TYPE)
R3 (PERSON, PROJECT)
Example 3
Let us consider a relation R (A, B, C, D, E, F) with FDs and with MVDs
A→→ B and CD→→ EF.
Let us decompose the relation R into two relations as R1(A, B) and
RZ(A, C, D, E, F) by applying the MVD: A→→ B and its compliment A

→CDEF. The relation R1 is now in 4NF because A→→ B is trivial and is

the only MVD in the relation. The relation RZ, however is still in BCNF
because of the nontrivial MVD: CD→→ EF.
R R
Now, RZ is decomposed into relations Z1 (C, D, E, F) and Z 2 (C, D,
A) by applying the MVD: CD→→ EF and its compliment CD→→ A.
RZ1 RZ2
Both the decomposed relations and are now in 4NF.
A relation EMPLOYEE decomposed into EMP_PROJ and EMP_

DEPENDENTS
Problems with MVDs and 4NF
FDs, MVDs and 4NF are not sufficient to identify all data redundancies.
Let us consider a relation PERSONS_ON_JOB_SKILLS, as shown in
Table 10.14. This relation stores information about people applying all
their skills to the jobs to which they are assigned. But, they use particular
or all skills only when the job needs that skill.
Table 10.14 Relation PERSON_ON_JOB_SKILL in BCNF and 4NF

The relation PERSONS_ON_JOB_SKILLS of Table 10.14 is in BCNF

and 4NF. It can lead to anomalies because of the dependencies between
the joins. For example, person “Thomas” who possesses skills “Analyst”
and “DBA” applies them to job J-2, as J-2 needs both these skills. The
same person “Thomas” applies skill “Analyst” only to job J-1, as job
J-1 needs only skill “Analyst” and not skill “DBA”. Thus, if we delete
‹Thomas, DBA, J-2›, we must also delete ‹Thomas, Analyst, J-2›,
because persons must apply all their skills to a job if that requires those
skills.
14. (a)
TYPES OF DATABASE RECOVERY
In case of any type of failures, a transaction must either be aborted or
committed to maintain data integrity. Transaction log plays an important
role for database recovery and bringing the database in a consistent state
in the event of failure. Transactions represent the basic unit of recovery
in a database system. The recovery manager guarantees the atomicity
and durability properties of transactions in the event of failures. During
recovery from failure, the recovery manager ensures that either all the
effects of a given transaction are permanently recorded in the database
or none of them are recorded. A transaction begins with successful
execution of a <T, BEGIN>” (begin transaction) statement. It ends with
successful execution of a COMMIT statement. The following two types
of transaction recovery are used:
•• Forward recovery.
•• Backward recovery.
Forward Recovery (or REDO)

Forward recovery (also called roll-forward) is the recovery procedure,
which is used in case of a physical damage, for example crash of disk
pack (secondary storage), failures during writing of data to database
buffers, or failure during flushing (transferring) buffers to secondary
storage. The intermediate results of the transactions are written in the
database buffers. The database buffers occupy an area in the main
memory. From this buffer, the data is transferred to and from secondary
storage of the database. The update operation is regarded as permanent
only when the buffers are flushed to the secondary storage. The flushing
operation can be triggered by the COMMIT operation of the transaction
or automatically in the event of buffers becoming full. If the failure

occurs between writing to the buffers and flushing of buffers to the

secondary storage, the recovery manager must determine the status of
the transaction that performed the WRITE at the time of failure. If the
transaction had already issued its COMMIT, the recovery manager redo
(roll forward) so that transaction’s updates to the database. This redoing
of transaction updates is also known as roll-forward. The forward
recovery guarantees the durability property of transaction.
To recreate the lost disk due to the above reasons explained, the systems
begin reading the most recent copy of the lost data and the transaction log
(journal) of the changes to it. A program then starts reading log entries,
starting from the first one that was recorded after the copy of database
was made and continuing through to the last one that was recorded just
before the disk was destroyed. For each of these log entries, the program
changes the data value concerned in the copy of the database to the
‘after’ value shown in the log entry. This means that whatever processing
took place in the transaction that caused the log entry to be made, the
net result of the database after that transaction will be stored. Operation
for every transaction (each entry in the log) is performed that caused
a change in the database since the copy was taken, in the same order
that these transactions were originally executed. This brings the database
copy to the up-to-date level of the database that was destroyed.
Forward (or roll-forward) recovery or redo

Fig. 13.1 illustrates an example of forward recovery system. There are a
number of variations on the forward recovery method that are used. In
one variation, the changes may have been made to the same piece of data

since the last database copy was made. In this case, only the last one of
those changes at the point that the disk was destroyed needs to be used
in updating the database copy in the rolled-forward operation. Another
roll-forward variation is to record an indication of what the transaction
itself look like at the point of being executed along with other necessary
supporting information, instead of reading before and after images of
the data in the log.
Backward Recovery (or UNDO)
Backward recovery (also called roll-backward) is the recovery procedure,
which is used in case an error occurs in the midst of normal operation on
the database. The error could be a human keying in a value, or a program
ending abnormally and leaving some of the changes to the database that
it was suppose to make. If the transaction had not committed at the time
of failure, it will cause inconsistency in the database as because in the
interim, other programs may have read the incorrect data and made use
of it. Then the recovery manager must undo (rollback) any effects of the
transaction database. The backward recovery guarantees the atomicity
property of transactions.
Fig. 13.2 illustrates an a example of backward recovery method. In case
of a backward recovery, the recovery is started with the database in its
current state and the transaction log is positioned at the last entry that
was made in it. Then a program reads ‘backward’ through log, resetting
each updated data value in the database to it “before image” as recorded
in the log, until it reaches the point where the error was made. Thus,
the program ‘undoes’ each transaction in the reverse order from that in
which it was made.
Backward (or roll-backward) recovery or undo

Example 1
Roll-backward (undo) and roll forward (redo) can be explained with
an example as shown in Fig. 13.3 in which there are a number of
concurrently executing transactions T1, T2, ....., T6 . Now, let us assume
that the DBMS starts execution of transactions at time ts but fails at
time tf due to disk crash at time tc. Let us also assume that the data for
transactions T2 and T3 has already been written to the disk (secondary
storage) before failure at time tf.
It can be observed from Fig. 13.3 that transactions T1 and T6 had not
committed at the point of the disk crash. Therefore, the recovery manager
must undo the transactions T1 and T6 at the start. However, it is not clear
from Fig. 13.3 that to what extent the changes made by the other already
committed transactions T1 and T6 have been propagated to the database
on secondary storage. This uncertainty could be because the buffers may
or may not have been flushed to secondary storage. Thus, the recovery
manager would be forced to redo transactions T2, T3, T4 and T5 .
Example of roll backward (undo) and roll froward (redo)

Example 2
Let us consider another example in which a transaction log operation
history is given as shown in Table 13.1. Besides the operation history,
log entries are listed that are written into the log buffer memory (resident
in main or physical memory) for the database recovery. The second
transaction operation W1 (A, 20) in Table 13.1 is assumed to represent an
update by transaction T1 , changing the balance column value to 20 for
a row in the accounts table with ACOUNT-ID = A. In the same sense,

the write log (W, 1, A, 50, 20), the value 50 is the before image for the
balance column in this row and 20 is the after image for this column.
Now, let us assume that a system crash occurs immediately after the
operation 1 W (B, 80) has completed, in the sequence of events of Table
13.1. This means that the log entry (W, 1, B, 50, 80) has been placed
in the log buffer, but the last point at which the log buffer was written
out to disk was with the log entry (C, 2). This is the final log entry that
will be available when recovery is started to recover from the crash. At
this time, since transaction T2 has committed while transaction T1 has
not, we want to make sure that all updates performed by transaction T2
are placed on disk an that all updates performed by transaction T1 are
rolled back on disk. The final values for these data items after recovery
has been performed should be A = 50, B = 50, and C = 50, which is the
values just before Table 13.1.
After the crash system is reinitialised, a command is given to initiate
database recovery. The process of recovery takes place in two phases
namely (a) roll backward or ROLLBACK and (b) roll forward or ROLL
FORWARD. In the ROLLBACK phase, the entries in the sequential
log file are read in reverse order back to system start-up, when all data
access activity began. We assume that the system start-up happened
just before the first operation 1 R (A, 50) of transaction history. In the
ROLL FORWARD phase, the entries in the sequential log file are read
forward again to the last entry. During the ROLLBACK step, recovery
performs UNDO of all the updates that should not have occurred,
because the transaction that made them did not commit. It also makes a
list of all transactions that have committed. We have assumed here that
the ROLLBACK phase occurs first and the ROLL FORWARD phase
afterward, as is the case in most of the commercial DBMSs such as DB2,
System R of IBM.
Table 13.1 Transaction history and corresponding log entries

Table 13.2 ROLLBACK process for transaction history crashed just

after W1 (B, 80)
Table 13.2 and 13.3 list all the log entries encountered and the actions
taken during ROLLBACK and ROLL FORWARD phases of recovery.
It is to be noted that the steps of ROLLBACK are numbered on the left
and the numbering is continued during the ROLL FORWARD phase of
table 13.3. During ROLLBACK the system reads backward through the
log entries of the sequential log file and makes a list of all transactions
that did and did not commit. The list of committed transactions is used in
the ROLL FORWARD, but the list of transactions that did not commit is
used to decide when to UNDO updates. Since the system knows which
transactions did not commit as soon as it encounters (reading backward)
the final log entry, it can immediately begin to UNDO write log changes
of uncommitted transactions by writing before images onto disk over
the row values affected. Disk buffering is used during recovery to read
in pages containing rows that need to be updated by UNDO or REDO
steps. An example of UNDO write is shown in step 4 of table 13.2. Since
the transaction responsible for the write log entry did not commit, it
should not have any transactional updates out on disk. It is possible that
some values given in the after images of these write log entries are not
out on disk. But, in any event it is clear that writing the before images
in place of these data items cannot hurt. Eventually, we return to the
value such data items had before any uncommitted transactions tried to
change them.
Table 13.3 ROLL FORWARD process for transaction history taking
place after ROLLBACK of table 13.2

During the ROLL FORWARD phase of table 13.3, the system simply
uses the list of committed transactions gathered during the ROLLBACK
phase as a guide to REDO updates of committed transactions that might
not have gotten out of disk. An example of REDO is shown in step 9
of table 13.3. At the end of this phase the data item would have the
right values. All updates of transactions that committed are applied and
all updates of transactions that did not complete are rolled back. It can
be noted that in step 4 of ROLLBACK of table 13.2, the value 50 is
written to the data item A and in step 9 of ROLL FORWARD of table
13.3, the value 50 is written to data item C. It can be recalled that the
crash occurred just after the operation in W1(B, 80) of transaction log
operation history. Since the log entry for this operation did not get to
the disk, as can be seen in table 13.1, the before image of B cannot be
applied during recovery. The update for B to the value 80 also did not get
out to disk. Thus, the final values for the three data items mentioned in
the original transaction log history are A = 50, B = 50 and C = 50, which
was the values just before table 13.1.
Media Recovery
Media recovery is performed when there is a head crash (record
scratched by a phonograph needle) on the disk. During a head crash, the
data stored on the disk is lost. Media recovery is based on periodically
making a copy of the database. In the simplest form of media recovery,
before system start-up, bulk copy is performed for all disks being run
on a transactional system. The copies are made to duplicate disks or
to less expensive tape media. When a database object such as a file
or a page is corrupted or a disk has been lost in a system crash, the
disk is replaced with a back-up disk, and normal recovery processes is
performed. During this recovery, however, ROLLBACK is performed all
the way to system start-up, since one can not depend on the backup disk
to have any updates that were forced out to the last checkpoint. Then,
ROLL FORWARD is performed from that point to the time of system
crash. Thus, the normal recovery allows recovering all updates on this
backup disk.

14. (b)
Refer Q. No. 14. (a) from May/June 2013
15. (a)
Refer Qn. No. 15. (a) from May/June 2013
15. (b)
FILE ORGANISATION
A file organisation in a database system essentially is a technique of
physical arrangement of records of a file on secondary storage device.
It is a method of arranging data on secondary storage devices and
addressing them such that it facilitates storage and read/write (input/
output) operations of data or information requested by the user. The
organisation of data in a file is influenced by number of factors that
must be taken into consideration while choosing a particular technique.
Some of these factors are as follows:
•• Fast response time required to access a record (data retrieval), trans-
fer the data to the main memory, write record and or modify a record.
•• High throughput.
•• Intended use (type of application).
•• Efficient utilisation of secondary storage space.
•• Efficient file manipulation operations.
•• Protection from failure or data loss (disk crashes, power failures and
so on).
•• Security from unauthorised use.
•• Provision for growth.
•• Cost.
Records and Record Types

The method of file organisation chosen determines how the data can
be accessed in a record. Data is usually stored in the form of records.
As we described earlier, a record is an entity composed of data items
or fields in a file. Each data item is formed of one or more bytes and
corresponds to a particular field of the record. Records usually describe
entities and their attributes. For example, a purchase record represents
a purchasing order entity, and each field value in the record specifies
some attribute of that purchase order, such as ORD-NO, SUP-NAME,
SUP-CITY, ORD-VAL, as shown in Fig. 3.8 (a). Records of a file may

reside on one or several pages in the secondary storage. Each record has
a unique identifier called a record-id. A file can be of:
(a) Fixed length records.
(b) Variable length records.
Fixed-length Records
In a file with fixed-length records, all records on the page are of the
same slot length. Record slots are uniform, and records are arranged
consecutively within a page. Every record in the file has exactly
the same size (in bytes). The below Figure (a) shows a structure of
PURCHASE record and the below Figure (b) shows number of records
in the PURCHASE record. As shown, all records are having same fixed
length of total 50 bytes, if we assume that each character occupies 1 byte
of space. That means, each record uses 50 bytes and occupies slots in the
page one after another in a serial sequence. A record is identified using
both page-id and slot number of the record.
PURCHASE record

The first operation is to insert records in the first available slots (or
empty spaces). Now whenever a record is deleted, the empty slot created
by deletion of record must be filled with some other record of the file.
This can be achieved using number of alternatives. The first alternative
is that the record that came after deleted record can be moved into the
empty space formally occupied by the deleted record. This operation
will continue until every record following the deleted record has been
moved ahead. The below Fig. (a) shows an empty slot created by deletion
of record 5, whereas in the below Fig. (b) all the subsequent records have
moved one slot upward from record 6 onwards. All empty slots appear
together at the end of the page. Such an approach requires moving a
large number of records depending on the position of deleted record in
a page of the file.
Deletion operation on PURCHASE record

The second alternative is that only the last record is shifted in empty
slot of deleted record, instead of disturbing large number of records,
as shown in Fig. (c). In both these two alternatives, it is not desirable
to move records to occupy the empty slot of deleted record as because
doing so requires additional block accesses. As insertion of records is a
more frequently performed operation than deletion of records, it would
be more appropriate to keep the empty slot of the deleted record vacant
for a subsequent insertion of a record before the space can be reused.
Therefore, a third alternative is used in which the deletion of a record
is handled by using an array of bits (or bytes) called file header at the
beginning of the file, one per slot, to keep track of free (or empty) slot
information. Till the time record is stored in the slot, its bit is ON. But
when a record is deleted, its bit is turned OFF. The file header tracks this

bit becoming ON or OFF. A file header contains a variety of information

about the file including the addresses of the slot of deleted records.
When the first record is deleted, the file header stores its slot address.
Now this empty slot of first deleted record is used to store the empty slot
address of the second available record and so on, as shown in Fig. 3.9
(d). These stored empty slot addresses of deleted records are also called
pointers since they point to the location of a record. The empty slot of
deleted records thus forms a linked list, which is referred to as a free
list. Under this arrangement, whenever a new record is inserted, the first
available empty slot pointed by the file header is used to store it. The file
header pointer is now pointed to the next available empty slot for storing
next inserted record. In case of unavailability of an empty slot, the new
record is added at the end of the file.
Advantages of fixed-length record:
Because the space made available by a deleted record is exactly the
space needed to insert a new record, insertion and deletion for files are
simple to implement.
Variable-length Records
In a file with variable-length records, all records on the page are not of
the same length. In this case, different records in the file have different
sizes. A file in the database system can have multiple record types,
record with variable filed lengths or repeating fields in a record. The
main problem with variable-length records is that when a new record is
to be inserted, an empty slot of just the right length is required. In case
the empty slot is smaller than the new record length, it cannot be used.
Similarly, if the empty slot is too big, extra space is wasted. Therefore, it
is important that just the right length of space is allocated while inserting
new records and move records to fill the space created by deletion of
records to ensure that all the free space in the file is contiguous.
To implement variable-length records, the structure of file is first made
flexible as shown in Fig. This structure is related to the purchasing
system database of an organisation in which PURCHASE-INFO file
has been defined as an array with an arbitrary number of elements (for
example, PURCHASE-INFO), which does not limit the number of
elements in the array. Although, any actual record will have a specific
number of elements in its array. There is no limit on how large a record
can be (except up to the limit of size of disk storage device).

Flexible structure of PURCHASE-LIST record
Byte-string representation: Different techniques are used to implement

variable-length records operation in a file. Byte-string representation is
one of the simplest techniques of implementing variable-length records
operation. In byte-string technique, a special symbol (⊥) called end-
of-record is attached at the end of each record. Each record is stored
as a string of consecutive bytes. Fig. 3.11 shows an implementation of
bytestring technique using end-of-record symbol to represent the fixed-
length records of Fig. 3.9 (a) as variable length records.
Fig. 3.11 Byte-string representation of variable-length records
Disadvantages of byte-string representation:

•• It is very difficult to reuse empty slot space occupied formally by
deleted record.
•• A large number of small fragments of disk storage are wasted.
•• There is hardly any space for future growth of records.
Due to the above disadvantages and other limitations, the byte-string

technique is not usually used for implementing variable-length records.
Fixed-length representation: Fixed-length representation is another
technique to implement variable-length records operation in a file. In
this technique, one or more fixed-length records are used to represent
variable-length record. Two methods namely (a) reserved space and (b)
list representation are used to implement it. In reserved space method, a
fixed-length record of the size equal to that of maximum record length
in a file (that is never exceeded) is used. Unused space of the records

shorter than the maximum size is filled with a special null or end-of-
record symbol. Fig. 3.12 shows fixed-length representation of the file of
Fig. 3.11. As shown, suppliers KLY System, Concept Shapers and Trinity
Agency have maximum of two order numbers (ORD-NO). Therefore, the
PURCHASE-INFO array of PURCHASE-LIST record contains exactly
two records for maximum of two ORD-NO per supplier. The suppliers
with less than two ORD-NO will have records with null field (symbol
⊥) in the place of second ORD-NO. The reserved-space method is useful
when most records have a length close to the maximum. Otherwise, a
significant amount of space may be wasted.
Fig. 3.12 Reserved-space method of Fixed-length representation for
implementing variable-length records
In case of list representation (also called link list), a list of fixed-length

records, chained together by pointers, is used to implement variable-
length record. This method is similar to file header address of Fig. 3.9
(d) except that in case of file header method pointers are used to chain
together only deleted records, whereas in list representation, pointers of
all records pertaining to the same supplier all chained together. Fig. 3.13
(a) shows an example of link list method of fixed-length representation
for implementing variable length records.
As shown in Fig. 3.13 (a), link list method has disadvantages of wasting
space in all records except first in the chain because the first record
has the supplier name (SUP-NAME) and order value (ORD-VAL), but
subsequent repeating customer records do not have these fields. Even
though they remain empty, a field space is repeated for SUP-NAME in
all records (for example, records 6, 7 and 8), lest the records not be of
fixed length. To overcome this problem, two types of block structures
namely (a) anchor-block and (b) overflow-block structures are used. Fig.
3.13 (b) shows the structure of anchor-block and overflowblock of link
list configuration of fixed-length record representation for implementing

variable-length records. The anchor-block structure contains the first

record of chain, while the overflow-block contains records other than
those that are the first record of chain, that is repeating order of same
manufacturer. Thus all records within a block have the same length, even
though not all records in the file have the same length.
Link list method of fixed-length representation for implementing
variable-length records

ANNA UNIVERSITY
B.E. / B. Tech. DEGREE EXAMINATION,
MAY/JUNE 2012
Fourth Semester
(Regulation 2008)
Subject Code: CS 2255/141405/CS 46/CS 1254/10 144 CS
406/080250009
Subject Name: DATABASE MANAGEMENT SYSTEMS
PART A–(10 × 2 = 20 marks)
1. List four significant differences between a file—processing system and
a DBMS.
2. What are the different types of Data Models?
3. Describe a circumstance in which you would choose to use embedded

SQL rather than using SQL alone.
4. List two major problems with processing of update operations expressed

in terms of views.
5. Give an example of a relation schema R and a set of dependencies such

that R is in BCNF, but not in 4NF.
6. Why are certain functional dependencies called as trivial functional

dependencies?
7. List down the SQL facilities for concurrency.
01-May Jnue 2012.indd 3 12/7/2012 6:14:11 PM

8. What benefit does strict two-phase locking provide? What disadvantages

result?
9. Mention the different Hashing techniques
10. When is it preferable to use a dense index rather than a sparse index?
Explain your answer.
PART–B (5 × 16 = 80 Marks)
11. (a) Discuss in detail about database system architecture with neat dia-
gram.
Or
(b) Draw an E-R diagram for a banking enterprise with almost all com-
ponents and explain.
12. (a) Explain in detail about Relational Algebra, Domain Relational Cal-
culus and Tuple Relational Calculus with suitable examples.
Or
(b) Briefly present a survey on Integrity and Security.
13. (a) Explain in detail about 1NF, 2NF, 3NF and BCNF with suitable
examples.
Or
(b) Describe about the Multi-Valued Dependencies and Fourth normal
form with suitable example
14. (a) Discuss in detail about Transaction Recovery, System Recovery and
Media Recovery.
(b) Write down in detail about Deadlock and Serializability.
15. (a) Construct a B+ tree to insert the following key elements (order of the
tree is 3) 5, 3, 4, 9, 7, 15, 14, 21, 22, 23.
Or
(b) Describe in detail about how records are represented in a file and
how to organize them in a file.
01-May Jnue 2012.indd 4 12/7/2012 6:14:11 PM

Solutions
PART A
1.
Data Base System File System
5(i) Data Independence is there Data Independence is not there.
(ii) Data integrity and security more. Data integrity and security is less.
(iii) It is easy to access the data. It is difficult to access the data.
(iv) Data can be accessed It produces concurrent anomaly.
concurrently.
2. Data Models:-
classified in four different categories.
• Relational Model.
• Entity–Relationship Model.
• Object Based Data Model
• Semi structured Data Model.
3. Embedded SQLS are SQL statements included in the Programming Lan-

guage. The programming language in which the SQL statements are in-
cluded is called the host Language. Some of the host languages are C,
COBOL, Pascal, FORTRAN, PL/I etc. This embedded SQL source code
is submitted to an SQL precompiler, which processes the SQL statements.
The host language variables are used by the embedded SQL statements
to receive of the SQL queries thus allowing the programming language
to process the retrieved values. Complex queries can be solved by using
embedded SQL where as in SQL the queries can be solved only within
the limitation.
4. When we used update operations expressed in views that is two major

problems.:-
• The from clause has only attribute relation.
• The select class contains only attribute names of the relation, and does
not have any expressions, aggregates or distinct specification.
• Any attributes of tested is in the select clause can be set to null.
• The query not have a group by or having clause.
5. Result : = {R};
done ; = False;
01-May Jnue 2012.indd 5 12/7/2012 6:14:11 PM

compute F+;
while (not done) do
if (there is a schema R, in result that is not in BCNF)
then begin
let a→b be a non trivial functional dependency than holds on Ri such that
a→Ri is not in F+, and a ∩ b =f;
result: = (result – Ri) ∪ (Ri−b) ∪ (a, b);
end
else done : = true;
6. Functional dependencies play a key role in differentiating good database

designs from bad database designs. A functional dependency is a type of
constraint that is a generalization of the notion of key. Relation R, α c R
& b ¨ a.
aãb
Trivial functional dependencies:-
In some areas, a non-key attributes depends on another non-key attribute
on which again depends on primary key attribute of a relation. Hence such
dependencies are called trivial functional dependencies.
7. Transaction processing systems usually allow multiple transactions to run

concurrently.
(1) Improved throughput and resource utilization
(2) Reduced waiting time.
The SQL Std does not provide any explicit locking capabilities, Termina-
tion with rollback causes all updates made by transaction.
8.
• Ensures serializability
• Because it produces only cascadless schedules, recovery is very easy.
• It produces only cascadless schedules, recovery is easy.
• A transaction always reads a value written by a committed transaction.
9. Internal Hashing for Main Memory

External Hashing for Disk files.
Dynamic file expansion.
10. Using the dense index we follow the pointer directly to the first “Per-
ryridge” record. We process this record, and follow the pointer in that
record to locate the next record in search-key order. We continue pro-
cessing records until we encounter a record for a branch other than ‘Per-
ryridge’.
01-May Jnue 2012.indd 6 12/7/2012 6:14:11 PM

PART B
11. (a)
Native users Sophisticated
Application Data base
(tellers, agents users
programmers administrator
web users) (analysts)
Use i Use Use
Application Application Query Administration

intertaces programmers tools tools
Compiler and
DML queries DDL interpreter
linker
Application
program DML compiler
object code and organizer
Query evaluation
engine Query processor
Buffer
ff manager File manager Authorization Transaction
and integrity manager
manager
Storage manager
Indices Data Disk storage

dictionary
Statistical
Data data
A database system is partitioned into modules that deal with each of

the responsibilities of the overall system.
The functional components of a database system are:
(i) Storage Manager,
(ii) Query Processor.
Storage manager:-
A storage manager is a program module that provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system.
01-May Jnue 2012.indd 7 12/7/2012 6:14:11 PM

The storage manager is responsible for the interaction with file

manager.
The raw data are stored on the disk using file system, which is usu-
ally provided by a conventional operating system.
The storage manager translates the various DML statements into
low level file system commands.
The storage manager is responsible for storing, retrieving and
updating data in the database.
Components of storage manager:
Authorization and integrity manager:
Which tests for the satisfication of integrity constraints and checks the
authority of users to access data.
Transaction manager:
Which ensures that the data base remains in a consistent state despite
system failures, and that concurrent transaction execution proceed
without conflicting.
File manager:
Which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer manager:
Which is responsible for fetching data from disk storage into main
memory and deciding what data to cache in main memory.
Data files:
Which stores the database itself.
Data Dictionary:
Which stores metadata about the structure of the database.
Metadata:
It describes the structure of the primary database (ie) structure of each
file, the type and storage format of each data item and various con-
straints on the data.
Indices:
Which provide fast access to data items that hold particular values.
Query Processor: The Query Processor components are:
DDL interpreter:
Which interprets DDL statements and records the definitions in the
data dictionary.
01-May Jnue 2012.indd 8 12/7/2012 6:14:12 PM

DDL compiler:
Which translates DML statement in a query language into an evalua-
tion plan consisting of low level instructions.
Query evaluation engine:
Which executes low level instructions generated by the DML Com-
piler.
Data Model:
It is a collection of concepts that can be used to describe the structure
of a database.
11. (b)
Br-name Br-city
Assets
Branch
Loan-
Cus-name branch
Cus-id Cus-street
Loan-number Amount
Cus-city
Customer Borrower Loan
Access-date
Cust- Acc-num Balance
T
Type
banker
Depositer
p Account
Manager
Employee Worker
Worker
Emp-id Emp-name
p
Dept-name
p T
Telephone-num
The bank is organized into branches. Each branch is located in a par-

ticular city and is identified by a unique name. The bank monitors the
assets of each branch.
01-May Jnue 2012.indd 9 12/7/2012 6:14:12 PM

Bank customers are identified by their customer-id values. The

bank stores each customer’s name and the street and city where the
customer lives.
Bank employees are identified by their employee-id values. The bank
administration store the name and telephone number in each employee.
A loan originates at a particular branch and can be held by one or
more customers.
The branch entity set, with attributes branch-name, branch-city,
and assets.
The customer entity set with attributes customer-id, customer-
name, customer-street and customer-city. A possible additional attri-
bute is banker-name.
The employee entity set with attributes employee-id, employee-
name, telephone-number, salary and manager.
The weak entity set loan-payment, with attributes payment-num-
ber, payment-date and payment-amount.
Relationship sets for the bank database:
Borrower, a many to many relationship set between customer and loan.
Loan-branch a many to one relationship set that indicates in which
branch a loan originated. Note that this relationship set replaces the
attribute originating-branch of the entity set loan.
Loan-payment a one to many relationship from loan to payment,
which documents that a payment is made on a loan.
Depositor, with relationship attribute access-date, a many to
many relationship set between customer and account, indicating that
a customer owns an account.
Cust-banker, with relationship attribute type, a many to one rela-
tionship set expressing that a customer can be advised by a bank
employee, and that a bank has replaced the attribute banker-name of
the entity set customer.
Works for a relationship set between employee entities with role
indicators manager and worker, the mapping cardinalities express
that an employee works for only one manager and that a manager
supervises one or more employees.
12. (a) Relational Algebra:

The relational algebra is a procedural query language. It consists of a
set of operations that take one or two relations as input and produce
a new relation as their result.
Select operation:-
It is used to select a subset of the tuples from a relation that satisfy a
01-May Jnue 2012.indd 10 12/7/2012 6:14:12 PM

selection condition
s(Selection condition) (R) R-relation
Selection condition
⇓
<attribute name><comparison operator><constant value>
<attribute name><comparison operator><attribute name>
Eg :- s salary > 1000(EMPLOYEE)
Project operation:
It is used to select some of the columns from the tables and discard
the other column.
p<attribute list> (R)
Eg : pname, DNO(EMPLOYEE)
Rename operation:-
This operation can rename either the relation name or the attribute
names, or both.
Ps(B , B …B )(R) = renames both the relation and its attributes
1 2 n
OR
Ps(R) = renames the relation
Set Theoretic operations:-

DEPS-EMPS←sDNO = 5(EMPLOYEE)
RESULT 1←pEID(DEPS-EMPS)
RESULT 2←pSUPER ID(DEPS-EMPS)
RESULT←RESULT 1 ∪ RESULT 2
UNION: Denoted by R ∪ S, is a relation that includes all tuples that
are either in R or in S in both R and S duplicate tuple are eliminated.
INTERSECTION: R ∩ S, is a relation that includes all tuples that
are in both R and S.
SET DIFFERENCE:-R-S is a relation that includes all tuples that
are in R both not in S.
Cartesian or cross product or cross join(X) :-
This is a binary set operation, but the relations on which it is applied
do not have to be union compatible. This operation is used to com-
bine tuples from two relation in a combinational fashion.
Eg: FEMALE-EMPS←ssex =‘F’(EMPLOYEE)
EMPNAMES←pname, Eid(FEMALE-EMPS)
EMP-DEPENDENTS ←EMP NAMES × DEPENDENT
01-May Jnue 2012.indd 11 12/7/2012 6:14:12 PM

ACTUAL –DEPENDENTS ←sEID = EID(EMP-DEPENDENTS)

RESULT←pNAME, DEPENDENT-NAME(ACTUAL-DEPENDENTS)
Join operations:-
It is used to combine related tuples from two relations into single
relations.
Eg: DEPT-MGR←DEPARTMENT MGRID = EID (EMPLOYEE)
RESULT←pDNAME, FNAME, LNAME, (DEPT-MGR)
Theta join:-
Ai q Bj, Ai- attribute of R
Bi- attribute of S
Ai and Bj have same domain.
q is one of the comparison operator = {=, ^, <<, ^, >>, ≤,
≥, #}.
Natural join(*):
It requires that the two join attributes have the same name in both
relations.
Eg: PROJ-DEPT←PROJECT × P(DNAME, DNUM, MGRID, MGR(DEPARTMENT),
STAR DATE)
Division operation(÷):
It is suited to queries that include the phrase “for all”
Eg: Retrieve the names of employees who work on all the project
that “ABC” works –on.
Eg: ABC←s ename = ‘ABC’ (EMPLOYEE)
ABC-DNOS←pPNO(WORKS-ON EEID = EIDABC)
EID-DNOS ←pEEID, PNO(WORKS-ON)
(EIDS)(ETD) ←EID-PNOS-ABC-PNOS
RESULT←pename(EEIDS*EMPLOYEE)
R(Z) ÷ S(X) } > where X ∈ z.
Y = Z – X (ie) Z = X ∪ Y).
Aggregate functions & Grouping
Aggregate functions of Script-F
< grouping attribute > of < function list >(R)
⇑
List of attribute of the relation specified in R.
< function list >(< function > < attribute >)
⇓
function name.
01-May Jnue 2012.indd 12 12/7/2012 6:14:12 PM

Outer join and outer join operations:-

Outer join combines the tuples from the two relations whether or not
they have matching tuples in the other relation.
Left outer join:
Right outer join:
Full outer join:
Relational calculus:-
It is a formal query language where we can write one declara-
tive expression to specify a retrieval request and hence there is no
description of how to retrieve it.
A calculus expression specifies what is to be retrieved rather than
how to retrieve it.
Relational calculus is considered to be non procedural language.
Tuple Relational calculus:-
Tuple variables and Range Relations:-
A tuple relational calculus is based on specifying a number of tuple
variables.
{t/COND(t)}
t-tuple, COND(t) –conditional expression involving it.
Eg: {t/EMPLOYEE (t) and t. salary 50,000}
Expressions and Formulas:-
A general expressions of the tuple relational calculus of the form.
{t.A1, t2.A2, … tn.An/ COND (t1, t2…tn, tn+1…tn + m) t1, t2…tn, tn+1…tn + m -tuple
variables
A1-attribute of the relation of which ti ranges. COND is a condi-
tion or formulas.
1. An atom of the form R(ti)
2. An atom of the form ti.A op tj, B, op tj.B, op is one of the compari-
son operation {=, <, >, ≤, ≥, ≠}ti, Δtj,
3. An atom of the form ti. A op C or C op tj.B
Essential and universal quantifiers:-
A tuple variable t is bound if it is quantified, meaning that it appears
in an (3t) or (∀ t ) clause otherwise it is free.
• An occurance of a tuple variable in a formula F that is an atom
is free in F.
01-May Jnue 2012.indd 13 12/7/2012 6:14:12 PM

• All free occurrence of a tuple variable t in F are bound in for-

mula F1 of the form.
F1 = (3t) (F) or F1 = (∀ t) (F)
Using the Universal Quantifier:-
Eg: Find the names of employees who work on all the projects can-
celled by department number 5.
{e. Lname, e.Fname/EMPLOYEE (e) and (∀ X) (not (project (x))
or not (x, DNUM = 5) or ((3w) (WORKS –ON)(w) and w.EEID =
e.Eid and P.NUMBER = w.DNO)))}
Safe Expressions:
A Safe expression in relational calculus is one that is guaranteed to
yield a finite number of tuples as a result otherwise the expression
is called unsafe.
eg: (t/not(EMPLOYEE(t))}
is unsafe because it yields all tuples in the universe that are not
EMPLOYEE tuples, which are infinitely numerous.
An expression is said to be safe if all values in its result are from
the domain of the expression.
Domain Relational calculus:
The domain calculus differs from the tuple calculus in the type of vari-
ables used in formulas, rather than having variable range over tuples,
the variable range over single values from domain of attributes
{x1, x2…xn/COND(x1, x2…xn, xn+1…xn+m}.
12. (b) It is mechanism used to prevent invalid data entry into the table.
Types
Domain integrity constraints
Entity integrity constraints
Referential integrity constraints
Domain integrity constraints
Types
(i) Not Null constraint
(ii) Check constraint
(i) Not Null constraint
Eg: Create table employee (eid Number (5) constraint emp not
Null, ename Varchar(2));
emp-constraint name (it is optional)
It is used to enforce that the particular column will not accept
null values.
01-May Jnue 2012.indd 14 12/7/2012 6:14:13 PM

(ii) Check constraint:-

It is used to specify the conditions that each row must satisfy.
(eg) create table employee (ename varchar2 (10), eid
number(5), salary number(5) constraint sal check(salary 500));
Sal-constraint name.
Entity integrity constraints.
Unique constraints:-
(eg) Create table employee (ename varchar 2(10), eid number(5),
constraint eeid unique) eeid-constraint name. Unique constraint is
used to prevent duplication of values.
Primary Key constraints
(eg) Create table employee (ename varchar 2 (10), eid number (5),
Constraint eeid primary key, address varchar 2(10));
eeid-constraint name
This is used to prevent duplication of values.
Unique constraint: It allows null value for the column.
Primary Key constraint: It will not allow null value for the column.
Referential integrity constraint:
To establish a parent child relationship between two tables having a
common column we can use referential integrity constraints.
To implement this, we should define the column in the parent
table as primary key and the same column in the child table as a
foreign key referring to the corresponding parent entry.
(eg) Create table employee(ename Varchar 2(10), eno number (5),
primary key, dno number (5) constraint fdno references department
(dno), salary number(5));
fdno- constraint name
Before enabling these constraints we ensure that dno column of the
respective tables have been defined with either unique key or pri-
mary key constraint.
13. (a) First Normal Form(1NF)

It states that domain of an attribute must include only atomic(simple,
invisible) values and that the value of any attribute in a table must be
a single value from the domain of that attribute-
Eg: (a) Department.
Dname Dnumber DM GRID Dlocation
01-May Jnue 2012.indd 15 12/7/2012 6:14:13 PM

(b) Research 5 100 {A, B, C}

Administration 4 200 {D}
Head 1 400 {E}
(c) Research 5 100 A

Research 5 100 B
Research 5 100 C
Administration 4 200 D
Head 1 400 E
EMP-PROJ (eid, ename {PROJS (PNUMBER, HOURS)})
EMP-PROJ 1 EMP-PROJ 2
eid ename eid Pnumber Hours
Second Normal Form(2NF)

2NF is based on the concept of full functional dependency. A func-
tional dependency X→Y is a full functional dependency.
• A ∈ X, (X−{A})
• A ∈ X, (X−{A})→Y
{eid, PNUMBER}→HOURS is a full F∞
{eid, PNUMBER}→Ename is partial because eid→ename holds.
eid Pnumber Hours ename Pname Plocation
ED1
ED2
ED3
2NF normalisation
ED1
eid Pnumber Hours
ED1
ED2
eid ename
ED2
ED3
Pnumber Pname PLOCATION
A
ED3
01-May Jnue 2012.indd 16 12/7/2012 6:14:13 PM


3NF is based on the concept-0 transitive dependency
→X→Z and Z→Y
EMP-DEPT
ename eid DOB address Dnumber Dname DMGRid
ED1
ename eid DOB address Dnumber
ED3
Dnumber Dname DMGRid
Relational Decomposition:-
A single decomposition schema R = {A1, A2... An} that includes all
the attributes of the database.
∪R
i =1
i R
This is called attribute preservation condition of decomposition.

Properties of Decomposition:-
(1) Dependency preservation
(2) Lossless join property.
Boyce–codd Normal Form (BCNF):-
A relation schema (R) is in BCNF with respect to a set F of func-
tional dependency if for all functional dependencies in F+ of the form
a→b, where a ≤ R and b ≤ R; at least one of the following holds.
• a→b is trivial (i.e., b ≤a)
• a is a Super key for R.
(1) R = (A, B, C)
(2) F = (A→B, B→C)
(3) Key = {A}
01-May Jnue 2012.indd 17 12/7/2012 6:14:13 PM

The decomposed if Relation R is as follows, decomposed into R1 =

(A, B), R2 = (B, C)
• R1 and R2 are in BCNF.
• This delivers a lossless Join decomposition.
• All dependencies are preserved.
BCNF Decomposition Algorithm:-
result :{R};
done: = false;
compute F+;
while (not done ) do
if (there is a schema Ri, in result, that is not in BCNF) then
begin
Let a→b be a nontrivial functional dependency that holds on R1;
Such that a→Ri is not in F-1, and a ∩ b = f;
result: = (result–Ri) ∪ (Ri –b ) ∪ (a, b);
end
else done ; = true;
BCNF Δ Dependency Preservation:-
(1) R = (J, K, L)
(2) F = {JK→L, L→K}
(3) Two candidate keys 2 JK and JL.
Assume a relation R that has three attributes, J, K, L that are func-
tionally dependent. The functionally dependencies are such that
attributes on L and L is dependent on K.
The keys to the functional dependencies across the three attributes
will be JK and JL.
Then the relation R is not in BCNF (ie, Boyce Codd normal
format).
Eg: R-(branch-name, branch-city, branch-assets, customer-name,
loan-number, loan amount).
E-(branch-name→branch-assets, branch-city, loan
-number→loan-amount, branch-name) Key-{loan-number,
customer-name}
Decomposed to:
R1 = (branch-name, branch-city, branch-assets)
R2 = (branch-name, customer-name, loan-number, loan-amount).
R3 = (branch-name, loan-number, loan-amt)
R4 = (customer-name, loan-number)
01-May Jnue 2012.indd 18 12/7/2012 6:14:14 PM

13. (b) Multi Valued dependencies:

Multi Valued dependency (MVD) X Y specified on relation
schema R where X and Y are both subsets of R, specifies the follow-
ing constaint on any relation state r of R.
If two tuples t1 and t2 exist in r such that t1[X] = t2[X] then two tuples
t3 and t4 should also exist in r with the following properties where we
use Z to denote (R–(X ∪ Y))
t3[X] = t4[X] = t1[X] = t2[X]
t3[Y] = t1[Y] and t4[Y] = t2[Y]
t3[Z] = t2[Z] = t4[Z] = t1[Z]
whenever X Y hold S, we say that X multi determines Y.
Because of the symmetry in the definition, whenever X Y
holds in R so doesX Z
It is sometimes written as X Y/Z
The formal definition specifies that given a particular value of X,
the set of values of determined by this value of X is completely deter-
mined by X alone and does not depend on the value of the remaining
attributes Z of R.
The informally corresponds to being a multi valued attribute of
the entities represented by tuples in R.
Ename Pname Dname

AA X DA
AA Y DB
AA X DB
AA Y DA
The employee with ename ‘AA’ works on project PNAME ‘X’
and ‘Y’ and has two dependents with DNAME ‘DA’ and ‘DB’.
There is no association between PNAME and DNAME which
means that the two attributes are dependent.
An MVD X Y in R is called a trivial MVD if
(a) Y is a subset of X
(b) X ∪ Y = R(An MVD that satisfies neither (a) nor (b) is called
non trivial.
Fourth normal form:
A relation Schema R is in 4NF with respect to a set of dependencies
F, if for every trivial multi valued dependency X Y in F+, X is a
super key for R.
01-May Jnue 2012.indd 19 12/7/2012 6:14:14 PM

EMP
Ename Pname Dname
EMP-PROJECTS EMP-DEPENDENTS
Ename Pname Ename Dname
(a)
EMP is not in 4NF, because in the non trivial MVDS
ENAME PNAME and ENAME DNAME
Ename is not super key of EMP
Importance of 4NF:
EMP relation has additional employee BB who has three dependents
(CC, DD, EE) and works on your different projects (A, B, C, D).
There are 16 tuples in EMP, but (a), (b) we need to store only 11
tuples in both relations.
Not only would the decomposition save on storage but also the
update anomalies associated with multi valued dependencies are
avoided.
14. (a) Transaction Recovery

It overcomes the short comings of traditional recoveries by eliminat-
ing downtime and avoiding the loss of good data.
Transaction recovery is “the process of removing the undersized
effects of specific transactions from the database.” Traditional recov-
ery is at the database object level.
For example, at the data space, table space or index level. When
performing a traditional recovery a specific database object is chosen.
A transaction begins with successful execution of BEGIN TRANS-
ACTION statement, and it ends with successful execution of either a
COMMIT or a ROLLBACK statement.
When a commit point is established.
(1) All updates made by the executing program since the previ-
ous commit point are committed, that is they are stored per-
manently in the db.
(2) All db positioning is lost and all tuple locks are released.
System recovery:
The system must be prepared to recover.from local failures. Such
as overflow condition within an individual transaction, and global
failures such as more failures.
01-May Jnue 2012.indd 20 12/7/2012 6:14:14 PM

Failures fall in the system

(1) System failures: Example of such failure is power failure. Such
failures affect all transactions currently in progress but do not
physically damage the database. A system failure is also called
as soft crash.
(2) Media failures: (eg) Head crash on the disk. A Media failure is
also called as hard crash.
System failure:
In system failure the contents of main memory are lost. Therefore
the state of any transaction that was in progress at the time of the
failure is therefore no longer known.
Media Recovery:
A media failure is a failure such as disk hard crash or a disk control-
ler failure, in which some portion of the database has been physically
destroyed.
There is no need to undo transactions that were still in progress at
the time of the failure, as all updates of such transactions are actually
lost during failure.
For performing media recovery there is need of dump restore utility.
After media failure, the restore portion of the utility is used to
recreate the database from a specified backup copy.
14. (b) * A system is in a deadlock state if there exists a set of transactions

such that every transaction in the set is waiting for another trans-
action in the set.
* In other words, there exists a set of waiting transactions {T0, T1…
Tn}such that T0 iswaiting for a data item that T1 holds, and T1 is
waiting for data item that T2 holds and Tn-1 is waiting for a data
item that Tn holds, and Tn is waiting for a data item that T0 holds.In
such situation, none of transaction can make progress.
* There are two principle methods
(i) Deadlock Prevention: This approach ensures that system
will never enter in deadlock state.
(ii) Deadlock detection & recovery: This approach tries to
recover from deadlock if system enters in deadlock state.
* There are 2 approaches for deadlock prevention:
(1) One approach ensures that no cyclic waits can occur by
ordering the requests for locks, or requiring all locks to be
acquired together.
(2) The second approach for deadlock prevention is to use
preemption and transaction rollbacks.
01-May Jnue 2012.indd 21 12/7/2012 6:14:15 PM

Serializabilty:-
→There are two types of serializability
• Conflict serializability
• View serializability
Conflict Serializability:
Let us consider a schedule S in which there are two consecutive
instructions Ii and Ij of transactions Ti and Tj respectively
(i ≠ j)
(i) Ii = read (Q), Ij = read(Q). The order of Ii and Ij does not matter, since
the same value of Q is read by Ti and Tj regardless of the order.
(ii) Ii = read(Q), Ij = write(Q). If Ii comes before Ij then Ti does
not read the value of Q that is written by Tj in instruction.Ij. If
Ij comes before Ii, then Ti reads the value of Q that is written
by Tj. Thus the order of Ii and Ij matter S.
(iii) Ii = write (Q), Ij = read(Q). The order of Ii and Ij matters,
reason is same as previous case.
(iv) Ii = write(Q), Ij = write(Q) Since both instructions are write
operations, the order of these instructions does not affect either
Ti or Tj . However, the value obtained by the next read (Q)
instruction of S is affected, since the result of only the latter of
the two write instructions is preserved in the database.
T1 T2
read (A)
write(A)
read (A)
write(A)
read(B)
write(B)
read(B)
write(B)
Fig: Showing only the read & write operations.
T1 T2
read (A)
write(A)
read (A)
read(B)
write(A)
write(B)
read(B)
write(B)
Fig: After Swapping a pair of instructions.
01-May Jnue 2012.indd 22 12/7/2012 6:14:15 PM

T1 T2
read(Q)
write(Q)
write(Q)
View serializability:
For each data item Q, if transaction Ti reads the initial value of Q in
schedule S, then transaction Ti must, in Schedule S1, also read the
initial value of Q.
For each data item Q, if transaction Ti, executes read (Q) in sched-
ules S, and if that value was produced by a write(Q) operation exe-
cuted by Transaction Tj, then the read(Q) operation of transaction Ti
must, in schedule S1, also read the value of Q that was produced by
the same write (Q) operation of transaction Tj.
For each data item Q, the transaction that performs the final
write(Q) operation in schedule S must perform the final write(Q)
Operation in Schedule S’.
T3 T4 T5
read(Q)
write(Q)
write(Q)
write(Q)
A view serializable schedule.
15. (a) B+tree is nothing but technology used in the Query processing in the
database. For searching key elements here used in the tree pointers.
Now we construct the B+ tree for following elements 5, 3, 4, 9, 7,
15, 14, 21, 22, 23,
⇒First we construct 2 elements[∴order is 3]
3 5
⇒Now insert element 4. After inserting element 4, we get,
3 4
5 9 then insert 9
01-May Jnue 2012.indd 23 12/7/2012 6:14:15 PM

⇒Now, insert element 7. Then we get,

Now insert 15
3 4
5 9 3 4
7 5 9
9 15
⇒Now, we want to balance the B tree, and insert 14, 21.

+
5 7
3 4 9 15
14
21
⇒After balancing the B+ tree we get

5 14
3 4 7 9 15 21
⇒After inserting 22 and 23. The B+ tree is given below.

5 14
3 4 7 9 15 21
22 23
15. (b) A file is organized logically as a sequence of records. These records

are mapped onto disk block.
One approach to mapping the data base to files to use several
files, and to store records of only one fixed length in any given file.
Fixed–length records:
A file of account records for our bank database. Each records of this
file is defined as:
type deposit = record
account-number: char(10);
branch-name: char(22);
balance: real;
end;
01-May Jnue 2012.indd 24 12/7/2012 6:14:15 PM

record 0 A-102 Perry ridge 400

record 1 A-305 Round hill 350
record 2 A-215 Mianus 700
record 3 A-101 Down town 500
File containing account records

There are two problems with this simple approach
(1) It is difficult to declare a record from this structure.
(2) The block of the other XXX otherwise some records will
cross block boundaries.
When a record is deleted, we could move the record that come
after it into the space formally occupied by the deleted record.

With record 2 deleted and all records moved.
record 0 A-102 Perryridge 400

With record 2 deleted and final record moved.

Insertion and deletion for files of fixed-length records are simple to
implement, because the space made available by a deleted record is
exactly the space needed to insert a record.
header
record 1
record 2 A-215 Mianus 700
record 4
record 6
01-May Jnue 2012.indd 25 12/7/2012 6:14:16 PM

After deletion of records 1, 4, 6.

Variable length records:-
Variable length records: arise in db system in several ways
1 Storage of multiple record types in a file
2 Record types that allow variable lengths for one or more fields.
3 Record types that allow repeating fields.
type account-list = record
branch-name: char(22):
account-into: array[…∞] of
record;
account-number: char(10);
balance: real;
end
end
Byte-String representation:
A simple method for implementing variable-length records is to
attach a special end –of record (⊥) the symbol to the end of each
record.
0 Perryridge A-102 400 A-201 900 A-218 700 ⊥

1 Roundhill A-305 350 ⊥
2 Mianus A-215 700 ⊥
3 Downtown A-101 500 A-110 600 ⊥
4 Redwood A-222 700 ⊥
5 Brighton A-217 750 ⊥
Fixed length representation:-

There are two ways
1 Reserved Space
2 List representation.
0 Perryridge A-102 400 A-201 900 A-218 700

1 Roundhill A-305 350 ⊥ ⊥ ⊥ ⊥
2 Mianus A-215 700 ⊥ ⊥ ⊥ ⊥
3 Downtown A-101 500 A-110 600 ⊥ ⊥
4 Redwood A-222 700 ⊥ ⊥ ⊥ ⊥
5 Brighton A-217 750 ⊥ ⊥ ⊥ ⊥
01-May Jnue 2012.indd 26 12/7/2012 6:14:16 PM

Organization of records in files:

Several of the possible ways of organizing records in files are:
(1) Heap file organization
(2) Sequential file
(3) Hashing file organization.
Sequential file organization:
A sequential file is designed for efficient processing of record in
sorted ordered based on some search key
A-217 Brighton 750
A-101 Downtown 500
A-110 Downtown 300
A-215 Mionus 600
A-102 Perryridge 700
Clustering file organization:

Many relational database systems store each relation in a separate
file, so that they can take full advantage of the file system that the
operating system provides. Usually, tuples of a relation can be repre-
sented as fixed length records.
SQL query for the bank database:
Select account-number, customer-name, customer-street, customer-
city. from depositor, customer where depositor, customer-name =
customer, customer-name
Customer-name Account-number
Hayes A-102
Hayes A-220
Hayes A-503
Hayes A-305
Fig: The depositor relation
Customer-name Customer-Street Customer-city

Hayes Main Brooklyn
Turner Putnam Stamford
Fig: The customer relation
A clustering file organization that illustrated stores related records of
two or more relations in each block.
01-May Jnue 2012.indd 27 12/7/2012 6:14:16 PM

Such a file organization allows us to read records that would sat-

isfy the join condition by using one block read.
Hayes Main Brooklyn

Hayes A-102
Hayes A-220
Hayes A-503
Turner Putnam Stanford.
Turner A-305
Fig: clustering file structure
Hayes Main Brooklyn

Hayes A-102
Hayes A-220
Hayes A-503
Turner Putnam Stanford.
Turner A-305
Fig: clustering file structure with pointer chains.
01-May Jnue 2012.indd 28 12/7/2012 6:14:16 PM

NOV/DEC 2011
Fourth semester
CS 2255 – DATABASE MANAGEMENT SYSTEMS
(Regulation 2008)
Time: Three hours Maximum: 100 Marks
Answer All questions.
PART A – (10 × 2 = 20 marks)
1. What is a data model?
2. With an example explain what a derived attribute is?
3. Consider the following relation:

EMP (ENO, NAME, DATE_OF_BIRTH, SEX, DATE_OF_JOINING,
BASIC_PAY, DEPT) Develop an SQL query that will find and display
the average BASIC _ PAY in each DEPT.
4. List the two types of embedded SQL SELECT statements.
5. Consider the following relation: R (A

( , B, C, D, E)
The primary key of the relation is AB. The following functional depen-
dencies hold:
A→C
B→D
AB → E
Is the above relation in second normal form?
6. Consider the following relation: R (A

( , B, C, D)
The primary key of the relation is A. The following functional dependen-
cies hold:
A → B, C
B→D
Is the above relation in third normal form?
02-Nov Dec 2011.indd 29 12/7/2012 6:13:55 PM

7. List the two commonly used Concurrency Control techniques.
8. List the SQL statements used for transaction control.
9. What are ordered indices?
10. Distinguish between sparse index and dense index.
PART B – (5 × 16 = 80 marks)
11. (a) (i) Construct an E-R diagram for a car-insurance company whose
customers own one or more cars each. Each car has associated
with it zero to any number of recorded accidents. State any
assumptions you make. (6)
(ii) A university registrar’s office maintains data about the following
entities:
1. Courses, including number, title, credits, syllabus, and pre-
requisites,
2. Course offerings, including course number, year, semester,
section number, instructor, timings, and classroom;
3. Students, including student-id, name, and program; and
4. Instructors, including identification number, name, depart-
ment, and title. Further, the enrollment of students in
courses and grades awarded to students in each course they
are enrolled for must be appropriately modeled. Construct
an E-R diagram for the registrar’s office. Document all
assumptions that you make about the mapping constraints.
(10)
Or
(b) (i) With a neat-sketch discuss the three-schema architecture of a
DBMS, (8)
(ii) What is aggregation in an ER model? Develop an E-R diagram
using aggregation that captures the following information:
Employees work for projects. An employee working for a par-
ticular project use various machinery. Assume necessary attri-
butes. State any assumptions you make. Also discuss about the
ER diagram you have designed. (2 + 6)
12. (a) (i) Explain the distinctions among the terms primary key, candidate
key, and super key. Give relevant examples. (6)
(ii) What is referential integrity? Give relevant example. (4)
(iii) Consider the following six relations for an Order-processing
Database
02-Nov Dec 2011.indd 30 12/7/2012 6:13:55 PM

Application in a Company:
CUSTOMER (CUSTNO, CNAME, CITY)
ORDER (ORDERNO, ODATE, CUSTNO, ORD_AMT)
ORDER_ITEM (ORDERNO, ITEMNO, QTY)
ITEM (ITEMNO, ITEM_NAME, UNIT_PRICE)
SHIPMENT (ORDERNO, ITEMNO, WAREHOUSENO,
SHIP_DATE)
WAREHOUSE (WAREHOUSENO, CITY)
Here, ORD_AMT refers to total amount of an order; ODATE is
the date the order was placed: SHIP_DATE is the date an order
is shipped from the warehouse. Assume that an order can be
shipped from several warehouses. Specify the foreign keys for
this schema, stating any assumptions you make. (6)
(b) With relevant examples discuss the various operations in Relational
Algebra. (16)
13. (a) Define a functional dependency. List and discuss the six inference
rules for functional dependencies. Give relevant examples.(16)
Or
(b) (i) Give a set of Functional dependencies for the relation schema
R(A,B,C,D,E) with primary key AB under which R is in 2NF, but
not in 3NF. (5)
(ii) Prove that any relation schema with two attributes is in
BCNF. (5)
(iii) Consider a relation R that has three attributes ABC. It is
decomposed into relations R1, with attributes AB and R2 with
attributes BC. State the definition of lossless-join decomposition
with respect to this example, Answer this question concisely by
writing a relational algebra equation involving R,R1, and R2. (6)
14. (a) (i) Define a transaction. Then discuss the following with relevant
examples: (8)
1. A read only transaction
2. A read write transaction
3. An aborted transaction
(ii) With a neat sketch discuss the states a transaction can be in. (4)
(iii) Explain the distinction between the terms serial schedule and
serializable schedule. Give relevant example. (4)
(b) (i) Discuss the ACID properties of a transaction. Give relevant
example. (8)
(ii) Discuss two phase locking protocol. Give relevant example. (8)
02-Nov Dec 2011.indd 31 12/7/2012 6:13:55 PM

15. (a) (i) When is it preferable to use a dense index rather than a sparse
index? Explain your answer. (4)
(ii) Since indices speed query processing, why might they not be
kept on several search keys? List as many reasons as possible.
(6)
(iii) Explain the distinction between closed and open hashing. Dis-
cuss the relative merits of each technique in database applica-
tions. (6)
Or
(b) Diagrammatically illustrate and discuss the steps involved in pro-
cessing a query. (16)
02-Nov Dec 2011.indd 32 12/7/2012 6:13:55 PM

Solutions
PART A
1. Data model is a collection of conceptual tools for describing data, data

relationships, data semantics and consistency constraints.
Types.
• E-R Model.
• Relational Model.
• Hierarchial Model
• Network Model
• Object oriented Model
• Object Relational Model
2. A derived attribute is one that represent a value that is derivable from the
value of a related attribute or set of attributes
Eg the age attribute can be derived from the Date of Birth attribute.
3. SQL > select DEPT, avg (BASIC_PAY)

from EMP group by DEPT
4. 1 Open
2 Fetch
5. Yes, the above relation is in second normal form because it is in 1st normal
form and satisfy partial dependency
6. No, the above relation is not in 3 normal form. Because it does not satisfy
transitive dependency.
7. 1. Commit: It saves all transactions that have not already been saved to
the database since the last commit or rollback command was issued.
2. Rollback: It is used to undo transactions that have not already been
saved to the database.
3. Savepoint: It establishes a point back to which you may roll.
4. Set Transaction: It establishes properties for current transaction
8. 1. Lock Based Protocols
To ensure serializability, it is required that data items should be ac-
cessed in mutual exclusive manner. eg two phase locking Protocol.
2. Time stamp Based Protocols.
It ensures serializability. It selects an ordering among transactions in
advance using timestamps.
02-Nov Dec 2011.indd 33 12/7/2012 6:13:55 PM

9. • Each index structure is associated with a particular’ search Key.

• If the file containing the records is sequentially ordered, a primary
index is an index whose search key also defines the sequential order
of the file.
• Primary indices are also called as ordered indices or clustering indices.
10. Dense Index:

In a dense primary index, the index record contains the search key value
and a pointer to the first data record with the search key value rest of the
records with the same search key.
Sparse Index:
An index record appears for only some of the search key values to locate
a record, we find the index entry with the largest search key less than or
equal to the search key value for which we are looking.
PART B
11. (a) (i) Tables:-

Person(Driver_id, Name. Address)
Car (License, Model, year)
Accident(Report_No, Location, Date)
Participated(Driver_id, Licence, Report_No, Damage_
amount;
Driver-id Y
Year
License Model
Name Person Owns Car
Address
Report-no Location
Driver
Paticipated Accident
Date
Damage-amount
02-Nov Dec 2011.indd 34 12/7/2012 6:13:56 PM

11. (a) (ii)

University Name
Name Awards Title

Number
Credits
Student-id
Program
Syllabus
Student Owns Courses

Prerequisites
T
Teached by Has
Instructor Title Course offerings Timings
Name Classroom Semester

Identification Y
Year
number Course number
Department Instructor
Section number
Tables used:
1. University (Name, Address; Award)
2. Student (Student-id, name, program)
3. Courses(title, credits, syllabus, prerequisites)
4. Course offerings Course number, year, semester, section
number, instructor, timings, classroom
5. instructor (Identification number, name, department, title)
11. (b) (i) A database system is partioned into modules that deal with each
of the responsibilities of the overall system.
Storage Manager:-
A storage manager is a program module that provides the inter-
face between the low level data stored in the database and the
application programs and queries submitted to the system.
• The storage manager is responsible for the interaction with
the file manager.
• The storage manager translates the various DML statements
into low level file system commands.
Components of Storage Manager:
Authorization and Integrity Manager:
It tests for satisfaction of various integrity constraints and checks
the authority of users accessing the data.
02-Nov Dec 2011.indd 35 12/7/2012 6:13:56 PM


programmers admin
Use i Use Use
Application Application Query Admin

I/f programmers tools tools
Compiler and DDL

DML queries
linker interpreter
Application
Query evaluation
Buffer
manager
Storage Manager
Indices Data Disk Storage

dictionary
Statistical
Data data
Transaction Manager:
It ensures that the database remain in a consistent state despite
system failures and concurrent executions proceed without con-
flicting
File Manager:
It manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.
Buffer Manager:
It is responsible for fetching data from disk storage into main
memory & deciding what data to cache in main memory.
02-Nov Dec 2011.indd 36 12/7/2012 6:13:56 PM

Data files:
which store the database itself.
Data Dictionary:
It contains metadata that is about data.The schema of a table is an
example of metadata.
Indices:
which provides fast access to data items that hold particular
values.
The query Processor:
It helps the database system to simplify and facilitate access to
data.
DDL Interpreter:
which interprets DDL statements and records the definitions in
the data dictionary
DML Compiler:
which translates DML statements in a query language into an
evaluation plan consisting of low level instructions that query
evaluation engine understands.
Query Evaluation Engine:
which executes low level instructions generated by the DML
compiler.
11. (b) (ii)

• Aggregation is an abstraction through which relationships are
treated as higher level entities.
• That take a collection of values as input & return single values
Eg: Consider a quarternary relationship, ‘manager’ between
‘Employee’, Machinery ‘Job’ & Manager. Using the basic E-R
modeling constructs, we obtain the E-R diagram as shown below.
• I have assumed that the manager manages both employer, proj-
ect & machinery.
• here all the collection of attributes can be aggregated by using
E-R diagram with aggregation.
• the redundant information in the figure can be avoided in
aggregation.
• the works on manages the entity set Employee, machinery,
project as a higher level entity set.
E-R diagram with aggregation
02-Nov Dec 2011.indd 37 12/7/2012 6:13:57 PM

Machinery
Employee Works Project

-on
Manager
g
Manager
Machinery
Employee Works Project

-on
Manager
g
Manager
12. (a) ( i )
• A key allows is a set of attributes and thus distinguishes entities
from each other.
• Key also helps uniquely identify relationships.
Super key:
A super key is a set of one or more attributes that allows us to
identify uniquely an entity in the entity set.
Eg Roll_No attribute of the entity set ‘student’ distinguishes
one student entity from another.
Candidate Key:
A super key may contain extraneous attributes and we are
often interest in smallest super key. A super key for which no
subset is a super key called a candidate key.
Eg. stu_ name, stu_ street_unique student
Roll_no, {stu_name, stu_street}_candidate key
02-Nov Dec 2011.indd 38 12/7/2012 6:13:57 PM

{Roll_no, student_name}_ not candidate key

b/c subset is a super key.
3. Primary Key:
It is a candidate key that is choosen by the database designer as
the principal means of identifying entities within an entity set
Eg ., Roll_no is primary key of student entity set
12. (a) (ii) “A value that appears in one relation for a given set of attributes
also appears for a certain set of attributes in another relation”.
→ the relation scheme for a weak entity set must include the
primary key of the entity set on which it depends.
→ Relation scheme for weak entity set includes a foreign key
that leads to a referential integrity constraint.
Eg ., create table Deposit (B_name, char [15], Acc_no char[10],
cust_name, char[20] not null, Bal integer, Primary key (Acc-no,
cust_name), foreign key (B_name) references Branch, foreign
key (cust_name) references customer);
12. (a) (iii) • An order can be shipped form several warehouses Orderno
and warehouse number are foreign keys for the order ship-
ment from warehouse.
• Schema.
• Foreign key is refer the 2 or more relations
Foreign Key for this,.
1. CUSTNO
2. ORDERNO
3. ITEMNO
4. WAREHOUSE NO
• foreign key is refers the unique key it contains mostly num-
bers.
12. (b) 1. Set operations

(i) union
(ii) Intersection
(iii) Set difference
(iv) Cartesian Product
2. Second Group
(i) Select
(ii) Project
(iii) Rename
02-Nov Dec 2011.indd 39 12/7/2012 6:13:57 PM

Consider 2 relations
Depositor
Cust-name city
Hayes Pune
Johnson Mumbai
Jones Solapur
Lindsay Nashik
Smith Pune
Turner Mumbai
Borrower
Cust-name city
Adamar Mumbai
curry Pune
Hayes Pune
Jackson Solapur
Jones Solapur
Smith Pune
William Kolhapur
Union:
→ operation denoted by U
→ It includes all the tuples that are either in depositor & borrower
Result
Depositor ∪ Borrower
Cust-name city
Hayes Pune
.
.
.
.
Williams Kolhapur
02-Nov Dec 2011.indd 40 12/7/2012 6:13:57 PM

Intersection:
The result of n is a relation that includes all tuples that all in both
depositor & borrower.
O/P
Depositor n borrower
Cust - name City
Hayes Pune
Jones Solapur
Smith Pune
Difference:
→ denoted by depositor- borrower
→ result of this is contain all tuples in depositor but not in bor-
rower.
Depositor-Borrower
Cust-name City
Johnson Mumbai
Lindsay Nashik
Turner Mumbai
Cartesian Product:
→ also known as CROSS PRODUCT or CROSS JOINS. denoted
by ‘X’.
→ the result relation will have one tuple or each combination of
tuples from each participating relation.
Pub_Info
Pub_Code Name
P001 MCGRAW
P002 PHI
P003 Pearson
BOOK_INFO
BOOK-ID Title
B001 DBMS
B002 Compiler
02-Nov Dec 2011.indd 41 12/7/2012 6:13:57 PM

Pub_Info × book_info
Pub_Code Name Book_ID Title
P001 MCGRAW B001 DBMS
P002 PHI B001 DBMS
P003 Pearson B001 DBMS
P001 MCGRAW B002 Compiler
P002 PHI B002 Compiler
P003 Pearson B002 Compiler
Select operation:
Select operation selects tuples that satisfy a given predicate.
→ represented as s < select condi >(R)
Query: Display details of account holders living in the city
‘pune’
scity = ‘Pune’(depositor)
The project operation:
→ Project open selects certain columns from a table while dis-
carding others.
p < attributelist > (R)
Eg Query: pname (borrower)
Rename Operation:
→ we can rename either the relation or the attributes or both.
rs(new attribute name) (R)
Eg rTemp(Bname, Aname, pyear, Bprice) (Book,
→ the attributes are renamed.
13. (a)
• Functional dependencies differentiate good database designs from
bad.
• is a type of constraint that is a generalization of the notion of key.
Relation schema –R.
Let a ≤ R, b ≤ a functional dependency a →b
→ for all pairs of tuples t1 & t2 in r such that t1[a] = t2[a], call that
t1[b] = t2[b]
→ k is a super key of R if k→ R. i.e., k is super key if, whenever
t1[k] = t2[k], it is also call that t1[R] = t2[R] ie., (t1 = t2) functional
dependency
Loan-no → Branch-name
02-Nov Dec 2011.indd 42 12/7/2012 6:13:57 PM

Loan-no → Amount
No functional dependency
Loan-no → Customer−name
as a given loan can be made to more than one customers.
Uses:
→ To test relations to see whether they are legal under a given set of
functional dependencies
→ To specify constraints on the set of legal relations.
6 Rules:
1. Armstrong ‘s Axioms:
(i) Reflexivity Rule: If a is set of attributes and b ≤ a, then a →
b holds.
(ii) Augmentation Rule: If a → b holds and r is a set of attributes
then, r a → r b holds.
(iii) Transitivity Rule: If a → b holds and b → r holds, then a →
r holds.
Additional Rules:
(i) Union rule: If a → b holds, and a → r holds, then a → br
holds
(ii) Decomposition rule: If a → br holds, then a → b holds and
a → r holds
(iii) Pseudo transitivity rule: If a → b holds and rb → d holds,
then ar → d holds.
Eg: R = (A,B,C,G,H,I)
Set FDS (A → B, A → C, CG → H, CG → H, CG → I, B → H)
Some numbers of F+ are,
* A→H
Proof: A → B & B → H
∴ transitivity rule.
A→H
* CG→ HI
Proof: CG→ H and CG→ I
∴ By union rule.
CG→ HI
AG→ I
Proof: A→ C & CG→ I
∴ By Pseudotransitivity rule.
AG→ I
Or A→ C
∴ By augmentation rule.
AG → CG & CG → I
∴ By transitivity rule.
02-Nov Dec 2011.indd 43 12/7/2012 6:13:57 PM

Canonical cover:
A canonical cover Fc for F is a set of dependencies such that F logi-
cally implies all dependencies in Fcand Fc logically implies all in F
Fc must have
• no functional dependency on Fc contain an extraneous attribute.
• Each left side of functional dependency in Fc is unique.
Eg: consider the following set F of functional dependencies on
schema (A, B, C).
A → BC
B→C
A→B
AB → C
The canonical cover F is computed as,
A → BC A → B
Combine there FD s into A → BC
* A is extraneous attribute in AB → C b/c
(F−{AB−C})U{B → C}. b/c B → C already in FDS.
13. (b) (i) The relation schema is said to be in a second normal form when
it is already in 1st normal form & it satisfy partial dependency.
Partial Dependency:
Here the primary key A, B is in Relation schema, the primary key
partially depends on other attributes i.e non key so the relation R
(A, B, C, D, E) is in 2nd normal form.
Transitive Dependency:
The relation is 2nd normal form but it not satisfy transitive depen-
dency ie the primary key uniquely distinguish the each tuple so
it’s not transitive.
∴ Relation R is in 2NF not in 3NF.
13. (b) (ii) • BCNF is more rigorous form of 3NF.

• which has multiple candidate keys, composite candidate keys,
and candidate keys that overlapped.
• BCNF is based on concept of determinant.
• A relation table is in BCNF if every determinant is candidate
key.
Añ ANAME Aquali AStatus Title Royality

100 Arora Ph.D 10 T1 3000
110 Sharma M.Tech 20 T2 4000
02-Nov Dec 2011.indd 44 12/7/2012 6:13:57 PM

(A≠, Title ) combination is a candidate key so this is not in

BCNF
Añ Title Royality
A1 T1 5000
A2 T2 7000
Añ Aname Aquali
A1 John Ph.D
A2 tom M.Tech
If 2 attributes are combined to get the unique tuple then the 2

attributes are called as candidate key.
then the relation is in BCNF.
(i.e.,) Proved.
13. (b) (iii) The decomposition is lossless −join decomposition if either of

following holds.
R1 n R2 R1
R1 n R2 R2
• R1 n R2 is A which is common attribute,
• A→B is the FD. By augmentation rule we have, A→AB
which is relation R1. Thus R1 n R2 → R1 is not satisfied.
Hence above decomposition is not lossless decomposi-
tion.
• The relation algebra used here is intersection operation.
14. (a) (i) Transaction:

“Collection of operations that form a single logical unit of work
are called transactions”.
Read only transaction:
Read only transaction transfers the data item X from the database
to a local buffer belonging to the transaction that executed the
read operation
Eg : Let Ti be the transaction that transfer $50 from Account A
Ti: read (A)
A: = A−50
02-Nov Dec 2011.indd 45 12/7/2012 6:13:57 PM

Read write transaction:

→ are used for read the data from the database & write the data
back to other database or update the same database after the
transaction.
Eg : Let Ti be the transaction that transfer $50 from Account A
to B.
1.1.: read (A)
A: = A − 50
write (A)
read (B)
B: = B + 50
write(B)
Aborted transaction:
• In the absence of failures all transaction completes execution
however, it is possible that transaction doesn’t complete its
execution, called as “aborted”.
• It can restart the transaction if it aborted.
• It can kill the program, if the transaction was aborted because
of internal logical error.
14. (a) (ii) Active: the conrine while it is executing.

Partial committed: after the final statement.has been executed.
Failed: after the discovery that normal execution can no longer
proceed.
Aborted: after the transaction has been rolled back to start of
transaction
Committed: after successful completion.
Partially
Committed
committed
Active
Failed Aborted
02-Nov Dec 2011.indd 46 12/7/2012 6:13:57 PM

14. (a) (iii) (1) Serial schedule:

A serial schedule is a schedule that executes all of the opera-
tions from one transactions before moving on to the operations
of another transaction.
Eg:
T1 T2
read (A)
A: = A−50
write (A)
read (B)
B: = B + 50
write(B)
read (A)
temp = A*0.1
A: A−temp
write(A)
read(B)
B: = B + temp
write(B)
Serializable schedule:
A serializable schedule is a schedule that has the same effect on
the database as a serial schedule
Eg
T1 T2 T3
read (Q)
write (Q) write (Q) write (Q)
14. (b) (i) Atomicity: Either all operations of the transaction are reflected
property in database or none are.
Eg Before execution of transaction values of A & B are 1000 &
2000 a failure occurs at middle (i e) after write(A) before write(B)
then A = 950 & B = 2000 to avoid this atomicity ensured.
02-Nov Dec 2011.indd 47 12/7/2012 6:13:58 PM

Consistency: Execution of a transaction in isolation preserves

the consistency of the database.
Eg sum of A & B unchanged
Isolation: Even though multiple transactions execute concur-
rently, each transaction is unaware of other transactions
Eg If a 2nd concurrently running transaction reads A & B at
middle & computes A + B, it will observe an inconsistent value,
---------
A & B then it leads inconsistency
→to avoid this problem isolation used.
Durability: After a transaction completes successfully, the
changes it has made to the database persist, even if there are
system failures.
Eg transaction A to B 500Rs failure occur eventhough the data-
base data persist.
14. (b) (ii)

• this protocol requires that each transaction issue lock and
unlock requests in 2 phase
(i) Growing phase: In this phase, a transaction may obtain
locks, but may not release any lock.
(ii) Shrinking phase: a transaction may release locks, but may

not obtain any new locks.
• initially a transaction is in growing phase.
• once the transaction releases a lock, it enters in the shrinking
phase, and it cannot issue more lock requests
• transaction T1 & T2 are not 2 phase while T3 is 2 phase.
read(B);
B: = B − 50;
write(B);
lock-X(A);
read(A);
A:= A + 50;
write(A);
unlock(B);
unlock(A);
Advantage:
* ensures conflict serializability
02-Nov Dec 2011.indd 48 12/7/2012 6:13:58 PM

Disadvantage:
(i) Two phase locking does not ensure freedom from deadlock.
T3 T4
lock−X(B)
read(B)
B: = B − 50
write(B) lock − S(A)
read(A)
lock − S(B)
lock = X(A)
(ii) Cascading rollbacks may occur under 2 phase locking

1. Strict 2 phase locking Protocol : all exclusive mode
locks taken by transaction should be held till finish
2. The rigorous 2 phase locking protocol: All locks be
held untill transaction comit
15. (a) (i) Dense Index:

• An index record, appears, for every search key value in the
file.
• In a dense primary index the index record contains the search-
key value and a pointer to the first data record with that search
key value.
Sparse Index:
Index record appears for only some of the search key values.To
locate a record, we find the index entry with the largest search
key value that is less than or equal to search key we are looking
so dense index is preferable to use.
15. (a) (ii)

1. Several search keys imposes a overhead on modification of
database.
2. Maintanence of database is difficult
15. (a) (iii)

• Open hashing may place keys with the same hash function
value in different buckets.
• Closed hashing always places such keys together in the same
bucket. Different bucket can be of different size.
02-Nov Dec 2011.indd 49 12/7/2012 6:13:58 PM

• Deletion is difficult in open hash.

• Deletion is easy in closed hashing.
• Small, static set of data lookups more efficient in open hashing.
15. (b)
• Query processing refers to the range of activities involved in
extracting data from a database.
• Include translation of queries in high level database languages,
into expressions that can be used at the physical level of the file
system.
Parser Relational
Query & algebra
translator expression
Optimizer
Query Evaluation Execution plan

output engine
g
Data Statistics
abt data
1. Parsing & translation

2. Optimization
3. Evaluation
• 1st in query processing is to translate a given query into its internal
form.
• In generating the internal form of query the parser checks the syntax
of user’s query.
• The system constructs a parse-tree representation of the query,
which it then translates into a relational algebra expression
select balance
from account
where balance < 2500
this can translated into.
balance < 2500 (p balance (account))
balance (s balance < 2500(account))
02-Nov Dec 2011.indd 50 12/7/2012 6:13:58 PM

p balance
|
s balance < 2500
|
account
Query evaluation plan.
• to implement the preceding selection, we can scarch every tuple in
account to find tuples with balance less than 2500.
• If a B+ tree index is available on the attribute balance, we can use
the index instead of locate the tuples.
• A relational algebra operation annotated with instructions on how
to evaluate it is called an evaluation primitive.
• A sequence of primitive operations that can be used to evaluate a
query is a query−evaluation plan
• The different evaluation plan for a query can have different costs.
02-Nov Dec 2011.indd 51 12/7/2012 6:13:58 PM

APRIL / MAY 2011
Fourth Semester
(Regulation 2008)
Answer ALL questions.
PART A – (10 × 2 = 20 marks)
1. Who is a DBA? What are the responsibilities of a DBA?
2. What is a data model? List the types of data models used.
3. What is embedded SQL? What are its advantages?
4. What is the difference between tuple relational calculus and domain rela-
tional calculus?
5. What is meant by lossless-join decomposition?
6. A relation R = {A, B, C, D} has FD’s F = {AB → C, C → D, D → A}.

Is R is in 3NF?
7. What are the ACID properties?
8. What are two pitfalls (problems) of lock-based protocols?
9. What are the advantages and disadvantages of indexed sequential file?
10. What is database tuning?
PART B – (5 × 16 = 80 marks)
11. (a) (i) With a neat diagram, explain the structure of a DBMS. (9)
03-April May 2011.indd 52 12/7/2012 6:13:36 PM

Database Management Systems (April/May 2011) 5.53
(ii) Draw an E-R diagram for a small marketing company database,

assuming your own data requirements. (7)
Or
(b) (i) Compare the features of file system with database system. (8)
(ii) Explain the differences between physical level, conceptual level

and view level of data abstraction. (4)
(iii) Mention any four major responsibilities of DBA. (4)
12. (a) (i) Consider the following relational database

employee (employee-name, street, city)
works (employee-name, company-name, salary)
company (company-name, city)
managers (employee-name, manager-name)
Give an expression in SQL to express each of the following
queries:
Find the names and cities of residence of all employees who
work for XYZ Bank. Find the names, street address, and cities
of residence of all employees who work for XYZ Bank and earn
more than Rs.10.000 per annum.
Find the names of all employees in this database who live in the
same city as the company for which they work.
Find the names of all employees who live in the same city and
on the same street as do their managers. (4 × 3= 12)
(ii) Define the term distributed database management system and

mention the issues to be considered in the design of the same.
(4)
Or
(b) (i) What are the relational algebra operations supported in SQL?
Write the SQL statement for each operation. (12)
(ii) What is data integrity? Explain the types of integrity constraints.

(4)
13. (a) (i) Explain 1NF, 2NF, 3NF, and BCNF with suitable example. (8)
(ii) Consider the universal relation R = {A, B, C, D, E, F, G, H, I}
03-April May 2011.indd 53 12/7/2012 6:13:36 PM

and the set of functional dependencies F = {(A, B) → {C}, {A}

→ {D, E}, {B} → {F}, {F} → {G, H}, {D} → {I, J}}. What is
the key for R? Decompose R into 2NF, then 3NF relations. (8)
Or
(b) What are the pitfalls in relational database design? With a suitable
example, explain the role of functional dependency in the process of
normalization. (16)
14. (a) (i) Explain about immediate update and deferred update recovery
techniques. (8)
(ii) Explain the concepts of serializability. (8)

Or
(b) (i) Explain Two-phase locking protocol. (8)
(ii) Describe about the deadlock prevention schemes. (8)
15. (a) (i) List the different levels in RAID technology and explain its fea-
tures. (12)
(ii) Describe the different methods of implementing variable length

records. (4)
Or
(b) (i) Explain the various indexing schemes used in database environ-
ment. (12)
(ii) Let relations r1(A, B, C) and r2(C, D, E) have the following

properties: r1 has 20,000 tuples, r2 has 45,000 tuples, 25 tuples
of r1 fit on one block, and 30 tuples of r2 fit on one block. Esti-
mate the number of block accesses required, using each of the
following join strategies for r1 r2: (4)
1. Nested-loop join with r1 as outer relation
2. Block nested-loop join with r1 as outer relation
3. Merge join if r1 and r2 are initially sorted
4. Hash join (assuming that no overflow occurs).
03-April May 2011.indd 54 12/7/2012 6:13:36 PM

Solutions
PART A
1. A person who has control over the system is called a DBA.

1. Schema definition
2. Storage structure and access method definition
3. Schema and physical organization modification
4. Granting of authorization for data access.
2. Data model is a collection of conceptual tools for describing data, data

relationships, data semantics and consistency constraints.
Types:
1. E, R model
2. Relational model
3. Hierarchical model
4. Network model
5. Object oriented model
3. • Embedded SQLs are SQL statements included in the programming

language.
• In which SQL statements are C, COBOL, Pascal, FORTRAN.
Advantages:
• Efficient way of merging the strength of 2 programming environ-
ments.
• Efficient in CPU usage.
4.
Domain Relational calculus Tuple Relational

calculus
• It uses domain variables • It uses tuple,
that take on values from an variable that
attributes domain, rather take on values
than for an entire tuple. from attributes.
03-April May 2011.indd 55 12/7/2012 6:13:37 PM

5. Let R be a relation schema & let F be a set of functional dependencies

on R.
Let R1 & R2 form a decomposition of R this decomposition is called
lossless join decomposition if
R1 ∩ R2 → R1
R1 ∩ R2 → R2
R1 ∩ R2 forms a super key.
6. It is not a 3rd normal form. Because it does not satisfy transitive de-
pendency.
7. Atomicity: Either all operations of the transaction are reflected prop-

erly.
Consistency: Execution of a transaction in isolation perceiver the con-
sistency.
Isolation: Transactions executing onscreen.
Durability: Changes made persist ever if failures.
8. Shared mode lock:

If a transaction Ti has obtained a shared mode lock on item Q, then Ti
can read, but cannot write Q.
Exclusive:
If transaction Ti has obtained a shared mode lock on item Q, then Ti can
read & also write Q.
9. Advantage:
Records are stored in sequential order, according to the value of a “search
key” of each record.
Disadvantage:
Over flow occurs in indexed sequential file organization.
10. Database tuning describes a group of activities used to optimize and ho-
mogenize the performance of a database.
• Goal is to maximize use of system resources to perform work.
03-April May 2011.indd 56 12/7/2012 6:13:37 PM

PART B
11. (a) (i)
programmers admin
Use i Use Use
Application Application Query Admin

interface program tools tools
Compiler and DDL

DML queries
linker interpreter
Application
obj code and organizer
Query evaluation
Buffer
manager
Storage Manager
Indices Data Disk Storage

dictionary
Statistical
Data data
A database system is portioned into modules that deal with each of the
responsibilities of the overall system.
Storage Manager:
• A Storage Manager is a program module that provides the interface
between the low level data stored in the database and the application
programs and queries submitted to the system.
• The Storage Manager is responsible for the interaction with the file
manager.
03-April May 2011.indd 57 12/7/2012 6:13:37 PM

• The Storage Manager translates the various DML statements into

low level file system commands.
Components of Storage Manager:
Authorization & Integrity Manager:
It tests for satisfaction of various integrity constraints and checks the au-
thority of users accessing the data.
It ensures that the database remain in a consistent state despite system
failures and concurrent executions proceed without conflicting.
File Manager:
It manages the allocation of space on disk storage and the data structures
used to represent information stored on disk.
Buffer Manager:
It is responsible for fetching data from disk storage into main memory &
deciding data to cache in main memory.
Data Files:
Which store the database itself.
Data Dictionary:
It contains metadata that is about data. The scheme of a table is an ex-
ample of metadata.
Indices:
Which provides fast access to data items that hold particular values.
The Query processor:
It helps the database system to simplify and facilitate access to data.
DDL Interpreter:
Which interprets DDL statements and records the definitions in the data
dictionary
DML compiler:
Which translates DML statements in a query language into an evalua-
tion plan consisting of low level instructions that query evaluation engine
understands.
Query Evaluation Engine:
Which executes low level instructions generated by the DML compiler.
03-April May 2011.indd 58 12/7/2012 6:13:37 PM

11. (a) (ii) Bank database

Cus-id
Cus-name Cus-street
Loan-no Amount
Cus-city
Customer Borrower Loan
Deposit No.of. cus

Has
Account
Account
C-name Branch
Balance Acc-no
11. (b) (i) Technically, both of them support the basic features necessary for
data access. For example both of them ensure
• Data is managed to ensure its integrity and quality
• Allow shared access by a community of users
• Use of well defined schema for data-access
• Support a query language
But, file-systems seriously lack some of the critical features neces-
sary for managing data. Lets take a look at some of these feature.
Transaction support
Atomic transactions guarantee complete failure or success of an
operation. This is especially needed when there is concurrent ac-
cess to same data-set. This is one of the basic features provided
by all databases.
But, most file-systems don’t have this features. Only the lesser
known file-systems – Transactional NTFS(TxF), Sum ZFS, Veri-
tas VxFS support this feature. Most of the popular opensource
file-systems (including ext3, xfs, reiserfs) are not even POSIX
compliant.
Fast Indexing
Databases allow indexing based on any attribute or data-property
(i.e. SQL columns). This helps fast retrieval of data, based on
the indexed attribute. This functionality is not offered by most
03-April May 2011.indd 59 12/7/2012 6:13:37 PM

file-systems i.e. you can’t quickly access “all files created after
2PM today”.
The desktop search tools like Google or MAC spotlight of-
fer this functionality. But for this, they have to scan and index
the complete file-system and store the information in a internal
relational-database.
Snapshots
Snapshot is a point-in-time copy/view of the data. Snapshots are
needed for backup applications, which need consistent point-in-
time copies of data.
The transactional and journaling capabilities enable most of
the database to offer snapshots without shopping access to the
data. Most file-systems however, don’t provide this feature (ZFS
and VxFS being only exceptions). The backup softwares have to
either depend on running application or underlying storage for
snapshots.
Clustering
Advanced databases like Oracle (and now MySQL) also offer
clustering capabilities.The “g” in “Oracle 11g” actually stands
for “grid” or clustering capability. MySQL offers shared-nothing
clusters using synchronous replication. This helps the databases
scale up and support larger & more-fault tolerant production en-
vironments.
File systems still don’t support this option . The only excep-
tions are Veritas CFS and GFS (Open Source).
Replication
Replication is commodity with databases and form the basis for
disaster-recovery plans. File-systems still have to evolve to han-
dle it.
Relational View of Data
File systems store files and other objects only as a stream of bytes,
and have little or no information about the data stored in the files.
Such file systems also provide only a single way of organizing the
files, namely via directories and file names. The associated attri-
butes are also limited in number e.g. – type, size, author, creation
time etc. This does not help in managing related data, as disparate
items do not have any relationships defined.
Databases on the other hand offer easy means to relate stored
data. It also offers a flexible query language (SQL) to retrieve the
03-April May 2011.indd 60 12/7/2012 6:13:37 PM

data. For example, it is possible to query a database for “contacts

of all persons who live in Acapulco and sent emails yesterday”,
but impossible in case of a file system.
File-systems need to evolve and provide capabilities to relate
different data-sets. This will help the application writers to make
use of native file-system capabilities to relate data.
11. (b) (ii) A major purpose of a database system is to provide users with
an abstract view of data.
Physical Level:
The lowest level of abstraction describes how the data are
actually stored.
The physical level describes complex low level data
structures in detail.
Logical Level:
Next higher level of abstraction describes what data are
stored in the database.
View Level:
The highest level of abstraction describes only part of the
entire database. Even though the logical level was simpler
structure, complexity remains because of the variety of in-
formation.
View level
View 1 View 2 View n
Logical
level
Physical
level
Three levels of data abstraction.

E g: Type customer = record
Cust-id: string;
Cust-name: string;
Cust-street: string;
Cust-city: string;
end;
03-April May 2011.indd 61 12/7/2012 6:13:37 PM

11. (b) (iii) Schema definition: DBA creates the original database schema
by executing a set of data definition statements in DDL.
Storage structure and access method definition:
Schema and physical-organization modification.
The DBA carries out changes to the schema and physical orga-
nization to reflect the changing needs of the organization.
Granting of authorization for data access:
By granting different types of authorization, the database admin
can regulate which parts of the database various users can access.
Routine Maintenance:
→ Periodically backing up the database
→ Ensure five disk space is available,
→ Monitoring.
12. (a) (i) 1. Find names & cities of all employees

Q > select ∗ from employee
2. Find employee whose earning more than10,000
Q > select salary from employee where salary > 10,000.
3. Find names of all employees in this database who live in the
same city as the company
Q > select ∗ from employee where c-name = city.
4. Find the names of all employees who live in the same city and
on the same street as do their managers.
Q > select street from employee where city = select city from
12. (a) (ii) • The database is stored on several computers.

• The computers in a distributed system communicate with one
another through various communication media, such as high
speed networks or telephone lines.
• They do not share main memory or disk.
• The computers in distributed system referred as nodes.
Issues:
Software development cost is difficult i.e, it is more costly.
Greater potential for bugs since the sites that consititute the
distributed system operate in parallel.
Increased processing overload:
The exchange of manager and the additional computation re-
quired to achieve intersite co-ordination.
03-April May 2011.indd 62 12/7/2012 6:13:38 PM

12. (b) (i) Set Operations

(i) Union
(ii) Intersection
(iii) Set difference
(vi) Cartesian product
Second group
(i) Select
(ii) Project
(iii) Rename
Consider 2 relations
Depositor Borrower
Cust-name city Cust-name City
Hayes Pune Adamas Mumbai
Johnson Mumbai Carry Pune
Jones Solapur Hayes Pune
Lindsay Nashik Jackson Solapur
Smith Pune Jones Solapur
Turner Mumbai Smith Pune
Willians Kolhapur
Union
• Operation denoted by U.
• It includes all the tuples that are either in depositor and
borrower.
Result
Depositor ∪ Borrower
Cust-name City
Hayes Pune
.
.
.
.
Willians Kolhapur
Intersection:
→ The result of ∩ is a relation that includes all tuples that are
in both depositor & borrower.
03-April May 2011.indd 63 12/7/2012 6:13:38 PM

Depositor ∩ Borrower
Cust-name City
Hayes Pune
Jones Solapur
Smith Pune
Difference:
→ denoted by depositor-borrower.
→ result of this is contain all tuples in depositor but not in
borrower.
Depositor-Borrower
Cust-name City
Johnson Mumbai
Lindsay Nashik
Turner Mumbai
Cartesian product:
→ also known as CROSS PRODUCT or CROSS JOINS de-
noted by ‘X’
→ The result relation will have one tuple or each combina-
tion of tuples from each participating relation.
Pub-Info Book-Info
Pub-code Name Book-ID Title
P001 McGraw B001 DBMS
P002 PHI B002 Compiler
P003 Pearson
Pub-Info × Book-Info
Pub-code Name Book-ID Title
P001 McGraw B001 DBMS
P002 PHI B001 DBMS
P003 Pearson B001 DBMS
P001 McGraw B002 Computer
P002 PHI B002 Computer
P003 Pearson B002 Computer
03-April May 2011.indd 64 12/7/2012 6:13:38 PM

Select operation:
→ Select operation on selects tuples that satisfy a given pred-
icate
→ Represented as s < select condi > (R)
• Query: Display details of account holders living in the city

‘Pune’
s city = ‘Pune’ (depositor)
The project operation:

→ Project open selects certain columns from a table while
discarding others
p < attribute list > (R)
E g: Query: p name (borrower)
Rename operation:
→ We can rename either the relation or the attributes or
both
rs (new attribute name) (R)
E g: r Temp (B name, A name, P year, B price) (BOOK)
→ The attributes are renamed.
12 (b) (ii) “Integrity constraints guard against accidental damage to the da-
tabase, by ensuring the authorized changes to the database do not
result in loss of data also called as data Integrity
Constraints types:
Domain constraint:
→ elementary form of constraint.
→ test queries to ensure that the comparison make sense.
→ new domains can be crated.
Null value constrain:
If a row lacks a data value for particular column, that value
is said to be null.
Primary key constraint:
Primary key is one or more columns in a table used to unique-
ly identify each row in the table. Primary key value must not
be null.
03-April May 2011.indd 65 12/7/2012 6:13:38 PM

13. (a) (i) First Normal Form (1NF):

It states that domain of an attribute must include only atomic val-
ues and that the value of any attribute in a tulle must be a single
value from the domain of that attribute.
E g: (a) Department
Dname Dnumber DMGRid Dlocation
(b) Research 5 100 {A, B, C}
Administration 4 200 {D}

Head 1 400 {E}
(c) Research 5 100 A
Research 5 100 B
Research 5 100 C
Administration 4 200 D
Head 1 400 E
EMP-PROJ (eid, ename {PROJS (PNUMBER, HOURS)})
EMP-PROJ 1 EMP-PROJ 2
eid ename eid Pnumber Hours
Second Normal Form (2NF):

2NF is based on the concept of full functional dependency. A
functional dependency X → Y is a full functional dependency.
• A ∈ X, (X − {A})
• A ∈ X, (X − {A}) A → Y
{e id, PNUMBER} → HOURS is a full FD.
{e id, PNUMBER} → Ename is partial because eid → ename
holds.
03-April May 2011.indd 66 12/7/2012 6:13:38 PM

eid Pnumber Hours ename Pname Plocation
ED1
ED2
ED3
2NF normalisation
ED1
eid Pnumber Hours
ED1
ED2
eid ename
ED2
ED3
Pnumber Pname PLOCATION
A
ED3
Third Normal Form (3NF):

3NF is based on the concept 0 transitive dependency.
X → Z and Z → Y
EMP – DEPT
ename eid DOB address Dnumber Dname DMGRid
ED1
ename eid DOB address Dnumber
ED3
Dnumber Dname DMGRid
03-April May 2011.indd 67 12/7/2012 6:13:38 PM

Relational Decomposition:
A single decomposition schema R = {A1, A2, …..An} that includes
all the attributes of the database.
m
∪R
i =1
i R
This is called attribute presentation condition of decomposition.

Properties of Decomposition:
1. Dependency preservation
2. Lossless join property
Boyce – code Normal Form (BCNF):
A relation schema (R) is in BCNF with respect to a set of func-
tional dependency if for all functional dependencies is F+ of the
form a → b, where a ≤ R and b ≤ R, at least one of the following
holds.
• a → b is trivial ((i.e) b ≤ a)
• a is a super key for R
E g: 1. R = (A, B, C)
2. F = {A → B, B → C)
3. key = {A}
The decomposed of Relation R is as follows. Decomposed into
R1 = (A, B), R2 = (B, C)
• R1 and R2 are in BCNF.
• This delivers a lossless join decomposition
• All dependencies are preserved
BCNF Decomposition Algorithm:-
result: {R};
done: = false;
computer F+;
while (not done) do
if (there is a schema Ri; in result that is not in BCNF) then
begin
Let a → b be a not trivial functional dependency that holds on
R1;
Such that a → Ri is not in F+, and a ∩ b = f;
result: = (result – Ri) ∪ (Ri − β ) ∪ (a , b);
end
else done: = true;
03-April May 2011.indd 68 12/7/2012 6:13:39 PM

BCNF & Dependency preservation :

1. R = (J, K, L)
2. F = {JK → L, L → K}
3. Two candidate keys 2 JK & JL.
Assume a relation R that has three attribute J, K, L that are func-
tionally dependent.
The functionally dependencies are such that attributes on L and
L is dependent on K.
The keys to the functional dependencies across the three attri-
butes will be JK and JL
Then the relation R is not in BCNF
E g: R = (branch-name, branch-city, branch-assets, customer-
name, loan-number, loan-amount).
E = {branch-name → branch-assets, branch-city, loan-
number → loan-amount, branch-name}.
Key-{loan-number, customer-name}
Decomposed to:
R1 = (branch-name, branch-city, branch-assets)
R2 = (branch-name, customer-name, loan-number, loan-amount)
R3 = (branch-name, loan-number, loan-amount)
R4 = (customer-name, loan-number)
13. (a) (ii) As shown in figure, we can all the attributes dependents on at-
tributes A and B hence, the key of Relation R is (A, B).
I
D
A J
E
C
G
B F
H
ª Decomposition of R into 2NF

The relation of R is decomposed into following relations
(i) R1 (A, C, D, E, I, J, B)
(ii) R2 (B, A, F, G, H)
03-April May 2011.indd 69 12/7/2012 6:13:39 PM

ª 3NF:
The 3NF removes transitive dependency. The dependency dia-
gram for R1 and R2 is shown, above.
Thus R1 in divided into following relations:
(i) R3 (A, D, I)
(ii) R4 (A, E)
(iii) R5 (A, D, J).
R2 is divided into following relations to avoid transitive depen-
dency.
(i) R6 (B, F, G)
(ii) R7 (B, F, H)
13. (b) Pitfalls in Relational Database Design
Creating an effective design for a relational database is a key element in
building a reliable system. There is no one “correct” relational database
design for any particular project, and developers must make choices
to create a design that will work efficiently. There are a few common
design pitfalls that can harm a database system. Watching out for these
errors at the design stage can the help to avoid problems later on.
Careless Naming Practices
• Choosing names is an aspect of database design that is often ne-
glected but can have a considerable impact on usability and future
development. To avoid this, both table and column names should
be chosen to be meaningful and to conform to the established
conventions, ensuring that consistency is maintained throughout
a system. A number of conventions can be used in relational da-
tabase names, including the following two examples for a record
storing a client name: “client _ name” and “clientName”.
Lack of Documentation
• Creating documentation for a relational database can be a vital
step in safeguarding future development. There are different lev-
els of documentation that can be created for databases, and some
database management systems are able to generate the documen-
tation automatically. For projects where formal documentation is
not considered necessary, simply including comments within the
SQL code can be helpful.
Failure to Normalize
• Normalization is a technique for analyzing, and improving on,
an initial database design. A variety of techniques are involved,
including identifying features of a database design that may
03-April May 2011.indd 70 12/7/2012 6:13:39 PM

compromise data integrity, for example items of data that are

stored in more than one place. Normalization identifies anoma-
lies in a database design, and can preempt design features that
will cause problems when data is queried, inserted or updated.
Lack of testing
• Failure to test a database design with a sample of real, or realistic,
data can cause serious problems in a database system. Generally,
relational database design is started from an abstract level, using
modeling techniques to arrive at a design. The drawback to this
process is that the design sometimes will not relate accurately to
the actual data, which is why testing is so important.
Failure to Exploit SQL Facilities
• SQL has many capabilities that can improve the usability and suc-
cess of a database system. Facilities such as stored procedures
and integrity checks are often not used in cases where they could
greatly enhance the stability of a system. Developers often choose
not to carry out these processes during the design stages of a proj-
ect as they are not a necessity, but they can help to avoid problems
at a later stage.
FUNCTIONAL DEPENDENCIES
A functional dependency occurs between two attributes in a data-
base, A and B, if there exists a relationship such that for each value
of A there is only one corresponding value of B (A → B). This can be
extended to a functional dependency where A may be a set of tuples
(x, y, z) that correspond to a single value B([x, y, z] → B). In simple
mathematical terms the functional dependency must pass the vertical
line test for proper functions.
Normalization of a relation database means that the relation
(tables) in the database conform to a set of rules for a certain form
(First-Sixth Normal Form [1 – 6NF] and/or Boyce–Codd Nor-
mal Form [BCNF]. The higher the normal form of a table the less
vulnerable it is to data inconsistency and data anomalies formed
during updates, inserts, and deletes. Normalization often reduces
data redundancy in a database which reduces data inconsistency
and anomaly risks. Normalizing a database requires analysis of the
closure of the set of functional dependencies to ensure that the set
complies with the rules for the given normal form. If the table does
not comply with the rules then the table is split following specific
procedures to achieve the desired normal form. Every table in a da-
tabase has a normal form and to make a statement that a database is
03-April May 2011.indd 71 12/7/2012 6:13:39 PM

in a certain normal form (ex. 3NF) means that every table complies
with the rules for 3NF.
14. (a) (i) Deferred update:

• Modification techniques ensures transaction atomicity re-
cording all database modification in the log.
• When a transaction partially commits the information in
the log associated with the transaction completes.
• The execution of transaction Ti proceeds before Ti starts its
execution, a record < Ti start > is written to log.
• A write (X) operation by Ti results in the writing of a new
record to the log.
E g: To: read (B)
A: = A − 50;
write (A);
read (B);
B: = B + 50;
write (B);
Immediate database update:
allows database modification to be output to the database
while the transaction is still in the active state.
E g: < To, start>
<To, A, 1000, 950>
<To, B, 2000, 2050>
<To, commit>
<T1, start>
<T1, C, 700, 600>
<T1, commit>
(i) undo (Ti) It restores the value of all data items updated by
transaction Ti to the new values.
(ii) redo (Ti). It sets the value of all data items updated by
transaction to new values.
Ans: A, B, C = 950, 2050, 600 respectively.
14. (a) (ii) Serializability is the generally accepted “Criteria for correct-
ness” for the interleaved execution of a set of transaction; that
is such an execution is considered to be correct if and only if it
is serializble. A given execution of a given set of transactions is
serializable and therefore correct if and only if its equivalent to
some serial execution of the save transaction.
03-April May 2011.indd 72 12/7/2012 6:13:39 PM

Serial execution:
It is one in which the transactions are run one at a time in some
sequence.
Guaranteed:
It means that the given execution and the serial one always pro-
duce the same result. As each other. No matters what the initial
state of the database might be.
Individual transactions are assumed to be correct that is there
are assumed to transform a correct state of a database into an-
other correct data state. Running the transactions one at a time in
any serial order is therefore also correct. Let I be an interleaved
schedule involving some set of transaction t1, t2 ….tn. If I is seri-
alizable then there exist atleast one serial schedule S involving t1,
t2 ….t such that I is equivalent to S. S is said to be the serializa-
tion of I.
14. (b) (i) This protocol requires that each transaction issue lock and unlock
requests in 2 phase.
Growing phase:
In this phase, a transaction may obtain locks, but may not re-
lease any lock.
Shrinking phase:
• A transaction may release locks, but may not obtain any
new locks.
• Initially a transaction is in growing phase.
• Once the transaction releases a lock, it enters in the shrink-
ing phase, at it cannot issue more lock requests.
Transaction T1 & T2 are not 2 phase while T3 is 2 phase.
T3: lock – X (B);
read (B);
B: = B − 50;
write (B);
lock − X (A);
read (A);
A: = A + 50;
write (A);
unlock (B);
unlock (A);
Advantage:
* Ensures conflict serialzability.
03-April May 2011.indd 73 12/7/2012 6:13:39 PM

Disadvantage:
(i) Two phase locking does not ensure freedom from deadlock.
T3 T4
lock – X (B)
read (B)
B: B − 50
write (B) lock − S (A)
read (A)
lock – S (B)
lock – X (A)
(ii) Cascading rollbacks may occur under 2 phase lacking:
(1) Strict 2 phase locking protocol: all exclusive mode locks
taken by transaction should be held till finish.
(2) The rigorous 2 phase locking protocol: all locks be held
until transaction commit.
14. (b) (ii) One approach ensures that no cycle waits can occur by ordering the
requests for locks, or requiring all locks to be acquired together.
Disadvantage:
(a) It is hard to predict before the transaction begins, what data
items need to be locked.
(b) Data item utilization may be very low, since many of the
data items may be locked but unused for a long time.
• Second approach for dead lock prevention is to use preemp-
tion and transaction rollbacks.
• The system user timestamp to decide whether a transaction
should wait or roll back.
Wait die:
• Non preemption technique.
• Ti requests a data item held by Tj then Ti is allowed to
wait if it has a timestamp smaller that Tj.
Wound wait:
• preemptive technique.
• When transaction Ti requests data item held by Ti. Ti is
allowed to wait, only if it has timestamp greater than TS
otherwise rolled back.
03-April May 2011.indd 74 12/7/2012 6:13:39 PM

15. (a) (i) Having a large number of disks in a system presents opportuni-
ties for improving the rate at which data can be read or written,
if the disks are operated in parallel several independent reads on
write can also be performed in parallel. A variety of disk organi-
zation techniques are collectively called as Redundant Arrays of
independent disks (RAID)
RAID levels:
a) RAID 0
C C C C
b) RAID 1
P P P
c) RAID 2
d) RAID 3
e) RAID 4
P P P P P
f) RAID 5
P P P P P
g) RAID 6
RAID 0:
Refers to disk arrays with striping at the level of blocks, but
without any redundancy.
RAID 1:
Refers to disk mirroring with block striping it shows mirrored
organization.
03-April May 2011.indd 75 12/7/2012 6:13:39 PM

RAID level 2:
It is known as memory style error correcting code organiza-
tion which employs parity level bits. Memory system have long
used parity bits for error detection and correction. Each byte in a
memory system may have a primary parity bit associated with it
that records whether the number of the bits in the byte are set 1.
RAID level 3:
Bit interleaved parity organization improved on level 2 by ex-
ploiting the fact that disk controller unlike memory systems can
detect whether a sector has been read correctly to a single parity
bit can be used for error correction as well as for detection.
RAID level 4:
Block interleaved parity organization uses block level striping
like RAID 0 and in addition keep a parity block on a separate
block for corresponding block from N other disks.
RAID level 5:
Block interleaved distributed parity, improves an level 4 by por-
tioning data and parity among all N + 1 disks, instead of storing
data in N disks and parity in one disks.
RAID level 6:
P + Q redundancy scheme is much like level 5 but stores extra
information to guard against multiple disk failures.
15. (a) (ii) Variable length records arise in database systems in several
ways.
→ Storage of multiple record type in a file.
→ Record types that allow variable lengths for one or more
fields
→ Record types that allow repeating fields
Eg: (1) Byte string Representation
(2) Fixed length Representation
(3) Which uses Anchor block & overflow block – which
contains records other than those that are the first re-
cords of a chain.
15. (b) (i) There are two kinds of indices:

Ordered indices: Based on a stored ordering of the values.
03-April May 2011.indd 76 12/7/2012 6:13:40 PM

Hashed indices: Based on a uniform distribution of values

across a range of buckets. The bucket to which a value is as-
signed is determined by a function called a hash function.
Access types:
The types of access that are supported efficiently. Access type
can include finding records with a specified attribute value and
finding records.
Access time:
The time is takes to find a particular data item, or set of items,
using the technique in question.
Insertion time:
The time it takes to insert a new data item.
Deletion time:
The time it takes to delete a data item
Space overhead:
The additional space occupied by an index structure. Provided
that the amount of additional space is moderate.
Ordered Indices:
The gain fast random access to records in a file, we can use an
index structure. Each index structure is associated with a par-
ticular search key.
The records in the indexed file may themselves be stored in
some sorted order, just as books in a library are stored according
to some attribute.
A-217 Brighton 750
A-101 Downtown 500

Brighton
A-110 Downtown 600
Mionus
A-215 Mionus 700
Redwood
A-102 Pellyridge 400
A-201 Pellyridge 900
→ Clustering indices are also called primary indices.
03-April May 2011.indd 77 12/7/2012 6:13:40 PM

→ Indices whose search key specifies an order different from

the sequential order of the file are called non clustering indi-
ces, or secondary indices.
Dense and Sparse Indices:
An index record or index entry consist of a search-key value
and points one or more records with that value as search key
value.
Dense Index:
An index record appears for every search key value in the file.
Sparse index:
An index record appears for only some of the search key values.
Brighton A-217 Brighton 750
Downtown A-101 Downtown 600
Mionus A-110 Downtown 400
Pellyridge A-215 Mionus 350
Red wood A-102 Pellyridge 200
Red hill A-201 Pellyridge 250
Dense Index:
A-217 Brighton 750

Brighton
A-101 Downtown 600
Mionus
A-110 Downtown 350
Redwood
A-215 Mionus 200
15. (b) (ii) (1) Nested – loop join with r1, as outer relation.
⇒ Best case
= r 1 ∗ r2
= 20000 × 45000
= 90 × 107 pairs of tuples
⇒ Worst case
03-April May 2011.indd 78 12/7/2012 6:13:40 PM

= nr ∗ bs + br
= 20000 ∗ 30 + 25
= 600025 pairs of tuples
(2) Block nested-loop join with r1 as outer relation
⇒ br ∗ bs + br
= 25 ∗ 30 + 25
= 775 block access are required
(3) Merge join if r1 and r2 are initially sorted
= br + bs
= 25 + 30
= 55.
So, 55 block access are required.
(4) Hash join: → h (r) ≠ h (S) ⇒ more amount of blocks re-
quired here,
r1 → outer relation
r2 → inner relation
br → block in outer
bs → block in inner
nr → outer relation
ns → inner relation
03-April May 2011.indd 79 12/7/2012 6:13:41 PM

B.E / B. TECH. DEGREE EXAMINATION,
NOV/DEC 2010
Fourth Semester
CS 2255- DATABASE MANAGEMENT SYSTEMS
Regulation 2008
Time: Three Hours Maximum: 100 MARKS
PART A – (10 × 2 = 20 Marks)
1. Define the two levels of data independence.
2. Write down any two major responsibilities of a database administrator.
3. List out the various relational algebra operators.
4. What are the four broad categories of constraints?
5. Define irreducible sets of dependencies.
6. Define the third normal form.
7. What are ACID properties?
8. What are the three kinds of intent locks?
9. Which are the factors to be considered for the evaluation of indexing and
hashing techniques?
10. What is the drawback of flash memory?
04-Nov Dec 2010.indd 80 12/7/2012 6:13:17 PM

PART B – (5 × 16 = 80 Marks)
11. (a) Explain the three different groups of data models with examples.
(16)
Or
(b) Describe the components of entity-relationship diagram with suitable

examples. (16)
12. (a) Describe the features of Embedded SQL and Dynamic SQL. Give
Or
(b) Write short notes on the following:

(i) Mandatory access control. (9)
(ii) Missing information. (7)
13. (a) Explain non loss decomposition and functional dependencies with
Or
(b) Discuss join Dependencies and Fifth Normal Form, and explain why
5NF? (16)
14. (a) (i) State the Two-Phase Commit protocol. Discuss the implications
of a failure of the coordinator and some participants. (10)
(ii) Briefly explain transaction recovery with primitive operations.

(6)
Or
(b) (i) State and explain the three concurrency problems. (9)
(ii) What is meant by isolation level and define the five different
isolation levels. (7)
15. (a) (i) Discuss the improvement of reliability and performance of RAID
(8)
(ii) Explain the structure of a B+- tree. (8)
Or
(b) Explain the complex selection predicates with example. (16)
04-Nov Dec 2010.indd 81 12/7/2012 6:13:17 PM

Solutions
PART A
1. The two levels of data independence are
(i) Physical data independence
(ii) Logical data independence
Physical data independence describes how the data are actually stored.
Logical data independence describes what data are stored in the database
and relationship among those data.
2. (i) Storage method and access method definition

(ii) Schema and physical organization modification
(iii) Granting of authorization for data access.
3. The various operators used in relation algebra are binary and unary.
Unary operators are select, project and rename.
Binary operators are set union, set intersect, set difference and Cartesian
product.
4. The different types of integrity constraints used in designing a relational

database are
(a) Entity Integrity Constraints
(b) Referential Integrity Constraints
(c) Domain Integrity Constraints.
5. A functional depending set S is irreducible if the set has following three

properties.
1. Each right set of a functional dependency of S contains only one
attribute.
2. Each left set of a function dependency of S is irreducible.
3. Reducing any function dependency will change the content of S.
6. A relvar is in 3NF if and only if it is in 2NF and every non-key attribute is

non transitively dependent on the primary key.
7. The ACID properties are

(a) Atomicity
(b) Consistency
(c) Isolation
(d) Durability
8. The three kinds of intent locks are

(a) Intent shared (IS)
04-Nov Dec 2010.indd 82 12/7/2012 6:13:17 PM

(b) Intent exclusive (IX)

(c) Shared intent exclusive (SIX)
9. The factors to be considered for the evaluation of indexing and hashing

techniques are access type, access time, insertion time, deletion time and
space overhead.
10. The drawback of flash memory is that an entire block must be erased and
written over at a time.
PART B
11. (a) The model describes the data structures and access techniques
DBMS.
They are
(a) Hierarchical model.
(b) Network model
(c) Relational model
Hierarchical Model:
The hierarchical model evolved from the file based system. In
a hierarchical model, the data is organised in a hierarchical way
or ordered in a hierarchical tree structure and the database is a
collection of such disjoint trees. The nodes of the trees represent
record types. Each tree effectively represents a root record type
and all of its dependent record types.
Item number Item name Cust number Cust name
SHIP
ITEM CUSTOMER
MENT
Colour Price Cust categoryy Cust country
Qty
Fig: ER diagram
11 Clips Silver 25 12 Clock Red 700 13 Bag Blue 500
C1 Alicia UK 300 C3 Francis C UK 200 C1 Alicia A UK 400

C2 Malcom A USA 300 C2 Malcom A USA 400
C1 Alicia A UK 200
14 Lamp Gold 1000
Fig: Hierarchical Data Model.
04-Nov Dec 2010.indd 83 12/7/2012 6:13:17 PM

Advantages:
(i) conceptual simplicity
(ii) Database security
(iii) Data independence
(iv) Database integrity
(v) Efficiency.
Disadvantages:
(i) Complex implementation
(ii) Difficult to manage
(iii) Lack of structural independence
(iv) Implementation limitations
(v) Lack of standards.
Network Data Model:
Network data model uses two different data structures to represent
the database entities and relationships between entities, namely
record type and set type. A record type is used to represent an
entity type. It is made up of a number of data items that represent
the attributes of the entity. A set type is used to represent a direct
relationship between two record types.
C1 Alicia A UK C2 Malcom A USA
C3 Francis C UK
300 200 400 300 400 200
11 Clips Silver 25 12 Clock Red 700 13 Bag Blue 500
14 Lamp Gold 1000
Fig: Network Model.
Advantages:
(i) Conceptual simplicity
(ii) Handles more relationship types
(iii) Data access flexibility
(iv) Promotes database integrity
04-Nov Dec 2010.indd 84 12/7/2012 6:13:18 PM

(v) Data independence

(vi) Conformance to standards.
Disadvantages:
(i) System complexity
(ii) Lack of structural independence
Relational Data Model:
In this model, the relation is the only construct required to represent
the associations among the attributes of an entity as well as relation-
ships among different entities.
A relation may be visualized as a named table. Each column of the
table corresponds to an attribute of the relation and is named. Rows
of the relation are referred to as tuples of the relation and the columns
are its attributes.
The value for an attribute or a column are drawn from a set of values
known as domain.
The example of relational data model is the entity-relationship model.
Customer
Cust no. Name Cust categon Cust country
C1 Alicia A UK
C2 Malcom A USA
C3 Francis C UK
Sale
Cust no Items Qty
C1 11 300
C2 12 300
C3 12 200
C2 12 400
C1 13 200
C1 13 400
Item
Item no Name Colour Price
11 Clips Silver 25
12 Clock Red 100
13 Bag Blue 500
14 Lamp Gold 1000
Fig: Relational data Model
04-Nov Dec 2010.indd 85 12/7/2012 6:13:18 PM

Advantages:
1. Structural independence
2. Improved conceptual simplicity
3. Easier database design, management and use
4. Adhoc query capability
5. A powerful database management system
Disadvantages:
1. Substaintial hardware and system software overhead.
2. Poor design and implementation.
3. May promote “islands of information” problem.
11. (b) The Entity-Relationships (E-R) data model, which is popular for high
level database design, provides a means for representing relationships
between entities.
Features of ER-Model:
• This is used to give structure to the data.
• Model can be evolved independent of any DBMS.
• It is an aid for database design.
• It is easy to visualize and understand.
Item code Item desc Cust code Cust name
ITEM Shipment
p CUSTOMER
Colour Price Cust catagory Cust country
Qty
Fig: ER diagram with notations
Basic Concepts:
Entity set:
An entity is a thing or object in the real world i.e., distinguishable
from all other objects.
Eg: Pearson is a entity. An entity can be concrete such as Pearson
are a book or it may be abstract such as loan or an account.
Entity set is a set of entity’s of same type that shares the same
properties or attributes.
Eg: set of all persons who are customers in a bank.
04-Nov Dec 2010.indd 86 12/7/2012 6:13:18 PM

Attributes:
An entity is represented by set of attributes. Attributes are distributive
properties possessed by each member of entity set. For each attribute
there is a set of permitted values called the domain of attribute.
Eg: The domain of attribute the customer name might be set of
all things of certain length
Types of attributes:
Simple and Composite attribute:
A simple attribute cannot be divided into subparts.
Eg: customer-age = 30
A composite attribute can be divided into subparts.
Eg: customer-address = street + city + state
Single value and Multi value Attribute:
Attribute which has only one value is called as single value
attribute.
Eg: cust-no = 321467
Attribute which can have more than one value is said to be
multi valued attribute.
Eg: cust-phone no = 24765214
Derived attribute:
The value for the this type of attribute can be derived form the
values of other related attribute or entities.
Eg: The value for loan-held attribute can be derived from the
entity loan.
An attribute can be take a null value when entity does not have
a value for it. Null may indicate that the value is not applicable
or value is unknown.
Relationship set:
A relationship is an association among several entities. A relation-
ship set is a set of relationship of the same type. If E1, E2…En are
entity then relationship set are subset of {e1 e E1, e2 e E2, …en e En}
Eg: Consider a two entity set customer and loan we can define a
relationship set borrower to denote the association between the
customers and bank loans.
Types:
A relationship may also have distributive attribute describe the
relationship among entities.
Eg: We can access-date as the attribute for borrower relationship.
04-Nov Dec 2010.indd 87 12/7/2012 6:13:18 PM

Unary relationship:
Boss
Employee
Manager Worker
Binary relationship:
Publisher Publisher Book
Ternary relationship
T
Teacher T
Teacher Subject
Student
Quaternary relationship:
T
Teacher
Course
Student Studies
meterial
Subject
Mapping cardinalities:
It expresses then number of entity to which attribute of another entity
can be associated via a relationship set.
one to one
An entity in A is associated with at most one entity in B. An
entity B is associated with atmost one entity in A.
04-Nov Dec 2010.indd 88 12/7/2012 6:13:18 PM

eg:
A B
a1 b1
a2 b2
a3 b3
one to many:
An entity in A is associated with any number of entity in B. An
entity in B however can associated with at most in A.
Eg :
A B
b1
b2
a2 b3
b4
a3 b5
Many to one
An entity in A is associated with at most one entity in B. An
entity in B can be associated with any number of entities in A.
Eg:
A B
a1 b1
a2
a3 b2
a4
a5 b3
Many to Many
An entity in A is associated with any number of entities in B and
any entity in B is associated with any number of entities in A
Eg:
A B
a1 b1
a2 b2
a3 b3
a4 b4
04-Nov Dec 2010.indd 89 12/7/2012 6:13:19 PM

ER diagrams: An ER diagram can be express the overall logical

structure of database.
It consists of the following components.
Symbol Meaning Symbol Meaning
Entity Attribute
Weak entity Key attribute
Relationship Multi Valued
Identifying Derived attri-

relationship bute
12. (a) EMBEDDED SQL: Any SQL statement that can be used interac-
tively can also be embedded in an application program called embed-
ded SQL.
It is necessary to known a number of preliminary details.
1. Embedded SQL statements are prefixed by EXEC SQL, to
distinguish from statements.
2. An executable SQL statement can appear whenever an executable
host statement can appear.
3. SQL statements can include references to host variables such
references must include a colon prefix to distinguish them from
SQL column names.
4. The purpose of INTO clause is to specify the target variables into
which values are to be retrieved.
5. All host variables referenced in SQL statements must be declared
within an embedded SQL declare section. which is delimited by
the BEGIN and END DECLARE section statements.
6. Every program containing embedded. SQL statements must in-
clude a host variable called SQLSTATE
7. Every host variable must have a data type appropriate to the uses
to which it is put.
8. Host variables and SQL column can have the same name.
9. SQL statement should in principle be followed by a test of the
returned SQLSTATE value.
04-Nov Dec 2010.indd 90 12/7/2012 6:13:20 PM

10. Embedded SQL constitutes a loose coupling between SQL and

the host language.
It is therefore necessary to provide some kind of bridge
between the set-level retrieval capabilities of SQL and the row
level retrieval capabilities of the host such is the purpose of
cursors.
Operations Not Involving Cursors
The data manipulation statements that do not need cursors as follows:
Singleton SELECT
INSERT
SELECT
DELETE
UPDATE
Program for embedded SQL:
# include<stdio.h>
# include<sqlca.h>
main ( )
{
EXEC SQL BEGIN DECLARE SECTION
char * un = “scott/scott”;
char * con = “Dream HOME”;
EXEC SQL END DECLARE SECTION
EXEC SQL CONNECT: un USING: con;
if (sqlca, sqlcode < 0) exit(−1);
printf (“creating table”);
EXEC SQL CREATE TABLE VIEW
(PNO, varchar(5), client no. varchar(10),
viewdate date, cont varchar(10));
if (sqlca, sqlcode > = 0)
printf (“CREATION SUCCESSFUL”);
else
printf (“CREATION NOT SUCESSFUL”);
EXEC SQL COMMIT;
}
Program for cursor SQL
# include <stdio.h>
# include <stdlib.h>
EXEC SQL INCLUDE sqlca;
main( )
{
04-Nov Dec 2010.indd 91 12/7/2012 6:13:21 PM

EXEC SQL BEGIN DECLARE SECTION

char staff no(6);
prop no(6);
street(26);
city(16);
char*uname = “Scott/Scott”;
char*constr = “DREAM HOME”;
EXEC SQL END DECLARE SECTION
printf(“Enter staff no:”);
scanf(“%s”, & staff no);
EXEC SQL CONNECT: Uname using constr;
If(sqlca, sqlcode) exit(−1);
EXEC SQL WHENEVER SQL ERROR GOTO error;
EXEC SQL WHENEVER NOT FOUND GOTO done;
EXEC SQL DECLARE propers cursor FOR
select prop no, street, city from property for rent
where staff no = = staff no order by
property no;
EXEC SQL OPEN PROPCURS;
for (; ; )
{
EXEC SQL FETCH PROPCURS INTO: propno: street: city;
printf(“property no: %s”, prop no);
printf(“city %s”, city);
printf(“street %s”, street);
}
}
Dynamic SQL:
Dynamic SQL: is part of embedded SQL. It consists of a set of
“dynamic statements” which themselves are compiled ahead of
time whose purpose is precisely to support the compilation and
execution of regular SQL statements that are constructed at run
time.
Thus the two principal dynamic statements are PREPARE and
EXECUTE
1. The name SQLSOURCE identifies a PL/I variable at run time.
2. The prepare statement then takes that source statement and prepares
it to produce an executable version.
3. Finally EXECUTE statement executes the version.
4. Dynamic SQL is a source code standard.
5. The C function strcpy is invoked to copy the source form of a
certain SQL.
04-Nov Dec 2010.indd 92 12/7/2012 6:13:21 PM

DELETE statement into C Variable sql source

Eg for Dynamic SQL:
char sql source(65000);
strcpy(sqlsource,
“DELETE FROM SHIPMENTS WHERE
QUANTITY < QUANTITY (300)”);
rc = SQLEXEC Direct chstmt, (SQLchar *)
sql source, SQL-NTS);
SQL IS NOT PERFECT:
SQL is very far from being the perfect relational language it suffers from
numerous sins of both omission and commission.
12. (b) (i) Mandatory access control

Protecting data against unauthorised user. There are many aspects
to security problems.
• Legal social and ethical aspects.
• Physical controls.
• Policy questions.
• Operational problems.
• Hardware control.
• Operating system support.
Two approaches to security:
Discretionary control:
The given user difficulty have different access rights on
different objects, further, there are limitation regarding
which user can have what rights.
Eg : User1 may have access to A but not to B.
User 2 may have access to B but not to A.
Mandatory control:
Each object is labelled certain classification level and each
user has certain clearance level.
Eg: User 1 has access A but not B. Classification of B is
higher than classify of A.
Grant:
Syntax:
Authority <A>
Grant <priviliges list>
On <relvar name>
To <userlist>;
04-Nov Dec 2010.indd 93 12/7/2012 6:13:21 PM

Eg: Authority SA
Grant select {eno, stock, qty}
on supplier
To {Ajith, Arun}
Revoke:
Syntax;
Authority <SA>
Revoke:
<privilige list>
On<relvar name>
To <user list>;
Eg: Authority SA
Revoke;
Select {sno, stock, qty}
On supplier
To (Ajith, Arun}
Mandatory:
There are top secret, confidential, secret. Each object has
classification like topscret, scret and confidential and each
user is assigned a clearance level similar to objects. User
i can access the object j only if the clearance level of i is
greater than or equal to classification level of i.
Data Encryption:
The process of changing plane text to cyber text by encryption
algorithm
Algorithm needs plane text and key as inputs
There are two types of encryption, public and private.
Missing information:
If we don’t know the value of any particular column it is
the value is NULL. The value is existing but we don’t know
means unknown value.
The concept of NULL refers through any value which is
not known or which does not exist. NULL leads us to a logic
in which there are three truth values True, false, unknown
Note the null values are consider to be equal or unequal. The
relational expression would in turn results to a null value.
Information is often missing in the real world.
Eg: Date of Birth unknown,
04-Nov Dec 2010.indd 94 12/7/2012 6:13:21 PM

We need some way of dealing with such missing informa-

tion in our database system. The approach to this problem
in SQL is based on NULLS & three valued logic (3VL)
Eg: We may not known the weight of some part in parts
table so we say that the weight of the part is NULL. It
means
(i) we do know that part exist.
(ii) We also know that it has a weight.
(iii) We don’t know what the weight is.
The point about NULL’S is that they are not values but
they are marks (or) flags to indicate missing values.
Any scalar compression in which one of the operand
is NULL evaluates to unknown truth value. Instead of
true or false.
Eg: If the value of A is unknown then the expr. A > B, A =
Β, A ≠ Β evaluates to unknown value. Regardless of the
value of B.
Hence the term 3VL.
The concepts of NULLS leads to the logic in which
they are 3 truth value. True, false & unknown.
Boolean operators on NULL values:
If the either of the operands is UNK evaluates to unknown
truth value.
(UNK) instead of true or false.
AND T U F
t t u f
u u u f
f f f f
OR t u f
t t t t
u t u u
f t u f
NOT
t f
u u
f t
04-Nov Dec 2010.indd 95 12/7/2012 6:13:21 PM

Eg: A = 3, B = 4, C = UNK
A > B AND A > C → false
A > B OR B > C → UNK
A < B OR B < C → true.
NOT (A = C) →UNK.
13. (a) Functional dependency:

A function dependency is basically a many to one relationship from
one set of attributes to another within a given relvar.
In the case of shipments relvar there is a functional dependency
from the set of attribute {supp no. part no.} to be attributed {quantity}.
It is indicated as
{supp no. part no.} → {qty}
Types of functional dependency:
In a FD there are three types.
(a) Full dependency:
In a relation R, X and Y are attribute. X functionally determines
Y. Subset of X should not functionally determine Y.
(b) Partial Dependencies:
Attribute Y is partially dependent on the attribute X only if it is
dependent on a subset of attribute X.
(c) Transitive Dependencies:
X, Y and Z are three attributes in the relation R
X→Y
Y→Z
X→Z
Trival and Non trival:
Trival:
A dependency is trival if it cannot possibly failed to be statisfied. In
other words an FD is trival if and only if right side is a subset of left
side.
Non-Trival:
Non-trival dependencies are one that corresponds genuine integrity
constraint.
Nonloss Decomposition:
The concept of nonloss decomposition procedure involves decom-
posing a given relvar into other…; relvars, and moreever that the
decomposition is required to be reversible, so that no information is
04-Nov Dec 2010.indd 96 12/7/2012 6:13:21 PM

lost in the process, in otherwords, the only decompositions. We are

interested in are ones that are indeed non Loss.
Suppliers
SUPP NO. STATUS CITY

S3 30 Chennai
S5 30 Delhi
(a) SST SUPP NO STATUS SC SUPP NO CITY

S3 30 S3 Chennai
S5 30 S5 Delhi
(b) SST SUPP NO STATUS STC SUPP NO CITY

S3 30 30 Chennai
S5 30 30 Delhi
Sample value for relvar suppliers and corresponding decomposit

examing the two decompositions, we observe that.
1. In case (a), no information is lost, the SST and SC values still tell
us that supplier S3 has status 30 and city Chennai and supplier S5
has status 30 and city Delhi. In otherwords, this first decomposi-
tion is indeed non loss.
2. In case (b), by contrast, information definitely is lost, we can still
tell that both suppliers, have status 50, but we cannot tell which
supplier has which city. In otherwords, the second decomposition
is not non loss but lossy.
‘Reversibility’ means precisely that the original relvar is equal to
the join of its projection.
HEATH’S Theorem:-
Let R{A, B, C} be a relvar, where A, B, and C are sets of attributes.
If R statisfies the FD A→B then R is equal to the join of its projec-
tions on {A, B} and {A, C}
More on functional dependencies.
Irreducibility:-
An FD is said to be left irreducible if its left side it is not too big.
That relvar statisfies the FD{Suppno. Partno.} → city.
However the attribute partno on the left side here is redundant for
functional dependency purposes, that is we also have the FD.
04-Nov Dec 2010.indd 97 12/7/2012 6:13:21 PM

SUPP NO. →city

City is also functionally dependent on SUPPNO. alone.
FD diagrams:-
Let R be a relvar and I be some irreducible set of FD’s that apply
to R.
It is convenient to represent the set I by means of a functional
dependency diagram.
(a) SUPP
NAME
Supplier
name STATUS
CITY
(b)
SUPP NO
QTY
PART NO
(c) PART
NAME
COLOR
PART NO
WEIGHT
CITY
FD diagram for relvars SUPPLIERS, SHIPMENTS, PART

13 (b) Consider relvar SHIPMENTS from the suppliers –parts-projects da-
tabase, a sample value is shown at the relvar. That relvar is all key
and involves no nontrivial FD’S or MVDS at all and is therefore in
5NF.
Relation SHIPMENTS is the join of all three of its binary projection
but not of any two.
04-Nov Dec 2010.indd 98 12/7/2012 6:13:22 PM

(a) The three binary projections SHIPMENTS, PJ and JS corresponding

to the SHIPMENTS relation value is shown at the relvar.
(b) The effect of joining the SHIPMENTS and PJ, projections (OVER
PART- NUMBER)
The effect of joining that result and the JS Projection (OVER PROJECT
NUMBER)
Shipments
SUPPLIER NUMBER PART- NUMBER PROJECT- NUMBER

S1 P1 J2
S1 P2 J1
S2 P1 J1
S1 P1 J1
SHIPMENT PJ JS
SUPPLIER- PART PART PROJECT PROJECT SUPPLIER
NUMBER NUMBER NUMBER NUMBER NUMBER NUMBER
S1 P1 P1 J2 J2 S1
S1 P2 P2 J1 J1 S1
S2 P1 P1 J1 J1 S2
JOIN OVER PART- NUMBER JOIN OVER PROJ-

ECT -NUMBER
SUPPLIER PART PROJECT SUPPLIER

NUMBER NUMBER NUMBER NUMBER
S1 P1 J2
S1 P2 J1 Original shipments
S2 P1 J1
S2 P1 J2
Spurious
S1 P1 J1
04-Nov Dec 2010.indd 99 12/7/2012 6:13:22 PM

Join dependency:
Let R be a relvar and Let A, B, …Z be the subsets of the attributes of R.
Then we say that R statisfies the JD *{A, B…Z} if and only if every legal
value of R is equal to the join of its projections on A, B…Z.
FIFTH Normal Form:
A relvar R is in 5NF also called projection join normal form if and only if
every non- trival join dependency that is statisfied by R is implied by the
candidate key(s) of R, where.
(a) The join dependency * {A, B... Z} on R is trivial if and only if at least
one of A, B,… Z is the set of all attributes of R.
(b) The join dependency *{A, B…Z} on R is implied by the candidate
key(s) of R if and only if each of A, B….Z is a super key for R.
14. (a) (i) In this section we briefly discuss a very important elaboration
on the basic commit/Rollback concept called two phase commit.
Two phase commit is important whenever a given transaction
can interact with several independent resource managers each
managing its own set of recoverable resources and maintaining
its own recovery log.
For example consider a transaction among running on an IBM
mainframe that updates both an IMS database and a DB2 database.
If the transaction completes successfully, then allow updates, to
both IMS and DB2 data, must be committed; conversely if it fails
then all of its updates must be rolled back.
It follows that it does not make sense for the transaction to
issuse say COMMIT to IMS and a ROLLBACK to DB2 and
even if it issused the same instruction to both the system could
still crash between the two, will unfortunate results.
The transaction that has completed its process it issuse COM-
MIT. On that receiving that COMMIT request, the coordinator
goes through the following two process:
• Prepare: First it instructs all resource managers to get ready to
go either way on the transaction. In practice this means that
each participant in the process that is, each resource manager
involved – must force all log records for local resources used
by the transaction out to its own physical log. Assuming the
forced write is successful, the resource manager now replies
ok to coordinator, otherwise it replies not ok.
• Commit: When the coordinator has received replies from all
participants, it forces a record to its own physical log, record-
ing its decision regarding transaction. If fall replies ok that
04-Nov Dec 2010.indd 100 12/7/2012 6:13:22 PM

decision is COMMIT if any reply was Not ok the decision is

rollback. Either way, the coordinator, then informs each par-
ticipant of its decision, and each participant must them commit
or rollback. Note that that it is told by the appearance of the
decision record in the co-ordinator’s physical log that marks
the transtition from phase 1 to phase 2.
If the system fails at some point during the foregoing pro-
cess, the restart procedure will look for coordinator’s log. If it
finds it then the two phase commit process can pick up where
is left off.
(ii) A transaction begins by executing a BEGIN TRANSACTION
operation and ends by executing either a commit or a ROLL-
BACK operation.
COMMIT establishes a commit point. A commit point thus
corresponds to the end of a logical unit of work and hence to a
point at which the database is supported to be in correct state.
ROLL BACK, by contrast, rolls the database back to the state
it was in at BEGIN TRANSACTION, which effectively means
back to the previous commit point.
Note that the previous commit point is still accurate, even in
the case of the first transaction in the program, if we agree to
think of the first BEGIN-TRANSACTION in the program an ini-
tial “Commit point”.
1. All database updates made by the executing program since the
previous point are committed.
2. All database positioning is lost and all tuple locks are re-
leased.
Implementation will be simplest if both following are true.
• Database updates are kept in buffers in main memory and are not
physically written to disk until the transaction commit.
• Database updates are physically written to disk as part of process
of honoring.
If the statement is true,
• Log record must be physically written to before written to database.
• All other records return before the COMMIT log record.
• COMMIT process must not complete until the COMMIT log record
written to log.
14. (b) (i) The term concurrency refers to fact that DBMS typically allow
many transactions to access the same database at the same time. In
04-Nov Dec 2010.indd 101 12/7/2012 6:13:22 PM

such a system, some kind of control mechanism is clearly needed to

ensure that concurrent transaction do not interfere with each other.
Several problems can occur the concurrent transaction execute
in an uncontrolled manner. They are
1. The lost update problem.
2. The temporary update problem.
3. The incorrect summary problem.
The Lost Update Problem:
This problem occurs when two transactions that access the
same database items have their operation involved in a way
that makes the value of some database item incorrect.
For eg: if X = 80 at the start M = 5 and M = 4, the final
result should be 79; but in the interleaving of operation X =
84 because the update in T1 was lost.
T1 T2
read item (X); read –item (X);
X: = X – N; X: = X + M;
write – item (X); write –item (X) ← item X
has an incorrect value
read-item (Y); because its update by T1
is lost.
Y: = Y + N;
write - item (Y);

Fig: The Lost Update Probem
The Temporary UpDate (or Dirty Read) Problem:

This problem occurs when one transaction updates a database
item and then the transaction fails for some reason.
The updated item is accessed by another transaction before
it is changed back to its original value.
The Incorrect SUMMARY (Or Inconsistent Analysis )
Problem.
If one transaction is calculated an aggregate summary
function on a number of records while other transaction are
updating some these records, the aggregate function may
calculate some values before they are updated and others
after they are updated.
04-Nov Dec 2010.indd 102 12/7/2012 6:13:22 PM

Another that may occur is called unreaptable read where a

Transaction reads an item twice and it can be changed between
reads.
T1 T2
read-item (X); read-item (X);
X: = X − N; X: = X + M;
Write-item (X); Write-item (X);
read-item (X); ← Transaction T1 fails

and must change the
value of X back to its
old value; meanwhile
T2 has read the tem-
porary incorrect value
of X.
Fig: Temporary update problem
T1 T3
SUM: = 0;
read-item (A);
read-item (X) SUM: = SUM + A;
X: = X − N;
Write-item (X);
read-item (X); ← T3 reads
X after N is substracted
read-item (Y); and reads Y before N is
added a wrong summary
Y: = Y + N; is the result.
write – item (Y);
SUM:= SUM + X;
read-item (Y);
SUM: = SUM + Y
Fig: The incorrect summary problem
04-Nov Dec 2010.indd 103 12/7/2012 6:13:22 PM

14. (b) (ii) Isolation levels:

Serializability gurantees isolation in the ACID sense. One di-
rect and very desirable consequence is that if all schedulers are
serializable, then the application programmer writing the code
for a given transaction.
A need pay absolutely no attention at all to the fact that some
other transaction B might be executing in the system at the same
time.
However it can be argued that the protocols used to guarantee
serializabilty reduce the degree of concurrency or overall system
throughput to unacceptable levels. In practice, therefore systems
usually support a variety levels of “isolation”.
The isolation level that applies to a given transaction
might be defined as the degree of interference the transaction
in question is prepared to tolerate on the part of concurrent
transactions.
At least five different levels can be defined the SQL statement
and DB2 each support just four. Generally speaking, the higher
isolation levels, the less interference, the lower the isolation level,
the more the interference. By way of illustration we consider two
of the levels supported by DB2, cursor stability and repeatable
read. Repeatable read (RR) is the maximum level, if all transac-
tions operate at this level, all schedules are serializable.
Under cursor stability (cs), by contrast if a transaction A.
• Obtains addressability to some tuple t and thus
• Acquires a lock on t, then
• Relinquishes its add ressability to t without updating it and
so
• Does not promote its lock to X level then
• That lock can be released without having to wait for end
of transaction.
15. (a) (i) Improvement of Reliability via Redundancy:

Redundancy store extra information that is not needed normally,
but that can be used in the event of failure of a disk to rebuild
the lost information.
Mirroring: The simplest approach to introduce redundancy
is to duplicate every disk.
Logical disks consists of 2 physical disks. Every write is
carried out on both sides disks. If one of the disks fails, the data
can be read from the other. Data will lost only if the second disk
fails before the first disk is repaired.
04-Nov Dec 2010.indd 104 12/7/2012 6:13:22 PM

Mean time to data loss depends on mean time to failure and

mean time to repair.
Improvement in performance Via parallelism
Two main goals of parallelism in a disk system are
• Load balance multiple small accesses to reduce response
time.
• Parallelize large accesses to reduce response time.
Improve transfer rate by striping data across multiple disks.
Bit-level striping:
Split the bits of each byte across multiple disks.
• In an array of 8 disks, write bit of i each byte to disk i.
• Each access can read data at 8 times the rate of a single
disk.
• But seek/access time worse than for a single disk.
Block-Level-striping:
With n disks, block i of a file goes to disk.(i Mod n) + 1.
• Request for different blocks can run in parallel if the
blocks reside on different disks.
• A request for long sequence of blocks can utilize all disks
in parallel.
Factors in choosing RAID Level.
(a) Monetary cost
(b) Performance
(c) Performance during failure
(d) Performance during rebuild of failed disk.
(i) RAIDO is used only when data safety is not important.
(ii) Level 2 and 4 never used since they are subsumed
by 3 and 5.
(iii) Level 3 is not used anywhere since bit striping forc-
es, single block reads to access all disks, wasting
disks arm movement.
(iv) Level 1 is preferred for all other application.
15. (a) (ii) • Data pointers are stored only at leaf nodes.
• Leaf nodes have an entry for every value of the search field.
• Leaf nodes are linked together to provide ordered access.
• Internal nodes are similar to tree pointer
04-Nov Dec 2010.indd 105 12/7/2012 6:13:22 PM

The structure of the internal nodes of a B + tree of order P is as

follows.
• Each internal node is in the form.<P1, R1, P2, K2…Pq-1, Kq-1, Pq
>, where q ≤ p and each Pi is a tree pointer.
• Within each internal node , K1, K2, <….Kq-1
• For all search field values X in the Sub tree pointed at by Pi,
we have
Ki-1 < X ≤ Ki for 1 < i < q;
X ≤ Ki for i = 1
Ki-1 ≤ X for i = q.
• Each internal node has at most P tree pointer.
• Each internal node , except the root has at least [(P/2)] tree
pointers. The root node has at least two tree pointers if it is an
internal node.
• An internal node with q pointers q ≤ p has q − 1 search key
field values.
The structure of leaf node is as follows.
• Each leaf node is of the form
<<k1, Pr1>, <K2, Pr2>…<Kq-1, Prq-1>, P next>.
Where q ≤ p
Pri → data pointer
Pnext → points to next leaf node.
• Within each leaf node K1 < K2 < …Kq-1 q ≤ p.
• Each Pri is a data pointer that points to the record whose search
field value is Ki or to a file block containing the record.
• Each leaf node has at least [(P/2)] values.
• All leaf nodes are at the same level.
1 K1 Ki − 1 Pi Ki Kq − 1 Pq
X X X
X ≤ Ki − 1 Ki − 1 < X ≤ Ki Kq − 1 < X
Fig: Internal node of a B+ tree with q − 1 search values
04-Nov Dec 2010.indd 106 12/7/2012 6:13:22 PM

K1 Prr1 K2 Prr2 Ki Prri Kq −1 Prrq − 1 Pnext
Data Data Data Data Pointer

pointer pointer pointer pointer to next
leaf node
Fig: leaf node of a B+ tree with q − 1 search values and q − 1

data pointers
Perryridge
Mianus Redwood
Brighton Mianus Perryridge Redwood Round dell
Fig: B+ tree for account file (n = 3)
15. (b) File scan: Search algorithms that locate and retrieve records that
fulfill a selection condition.
Two scan algorithms to implement the relocation operation are A1
(linear Search) and A2 (binary Search)
Algorithm A1 (linear search)
• The system scans each file block and test all records to see
whether, they satisfy the selection.
• The cost of linear search, in terms of condition disk opera-
tions, is one seek plus br, blocks transfers
• If selections on key attributes have an average transfer cost of
br/2 for a selection on a key attribute, the system can terminate
the scan if the required record is found.
A2 (Binary search)
• Applicable if selection is an equality comparison on the at-
tribute on which file is ordered.
• Assume that blocks of a relation are stored contiguously
• Cost estimate
→ [log2(br)]- cost of locating the lst tuple by a binary search on
the blocks.
→ plus number of blocks containing records that satisfy selec-
tion condition.
Selection using Indices:
• Index-scan-search algorithms that use an index selection con-
dition must be on a search key of index.
04-Nov Dec 2010.indd 107 12/7/2012 6:13:23 PM

Search algorithms that use an index are A3 (Primary Index,

equality on Key):
• Retrieve a single record that satisfy the corresponding equal-
ity condition.
• If a B+ tree is used the cost of operations, in terms of I/O op-
eration is equal to the height of the tree plus one I/O to fetch
the record.
A4 (primary index)
• Retrieve multiple records
• Records will be on consecutive blocks.
• The cost of the operation depends on the height of the tree
plus the number of blocks containing records with the speci-
fied search key.
A5 (Secondary index, equality)
• Retrieve a single record if the search key is a candidate key.
• Retrieve multiple records if the search key is not a candidate
key.
• It can be very expensive
Selection involving comparison:
Can implement selections of the form.
σA ≤ V(r) or σ A ≥ V(r)
by using a linear file scan or binary search or by using indices
A6 (Primary index, Comparison)
• For use index to find first tuple and scan relation sequentially
from there.
• For just scan relation sequentially will first tuple > V: do not
use index.
A7 (Secondary index, comparison)
For use index to find first index entry and scan index sequentially
For just scan leaf pages of index finding pointers to records,
till first entry >V.
In either case, retrieve records that are pointed to require an
I/O for each record
Linear file scan may be cheaper if many records are to be
fetched.
Implementation of complex selections
Complex Selection predicates are
(i) Conjunction : s q1∧.s q2…qn(r)
(ii) Disjunction : s q1 ∨.q2 ∨ …qn(r)
04-Nov Dec 2010.indd 108 12/7/2012 6:13:23 PM

(iii) Negation : s7q(r)

Use linear scan on file.
If few records satisfy ¬ q, and ∧ n index is applicable to q
Algorithms:-
A8 (Conjunctive selection using one index)
• Select a combination of qi and algorithm A1through A7 that
results in the least cost for s qi(r)
• Test other conditions on tuple after fetching it into a memory
buffer.
A9 (conjunctive selection using composite index)
• Use appropriate composite (Multiple Key ) index if avail-
able.
• The types of index determines which of algorithms A3, A4, A5
will be used.
A10 (conjunctive selection by intersection of identifiers)
• Requires indices with record pointers.
• Use corresponding index for each condition and take the in-
tersections of all the obtained sets of record pointers.
• Then fetch records from file
• If some conditions do not have appropriate indices, apply test
in Memory.
A11 (Disjunctive selection, by Union identifiers)
• Applicable if all conditions have available, indices otherwise
use linear scan.
• Use corresponding index for each condition, and take union
of all the obtained sets of record pointers.
• Then fetch records from file.
04-Nov Dec 2010.indd 109 12/7/2012 6:13:23 PM

B.E. / B.Tech. DEGREE EXAMINATION,
APRIL / MAY 2010
Fourth Semester
(Regulation 2008)
PART A – (10 × 2 = 20 marks)
1. Explain the basic structure of a relational database with an example.
2. What are the functions of a DBA?
3. Give the usage of the rename operation with an example.
4. What do you mean by weak entity set?
5. What is normalization?
6. Write a note on functional dependencies.
7. What do you mean by a transaction?
8. Define the term ACID properties.
9. Describe flash memory.
10. List out the physical storage media.
PART B – (5 × 16 = 80 marks)
11. (a) (i) Discuss the various disadvantages in the file system and explain
how it can be overcome by the database system. (6)
05-April May 2010.indd 110 12/7/2012 6:12:57 PM

(ii) What are the different Data models present? Explain in detail.
(10)
Or
(b) (i) Explain the Database system structure with a neat diagram. (10)
(ii) Construct an ER diagram for an employee payroll system. (6)
12. (a) (i) Explain the use of trigger with your own example. (8)
(ii) Discuss the terms Distributed databases and client/server data-
bases. (8)
Or
(b) (i) What is a view? How can it be created? Explain with an example.
(7)
(ii) Discuss in detail the operators SELECT, PROJECT, UNION with
13. (a) Explain 1NF, 2NF and 3NF with an example. (16)
Or
(b) Explain the Boyce–Codd normal form with an example. Also state
how it differs from that of 3NF. (16)
14. (a) (i) How can you implement atomicity in transactions? Explain. (8)
(ii) Describe the concept of serilalizability with suitable example.
(8)
Or
(b) How concurrency is performed? Explain the protocol that is used to
maintain the concurrency concept? (16)
15. (a) What is RAID? Explain it in detail. (16)

Or
(b) Mention the purpose of indexing. How this can be done by B+ tree?
Explain. (16)
05-April May 2010.indd 111 12/7/2012 6:12:57 PM

PART A
1. A relational database is based on the relational model and uses a collection

of tables to represent both data and relationships among those data.
Eg:
ID NAME DEPT_NAME SALARY
2222 Einstein Physics 95000
12121 Wu Finance 90000
14142 Kim History 60000

2. • The Database Administrator creates schema definition
• DBA is responsible for storage structure and access method definition.
• DBA grants various levels of authorization to different kinds of users.
3. Rename:
Rename operation can rename either the relation name or the attribute
name or both.
Syntax:
Ps (R)
Eg:
PTemp (Branch)
4. An entity set may not have sufficient attributes to form a primary key.
• Such an entity set is known as weak entity set.
5.
• Normalization is the process of making the database design to its ulti-
mate normal form.
• The non key attributes must be mutually independent and indirectly
dependant on the primary key.
6.
• A functional dependency is basically a many to one relationship from
one set of attributes to another within a given relvar or relation.
• Types of Functional Dependencies are:
Full Functional Dependency
Partial Functional Dependency
Transitive Functional Dependency
7.
• A transaction is a logical unit of work which alters or accesses the da-
tabase.
• It begins with the execution of begin transaction keyword.
8. Every transaction possess the ACID properties
05-April May 2010.indd 112 12/7/2012 6:12:57 PM

Atomicity: Transactions are atomic

Correctness: Transforms correct state of database into another correct
state.
Isolation: Transactions are isolated from one another.
Durability: Once a transaction commits, it updates persist in the data-
base even when system crashes.
9.
• Flash memory differs from main memory in that stored data are re-
tained even if power is turned off.
• It is also increasingly used as a replacement for magnetic disks for stor-
ing moderate amounts of data.
10.
• Cache
• Main Memory
• Flash Memory
• Magnetic Disk Storage
• Optical Storage
• Tape Storage.
PART – B
11. (a) (i) Disadvantages in the file system:
Data redundancy and inconsistency:
Since different programmers create the files and application
programs over a long period, the various files are likely to
have different structures and the programs may be written in
several programming languages. Moreover the same infor-
mation may be duplicated in several places.
Difficulty in accessing data:
The conventional file processing environments do not al-
low needed data to be retrieved in a convenient and efficient
manner. More responsive data retrieval systems are required
for general use.
Data isolation:
Because data are scattered in various files, and files may be
in different formats, writing new application programs to
retrieve the appropriate data is difficult.
Integrity problems:
The data values stored in the database must satisfy certain types
of consistency constraints. The problem is compounded when
constraints involve several data items from different files.
05-April May 2010.indd 113 12/7/2012 6:12:57 PM

Atomicity problems:
A computer system like any other device is subject to fail-
ure. In many applications, it is crucial that if a failure oc-
curs, the data be restored to the consistent state that existed
prior to the failure. It is difficult to ensure atomicity in a
conventional file processing system.
Concurrent access anomalies:
For the sake of overall performance of the system and faster
response, many systems allow multiple users to update the
data simultaneously. But supervision is difficult to provide
because data may be accessed by many different applica-
tion programs that have not been coordinated previously.
Security problems:
Not every user of the database system should be able to ac-
cess all the data. Since application programs are added to
the file processing system in an ad hoc manner, enforcing
such security constraint is difficult.
11. (a) (ii) Data models:
Underlying the structure of database is the data model: a collec-
tion of conceptual tools for describing data, data relationship,
data semantics and consistency constraints. A data model pro-
vides a way to describe the design of a database at the physi-
cal, logical and view level. There are a number of different data
models that can be classified into four different categories.
Relational Model:
The relational model uses a collection of tables to represent
both data and the relationship among those data. Each table
has multiple columns and each column has a unique name.
Tables are known as relations. The relational model is an
example of record based model. Record based models are so
named because the database is structured in fixed format re-
cords of several types. Each record of particular type is con-
tained in a table. Each record type defines a fixed number of
fields, or attributes. The columns of the table correspond to
the attributes of the record type. The relational data model is
the most widely used data model and a vast majority of cur-
rent database systems are based on the relational model.
Entity Relationship Model:
The Entity Relationship (E-R) data model uses a collection
of basic objects called entities and relationships among these
05-April May 2010.indd 114 12/7/2012 6:12:57 PM

object. An entity is a “thing” or “object” in the real world

that is distinguishable from other objects.
Object Based Data Model:
Object oriented programming has become the dominant
software development methodology. This led to the devel-
opment of an object-oriented data model that can be seen
as extending the E-R model with motions of encapsulation,
methods and object identity. The object relational data mod-
el combines features of the object oriented data model and
relational data model.
Semi Structured Data Model:
The semi structured data model permits the specification of
data where individual data items of the same type may have
different sets of attributes. This is contrast to the data models
mentioned earlier where every data item of a particular type
must have the same set of attributes. The Extensible Markup
Language (XML) is widely used to represent semi structured
data.
Historically the network data model and the hierarchical
data model preceded the relational data model. These mod-
els are tied closely to the underlying implementation, and
complicated the task of modelling data. As a result they are
used little now, except in old database code that is still in
service in some places.
11 (b) (i) The architecture of database system is influenced by the underly-
ing computer system on which the database runs.
Database applications are partioned into 2 or 3 parts. In 2 tier
architecture the application is partitioned into a component that
resides at client machine which invokes database from server. In
3 tier architecture the client machine acts as a front end and does
not call the database directly. The application server in turn com-
municates with the database.
Components of DBMS:
Storage manager:
A storage manager is a program module that provides interface
between low level data stored in database and the application pro-
gram or queries submitted to the system. The various sub compo-
nents of storage manager are:
Authorization and integrity manager:
It tests for satisfaction of various integrity constraints and checks
the authority of users for
05-April May 2010.indd 115 12/7/2012 6:12:58 PM

It ensures that the database remains in a consistent state after con-
current transaction without conflicts.
File Manager:
It manages the allocation of space on disc and data structures for
representation of data.
programmers administrator
Use i Use Use
Application Application Query Administration

intertaces programs tools tools
Compiler and
DML queries DML interpreter
linker
Application
programme DML compiler
Query evaluation
Buffer
manager
Storage manager
Indices Data
dictionary
Statistical
Data data
Fig. : Data Base Architecture

Buffer manager:
It is responsible for fetching data from main memory to disc and
main memory to cache.
05-April May 2010.indd 116 12/7/2012 6:12:58 PM

Disc Storage:
Data files:
The actual data stored in the database.
Data dictionary:
It contains meta data.
Indices:
It is used to provide fast access of data items that holds key values.
Query processor:
It is an important part of database system. The various subcompo-
nents of query processor are:
• DDL Interpreter
• DML compiler
• Query evaluation engine
Database users:
Naive users:
They are unsophisticated users who interact with the system
by invoking application programs that are written previously.
Application programmers:
They are computer professionals who writes and develops ap-
plication programmer using application tools.
Sophisticated users:
These users interact with the system without writing programs.
They form their queries and statements for request.
Specialized users:
These users are sophisticated users who write sophisticated da-
tabase application that is beyond traditional data processing.
Database Administrator (DBA):
• DBA has central control over both programs and data. The
functions of DBA are as follows:
The DBA creates schema definition.
DBA is responsible for storage structure and access method
definition.
DBA carries out the changes in schema and physical orga-
nization according to the changing needs.
DBA periodically takes back up of database.
It ensures enough disc space is available.
It monitors the database and upgrades the performance.
05-April May 2010.indd 117 12/7/2012 6:12:58 PM

11 (b) (ii)
Emp. name Dept. no
Income tax
Address Dept. name
Emp. no
Salary
Undergoes
g Employee Manager Department
deduction
Cont. no Assistants
P.f
Earns
Lunch
Basic Salary
allowance
Hra
Da Travel
allowance
12. (a) (i) Trigger:

It defines an action the database should take when some database
related event occurs. The execution of trigger is transparent to the
user. One trigger may call another one in execution.
Advantages of trigger:
• To generate data automatically
• To enforce complex integrity constraints
• To customize complex integrity constraints
• To maintain replicas
• To audit data modifications
Types of triggers:
• Row level trigger
• Statement level trigger
• Before and after triggers
• Mutating trigger
A mutating trigger is a trigger which calls itself.
Syntax for trigger:
Create [or replace] Trigger Trigger_name
[Before / After] [delete / insert / update [of column]] on [user]
tablename
[for each row] [when condition]
PL / SQL block
end;
05-April May 2010.indd 118 12/7/2012 6:12:58 PM

Eg 1:
Create trigger warn_trig before delete on emp begin
Raise-application error (1101, ‘you cant delete’);
end;
Eg 2:
Create (or replace) trigger trig_update
after insert on restate for each row
begin
if: new class = ‘A’ then
update flight set seatsfc = seatsfc + 1 where
flight no = : newfno;
else
update flight set seats sc = seats sc + 1 where
flight no =: new flight no;
end if;
end;
12. (a) (ii) Distributed databases

Distributed database implies that a single application should be
able to operate transparently on data that is spread across a va-
riety of different databases managed by a variety of different
DBMS’s running on a variety of different machines, supported
by a variety of different operating systems and connected by a
variety of different communication networks where “transpar-
ently” means the application operates from a logical point of
view as if the data were all managed by a single DBMS running
on a single machine. The total information asset of the enterprise
that is distributed already is thus splintered into what are some-
times called islands of information.
Communication
network
05-April May 2010.indd 119 12/7/2012 6:12:58 PM

Client Server Database:

Client/server systems can be regarded as a special care of dis-
tributed system. A client/server is a distributed system in which
(a) some sites are client sites and some are server sites (b) all
data resides at the server sites (c) all applications execute at the
client sites and (d) seams show. The client is the application and
the server is the DBMS. The overall system can also be neatly
divided into two parts, however the possibility arises of running
the two on different machines. Several clients might be able to
access several servers. A single client might be able to access
several servers. Client/server approach does not have certain im-
plications for application programming.
Applications Client machine
Transparent remote
accesses
DBMS
Server machine
12. (b) (i) Views provide a shorthand or macro capability. There is a strong
analogy here with views in a programming language system. In
principle a user in a programming language system could write
out the expanded form of a given macro directly. Analogous re-
marks apply to views. Thus views in a database system play a role
somewhat analogous to that of macros in a programming language
system. Views allow the same data to be seen by different users in
different ways at the same time. Views provide automatic security
for hidden data. Views can provide logical data independence.
Views follows two principles:
• Principle of Interchangeability
• Principle of Database Relativity
Creation of view:
Eg create view city_pair (scity, pcity)
Eg of views:
Create view v as select ∗ from suppliers
where status > 25 or city = “Chennai”;
05-April May 2010.indd 120 12/7/2012 6:12:59 PM

12. (b) (ii) Relational Algebra:

It is a theoretical language with operations that works on one or
more relations to define another relation without changing the
original relation. The fundamental relational algebra operations
are as follows:
Selection/Restriction (σ)
This operation works on a single relation R and defines a new
relation that contains a set of tuples from relation R which satis-
fies the given logical predicate.
Syntax:
s predicate (R).
Eg:
s (salary > = 15000)staff.
Projection (p):
Projection operation works on a single relation R and define a
new relation and contains a vertical sunset of R extracting the
values of specified attributes and eliminating duplicates.
Syntax:
pa1, a2….anR
Eg:
pname, salary
Rename (r):
Rename operation can rename either the relation name or the
attribute name or both.
Syntax:
rs (R)
Eg:
rTemp (Branch)
Union:
The union of 2 relation R and S define the relation that contains
all tuples of R or S or both R & S, duplicate tuples being elimi-
nated
Syntax:
R ∪ S (or) R or S.
Eg:
pcity (Branch) ∪ pcity (property for rent)
13. (a) First Normal Form:

A relvar is in 1NF if and only if in every legal value of that relvar, ev-
ery tuple contains only one value for each attribute. For instance let us
combine suppliers and shipments table into a single relvar as follows:
first {suppno, status, city, qty, partno}
primary key {suppno, partno}
05-April May 2010.indd 121 12/7/2012 6:12:59 PM

We add an additional constraint,

city → status
suppno city
qty
sartno status
suppno status city part no qty

S1 20 Chennai P1 10
S2 10 Bombay P2 20
S3 30 Delhi P3 30
Second {suppno, status, city}

Shipments {suppno, partno, qty}.
suppno partno qty

S1 P1 10
S2 P2 20
S3 P3 30
Second
suppno status city
S1 20 Chennai
S2 10 Bombay
S3 30 Delhi
Second Normal Form:

A relvar is in 2NF if and only if it is 1NF and every non key attribute
is irreducibly dependant on the primary key. Second relvar is not in
2NF and hence it is decomposed.
SC {suppno, city}
CS {city, status}
The first step in normalization procedure is to take projections to
eliminate non irreducible functional dependencies.
R {A, B, C, D}
Primary key {A, B}
05-April May 2010.indd 122 12/7/2012 6:12:59 PM

/ ∗ assume A → 0 holds}
then normalize as,
R1 {A, D} primary key {A}
R2 {A, B, C} primary key {A}
FK {A} reference R
Third Normal form:
A relvar is in 3NF if and only if it is in 2NF and every non key attri-
bute is non transitively dependant on the primary key. Note that the
transitive dependencies implies mutual dependencies. The second
step in the normalization procedure is to take projections to eliminate
transitive dependencies
R {A, B, C}
Primary key {A}
/ ∗ assume B → C holds}
then normalize as,
R1 {B, C} primary key {B}
R2 {A, B} primary key {A}
FK {B} reference R1
13. (b) Boyce/Codd Normal Form:

3NF is not adequate to deal with relvars that has two or more pri-
mary keys. The primary keys were composite. The primary keys are
overlapped.
A relvar is in BCNF if and only if every determinant is a candidate
key.
SSP {suppno, suppname, partno, qty}
suppno suppname partno qty

S1 X P1 50
S1 X P2 40
S2 Y P1 10
In the given example, the candidate keys are {suppno,

partno}·{suppname, partno}. This implies that the given relvar is not
in BCNF because suppno and suppname are both determinant be-
cause each determines the other.
The solution is to break the relvar into two projections
SS {suppno, suppname}
Shipments {suppno, partno, qty}
05-April May 2010.indd 123 12/7/2012 6:12:59 PM

Algorithm for BCNF:

• Let us consider an arbitrary relvar R can be non loss decom-
posed into a set of ‘D’ of BCNF projections.
1. Initiative the D to contain just R
2. For each non BCNF relvar ‘7’ is ‘D’ execute steps 3 and 4.
3. Let X → Y be a FD for 7 that violates the requirements for
BCNF.
4. Replace 7 in D by two of its: that over X & Y and that over
all attributes except these in Y.
14. (a) (i) Atomicity:
Transactions are guaranteed (from a logical point of view) ei-
ther to execute in their entirety or not to execute at all. Even if
the system fails halfway through the process. Atomicity like the
durability property. This property is guaranteed by the system’s
recovery mechanisms (ever with nested transactions). To be
specific, that if the system supported multiple assignment, there
would be no need for transactions as such to have the atomicity
property, rather it would be sufficient for statements to be so. We
understand that the transaction concept is more important from a
pragmatic point of view that it is from a theoretical one. A better
understanding of some of the assumptions on which the research
has been based of the crucial role of integrity constraints in par-
ticular. Plus a recognition of the need to support multiple assign-
ment as a primitive operator. It would be surprising if a change in
assumptions did not lead to change in conclusions.
(ii) Serializability is the generally accepted “criterion for cor-
rectness” for the interleaved execution of a set of transactions;
that is such an execution is considered to be correct if and only if
it is serializable. A given execution of a given set of transactions
is serializable and therefore correct if and only if its equivalent to
some serial execution of the same transaction.
Serial execution: It is one in which the transactions are run one
at a time in some sequence.
Guaranteed: It means that the given execution and the serial one
always produce the same result as each other. No matter what the
initial state of the database might be.
Individual transactions are assumed to be correct; that is they are
assumed to transform a correct state of the database into another
correct state. Running the transactions one at a time in any serial
order is therefore also correct. Let I be on interleaved schedule
involving some set of transactions T1, T2….Tn. If I is serializable
05-April May 2010.indd 124 12/7/2012 6:12:59 PM

then there exists at least one serial schedule S involving T1, T2,
….Tn such that I is equivalent to S. S is said to be the serializa-
tion of I.
14. (b) The term concurrency refers to the fact that DBMS’s typically allow
many transactions to access the same database at the same time.
Concurrent transactions do not interfere with each other.
Three concurrency problems:
There are essentially three ways in which a transaction, though cor-
rect in itself in produce the wrong answer if some other transaction
interferes with it in some way.
The lost update problem:
Transaction A Time Transaction B

– –
– –
retrieve t t1 –
– –
– retrieve t
t2
– –
update t –
– t3 –
– update t
– t4 –
↓
Transaction A relieves some tupels t at time t1. Transaction B retrieves
the same tuple t at time t2. A updates tuple at time t3. B updates the
same tuple of the values seen at time t2. Transaction A’s update is lost at
time t4, because transaction B overwrites it without even looking at it.
The Uncommitted Dependency Problem:
The uncommitted dependency problem arises if one transaction is al-
lowed to retrieve or update a tuple that has been updated by another
05-April May 2010.indd 125 12/7/2012 6:12:59 PM

transaction but not yet committed by that other transaction. For if

it is not yet been committed there it always a possibility that it will
never be committed but will be rolled back instead, in which case the
first transaction will have seen some data that now no longer exists.
The unconsistent Analysis Problem:
Consider transactions A and B are operating on account tuples. Trans-
action A is running account balances, transaction B is transferring an
account amount. The result produced by A is incorrect, if A were to
go on to write that result back into the database it would actually
leave the database in an inconsistent state. In effect, A has seen an
inconsistent state of the database and has therefore performed an in-
consistent analysis. The operations that are of primary interest from
a concurrency point of view are the database retrievals and database
updates in other words we can regard a transaction as consisting of a
sequence of such operations only (ie) BEGIN TRANSACTION and
COMMIT OR ROLLBACK.
15. (a) Having a large number of disks in a system presents opportunities for
improving the rate at which data can be read or written, if the disks
are operated is parallel. Several independent reads or writes can also
be performed in parallel. A variety of disk organization techniques
are collectively called as Redundant Arrays of Independance Disks
(RAID).
RAID levels:
a) RAID 0
C C C C
b) RAID 1
P P P
c) RAID 2
d) RAID 3
05-April May 2010.indd 126 12/7/2012 6:12:59 PM

e) RAID 4
P P P P P
f) RAID 5
P P P P P P
g) RAID 6
RAID 0: Refers to disk arrays with striping at the level of blocks, but
without any redundancy.
RAID level 1: Refers to disk mirroring with block striping. It shows

mirrored organization.
RAID level 2:
It is known as memory style error correcting code organization
which employs parity level bits. Memory systems have long used
parity bits for error detection and correction. Each byte in a memory
system may have a parity bit associated with it that records whether
the numbers of the bits in the byte are set 1.
RAID level 3:
Bit interleaved parity organization improves on level 2 by exploit-
ing the fact that disk controllers unlike memory systems can detect
whether a sector has been read correctly to a single parity bit can be
used for error correction as well as for detection.
RAID level 4:
Block interleaved parity organization uses block level striping like
RAID 0 and in addition keep a parity block on a separate block for
corresponding block from N other disks.
RAID level 5:
Block interleaved distributed parity, improves an level 4 by portion-
ing data and parity among all N + 1 disks, instead of storing data in
N disks and parity in one disks.
RAID level 6:
P + Q redundancy scheme is much like level 5 but stores extra infor-
mation to guard against multiple disk failures.
05-April May 2010.indd 127 12/7/2012 6:13:00 PM

15. (b) As index entry or index record consists of a search key value and
pointer to one or more records with that value as their search key
value. The pointer to a record consists of the identifier of a disk block
and an offset within the disk block to identify the record within a
block.
B+ tree:
The main disadvantage of the index sequential file organization is
that performance degrades as the file grows both for index lookups
and for sequential scans though the data. The B+ tree index structure
is the more widely used of several index structures that maintain
their efficiency despite insertion and deletion of data.
Structure of a B+ tree:
A B+ tree index is a multilevel index but it has a structure that differs
from that of the multilevel index sequential file. Structure consists of
a typical node of B+ tree. It contains upto n – 1 reach key values k1,
k2…. kn-1 and n pointers P1, P2…. Pn. The search key values within
a node are bept in sorted order, thus if i < j, then ki < kj. First let us
consider the structure of leaf nodes. For i = 1, 2…n – 1 pointer Pi
points to a file record with search key value ki pointer Pn has a special
purpose.
P1 K1 P2 ……… Pn−1 Kn−1 Pn
The nonleaf nodes of a B+ tree form a multilevel index on the leaf

nodes. The structure of nonleaf nodes is the same as that for leaf
nodes except that all pointers are pointers to tree nodes. A nonleaf
node may hold up to n pointers and must hold atleast [n/2] pointers.
The pointer represented by its number is called the fanout of the
node. Nonleaf nodes are also refereed to an internal nodes. Unlike
other nonleaf nodes, the root node can hold fewer than [n/2] point-
ers; however it must hold at least two pointers, unless the tree con-
sists of only one node. It is always possible to construct a B+ tree for
any n, that satisfies the requirements.
05-April May 2010.indd 128 12/7/2012 6:13:00 PM

B.E./B.TECH. DEGREE EXAMINATION,
NOVEMBER/DECEMBER 2009
Fifth Semester
(Regulation 2004)
(Common to B.E. (Part-Time)
Fourth Semester Regulation 2005)
Answer All Questions
PART A − (10 × 2 = 20 marks)
1. List four significant differences between file processing system and

DBMS.
2. Distinguish between relational algebra and relational calculus.
3. Give the purpose of DDL and DML.
4. How do triggers help database designers?
5. Mention the advantage of using magnetic tape for storing the data.
6. List out the various indexing techniques followed by the database

system.
7. What do you mean by concurrency control?
8. Define the term ACID properties?
9. In SQL how will you create a structured type? Give an example.
10. How do the XML data differ from that of Relational data?
06-Nov-Dec_2009.indd 129 12/8/2012 1:50:44 PM

PART B – (5 × 16 = 80 Marks)
11. (a) Discuss the various design issues involved in ER Database schema.
(16).
(b) (i) Explain the distinction between condition-defined and user-
defined constraints in Generalization, which of these constraints
can the system check automatically? Give your answer. (8)
(ii) Construct an ER diagram for a hospital with a set of patients and

a set of medical doctors. Associate with each patients a log of
the various test and examinations conducted. (8)
12. (a) Explain the Third Normal Form with suitable example and compare
with BCNF.
(b) For the following employee database

Sales (dept, item, volume)
Item ( iname, type, color)
Dept (dname, floor)
Employee (eno, ename, manager_ no, dept, salary, job_status)
Write the SQL statement for the following queries
(i) Find the items sold by no department in second floor (4)
(ii) Find the names of all the departments where all the employees
earn less than their managers. (4)
(iii) Define the view to list the name and salary of Aarthi’s manager.
(4)
(iv) Is the view is updatable? If no, state why? (4)
13. (a) Explain the different methods of storing variable size records. (16)
(b) Explain the addition and deletion operations by making use of B+

tree. (16)
14. (a) (i) How will you implement atomicity in transactions?
(ii) Describe the concept of serializability with suitable example.

(8)
(b) How is the atomicity maintained during concurrent transactions?

How is locking implemented? Explain the protocol that is used to
maintain the concurrency concept. (16)
06-Nov-Dec_2009.indd 130 12/8/2012 1:50:44 PM

15. (a) (i) A car rental company maintains a vehicle database for all vehi-
cles in its current fleet. For all vehicles it includes vehicle-id,
license-no, manufacturer, model, date-of-purchase and color,
Special data types are included for certain vehicles.
Trucks: cargo-capacity
Sports Car: horsepower, renter-age requirement
Vans: number of passengers
Construct an object-oriented database schema definition for this
database.
Use inheritance wherever appropriate. (8)
(ii) For the following schema
Books (title, authorset setoff (Author), publisherset
setoff(Publisher))
Author (first-name, last-name)
Publisher (name, branch)
Give the XML representation and its DTD. (8)
(b) (i) Describe the Data Warehouse Architecture with a neat diagram.
(7)
(ii) Explain the data mining applications – classifications, associa-
tion and clustering. (9)
06-Nov-Dec_2009.indd 131 12/8/2012 1:50:44 PM

Solutions
PART A
1. Refer May/June 2012 Q.P → Q. No 1
2.
S.No Relational Algebra Relational Calculus
(i) It is Procedural Query language It is non Procedural Query
language
(ii) It consists of set of operations It writes one declarative
expression to specify a
retrieval.
(iii) It takes one or more relations as Hence there is no description
input and produce a new relation of how to retrieve it.
as their result.
(iv) It consists of following operations, It consists of two types
(a) select (i) Tuple Relational
(b) project calculus.
(c) Rename (ii) Domain Relational Cal-
(d) set operations culus.
(e) Cartesian product
(f) Join operations
3. Data Definition Language [DDL]:

It provides command for defining relation schemas, deleting relations
and modifying relations schemas.
Data Manipulation Language[DML]:
It includes commands to insert tuples into, delete tuples from and modify
tuples in the database.
4. Triggers are useful mechanism for database designers for starting certain
tasks automatically when certain conditions are met.
5. Tapes have a high capacity (5-giga bytes tapes) and can be removed from
the tape drive, facilating cheap archival storage.
6. (i) Primary index

(a) dense index
(b) Sparse index
(ii) Secondary index
06-Nov-Dec_2009.indd 132 12/8/2012 1:50:44 PM

(iii) Clustering index

(iv) B+ tree index
7. Concurrency control:
→ Concurrency control is the technique used to control concurrent execu-
tion of transaction
→ The concurrency control schemes are based on the serializability property.
8. (i) Atomicity
(ii) Consistency
(iii) Isolation
(iv) Durability
9. The basic structure of a SQL expression consists of three clauses: select,

from, and where.
A typical SQL query has the form
Select A1, A2,…. An //
from r1, r2….rn
where p;
Example select Roll no, name from emp;
10 → Relational Databases are widely used in existing applications

XML data stored in Relational databases.
→ Converting Relational data to XML data is non straight forward con-
verting XML data to Relational data is straight forward
→ More applications are generated relational schema
Many applications are not generated XML data.
→ Simple storage in Relational Database.
Complicated storage in XML database.
PART B
11 (a) Refer Nov/Dec 2010 Q.NO 11.(b)
11. (b) (i) Condition defined:

→ In condition-defined lower-level entity sets, membership is
evaluated on the basis of whether or not an entity satisfies
an explicit condition or predicate.
→ For e.g. assume that the higher level entity set Account is
having attribute Account type.
→ Only those entities that satisfy the condition Account-type
= “saving account” allowed to belong to the lower level
entity set ‘saving-account’
06-Nov-Dec_2009.indd 133 12/8/2012 1:50:44 PM

→ All entities that satisfy the condition Account-type =

“checking account” are included in checking account.
User Defined:
→ These types of constraints are defined by user. For e.g. let
us assume that, after 3 months of employment bank em-
ployees are assigned to one of four work terms.
→ We therefore represent the teams as four lower-level entity
sets of the higher level employee entity set.
→ A given employee is assigned to one of the four teams by in
charge of the teams.
→ condition defined constraints can the system check auto-
matically.
11. (b) (ii) • Hospital table

• Doctor
• Test
• Room
• Patient
• Permanent doctor
• Consulting doctor
• Labs
These all involved in the hospital to manage.
Hospital
p name
Address
Ph-no
Reg-no
Hospital
Description Changes
g
Employs Contains
T
Test Reasons for
Doc name admission
Patient-code
Doctor Labs
Doc code
e Fees Patient Address
ISA Relation
D.O.B name
Blood
Permanent Consulting gr
Salary Age
doctor doctor
Contact-no Changes
g Admitted
Address Pincode
Specialization
p
Street State
City Relation
Room
code
Room type
yp Charges
Fig: Hospital management
06-Nov-Dec_2009.indd 134 12/8/2012 1:50:44 PM

12. (a) Refer April/May 2010 Q. No is a & b
12. (b) (i) Q > Select item from Sales where dept not in second floor
(ii) Q> Select dname from Dept where employee Salary < man-
ager_ no_ salary
(iii) Create view View as select manager_ no, salary from employee
where
Job = “Manager – no and ename = ‘Aarthi’;
(iv) Yes, the view is updatable
13. (b) Refer Nov/Dec2010 Q.No 15 (a) (ii)
14. (a) (i) Refer April/May 2010 14 (a) (i)

(ii) Refer April/May2010 Q.No 14 (a) (ii)
(b) Refer April/May2010 Q.No 14 (b)
15 (a) Trucks:
(i) Create type Trucks under vehicles (vehicle_id varchar (10), license_
no varchar (10), manufacture varchar 2 (20), model varchar (20), data-
of-purchase, date, color varchar (15), Cargo-capacity varchar 2 (10)
Sports Car:
Create type Sports Car under vehicles (vehicle_id varchar(10),
license_ no varchar(10), manufacture varchar(20), model
varchar(20) date-of-purchase, date, color varchar (15), horsepow-
er varchar 2 (10), renter-age-requirement Number (3));
Vans:
Create type Vans under vehicles (vehicle_ id varchar(10), license_no
varchar (10), manufacture varchar 2 (20), model varchar (20), date-
of-purchase, date, color varchar (15), Number-of-passengers number
(3)); above the object oriented database scheme
Super-types: vehicles
Sub-types: Trucks, Sports car, and vans.
Trucks, sports car and vans inherits the attributes of vehicles
15. (a) (ii) XML Representation and its DTD:-

The document type definition (DTD) is an optional part of an
XML document. The main purpose of a DTD is much like that
of a schema: to certain and type the information present in the
document.
<! DOCTYPE book [
06-Nov-Dec_2009.indd 135 12/8/2012 1:50:45 PM

<! Element book((Books/Author/Publisher)+)>

<! Element Book((title authorset_ set off_(Author) Publisher set.
Set off-publisher>
<! Element Author(first_ Name Last _Name)>
<! Element Publisher(name branch)>
<! Element title(# PC DATA)>
<! Element Author set_ set off –Author (# PC DATA)>
<! Element Publisher set –off_ Publisher (# PC DATA)>
<! Element first _Name (# PC DATA)>
<! Element Last _Name (# PC DATA)>
<! Element (Name (# PC DATA)>
<! Element branch(# PC DATA)>
]> Representation of DTD.
The elements title, Author set _ off_ Author, Publisher_ setoff_
publisher, first_ Name, Last_ Name, Name and branch are all
declared to be of type # PC DATA.
The keyword # PC DATA indicates text data;
XML representation of book information.
<book>information>
<Book>
<title>DBMS</title>
<authorsetoff> Abraham </authorsetoff>
<publisher set> McGraw</publisher set>
</Book>
<Author>
<First _Name>Abraham </First _name>
<Last _Name>Silberschatz </Last _name>
</Author>
<Publisher>
<name>MGgraw </name>
<branch>Chennai </branch>
</Publisher>
</book_ information>
XML –representation of book_ information.
15. (b) (i) Data warehouse:

→ which used to generate large volume of data.
→ datas from the thousands of local branch
→ large organizations have a complex intern organization
structure: & therefore create different data.
06-Nov-Dec_2009.indd 136 12/8/2012 1:50:45 PM

→ A data warehouse is a repository of information gathered

from multiple stories, stored number of undefined schemas
at a single site.
→ data warehouses provide the user a single consolidated
interface to data, making decision
→ support queries easier to write.
Components:
Data source 1
Data
Data source 2 loaders
Query and
analysis tool
Data source n Data warehouse
→ When & where to gained section.

→ In a source-driven architecture for gathering data, the data
sources transmit new information, either continually, or
periodically.
→ In a decision-driven architecture the data warehouse peri-
odically, send requests for new data to the sources.
What schema to use:
→ Data sources that have been constructed independently are
likely to have different schemas.
→ They use different data models.
Data transformation & cleaning:
→ Task of correcting & preprocessing data is called data clean-
ing.
→ Address lists collected from multiple sources may have
duplicates in a merge-purge operation.
How to operate updates:
→ Updates on relations at the data sources must be propagated
to the data warehouse.
→ Updates leads to view-maintenance problem.
06-Nov-Dec_2009.indd 137 12/8/2012 1:50:45 PM

What data to Summarize:

→ The raw data generated by a transaction processing system
may be too large to store online
Warehouse Schemas:
→ Data warehouses typically have schema that are designed
for data analysis, using tools such as OLAP tools.
→ Data table containing multidimensional data are called fact
table.
→ To minimize storage requirements, dimension attributes
are usually short identifiers that are foreign keys into other
tables called dimension tables.
→ The resultant schema appears with a fact table, multiple
dimension tables and foreign keys from the fact table to the
dimension tables, is called a state schema.
→ Complex data-warehouse designs may also have more than
one fact table.
15. (b) (ii) Data Mining:

→ The process of semi automatically analyzing large database
to find useful patterns.
→ Knowledge discovery in artificial intelligence or statistical
analysis, data mining attempts to discover rules and patterns
from data.
→ Data mining differs from machine learning and statistics in that
it deals with large volumes of data, stored primarily on disk.
→ Some types of knowledge discovered from a database can
be represented by a set of rules.
Applications:
→ Used in some sort of prediction.
→ Another class of applications looks for associations, for
instance books that tend to be bought together.
→ If a customer buys a book, an on-line bookstore may suggest
other associated books.
→ Clusters are another example of such patterns.
Classification:
→ Techniques for building one type of classifiers, called
decision-tree classifiers, & prediction techniques.
→ Training instance of items along with the classes to which
they belong,
06-Nov-Dec_2009.indd 138 12/8/2012 1:50:45 PM

→ Some information could be relevant to the credit worthiness

of the applicant
Association Rules
→ Association between different items that people buy.
Eg Someone who buys bread is quite liked also to buy milk.
→ A shop that offers discounts on one associated item may not
offer a discount on other, customer will buy the other prob-
ably anyway,
bread ⇒ milk.
Support :
→ Is a measure of what fraction of the population satisfies both
the antecedent and consequent of the rule.
milk ⇒ screwdrivers
Confidence:
→ Is a measure of how often the consequent is true when the
antecedent is true.
bread ⇒ milk
→ In the priori technique for generating large item sets, only
sets with single items considered in the 1st pass.
Other types of Associations
Duration from the expected co-occurrence of the two
→ Correlation between items
→ Correlations can be positive 1 in that the co-ocurence is high-
er than would have been expected, or negative 1 in that items
co-occur less frequently than predicted.
Eg: whenever bond rates go up, the stock prices go down
within 2 days.
06-Nov-Dec_2009.indd 139 12/8/2012 1:50:45 PM

B.E./B.Tech. DEGREE EXAMINATIONS,
NOV/DEC 2008
Fifth Semester
(Regulation 2008)
Time : Three Hours Maximum : 100 marks
Answer ALL Questions
PART A (10 × 2 = 20 marks)
1. what is Database Management Systems? Why do we need a DBMS?
2. What are three characteristics of a Relational database system?
3. State the difference between security and integrity.
4. What is decomposition and how does it address redundancy?
5. What is a heap file? How are pages organized in a heap file?
6. How does a B-tree differ from B+ trees? Why is B+ tree usually preferred
as an access structure to a data file?
7. Give the meaning of the expression ACID transaction.
8. When are two schedules conflict equivalent?
9. Define the terms fragmentation and replication, in terms of where data is

stored.
10. What are structured data types? What are collection types, in particular?
07-Nov_Dec 2008.indd 140 12/7/2012 6:12:25 PM

PART B (5 ë 16 = 80 marks)
11. (a) (i) Explain the component modules of a DBMS and their
interactions with the architecture. (10)
(ii) Construct an ER diagram to a modal online book store. (6)
or
(b) (i) Explain the basic relational algebra operations with the
symbol used and example for each. (10)
(ii) Discuss about tuple relational calculus and domain

relational calculus. (6)
12. (a) (i) Write short note on the following: (10)

Data manipulation language (DML)
Data definition language (DDL)
Transaction control statement (TCS)
Data control language (DCL)
Data administration statements (DAS).
(ii) Consider the database given by the following schemas: (6)

Customer( Cust _no, sales_ person _no, City)
Sales_person(sales_person_no,Sales_person_
name,common_prec,year_of_hire)
Give the sql for the following:
Display the list of all customers by cust_no with the city
in which each is located.
List the names of the sales persons who have accounts in
Delhi.
or
(b) (i) Consider the universal relation R( A,B,C,D,E,F,G,H,I,J)

and the set of FD’s. G=({A,B}?{C}?{B,D}?{E,F},
{A,D}?{G,H}, {A}?{I}’{H}?{J}) What is the key of R?
Decompose R into 2NF, then 3NF relations. (10)
(ii) Discuss how schema refinement through dependency

analysis and normalization can improve schemas obtained
through ER design. (6)
07-Nov_Dec 2008.indd 141 12/7/2012 6:12:25 PM

13. (a) (i) Describe the different types of file organization? Explain
using a sketch of each of them with their advantages and
disadvantages. (10)
(ii) Describe static hashing and dynamic hashing. (6)
or
(b) (i) Explain the index schemas used in DBMS. (10)
(ii) How does a DBMS represent a relational query evaluation
plan? (6)
14. (a) (i) Explain Timestamp-based concurrency control protocol

and the modifications implemented in it. (10).
(ii) Describe shadow paging recovery technique (6)
or
(b) (i) Describe strict two-phase locking protocol. (10)
(ii) Explain the log based recovery technique (6)
15. (a) (i) Explain 2-phase commitment protocol and the behavior
of this protocol during lost messages and site failures. (12)
(ii) Describe X path and X query with an example. (4)
or
(b) (i) Explain Data mining and data warehousing. (12)
(ii) Describe the anatomy of XML document.
07-Nov_Dec 2008.indd 142 12/7/2012 6:12:25 PM

APRIL/MAY 2008
Fifth Semester
CS1301 – DATABASE MANAGEMENT SYSTEMS
(Regulation 2004)
(Common to BE – (Part Time) Fourth Semester Regulation
2005)
PART A − (10 × 2 = 20 marks)
1. Define Data independence.
2. Distinguish between primary key and candidate key
3. With an example explain a weak entity in an ER diagram.
4. With an example explain referential integrity.
5. What is domain integrity? Give example.
6. Distinguish between dense and sparse indices.
7. List the properties that must be satisfied by a transaction,
8. Define deadlock.
9. State the advantages of distributed systems.
10. What is data warehousing?
08-April and May 2008.indd 143 12/7/2012 6:12:13 PM

PART B – (5 × 16 = 80 marks)
11. (a) (i) Construct an ER diagram for a car insurance company that has
a set of customers, each of whom owns one/more cars. Each car
has associated with it zero to any number of recorded accidents.
(ii) Construct appropriate tables for the above ER diagram.
Or
(b) (i) Define data model. Explain the different types of data models
with relevant examples. (10)
(ii) Explain the role and functions of the database administrator. (6)
12 (a) With relevant examples discuss the following in SQL.

(i) Data Definition Language. (4)
(ii) Data Manipulation Language (4)
(iii) Data Control Language (4)
(iv) Views (4)
Or
(b) What is normalization? Explain normalization techniques using
functional dependencies with relevant examples. (16)
13 (a) Explain following with relevant examples:

(i) B tree (5)
(ii) B+ tree (5)
(iii) Static and dynamic hashing (6)
Or
(b) With a relevant example discuss the steps involved in processing a
query. (16)
14 (a) Explain testing for serializability with respect to concurrency control

schemes. How will you determine, whether a schedule is serializable
or not. (16)
Or
(b) Explain the following concurrency control:
(i) Lock based protocol (8)
(ii) Time stamp based protocol (8)
15 (a) State and explain the features of object oriented data model. Use
banking application as an example. (16)

Or
(b) Write detail notes on following:
(i) Distributed Databases (8)
(ii) Data Mining (8)

NOV/DEC 2007
Fifth Semester
2005)
PART A − (10 × 2 = 20 marks)
1. List any two advantages of database systems.
2. Give the reasons why null values might be introduces into the database.
3. What is static SQL? How does it differ from dynamic SQL?
4. What are the different types of integrity constraints used in designing a

relational database?
5. Compare sequential access devices versus random access devices with

an example.
6. What can be done to reduce the occurrences of bucket overflows in a

hash file organization?
7. Give the ACID properties.
8. State the benefits of strict two-phase locking.
9. What is the need for complex data types?
10. What is data mining?
09-Nov Dec 2007.indd 146 12/7/2012 6:12:01 PM

PART B – (5 × 16 = 80 marks)
11. (a) Explain the system structure of a database system with neat block
diagram. (16)
Or
(b) (i) Construct an ER-diagram for hospital with a set of patients and a
set of medical doctors. Associate with each patient a log of the vari-
ous tests and examinations conducted.
(ii) Discuss on various relational algebra operators with suitable exam-
ple.
12. (a) (i) Consider the employee database, where the primary keys are
underlined.
employee (empname, street, city)
works (empname, companyname, salary)
company (companyname, city)
manages (empname, managername)
And given an expression in SQL for the following queries:
(1) Find the names of all employees who work for First Bank
Corporation.
(2) Find the names, street addresses, and cities of residence of
all employees who work for First Bank Corporation and earn
more than 200000 per annum.
(3) Find the names of all employees in this database who live in
the same city as the companies for which they work.
(4) Find the names of all the employees who earn more than
every employees of Small Bank Corporation.
(ii) Discuss the strengths and weaknesses of the trigger mechanism.
Compare triggers with other integrity constraints supported by
SQL. (8)
Or
(b) (i) What is normalization? Explain the various normalization tech-
niques with suitable example. (12)
(ii) Give the comparison between BCNF and 3NF. (4)
13. (a) (i) Explain how the RAID system improves performance and reli-
ability. (8)
(ii) Describe the structure of B+ tree and list the characteristics of a
B+ tree. (8)
Or
09-Nov Dec 2007.indd 147 12/7/2012 6:12:01 PM

(b) (i) Explain the steps involved in processing a query. (8)

(ii) Give the algorithm for Hash join. (8)
14. (a) (i) Describe about the testing of serializability. (6)

(ii) Discuss on two-phase locking protocol. (10)
Or
(b) (i) Explain the deffered and immediate-modification versions of the
log-based recovery scheme (10)
(ii) Write the short notes on shadow paging (6)
15. (a) (i) Highlight the features of OODBMS. (8)

(ii) Write short notes on distributed databases. (8)
Or
(b) (i) Give the structure of XML data. (4)
(ii) Explain the architecture of a typical data warehouse and describe
the various components of data warehouse. (12)
09-Nov Dec 2007.indd 148 12/7/2012 6:12:01 PM

MAY/JUNE 2007
Fifth Semester
(Regulation 2004)
2005)
PART A − (10 × 2 = 20 marks)
1. List five responsibilities of the DB Manager.
2. Give the limitations of E – R model? How do you overcome this?
3. Define query language. Give the classification of query language.
4. Why it is necessary to decompose a relation?
5. Give any two advantages of sparse index over dense index.
6. Name the different types of joins supported in SWL.
7. What are the types of transparencies that a distributed database must

support? Why?
8. What benefit is provided by strict – two phase locking? What are the
disadvantages results?
9. Briefly write the overall process of data warehousing
10. What is an active database?
10-May and June 2007.indd 149 12/7/2012 6:11:49 PM

PART B – (5 × 16 = 80 marks)
11. (a) (i) What are the types of knowledge discovered during data mining?
Explain with suitable examples. (8)
(ii) Highlight the features of object oriented database. (8)
Or
(b) (i) What is nested relations? Give example. (8)
(ii) Explain the structure of XML with suitable example. (8)
12. (a) (i) Compare file system with database system. (8)
(ii) Explain the architecture of DBMS. (8)
Or
(b) (i) What are the steps involved in designing a database application?
Explain with an application. (10)
(ii) List the possible types of relations that may exist between two
entities. How would you realize that into tables for a binary rela-
tion? (6)
13. (a) (i) What are the relational algebra operations supported in SQL?
Write the SQL statement for each operation. (8)
(ii) Justify the need for normalization with examples. (8)
Or
(b) (i) What is normalization? Explain 1NF, 2NF, 3NF and BCNF with
(ii) What is FD? Explain the role of FD in the process of normaliza-
tion. (8)
14. (a) (i) Explain the security features provided in commercial query lan-
guages. (8)
(ii) What are the steps involved in query processing? How would
you estimate the cost of the query? (8)
(b) (i) Explain the different properties of indexes in detail. (8)
(ii) Explain various hashing techniques. (8)
15. (a) (i) Explain the four important properties of transaction that a DBMS
must ensure to maintain database. (8)
(ii) What is RAID? List the different levels in RAID technology and
explain its features. (8)
Or

(b) (i) What is concurrency control? How is it implemented in DBMS?

Explain. (8)
(ii) Explain various recovery techniques during transaction in detail. (8)

NOV/DEC 2006
Fifth Semester
(Regulation 2004)
PART A − (10 × 2 = 20 marks)
1. Compare database systems with file systems.
2. Give the distinction between primary key, candidate key and super key.
3. Write a SQL statement to find the names and loan numbers of all custom-
ers who have a loan at Chennai branch.
4. What is multi-valued dependency?
5. Give the measures of the quality of the disk?
6. What are the two types of ordered indices?
7. List out the ACID properties.
8. What is shadow paging?
9. Compare DBMS versus object oriented DBMS.
10. What is Data warehousing?
PART B – (5 × 16 = 80 marks)
11. (a) (i) Describe the system structure of database system. (12)
(ii) List out the functions of DBA (4)
Or
11-Nov Dec 2006.indd 152 12/7/2012 6:11:38 PM

(b) (i) Illustrate the issues to be considered while developing an ER-

diagram.
(ii) Consider the relational database
employee (empname, street, city)
works (empname, companyname, salary)
company (companyname, city)
manages (empname, managername).
Give an expression in the relational algebra for each request.
Find the names of all employees who work for first bank Corpo-
ration.
Find the names, street addresses and cities of residence of all
employees who work for first Bank Corporation and earn more
than 200000 per annum.
Find the names of all employees in this database who live in the
same city as the company for which they work.
Find the names of all employees who earn more than every
Employees of small Bank Corporation.
12. (a) (i) Discuss about triggers. How do triggers offer a powerful mech-
anism for dealing with the changes to database with suitable
example. (10)
(ii) What are nested queries? Explain with example. (6)
Or
(b) (i) What is normalization? Give the various normal forms of rela-
tional schema and define a relation which is in BCNF and explain
with suitable example. (12)
(ii) Compare BCNF versus 3NF. (4)
13. (a) (i) Describe about RAID levels. (10)
13. (a) (ii) Explain why allocations of records to blocks affects database
system performance significantly. (6)
Or
(b) (i) Describe the structure of B+ tree and give the algorithm for search
in the B+ tree with example. (12)
(ii) Give the comparison between ordered indexing and hashing
14. (a) (i) Explain the different forms of Serializability. (10)

(ii) What are different types of schedules are acceptable for recover-
ability? (6)
11-Nov Dec 2006.indd 153 12/7/2012 6:11:39 PM

Or
(b) (i) Discuss on two-phase locking protocol and timestamp-based
protocol. (12)
(ii) Write short notes on log-based recovery. (4)
15. (a) (i) Discuss in detail about the object relational database and its
advantages.
(ii) Illustrate the issues to implement distributed database.
(b) (i) Give the basic structure of XML and its document schema.
(ii) What are the two important classes of data mining problems?
Explain about rule discovery using those classes.
11-Nov Dec 2006.indd 154 12/7/2012 6:11:39 PM

B.E./B.Tech DEGREE EXAMINATION,
MAY/JUNE 2006
Fourth Semester
(Regulation 2008)
Time: Three hours Maximum: 100 Marks
Answer ALL Questions
PART A (10 ë 2 = 20 Marks)
1. Differentiate File systems and Database management system.
2. What is logical data independence?
3. Write the Tuple relational calculus expression to find the number of

employees working in Sales department in the given relation Employee.
Employee (SSN-No., Name, Department).
4. Define triggers.
5. When is a functional dependency said to be trivial?
6. What are the pitfalls of Database design?
7. What are ACID properties?
8. What are the facilities available in SQL for Database recovery?
9. What is stripping and mirroring?
10. Differentiate static and dynamic hashing.
12-may_june_2006.indd 155 12/7/2012 6:11:23 PM

PART B (5 ë 16 = 80 Marks)
11. (a) (i) Explain the Database Management System architecture
with a neat diagram. (10)
(ii) What are the need for the development of relational
databases? (6)
12. (a) (i) Explain briefly about various relational algebra expressions
with examples. (8)
(ii) Discuss about the evolution of distributed database.
Compare with client/server mode. (8)
or
(b) Consider the relational table given below and answer the
following SQL queries. (16)
Employee (SSN-No, Name, Department, Salary)
(i) List all the employees whose name starts with the letter ‘L’
(ii) Find the maximum salary given to employees in each
department.
(iii) Find the number of employee working in ‘accounts’
department.
(iv) Find the second maximum salary from the table.
(v) Find the employee who is getting the minimum salary.
13. (a) What is normalization? Explain first, second and third

normal forms with an example describing the advantages of
normalization. (16)
or
(b) Explain briefly about Armstrong rules on functional
dependency and write the algorithm to compute functional
dependency closure. (16)
14. (a) Explain briefly about the working of two phase locking
protocol using a sample transaction. (16)
or
(b) (i) When is a transaction said to be deadlocked? Explain the
deadlock prevention methods with an example. (8)
12-may_june_2006.indd 156 12/7/2012 6:11:23 PM

(ii) Explain concurrency control mechanisms. Discuss the

need with example. (8)
15. (a) (i) Draw and explain the structure of B+ tree index files. (10)
(ii) Write notes on RAID. (6)
or
(b) Explain briefly about query processing with examples to
perform sort and join operation. (16)
12-may_june_2006.indd 157 12/7/2012 6:11:23 PM

3 - Database Management Systems PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3 - Database Management Systems PDF

Uploaded by

Copyright:

Available Formats

Semester-III

03_Database Management Systems.indd 1 4/23/2014 12:41:00 PM

EEE_Sem-VI_Chennai_FM.indd iv 12/7/2012 6:40:43 PM

DATABASE MANAGEMENT SYSTEMS

PART A (10 × 2 = 20 marks)

1. What do you mean by simple and composite attribute?

3. State the difference between security and integrity.

5. Define trivial functional dependency.

6. Define functional dependency.

7. Brief about Cascading rollback.

8. What is a rigorous two phase locking protocol?

9. What is slotted page sheet? Draw the diagram.

10. What is the content of update log record?

03_Database Management Systems.indd 3 4/23/2014 12:41:00 PM

03_Database Management Systems.indd 4 4/23/2014 12:41:00 PM

03_Database Management Systems.indd 5 4/23/2014 12:41:00 PM

1. A composite attribute is an attribute composed of multiple components,

2. Queries are the primary mechanism for retrieving information from

4. A recursive relationship is a relationship between the instances of a single

5. When the right-hand side of the functional dependency is a subset of

6. A functional dependency is a many-to-one relationship between two sets

7. A cascading rollback occurs in database systems when a transaction (T1)

03_Database Management Systems.indd 6 4/23/2014 12:41:01 PM

dependent on T1’s actions must also be rollbacked due to T1’s failure,

8. Two-phase locking (also called 2PL) is a method or a protocol of

1) page is organized into areas (slots)

10. Update Log Record notes an update (change) to the database. It

03_Database Management Systems.indd 7 4/23/2014 12:41:02 PM

03_Database Management Systems.indd 8 4/23/2014 12:41:02 PM

03_Database Management Systems.indd 9 4/23/2014 12:41:03 PM

11. (a) (ii)

The view at each of the above levels is described by a scheme or schema.

03_Database Management Systems.indd 10 4/23/2014 12:41:03 PM

11. (b) (i)

11. (b) (ii)

03_Database Management Systems.indd 11 4/23/2014 12:41:03 PM

Mappings of three-tier architecture

03_Database Management Systems.indd 12 4/23/2014 12:41:04 PM

iii)select exam-name from exam where emp=skill type

Client/server database architecture

03_Database Management Systems.indd 13 4/23/2014 12:41:04 PM

As shown in figure, the client/server database architecture consists

03_Database Management Systems.indd 14 4/23/2014 12:41:04 PM

Disadvantages of Client/Server Database System

03_Database Management Systems.indd 15 4/23/2014 12:41:05 PM

Distributed database system

03_Database Management Systems.indd 16 4/23/2014 12:41:05 PM

13. (a) (i)

13. (a) (ii)

03_Database Management Systems.indd 17 4/23/2014 12:41:05 PM

shown in Fig. 10.2 (b). Thus, now non-simple domain RESIDENCE is

Second Normal Form (2NF)

03_Database Management Systems.indd 18 4/23/2014 12:41:06 PM

Relation PATIENT_DOCTOR decomposed into two tables for

03_Database Management Systems.indd 19 4/23/2014 12:41:06 PM

In other words, no attributes of the relation should be transitively

In other words, a relation must only have candidate keys as determinants.

03_Database Management Systems.indd 20 4/23/2014 12:41:06 PM

form of 3NF and eliminates the problems of 3NF. The difference

03_Database Management Systems.indd 21 4/23/2014 12:41:06 PM

Let us consider the set of F of FDs given by

03_Database Management Systems.indd 22 4/23/2014 12:41:07 PM

Computing closure F+ of F under S