You are on page 1of 31

Module-I

1) Discuss the characteristics of database approach.


1. Self-Describing Nature of a Database System
It has a complete definition or description of the database structure and constraints. This definition is
stored in the system catalog, which contains information such as the structure of each file, the type and
storage format of each data item, and various constraints on the data. This information stored in the
system catalog is called, Meta-data and it describes the structure of the primary database. This allows
the DBMS software to work with different databases.
2. Insulation between Programs and Data, and Data Abstraction
Called program-data independence. Allows changing data storage structures and operations without
having to change the DBMS access programs. The structure of data files is stored in the DBMS catalog
separately from the access programs.
Data Abstraction: A data model is used to hide storage details and present the users with a conceptual
view of the database.
3. Support of Multiple Views of the Data
A database typically has many users, each of whom may require a different perspective or view of
the database. A view may be a subset of the database or it may contain virtual data that is derived
from the database files but is not explicitly stored.
4. Sharing of Data and Multiuser Transaction Processing
A multiuser DBMS, as its name implies, must allow multiple users to access the database at the
same time. This is essential if data for multiple applications is to be integrated and maintained in a
single database. The DBMS must include concurrency control software to ensure that several
users trying to update the same data do so in a controlled manner so that the result of the updates
is correct.
1) With neat dig, explain 3-schema architecture.
3-schema architecture also called 3 levels of architecture.
This architecture contains three layers or levels of the database management system:
1. The internal level has an internal schema, which describes the physical storage structure of the
database. The internal schema uses a physical data model and describes the complete details of data
storage and access paths for the database.

2. The conceptual level has a conceptual schema, which describes the structure of the whole database
for a community of users. The conceptual schema hides the details of physical storage structures and
concentrates on describing entities, data types, relationships, user operations, and constraints. A high-
level data model or an implementation data model can be used at this level.
3. The external or view level includes a number of external schemas or user views. Each external schema
describes the part of the database that a particular user group is interested in and hides the rest of the
database from that user group. A high-level data model or an implementation data model can be used at
this level.
2) Explain data independence and it types.
Data independence refers characteristic of being able to modify the schema at one level of the
database system without altering the schema at the next higher level.
1. Logical data independence is the capacity to change the conceptual schema without having to change
external schemas or application programs.
2. Physical data independence is the capacity to change the internal schema without having to change
the conceptual (or external) schemas.
3) Discuss the advantages of using DBMS.IMP
1. Controlling Redundancy
In File Processing System, duplicate data is created in many places because all the programs have their
own files. This creates data redundancy which in turns wastes labor and space. In Database Management
System, all the files are integrated in a single database. The whole data is stored only once at a single
place so there is no chance of duplicate data.
2. Restricting Unauthorized Access
A DBMS should provide a security and authorization subsystem, which is used for specifying restrictions
on user accounts. Common kinds of restrictions are to allow read-only access (no updating), or access
only to a subset of the data.
3. Providing Persistent Storage for Program Objects: Object-oriented database systems make it easier
for complex runtime objects (e.g., lists, trees) to be saved in secondary storage so as to survive beyond
program termination and to be retrievable at a later time.
4. Providing Storage Structures for Efficient Query Processing: The DBMS maintains indexes (typically in
the form of trees and/or hash tables) that are utilized to improve the execution time of queries and
updates.
The query processing and optimization module is responsible for choosing an efficient query execution
plan for each query submitted to the system.
5. Providing Multiple User Interfaces
Many types of users with varying levels of technical knowledge use a database, a DBMS should provide a
variety of user interfaces. These include query languages for casual users; programming language
interfaces for application programmers; forms and command codes for parametric users; and menu-
driven interfaces and natural language interfaces for stand-alone users. Both forms-style interfaces and
menu-driven interfaces are commonly known as graphical user interfaces (GUIs).
6. Representing Complex Relationships among Data
A database may include numerous varieties of data that are interrelated in many ways. A DBMS must
have the capability to represent a variety of complex relationships among the data as well as to retrieve
and update related data easily and efficiently.
7. Enforcing Integrity Constraints
Data integrity means that the data is accurate and consistent in the database. Data Integrity is very
important as there are multiple databases in a DBMS. All of these databases contain data that is visible to
multiple users. So it is necessary to ensure that the data is correct and consistent in all the databases and
for all the users.
8. Providing Backup and Recovery
Database Management System automatically takes care of backup and recovery. The users don't need to
backup data periodically because this is taken care of by the DBMS. Moreover, it also restores the
database after a crash or system failure to its previous condition.
4) With neat diagram illustrate typical component module of DBMS.

DBMS Component Modules


• A higher-level stored data manager module of the DBMS controls access to DBMS information that is
stored on disk.
• The DDL compiler processes schema definitions, specified in the DDL, and stores descriptions of the
schemas (meta-data) in the DBMS catalog.
• The run-time database processor handles database accesses at run time. The query compiler handles
high-level queries that are entered interactively.
• The pre-compiler extracts DML commands from an application program written in a host programming
language.
These commands are sent to the DML compiler for compilation into object code for database access
Database System Utilities
1. Loading - loads existing data files (e.g., text files or sequential files) into the database.
2. Backup - this utility provides a backup copy of the database, usually by dumping the entire database
onto tape.
3. File reorganization - can be used to reorganize a database file into a different file organization to
improve performance.
4. Performance monitoring - monitors database usage and provides statics to the DBA.

6) Illustrate client-server architectures of DBMS.


The client/server architecture was developed to deal with computing environments in which a
large number of rcs, workstations, file servers, printers, database servers, Web servers, and other
equipment are connected via a network. The idea is to define specialized servers with specific
functionalities.
Logical and physical two-tier client/server architecture.
The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on
the client end can directly communicate with the database at the server side. For this interaction,
API's like: ODBC, JDBC are used.
The user interfaces and application programs are run on the client-side.
The server side is responsible to provide the functionalities like: query processing and transaction
management.
To communicate with the DBMS, client-side application establishes a connection with the server
side.

fig: Logical two-tier client/server architecture.

3- tier client/server architecture.

 The 3-Tier architecture contains another layer between the client and server. In this
architecture, client can't directly communicate with the server.
 The application on the client-end interacts with an application server which further
communicates with the database system.
 End user has no idea about the existence of the database beyond the application
server. The database also has no idea about any other user beyond the application.
 The 3-Tier architecture is used in case of large web application.
7) Discuss Database languages and interfaces.

Database languages:
a) Data Definition Language
The language is used to create database, tables, alter them, etc. With this, you can also
rename the database, or drop them. It specifies the database schema.
CREATE: Create new database, table, etc. ALTER: Alter existing database, table, etc.
DROP: Drop the database RENAME: Set a new name for the table.

b) Data Manipulation Language


The language used to manipulate the database like inserting data, updating table, retrieving
record from a table, etc. is known as Data Manipulation Language –

SELECT: Retrieve data from the database INSERT: Insert data UPDATE: Update data
DELETE: Delete all records

c) Data Control Language: Grant privilege to a user using the GRANT statement.
GRANT: Give privilege to access the database. REVOKE: Take back the privilege to access
the database.

d) Transaction Control Language


Manage transactions in the Database using the Transaction Control Language –
COMMIT: Save the work. SAVEPOINT: Set a point in transaction to rollback later
ROLLBACK: Restores since last commit

Database Interfaces:
Menu-Based Interfaces for Web Clients or Browsing –
These interfaces present the user with lists of options (called menus) that lead the user
through the formation of a request. Basic advantage of using menus is that they removes the
tension of remembering specific commands and syntax of any query language.
Forms-Based Interfaces –
A forms-based interface displays a form to each user. Users can fill out all of the form entries
to insert a new data, or they can fill out only certain entries, in which case the DBMS will
redeem same type of data for other remaining entries.
Graphical User Interface –
A GUI typically displays a schema to the user in diagrammatic form. The user then can
specify a query by manipulating the diagram.
Natural language Interfaces –
These interfaces accept request written in English or some other language and attempt to
understand them. A Natural language interface has its own schema, which is similar to the
database conceptual schema as well as a dictionary of important words.
Speech Input and Output –
There is an limited use of speech say it for a query or an answer to a question or being a
result of a request it is becoming commonplace Applications with limited vocabularies such
as inquiries for telephone directory, flight arrival/departure.
Interfaces for DBA –
Most database system contains privileged commands that can be used only by the DBA’s
staff. These include commands for creating accounts, setting system parameters, granting
account authorization, changing a schema, reorganizing the storage structures of a databases
7) Explain different types of end users? And main activities of each.
a) Casual End Users –
These are the users who occasionally access the database but they require different
information each time. They use a sophisticated database query language basically to
specify their request and are typically middle or level managers or other occasional
browsers. These users learn very few facilities that they may use repeatedly from the
multiple facilities provided by DBMS to access it.
b) Naive or parametric end users –
These are the users who basically make up a sizeable portion of database end users. The
main job function revolves basically around constantly querying and updating the
database.

The following tasks are basically performed by Naive end users:

1. The person who is working in the bank will basically tell us the account balance and
post-withdrawal and deposits.
2. Reservation clerks for airlines, railway, hotels, and car rental companies basically
check availability for a given request and make the reservation.
3. Clerks who are working at receiving end for shipping companies enter the package
identifies via barcodes and descriptive information through buttons to update a central
database of received and in transit packages.
c) Sophisticated end users –
These users basically include engineers, scientist, business analytics and others who
thoroughly familiarize themselves with the facilities of the DBMS in order to implement
their application to meet their complex requirement. These users try to learn most of the
DBMS facilities in order to achieve their complex requirements.

d) Standalone users –
These are those users whose job is basically to maintain personal databases by using a ready-
made program package that provides easy to use menu-based or graphics-based interfaces,
An example is the user of a tax package that basically stores a variety of personal financial
data of tax purposes. These users become very proficient in using a specific software
package.

8) Write a note on Applications of DBMS

Sector Use of DBMS

For customer information, account activities,


Banking payments, deposits, loans, etc.

Airlines For reservations and schedule information.

For student information, course registrations,


Universities colleges and grades.

Telecommunication It helps to keep call records, monthly bills,


maintaining balances, etc.
9. Discuss data model and it types?
Data Model: A collection of ideas that are used to describe the overall structure of a database.
Structure of database means type of data ,relationship among these data and restriction which
is applied to the data. Most of data models also describe the basic operations applied to the
database (updating and retrieving).
Data model is used to show that how can we hide some details of database from the
users that is not needed by the user.
Types of Data Model:
a) High Level or Conceptual Data Models: It describe that what data is stored and how can
we retrieve that data. It uses concepts to describe what data is stored are entities , attributes
,and relationships. So E-R model in the category of conceptual level.
b) Low level or Physical Data Models: It describe how data is stored in the computer.
Because user has no interest to know that where data is stored so this portion is always hidden
to the users. This models are used specially for computer specialists.
c) Representational or Implementation Data Models: Between these two model
representational model is come which hide some details of data storage and some represents
to the end users. It represent data by using record structures so sometime it is called record
based data models .This model is mostly used in traditional DBMS. This model also include
relational data model ,Network and Hierarchical model(legacy model) ,
Module-II
1) Design an ER-Diagram for company database taking into account of at least 4-entities.

2) Design an ER-Diagram for Bank database taking into account of at least 4-entities.
3) With respect to ER model explain with example.
A) Strong entity B) Weak entity C) Participation constraints D) Cardinality ratio(1:1,1:N,N:1,M:N)
D) Recursive relationship.
Weak Entity Type: Weak entity type doesn't have a key attribute. Weak entity type can't be
identified on its own. It depends upon some other strong entity for its distinct identity.
A weak entity is represented by a double outlined rectangle.

Example: If we have two tables of Customer(Customer_id, Name, Mobile_no, Age,


Gender) and Address(Locality, Town, State, Customer_id). Here we cannot identify the
address uniquely as there can be many customers from the same locality. So, for this, we
need an attribute of Strong E ntity Type i.e ‘Customer’ here to uniquely identify entities of
'Address' Entity Type.

Participation constraints
In a Relationship, Participation constraint specifies the existence of an entity when it is
related to another entity in a relationship type. It is also called minimum cardinality
constraint.
This constraint specifies the number of instances of an entity that can participate in a
relationship type.
Recurring relationships
A relationship between two entities of similar entity type is called a recursive relationship

Cardinality ratio:

 One to One (1:1) – “Student allotted a project” signifies a one-to-one relationship


because only one instance of an entity is related with exactly one instance of another
entity type.

One To One
 One to Many (1:M) – “A department recruits faculty” is a one-to-many relationship
because a department can recruit more than one faculty, but a faculty member is
related to only one department.

One to Many
 Many to One (M:1) – “Many houses are owned by a person” is a many-to-one
relationship because a person can own many houses but a particular house is owned
only a person.

Many To One
 Many to Many (M:N) – “Author writes books” is a many-to-many relationship
because an author can write many books and a book can be written by many authors.

Many To Many
4) Discuss naming conventions and design issues used for ER-diagram.
Naming conventions
When designing a database schema, the choice of names for entity types, attributes, relationship
types, and (particularly) roles is not always straightforward. One should choose names that
convey, as much as possible, the meanings attached to the different constructs in the schema. We
choose to use singular names for entity types, rather than plural ones, because the entity type
name applies to each individual entity belonging to that entity type. In our ER diagrams, we will
use the convention that entity type and relationship type names are uppercase letters, attribute
names have their initial letter capitalized, and role names are lowercase letters.
Design Issues
It is occasionally difficult to decide whether a particular concept in the miniworld should be
modeled as an entity type, an attribute, or a relationship type. In general, the schema design
process should be considered an iterative refinement process, where an initial design is created
and then iteratively refined until the most suitable design is reached. Some of the refinements that
are often used include the following:
A concept may be first modelled as an attribute and then refined into a relationship because it is
determined that the attribute is a reference to another entity type. It is often the case that a
pair of such attributes that are inverses of one another are refined into a binary relationship.
Once an attribute is replaced by a relationship, the attribute itself should be removed from the
entity type to avoid duplication and redundancy.

5) With neat diagram illustrate different phases of database design.


Database design is a framework that the database uses for planning, storing and managing
data in companies and organizations.
We can say that the consistency of a data is achieved when the database is designed in such
a way so it can store only useful and often most required data.
Main phases of database design are

1. Conceptual design
When every data requirement is stored and analyzed, the next thing that we need to do is
creating a conceptual database plan. Here, a highly leveled conceptual data model is used.
This phase is called conceptual design.
When the conceptual design phase is in progress, the basic data modeling operations can be
deployed to define the high-level user operations that are noted during analysis of the
functions.
2. Logical Design
The logical phase of database design is also called the data modeling mapping phase. This
phase gives us a result of relation schemas. The basis for these schemas is the ER or the Class
Diagram.
To create the relation schemas is mainly mechanical operation. There are rules for
transferring the ER model or class diagram to relation schemas.
3. Normalization
Normalization is, in fact, the last piece of the logical design puzzle. The main purpose of
normalization is to remove superfluity and every other potential anomaly during the update.
Normalization in database design is a way to change the relation schema to reduce any
superfluity. With every normalization phase, a new table is added to the database.
4. Physical Design
The last phase of database design is the physical design phase. In this phase, we implement
the database design. Here, a DBMS (Database Management System) must be chosen to use.

6) Discuss entity and attributes, discuss different types of attribute occur in ER-diagram with
examples.
Entities and Attributes
Entity which is a "thing" in the real world with an independent existence. An entity may be an
object with a physical existence—a particular person, car, house, or employee.
Attributes the particular properties that describe it. For example, an employee entity may be
described by the employee’s name, age, address, salary, and job.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.

Composite attribute − Composite attributes are made of more than one simple attribute. For
example, a student's complete name may have first_name and last_name.

Derived attribute − Derived attributes are the attributes that do not exist in the physical database,
but their values are derived from other attributes present in the database.
For example, average_salary in a department should not be saved directly in the database, instead it
can be derived. For another example, age can be derived from data_of_birth.

Single-value attribute − Single-value attributes contain single value.


For example − Social_Security_Number.

Multi-value attribute − Multi-value attributes may contain more than one values. For example, a
person can have more than one phone number, email_address, etc.
Null VaIues: In some cases a particular entity may not have an applicable value for
an attribute.

7) Discuss Basic steps to Conversion of ER diagram into relational schema.

The basic rules for converting the ER diagrams into tables are-
Convert all the Entities in the diagram to tables. All the entities represented in the rectangular
box in the ER diagram become independent tables in the database. All single valued attributes of
an entity is converted to a column of the table. All the attributes, whose value at any instance of
time is unique, are considered as columns of that table.
Key attribute in the ER diagram becomes the Primary key of the table.
Declare the foreign key column, if applicable.
Any multi-valued attributes are converted into new table.
Any composite attributes are merged into same table as different columns.
One can ignore derived attribute, since it can be calculated at any time.

M odule-III
1) Discuss characteristics of a relation with example.

A relation has certain characteristics, which are given here.


1. Ordering of tuples in a relation: Since a relation is a set of tuples and a set has no
particular order among its elements, hence, tuples in a relation do not have any
specified order. However, tuples in a relation can be logically ordered by the values of
various attributes. In that case, information in a relation remains same, only the order
of tuple varies. Hence, tuple ordering in a relation is irrelevant.
2) Ordering of values within a tuple: An n-tuple is an ordered set of attribute values
that belongs to the domain D, so, the order in which the values appear in the tuples is
significant. However, if a tuple is defined as a set of (<attribute> : <value>) pairs, the
order in which attributes appear in a tuple is irrelevant. This is due to the reason that
there is no preference for one attribute value over another.
3) Values and nulls in the tuples: Relational model is based on the assumption that
each tuple in a relation contains a single value for each of its attribute. Hence, a
relation does not allow composite and multivalued attributes. Moreover, it allows
denoting the value of the attribute as null, if the value does not exist for that attribute
or the value is unknown.
4) No two tuples are identical in a relation: Since a relation is a set of tuples and a set
does not have identical elements. Therefore, each tuple in a relation must be uniquely
identified by its contents. In other words, two tuples with the same value for all the
attributes (that is, duplicate tuples) cannot exist in a relation.
5) Interpretation of a relation: A relation can be used to interpret facts about entities as
a type of assertion. A relation can also be used to interpret facts about relationships.

2) Illustrate update operation dealing with constraint violation.


3) Explain basic constraints in SQL.
a) NOT NULL - Ensures that a column cannot have a NULL value

EX: CREATE TABLE Persons (ID int NOT NULL, LastName varchar(255) NOT
NULL,FirstName varchar(255) NOT NULL,Age int);

UNIQUE - Ensures that all values in a column are different

PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Uniquely


identifies each row in a table.

CREATE TABLE Persons (PID int NOT NULL,FName


varchar(25), Age int,PRIMARY KEY (PID));

FOREIGN KEY - Uniquely identifies a row/record in another table.

CREATE TABLE order (O_ID int PRIMARY KEY ,O_Name


varchar(25), FK(PID) REFERENCES persons(PID));

CHECK - Ensures that all values in a column satisfies a specific condition

CREATE TABLE Persons


( ID int NOT
NULL,
Name
varchar(255), Age
int,
CHECK (Age>=18)
);

DEFAULT - Sets a default value for a column when no value is specified

INDEX - Used to create and retrieve data from the database very quickly

4) Explain schema change statements in SQL.

Schema Change Statements in SQL


a) Drop
b) Alter

i) Drop table: It is used to remove a relation (Base table) and its definition. The relation
can no longer be used for querying, updates or any other commands since its
description no longer exists
Ex: Drop table dependent;
ii) Alter table: It is used to add an attribute to one of the base relations. The new
attribute will have NULL’s in all tuples of the relation right after the command is
executed, hence the NOT NULL constraint is not allowed for such an attribute.
Ex: Alter table Employee add JOB varchar2 (12);

The ALTER TABLE command adds, deletes, or modifies columns in a table.

The ALTER TABLE command also adds and deletes various constraints in a table.

Ex: ALTER TABLE Customers


ADD Email varchar(25);

ALTER TABLE Customers


DROP COLUMN Email;

DROP is used to delete a whole database or just a table. The DROP statement destroys the
objects like an existing database, table, index, or view.

Examples:
DROP TABLE table_name;
table_name: Name of the table to be deleted.

DROP DATABASE database_name;


database_name: Name of the database to be deleted.

5) Consider the following tables:

WORKS(PNAME,CNAME,SALARY)
LIVE(PNAME,STREET,CITY)
LOCATEDI-IN(CNAME,CITY)
MNAGER(PNAME,MGRNAME)
Write the sql query for the following:
i) Find the names of all persons who live in city ‘mumbai’.
ii) Retrieve the names of all persons of Infosis’ whose salary is between rs.30,000 and rs
50,000.
iii) Find the names of all persons who live and work in the same city.
iv) list the names of the people who work for ‘wipro’ along with the cities they live in.
iv) find the average salary of all ‘Infosians’

Select Pname from MNAGER,LIVE where CITY=’MUMBAI’;

Select pname from MNAGER M ,WORKS W WHERE W.pname =M.pname and


SALARY IN
(Select salary from Works where (salary >30000 and salary <50000) );
Select pname from MNAGER M,LIVE L,WORK W where M.pname=L.Pname and
City in( Select pname from LOCATED where City=’Bombay’);

Select Pname,city from Works W,Located-in L1 where W.pname=L1.pname


and CNAME=’wipro’ .

Select avg(Salary) from Works W where CNAME=’infosys’;

6. Consider the following tables:

EMPLOYEE (FNAME, MINIT, LNAME, SSN, BDATE, ADDRESS, SEX, SALARY,


#SUPERSSN, #DNO)
DEPARTMENT (DNAME, DNUMBER, #MGRSSN, MGRSTARTDATE)
DEPT_LOCATIONS (#DNUMBER, DLOCATION)
PROJECT (PNAME, PNUMBER, PLOCATION, #DNUM)
WORKS_ON (#ESSN, #PNO, HOURS)
DEPENDENT (#ESSN, DEPENDENT_NAME, SEX, BDATE, RELATIONSHIP)

Retrieve the birthdate and address of the employee whose name is "John B. Smith".

Select bdate, address from Employee where fname = 'John' and minit = 'B' and lname =
'Smith' ;

Retrieve the name and address of all employees who work for the ``Research'' department.

Select fname,lname,address from Employee, Department where dname = 'Research' and


dnumber = dno;

For every project located in "Stafford", list the project number, the controlling department
number, and the department manager's last name, address birthdate.

Select pnumber, dnum, lname, address, bdate from Project, Department, Employee
where dnum = dnumber and mgrssn = ssn and plocation = 'Stafford' ;

Find the names of employees who work on *all* the projects controlled by department
number 5.

Select fname,lname from Employee where not exists ((select pnumber from Project where
dnum = 5) except (select pno from Works_On where ssn = essn)) ;

Retrieve the name of all employees who have no dependents.


Select fname,lname from Employee where not exists (Select * from Dependent where ssn =
essn) ;
Find the sum of the salaries of all the employees, the maximum, the minimum and avergae
salaries.

Select sum(salary), max(salary), min(salary), avg(salary) from Employee ;


Retrieve the total number of employees in the "Research" department.

Select count(*) from Employee, Department where dno = dnumber and dname='Research' ;

For each dept, retrieve the dept number, the number of employees in the dept and their
average salary.

Select dno, count(*), avg(salary) from Employee group by dno ;

For each project on which more than two employees work, retrieve the project number,
the project name and the number of employees who work on that project.
Select pnumber, pname, count(*) from Project, Works_On where pnumber = pno
group by pnumber, pname having count(*) > 2 ;

For each dept having more than 5 employees, retrieve the department number, and the
number of employees making more than $40,000.
Select dname, count(*) from Department, Employee where dnumber = dno and salar>40000
group by dname having count(*) > 5 ;
M odule-IV
1) Discuss informal design guidelines of relational schema.
1. Semantics of the Attributes
2. Reducing the Redundant Value in Tuples.
3.Reducing Null values in Tuples.
4.Dissallowing spurious Tuples.
1. Semantics of the Attributes
Whenever we are going to form relational schema there should be some
meaning
among the attributes.This meaning is called semantics.This semantics relates
one
attribute to another with some relation.
Eg:
USN No Student name Sem

2. Reducing the Redundant Value in Tuples


Mixing attributes of multiple entities may cause problems
Information is stored redundantly wasting storage
Problems with update anomalies
Insertion anomalies
Deletion anomalies
Modification anomalies
Student name Sem

Eg:
Dept No Dept Name
If we integrate these two and is used as a single table i.e Student Table

USN Student name Sem Dept No Dept Name


No
Here whenever if we insert the tuples there may be ‘N’ stunents in one
department,so
Dept No,Dept Name values are repeated ‘N’ times which leads to data
redundancy.
Another problem is updata anamolies ie if we insert new dept that has no
students.
If we delet the last student of a dept,then whole information about that
department will be deleted
If we change the value of one of the attributes of aparticaular table the we
must update
the tuples of all the students belonging to thet depy else Database will
become
inconsistent.
Note: Design in such a way that no insertion ,deletion,modification anamolies
will occur
3. Reducing Null values in Tuples.
Note: Relations should be designed such that their tuples will have as few
NULL
values as possible
Attributes that are NULL frequently could be placed in separate relations (with
the
primary key)Reasons for nulls:
attribute not applicable or invalid
attribute value unknown (may exist)
value known to exist, but unavailable
4. Disallowing spurious Tuples
Bad designs for a relational database may result in erroneous results for
certain JOIN
operations
The "lossless join" property is used to guarantee meaningful results for join
operations
Note: The relations should be designed to satisfy the lossless join condition.
No
spurious tuples should be generated by doing a natural-join of any relations.

2) Define functional dependency? Give axioms and inference rules of FD.


Functional Dependency

The functional dependency is a relationship that exists between two attributes. It typically
exists between the primary key and non-key attribute within a table.
Functional dependency can be written as:

Emp_Id → Emp_Name

Axioms and Inference Rule (IR):


the Armstrong's axioms are the basic inference rule.
Armstrong's axioms are used to conclude functional dependencies on a relational
database.
1. Reflexive Rule (IR1)
In the reflexive rule, if Y is a subset of X, then X determines Y.
If X ⊇ Y then X → Y
2. Augmentation Rule (IR2)
The augmentation is also called as a partial dependency. In augmentation, if X determines Y,
then XZ determines YZ for any Z.
If X → Y then XZ → YZ
3. Transitive Rule (IR3)

In the transitive rule, if X determines Y and Y determine Z, then X must also


determine Z.
If X → Y and Y → Z then X → Z

4. Union Rule (IR4)

Union rule says, if X determines Y and X determines Z, then X must also determine Y and Z.
If X → Y and X → Z then X → YZ

3) Define normalization and discuss the different types of anomalies occur in Normalization.
Normalization
Database Normalization is a technique of organizing the data in the database.

If a database design is not perfect, it may contain anomalies, which are like a bad dream for
any database administrator. Managing a database with anomalies is next to impossible.

Update anomalies − If data items are scattered and are not linked to each other properly,
then it could lead to strange situations. For example, when we try to update one data item
having its copies scattered over several places, a few instances get updated properly while a
few others are left with old values. Such instances leave the database in an inconsistent
state.

Deletion anomalies − We tried to delete a record, but parts of it was left undeleted
because of unawareness, the data is also saved somewhere else.

Insert anomalies − We tried to insert data in a record that does not exist at all.

Normalization is a method to remove all these anomalies and bring the database to a
consistent state.
4) What is the need for normalization? Discuss 1st NF,2nd NF and 3rd NF with example.
The purpose of normalization.
• The problems associated with redundant data.
• The identification of various types of update anomalies such as insertion, deletion, and
modification anomalies.
• How to recognize the appropriateness or quality of the design of relations.
• The concept of functional dependency, the main tool for measuring the appropriateness
of attribute groupings in relations.

1st NF : A relation is in 1NF if it contains an atomic value.


We re-arrange the relation (table) as below, to convert it to First Normal Form.

Each attribute must contain only a single value from its pre-defined domain.

2nd NF : A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional
dependent on the primary key.

We re-arrange the relation (table) as below, to convert it to 2nd Normal Form.

3rd NF : A relation will be in 3NF if it is in 2NF and no transition dependency exists.

We find that in the above Student_detail relation, Stu_ID is the key and only prime key
attribute. We find that City can be identified by Stu_ID as well as Zip itself. Neither Zip is a
superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so there exists
transitive dependency.
To bring this relation into third normal form, we break the relation into two relations as
follows

5) Define BCNF? How does it differ from 3NF? why it’s considered stronger from 3NF.

Boyce-Codd Normal Form

Boyce-Codd Normal Form (BCNF) is an extension of Third Normal Form on strict terms. BCNF
states that −

For any non-trivial functional dependency, X → A, X must be a super-key.

In the above image, Stu_ID is the super-key in the relation Student_Detail and Zip is the
super-key in the relation ZipCodes. So,

Stu_ID → Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.

6) Explain the different properties of relational decomposition.


7) What is multivalued dependency? What type of constraints does it specify?
Multivalued Dependency

Multivalued dependency occurs when two attributes in a table are independent of each
other but, both depend on a third attribute.
A multivalued dependency consists of at least two attributes that are dependent on a
third attribute that's why it always requires at least three attributes.

Example: Suppose there is a bike manufacturer company which produces two colors(white
and black) of each model every year.

Here columns COLOR and MANUF_YEAR are dependent on BIKE_MODEL and


independent of each other.

BIKE_MODE MANUF_YEA COLO


L R R
M2011 2008 White
M2001 2008 Black
M3001 2013 White
M3001 2013 Black
M4006 2017 White
M4006 2017 Black

In this case, these two columns can be called as multivalued dependent on BIKE_MODEL.
The representation of these dependencies is shown below:

1. BIKE_MODEL → → MANUF_YEAR
2. BIKE_MODEL → → COLOR

This can be read as "BIKE_MODEL multidetermined MANUF_YEAR" and "BIKE_MODEL


multidetermined COLOR".

M odule-V
1) Discuss ACID properties?
a. Atomicity b. Consistency c. Isolation d. Durability
Atomicity
It states that all operations of the transaction take place at once if not, the transaction is aborted.
There is no midway, i.e., the transaction cannot occur partially. Each transaction is treated as one
unit and either run to completion or is not executed at all.
Atomicity involves the following two operations:
Abort: If a transaction aborts then all the changes made are not visible.
Commit: If a transaction commits then all the changes made are visible.
Consistency
The integrity constraints are maintained so that the database is consistent before and after the
transaction.
The execution of a transaction will leave a database in either its prior stable state or a new stable
state.
The consistent property of database states that every transaction sees a consistent database
instance.
The transaction is used to transform the database from one consistent state to another consistent
state.
Isolation
It shows that the data which is used at the time of execution of a transaction cannot be used by the
second transaction until the first one is completed.
In isolation, if the transaction T1 is being executed and using the data item X, then that data item
can't be accessed by any other transaction T2 until the transaction T1 ends.
The concurrency control subsystem of the DBMS enforced the isolation property.
Durability
The durability property is used to indicate the performance of the database's consistent state. It
states that the transaction made the permanent changes.
They cannot be lost by the erroneous operation of a faulty transaction or by the system failure.
When a transaction is completed, then the database reaches a state known as the consistent state.
That consistent state cannot be lost, even in the event of a system's failure.
The recovery subsystem of the DBMS has the responsibility of Durability property.
2) With neat diagram discuss different states of transaction.

Active state

The active state is the first state of every transaction. In this state, the transaction is being executed.
For example: Insertion or deletion or updating a record is done here. But all the records are still not
saved to the database.
Partially committed
In the partially committed state, a transaction executes its final operation, but the data is still not
saved to the database.
In the total mark calculation example, a final display of the total marks step is executed in this state.
Committed
A transaction is said to be in a committed state if it executes all its operations successfully. In this
state, all the effects are now permanently saved on the database system.
Failed state
If any of the checks made by the database recovery system fails, then the transaction is said to be in
the failed state.
In the example of total mark calculation, if the database is not able to fire a query to fetch the
marks, then the transaction will fail to execute.
Aborted
If any of the checks fail and the transaction has reached a failed state then the database recovery
system will make sure that the database is in its previous consistent state. If not then it will abort or
roll back the transaction to bring the database into a consistent state.
If the transaction fails in the middle of the transaction then before executing the transaction, all the
executed transactions are rolled back to its consistent state.
After aborting the transaction, the database recovery module will select one of the two operations:
Re-start the transaction
Kill the transaction

3) Define transaction and schedules in transaction management.


Transaction
The transaction is a set of logically related operation. It contains a group of tasks.
A transaction is an action or series of actions. It is performed by a single user to perform
operations for accessing the contents of the database.
Example: Suppose an employee of bank transfers Rs 800 from X's account to Y's account. This small
transaction contains several low-level tasks:
X's Account

Open_Account(X)
Old_Balance = X.balance
New_Balance = Old_Balance - 800
X.balance = New_Balance
Close_Account(X)
Y's Account

Open_Account(Y)
Old_Balance = Y.balance
New_Balance = Old_Balance + 800
Y.balance = New_Balance
Close_Account(Y)
Operations of Transaction:

Following are the main operations of transaction:

Read(X): Read operation is used to read the value of X from the database and stores it in a buffer
in main memory.

Write(X): Write operation is used to write the value back to the database from the buffer.

Schedule

A series of operation from one transaction to another transaction is known as schedule. It is used to
preserve the order of the operation in each of the individual transaction.

1. serial schedule

The serial schedule is a type of schedule where one transaction is executed completely before
starting another transaction. In the serial schedule, when the first transaction completes its
cycle, then the next transaction is executed.

2. Non-serial Schedule

 If interleaving of operations is allowed, then there will be non-serial schedule.


 It contains many possible orders in which the system can execute the individual operations
of the transactions.

4) Write a note on concurrent execution of transaction.

Problems with Concurrent Execution

In a database transaction, the two main operations are READ and WRITE operations. So,
there is a need to manage these two operations in the concurrent execution of the transactions
as if these operations are not performed in an interleaved manner, and the data may become
inconsistent. So, the following problems occur with the Concurrent Execution of the
operations:

Problem 1: Lost Update Problems (W - W Conflict)

The problem occurs when two different database transactions perform the read/write
operations on the same database items in an interleaved manner (i.e., concurrent execution)
that makes the values of the items incorrect hence making the database inconsistent.

For example:

Consider the below diagram where two transactions TX and TY, are performed on the
same account A where the balance of account A is $300.

Hence data becomes incorrect, and database sets to inconsistent.

Dirty Read Problems (W-R Conflict)

The dirty read problem occurs when one transaction updates an item of the database, and
somehow the transaction fails, and before the data gets rollback, the updated database item is
accessed by another transaction. There comes the Read-Write Conflict between both
transactions.

For example:

Consider two transactions TX and TY in the below diagram performing read/write


operations on account A where the available balance in account A is $300:
Unrepeatable Read Problem (W-R Conflict)

Also known as Inconsistent Retrievals Problem that occurs when in a transaction, two
different values are read for the same database item.

For example:

Consider two transactions, TX and TY, performing the read/write operations on account
A, having an available balance = $300. The diagram is shown below:

5) What is 2PL (2 phase locking) protocol?


Two-phase locking (2PL)

 The two-phase locking protocol divides the execution phase of the transaction into three
parts.
 In the first part, when the execution of the transaction starts, it seeks permission for the lock
it requires.
 In the second part, the transaction acquires all the locks. The third phase is started as soon
as the transaction releases its first lock.
 In the third phase, the transaction cannot demand any new locks. It only releases the
acquired locks.
There are two phases of 2PL:

Growing phase: In the growing phase, a new lock on the data item may be acquired by the
transaction, but none can be released.

Shrinking phase: In the shrinking phase, existing lock held by the transaction may be
released, but no new locks can be acquired.

In the below example, if lock conversion is allowed then the following phase can happen:

1. Upgrading of lock (from S(a) to X (a)) is allowed in growing phase.


2. Downgrading of lock (from X(a) to S(a)) must be done in shrinking phase.

Example:

The following way shows how unlocking and locking work with 2-PL.

Transaction T1:

 Growing phase: from step 1-3


 Shrinking phase: from step 5-7
 Lock point: at 3

Transaction T2:

 Growing phase: from step 2-6


 Shrinking phase: from step 8-9
 Lock point: at 6
6) Discuss Transaction support in SQL.
A single SQL statement is always considered to be atomic.
Either the statement completes execution without error or it fails and leaves the
database unchanged.
With SQL, there is no explicit Begin Transaction statement. Transaction
initiation is done implicitly when particular SQL statements are encountered.
Every transaction must have an explicit end statement,
COMMIT or ROLLBACK.
Characteristics specified by a SET TRANSACTION statement in SQL2:
 Access mode: READ ONLY or READ WRITE.
The default is READ WRITE unless the isolation level of READ UNCOMITTED is
specified, in which case READ ONLY is assumed.
 Diagnostic size n, specifies an integer value n, indicating the number of conditions
that can be held simultaneously in the diagnostic area.
 Isolation level <isolation>, where <isolation> can be READ UNCOMMITTED, READ
COMMITTED, REPEATABLE READ or SERIALIZABLE. The default is SERIALIZABLE.
With SERIALIZABLE: the interleaved execution of transactions
will adhere to our notion of serializability. However, if any transaction executes at a lower
level, then serializability may be violated.

7) Write a note on ARIES Algorithm.


The ARIES Recovery Algorithm is based on:
1. WAL (Write Ahead Logging)
2. Repeating history during redo:
ARIES will retrace all actions of the database
system prior to the crash to reconstruct the database state when the crash
occurred.
3. Logging changes during undo:
It will prevent ARIES from repeating the
completed undo operations if a failure occurs during recovery, which causes a
restart of the recovery process.
The ARIES recovery algorithm consists of three steps:
1. Analysis: step identifies the dirty (updated) pages in the buffer and the set of
transactions active at the time of crash. redo is to start is also determined.
The appropriate point in the log where
2. Redo:
necessary redo operations are applied.
Write a note on checkpointing.
3. Undo: log is scanned backwards and the operations of transactions active at the
time of crash are undone in reverse order.
The Log and Log Sequence Number (LSN)
A log record is written for (a) data update, (b) transaction commit, (c) transaction abort,
(d)
undo, and (e) transaction end.
In the case of undo a compensating log record is written.
A unique LSN is associated with every log record. LSN increases monotonically and
indicates the disk address of the log record it is associated with.
In addition, each data page
stores the LSN of the latest log record corresponding to a change for that page.
A log record stores (a) the previous LSN of that transaction, (b) the transaction ID, and
(c)
the type of log record. A log record stores:
1. Previous LSN of that transaction:
It links the log record of each transaction. It is like a
back pointer points to the previous record of the same transaction.
2. Transaction ID
3. Type of log record.
4. Page ID for the page that includes the item
5. Length of the updated item
6. Its offset from the beginning of the page
7. BFIM of the item
8. AFIM of the item
A checkpointing does the following:
1. Writes a begin_checkpoint record in the log
2. Writes an end_checkpoint record in the log.
With this record the contents of transaction
table and dirty page table are appended to the end of the log.
3. Writes the LSN of the begin_checkpoint record to a special file.
This special file is accessed during recovery to locate the last checkpoint information.
8) Write a note on checkpointing.

Checkpoint
 The checkpoint is a type of mechanism where all the previous logs are removed from the
system and permanently stored in the storage disk.
 The checkpoint is like a bookmark. While the execution of the transaction, such checkpoints
are marked, and the transaction is executed then using the steps of the transaction, the log
files will be created.
 When it reaches to the checkpoint, then the transaction will be updated into the database,
and till that point, the entire log file will be removed from the file. Then the log file is
updated with the new step of transaction till next checkpoint and so on.
 The checkpoint is used to declare a point before which the DBMS was in the consistent state,
and all transactions were committed.

Recovery using Checkpoint


In the following manner, a recovery system recovers the database from this failure:
 The recovery system reads log files from the end to start. It reads log files from T4 to T1.
 Recovery system maintains two lists, a redo-list, and an undo-list.
 The transaction is put into redo state if the recovery system sees a log with <Tn, Start> and
<Tn, Commit> or just <Tn, Commit>. In the redo-list and their previous list, all the
transactions are removed and then redone before saving their logs.
 For example: In the log file, transaction T2 and T3 will have <Tn, Start> and <Tn, Commit>.
The T1 transaction will have only <Tn, commit> in the log file. That's why the transaction is
committed after the checkpoint is crossed. Hence it puts T1, T2 and T3 transaction into redo
list.
 The transaction is put into undo state if the recovery system sees a log with <Tn, Start> but
no commit or abort log found. In the undo-list, all the transactions are undone, and their
logs are removed.

9) What is Serializibility?How can seriaizability?Justify your answer?

Serializability of Schedules
 If no interleaving of operations is permitted, there are only two possible arrangement
for transactions T1 and T2.
 Execute all the operations of T1 (in sequence) followed by all the operations of T2 (in
sequence) .
 Execute all the operations of T2 (in sequence) followed by all the operations of T1
 A schedule S is serial if, for every transaction T all the operations of T are executed
consecutively in the schedule.
 A schedule S of n transactions is serializable if it is equivalent to some serial schedule
of the same n transactions.
Figure 19:S Example.s of.serial and iionserinl. schedules invn)ving tjansactions T;
‹and r Z. (a) Seri.a1 schedule A: T folloz'cd by Th‘. (‘b)‘Seris] schedule B. Th fo1]oued. by
T . c j.Two nonserial schedules C and D with interleaving of operations.
(bj T

T] T,

You might also like