You are on page 1of 28

Assignment Set – 1

Database Management System (DBMS and Oracle 9i)

1. Explain the functions and advantages of a DBMS over a traditional file


system.
Ans:-
A DBMS is a set of software programs that controls the organization,
storage,  management, and retrieval of data in a database. DBMSs are
categorized according to their data structures or types. The DBMS
accepts requests for data from an application program and instructs the
operating system to transfer the appropriate data. The queries and
responses must be submitted and received according to a format that
conforms to one or more applicable protocols. When a DBMS is used,
information systems can be changed more easily as the organization's
information requirements change. New categories of data can be added
to the database without disruption to the existing system.

Advantages of a DBMS:-

• Management of distributed data with different levels of transparency.


• Increase reliability and availability.
• Easier expansion.
• Reflects organizational structure - database fragments are located in the
departments they relate to.
• Local autonomy - a department can control the data about them (as
they are the ones familiar with it.)
• Protection of valuable data - if there were ever a catastrophic event
such as a fire, all of the data would not be in one place, but distributed in
multiple locations.
• Improved performance - data is located near the site of greatest
demand, and the database systems themselves are parallelized, allowing
load on the databases to be balanced among servers. (A high load on one
module of the database won't affect other modules of the database in a
distributed database.)
• Economics - it costs less to create a network of smaller computers with
the power of a single large computer.
• Modularity - systems can be modified, added and removed from the
distributed database without affecting other modules (systems).
• Reliable transactions - Due to replication of database.
• Hardware, Operating System, Network, Fragmentation, DBMS,
Replication and Location Independence.
• Continuous operation.

2. Describe indexing and clustering techniques with relevant real time


examples.
Ans:-
 The most important characteristics of column data are the
clustering factor for the column and the selectivity of column values,
even though other important characteristics within tables are available to
the CBO. A column called clustering_factor in the dba_indexes view
offers information on how the table rows are synchronized with the
index. When the clustering factor is close to the number of data
blocks and the column value is not row ordered when the
clustering_factor  approaches the number of rows in the table, the
table rows are synchronized with the index.
To illustrate this, the following query will filter the result set using a
column value for clustering_factor:
 
Select
customer_name
from
customer
where
customer_state = ‘New Mexico’;

An index scan is faster for this query if the percentage of customers


in New Mexico is small and the values are clustered on the data blocks.
The decision to use an index versus a full-table scan is atleast partially
determined by the percentage of customers in New México. So, why
would a CBO choose to perform a full-table scan when only a small
number of rows are retrieved? The clustering_factor has the answer.
Four factors synchronize to help the CBO choose whether to
use an index or a full-table scan: the selectivity of a column value; the
db_block_size ; the avg_row_len ; and the cardinality. An index scan is
usually faster if a data column has high selectivity and a low
clustering_factor.

3. Describe various integrity rules with a relevant example.


Ans:-
It is important that data adhere to a predefined set of rules, as
determined by the database administrator or application developer. As
an example of data integrity, consider the tables employees and
departments and the business rules for the information in each of the
tables.
Types of Data Integrity

This section describes the rules that can be applied to table columns to
enforce different types of data integrity.

Null Rule
A null is a rule defined on a single column that allows or disallows
inserts or updates of rows containing a null (the absence of a value) in
that column.

Unique Column Values


A unique value defined on a column (or set of columns) allows the insert
or update of a row only if it contains a unique value in that column (or
set of columns).
Primary Key Values
A primary key value defined on a key (a column or set of columns)
specifies that each row in the table can be uniquely identified by the
values in the key.

Referential Integrity Rules


A rule defined on a key (a column or set of columns) in one table that
guarantees that the values in that key match the values in a key in a
related table (the referenced value).Referential integrity also includes the
rules that dictate what types of data manipulation are allowed on
referenced values and how these actions affect dependent values.

The rules associated with referential integrity are:


• Restrict: Disallows the update or deletion of referenced data.
• Set to Null: When referenced data is updated or deleted, all associated
dependent data is set to NULL.
• Set to Default: When referenced data is updated or deleted, all
associated dependent data is set to a default value.
• Cascade: When referenced data is updated, all associated dependent
data is correspondingly updated. When a referenced row is deleted, all
associated dependent rows are deleted.

Complex Integrity Checking


Complex integrity checking is a user-defined rule for a column (or set of
columns) that allows or disallows inserts, updates, or deletes of a row
based on the value it contains for the column (or set of columns).
Integrity Constraints Description
An integrity constraint is a declarative method of defining a rule for a
column of a table.

Oracle supports the following integrity constraints:


• NOT NULL constraints for the rules associated with nulls in a column
• UNIQUE key constraints for the rule associated with unique column
values
• PRIMARY KEY constraints for the rule associated with primary
identification values
• FOREIGN KEY constraints for the rules associated with referential
integrity. Oracle supports the use of  FOREIGN KEY integrity
constraints to define the referential integrity actions, including:
 Update and delete No Action
 Delete CASCADE
 Delete SET NULL

4. Explain the three-level architecture of a DBMS with a labeled diagram.


Ans:-
  Data are actually stored as bits, or numbers and strings, but it is
difficult to work with data at this level.
It is necessary to view data at different levels of abstraction.
Schema:
• Description of data at some level. Each level has its own schema.
We will be concerned with three forms of schemas:
 physical,
 conceptual, and
 external. 
5. Describe the relational algebra operations with relevant real time
examples.
Ans:-
n order to implement a DBMS, there must exist a set of rules which
state how the database system will behave. For instance, somewhere in
the DBMS must be a set of statements which indicate than when
someone inserts data into a row of a relation, it has the effect which the
user expects. One way to specify this is to use words to write an `essay'
as to how the DBMS will operate, but words tend to be imprecise and
open to interpretation. Instead, relational databases are more usually
defined using Relational Algebra.

Relational Algebra is:


• the formal description of how a relational database operates
• an interface to the data stored in the database itself 
• the mathematics which underpin SQL operations
Operators in relational algebra are not necessarily the same as SQL
operators, even if they have the same name. For example, the SELECT
statement exists in SQL, and also exists in relational algebra. These two
uses of SELECT are not the same. The DBMS must take whatever SQL
statements the user types in and translate them into relational algebra
operations before applying them to the database.

More formally, R × S  is defined as follows:


R × S = {(r 1, r 2, ..., r n, s1, s2, ..., sm)|(r 1, r 2, ..., r n) ∈  R, (s1, s2, ...,
sm) ∈ S }
6.Write about the database system environment.
Ans:-
 
Database Management System (DBMS) is a set of computer
programs that controls the creation, maintenance, and the use of a
database. It allows organizations to place control of database
development in the hands of database administrators (DBAs) and
other specialists. A DBMS is a system software package that helps
the use of integrated collection of data records and files known as
databases. It allows different user application programs to easily
access the same database.DBMSs may use any of a variety
of database models, such as the network model or relational  model.
In large systems, a DBMS allows users and other software to
store and retrieve data in a structured way. Instead of having to write
computer programs to extract information, user can ask simple
questions in a query language.Thus, many DBMS packages
provide Fourth-generation  programming language (4GLs) and other
application development features. It helps to specify the logical
organization for a database and access and use the information within a
database. It provides facilities for controlling data access, enforcing data
integrity, managing concurrency, and restoring the database from
backups. A DBMS also provides the ability to logically
present database information to users.

7. Describe Entity Types, Entity Sets, Attributes and Keys.


Ans:-
Entity Types & Sets:-
  An entity is a term from the entity-relationship model .A relational
model (your database schema) is one of the ways to implement the
ER model .Relational tables represent relations between simple types
like integers and strings, which, in their turn, can represent everything:
entities, attributes, relationships.
• An entity is any object in the system that we want to model and store
information about
• Individual objects are called entities
• Groups of the same type of objects are called entity types or entity sets
• Entities are represented by rectangles (either with round or square
corners)

Attribute:-
• All the data relating to an entity is held in its attributes.
• An attribute is a property of an entity.
• Each attribute can have any value from its domain.
• Each entity within an entity type:
 May have any number of attributes.
 Can have different attribute values than that in any other entity.
 Have the same number of attributes.
•Attributes can be
 simple or composite
 single-valued or multi-valued
• Attributes can be shown on ER models
• They appear inside ovals and are attached to their entity.
• Note that entity types can have a large number of attributes... If all are
shown then the diagrams would be confusing. Only show an attribute if
it adds information to the ER diagram, or clarifies a point.
8. Explain the following database operations with one query example for
each:
A) Insert:- insert statements have the following form:
INSERT INTO table (column1, [column2, ...]) VALUES (value1,
[value2, ...])
The number of columns and values must be the same. If a column is not
specified, the default value for the column is used. The values specified
(or implied) by the INSERT statement must satisfy all the applicable
constraints (such as primary keys, CHECK constraints, and NOT
NULL constraints). If a syntax error occurs or if any constraints are
violated, the new row is not added to the table and an error returned
instead.
Example:
INSERT INTO phone_book (name, number) VALUES ('John Doe',
'555-1212');
B) Delete: - The DELETE statement follows the syntax:
DELETE FROM table_name [ WHERE condition]
Any rows that match the WHERE condition will be removed from the
table. If the WHERE clause is omitted, all rows in the table are removed.
The DELETE statement does not return any rows; that is, it will not
generate a result set.
Executing a DELETE statement can cause triggers to run that can cause
deletes in other tables. For example, if two tables are linked by a foreign
key and rows in the referenced table are deleted, then it is common that
rows in the referencing table would also have to be deleted to maintain
referential integrity.
Delete rows from mytable using a subquery in the where condition:
DELETE FROM mytable WHERE id IN (SELECT id FROM
mytable2);
C) Update:-
 An SQL UPDATE statement changes the data of one or more records in
a table. Either  all the rows can be updated, or a subset may be chosen
using a condition.
The UPDATE statement has the following form:
UPDATE table_name SET column_name = value [, column_name =
value ...] [ WHERE condition]
For the UPDATE to be successful, the user must have data manipulation
privileges ( UPDATE privilege)on the table or column, the updated
value must not conflict with all the applicable constraints (such as
primary keys, unique indexes, CHECK constraints, and NOT NULL
constraints).
Set the value of column C1 in table T  to 1, only in those rows where the
value of column C2 is "a".
UPDATE T SET C1 = 1 WHERE C2 = 'a';
Assignment Set – 2
Database Management System (DBMS and Oracle 9i)

1. Describe the following normalization techniques:


A) Third Normal Form:-
 The normal forms (abbrev. NF) of relational database theory provide
criteria for determining a table's degree of vulnerability to logical
inconsistencies and anomalies. The higher the normal form applicable to
a table, the less vulnerable it is to inconsistencies and anomalies. Each
table has a "highest normal form" (HNF): by definition, a table always
meets the requirements of its HNF and of all normal forms lower than its
HNF; also by definition, a table fails to meet the requirements of any
normal form higher than its HNF.
The normal forms are applicable to individual tables; to say that an
entire database is in normal form n is to say that all of its tables are in
normal form .Every non-prime attribute is non-transitively dependent on
every candidate key in the table.
B) Boyce-Codd Normal Form:-
  Newcomers to database design sometimes suppose that a
normalization proceeds in an iterative fashion, i.e. a 1NF design is first
normalized to 2NF, then to 3NF, and so on. This is not an accurate
description of how normalization typically works. A sensibly designed
table is likely to be in 3NF on the first attempt; furthermore, if it is 3NF,
it is overwhelmingly likely to have an HNF of 5NF. Achieving the
"higher" normal forms (above 3NF) does not usually require an extra
expenditure of effort on the part of the designer, because 3NF tables
usually need no modification to meet the requirements of these higher
normal forms. Every non-trivial functional dependency in the table is a
dependency on a super key

2. Describe the theory of Discretionary Access control based on


granting and revoking privileges
Ans:-
Discretionary access control is based on the idea of access rights, or privileges,
and mechanisms for giving users such privileges. A privilege allows a user to
access some data object in a certain manner (e.g. to read or modify). A user who
creates data object such as a table or a view automatically gets all applicable
privileges on that object and the user can also propagate privileges using "Grant
Option". The DBMS subsequently keeps track of how these privileges are granted
to other users, and possibly revoked, and ensures that at all times only users with
the necessary privileges can access an object.
SQL Syntax SQL supports discretionary access control through the GRANT and
REVOKE commands.
The GRANT command gives users privileges to base tables and views.
The REVOKE command cancels uses' privileges.
For example:
GRANT privilege1, privilege2, ... ROVOKE privilege1, privilege2,
ON object_name ON object_name
FROM user1, user2, ... ;  TO user1, user2,
GRANT SELECT, ALTER ROVOKE SELECT, ATLER
ON student ON student
TO db2_14 FROM db2_14

3. Write about Encryption and Public Key Infrastructures.


Ans:-
Encryption Algorithm:-
Input: encryption key K (one key for the entire database), a page of
plaintext values (P bytes), and a random permutation (function)
associated with the page (one for each page) perm :{1,.., P}
→{1,.., P}.Output: a page of ciphertext values.
We consider and encrypt each byte of the page separately:
For I = 1to P,
(1) Let d perm(i) mod | K | I = which is clearly in the range of [0, |
K|-1]
Public Key Infrastructure (PKI) is a set of hardware, software, people,
policies, and procedures needed to create, manage, distribute, use,
store, and revoke digital certificates.[1] In cryptography, a PKI is an
arrangement that binds public keys with respective user identities by
means of a certificate authority (CA). The user identity must be
unique within each CA domain. The binding is established through
the registration and issuance process, which, depending on the level
of assurance the binding has, may be carried out by software at a CA,
or under human supervision. The PKI role that assures this binding is
called the Registration Authority (RA). For each user, the user
identity, the public key, their binding, validity conditions and other
attributes are made unforgettable in public key certificates issued by
the CA. The term trusted third party (TTP) may also be used
for certificate authority (CA). The term PKI is sometimes erroneously
used to denote public key algorithms, which do not require the use of
a CA.

4. Describe the categories of Informix Universal Server.


Ans:-
This used to be where I'd let off steam after uncovering a nasty bug in
the Illustra object-relational database management system. And, in fact,
sometimes I reached such poetic heights of vitriol that I'm leaving the
old stuff at the bottom (also, it might be useful if you are still running
Illustra for some reason).
However, there really aren't any good reasons to pick on Illustra
anymore. The company was bought by Informix, one of the "big three"
traditional RDBMS vendors (Oracle and Sybase being the other two).
Informix basically folded the interesting features of the old Illustra
system into their industrial-strength enterprise-scale RDBMS and calls
the result "Informix Universal Server" (IUS). To the extent that IUS is
based on old code, it is based on Informix's tried and true Online Server,
which has been keeping banks and insurance companies with thousands
of simultaneous users up and running for many years.
I plan to be experimenting with IUS in some heavily accessed sites
during the latter portion of 1997. I'm going to record my experiences
here and hope to have lots of tips and source code to distribute.
5. Describe the following concepts in the context of Joins:
A). Keys:-
A key is an attribute (also known as column or field) or a combination of
attribute that is used to identify records. The purpose of the key is to
bind data together across tables without repeating all of the data in every
table.
(I) Super Key – 
An attribute or a combination of attribute that is used to identify the
records uniquely is known as Super Key. A table can have many Super
Keys.
(II) Candidate Key – 
It can be defined as minimal Super Key or irreducible Super Key. In
other words an attribute or a combination of attribute that identifies the
record uniquely but none of its proper subsets can identify the records
uniquely.
(III) Primary Key – 
A Candidate Key that is used by the database designer for unique
identification of each row in a table is known as Primary Key. A
Primary Key can consist of one or more attributes of a table.
(IV) Foreign Key – 
A foreign key is an attribute or combination of attribute in one base table
that points to the candidate key (generally it is the primary key) of
another table. The purpose of the foreign key is to ensurereferential
integrity of the data i.e. only values that are supposed to appear in the
database are permitted.
(V) Composite Key – 
If we use multiple attributes to create a Primary Key then that Primary
Key is called Composite Key.
(VI) Alternate Key – 
Alternate Key can be any of the Candidate Keys except for the Primary
Key.
(VII) Secondary Key – 
The attributes that are not even the Super Key but can be still used for
identification of records (not unique) are known as Secondary Key.
B. Performing a Join :-
A SQL JOIN clause combines records from two or more tables in a
database.[1]  It creates a set that can be saved as a table or used as is. A
JOIN is a means for combining fields from two tables by using values
common to each. ANSI standard SQL specifies four types of  JOINs:
INNER, OUTER, LEFT, and RIGHT. In special cases, a table (base
table, view, or joined table) can JOIN to itself in a self-join. A
programmer writes a JOIN predicate to identify the records for joining.
If the evaluated predicate is true, the combined record is then produced
in the expected format, a record set or a temporary table.
All subsequent explanations on join types in this article make use of the
following two tables. The rows in these tables serve to illustrate the
effect of different types of joins and join-predicates. In the following
tables the Department ID column of the Department table (which can be
designated as Department. Department ID) is the primary key, while
Employee. Department ID is a foreign key.
C. Distinct and Eliminating Duplicates:-
Every field in one record is identical to every field in a different
record, i.e. a duplicate is where there is no way of telling two or more
records apart. If you just need to remove records which are similar (i.e.
one or more fields are identical but there are one or more fields which
are different) then instead refer to how to delete similar records .To
check that you have duplicate records in your table do the following:
select count(*) from MyTable
and
select distinct * from MyTable
Unfortunately there is no way in SQL to delete one of these duplicates
without deleting all of them .They are identical after all, so there is no
SQL query that you could put together which could distinguish between
them.
What you can do is to copy all the distinct records into a new table:
select distinct *
into NewTable
from MyTable
This query will create a new table (NewTable in my example)
containing all the records in the original table but without any records
being duplicated. It will therefore preserve a single copy of those records
which were duplicated.
6. Explain the E-R to relational mapping with a suitable example.
Ans:-
For each regular entity type E in the ER schema,
– create a relation R that includes all the simple attributes of E
– include only simple component attributes of composite attribute
– choose one of the key attributes of E as primary key for R 
• Ex. Employee, Department, Project relations
– primary key :
• Employee(SSN), Department(DNUMBER),Project(PNUMBER.
Upon completion of this module, you will be introduced to the following
 The use of high-level conceptual data models to support database
design.
 The basic concepts associated with the Entity–Relationship (ER)
model, a high-level conceptual data model.
 A diagrammatic technique for displaying an ER model.
 How to identify problems called connection traps, which may
occur when creating an ER model.
 The limitations of the basic ER modeling concepts and the
requirements to model more complex applications using enhanced
data modeling concepts.
 The main concepts associated with the Enhanced Entity–
Relationship (EER) model called specialization/generalization and
categorization.
7. Describe the Concurrency Control Techniques in a DBMS.
Ans:-
Concurrency control in Database management systems (DBMS;
Bernstein et al. 1987, Weikum and Vossen 2001), other transactional
objects, and related distributed applications (e.g., Grid computing and
Cloud computing) ensures that database transactions are performed
concurrently without violating the data integrity of the respective
databases. Thus concurrency control is an essential element for 
correctness in any system where two database transactions or more,
executed with time overlap, can access the same data, e.g., virtually in
any general-purpose database system. A well established concurrency
control theory exists for database systems: serializability theory, which
allows effectively designing and analyzing concurrency control methods
and mechanisms.
To ensure correctness, A DBMS usually guarantees that only
serializable transaction schedules are generated, unless serializability is
intentionally relaxed. For maintaining correctness in cases of failed
(aborted) transactions (which can always happen for many reasons)
schedules also need to have the recoverability property. A DBMS also
guarantees that no effect of committed transactions is lost, and no effect
of aborted (rolled back) transactions remains in the related database.
Overall transaction characterization is usually summarized by the
following ACID rules.
8. Explain the following with respect to files:
A) Sorted Files:-
 Many alternatives exist, each with its strengths and weaknesses:
 Heap (random order) files: Suitable when typical access is a file
scan retrieving all records.
 Sorted Files: Best if records must be retrieved in some order, or
only a `range’ of records is needed.
 Indexes: Data structures to organize records via trees or hashing.
 Like sorted files, they speed up searches for a subset of records,
based on values in certain (“search key”) fields
 Updates are much faster than in sorted files.
B) Heap Files:-
Rows are simply appended to the end of the file as they are inserted.
Hence the file is unordered.Deleted rows will create gaps in file. File
then must be periodically compacted to recover space.
Heap File – Performance:- 
Inserting a row:
•Access path is to append to the end
•Retrieve and store one page
Updating a row:
•Access path is to scan whole file
•Avg. F/2 page transfers if row already exists
• F+1 page transfers if row does not already exist
Deleting a row:
•Access path is scan
•Avg. F/2+1 page transfers if row exists
• F page transfers if row does not exist

You might also like