You are on page 1of 16

Chapter 1

Introduction to DBMS
1.1 INTRODUCTION
DataBase Management System has evolved from a specialized computer application to a
central component of computing environment. Database system plays a vital role in
organizing data about a particular enterprise. Consider an example of a company which
stores data about following:
Departments (Department_ID, Department_Name, Location)
Employees (Employee_ID, Employee_Name, Address, Salary, Department_ID)
Project (Project_Name, Project_ID, Department_ID)
Which may have following relations :
An Employee works in a Departments.
An Employee works on many Projects.
A Department handles many Project.
Therefore, a system is needed which can effectively organize the data and also use it to
analyze and guide operations of the company.
Now–a–days, the amount of information to be stored is increasing enormously and thus the
need of flexible and powerful system is also increasing day–by–day which has the ability
not only to effectively organize or maintain large collection of data but also provides easy
access to data.

1.2 CONCEPT OF DBMS


This section covers basic definitions related to DBMS and also explain various components
of DBMS. Data Base Management System consists of two components :
(i) Database i.e. collection of data
(ii) System or set of programs which are used to access and manage the database.
By incorporating these two components, DBMS organize the information, maintain it and
retrieve it efficiently as and when required. Therefore to use and understand the system, as
well to maintain database a set of programs are needed.
1.2.1 Concept of Data and Database
The word Data is derived from a Latin word ‘datum’ which means ‘to give’, thus data are
given facts from which additional facts can be inferred. Data, are facts or undoubted
information used for different computations or calculations. For example – the facts related
to an employee in a company like his EmployeeID, Name, Salary, Designation etc. are data
but when these data are retrieved or processed to find answers of questions like :
What is Employee No. of employee whose salary is more than 10,000?
What is name of employee whose EmployeeID is 8637?
Then it becomes information. Thus, information is a processed form of data and database
is a logically coherent collection of data, not the information, with same meaning.
A database, is a collection of interrelated data which represents some aspects of real world.
Database has some inherent meaning and is related to a particular group of users or
applications. For example – Database of a college, may contain data about students,
faculties, courses etc. which are related to each other with certain relations like – faculty
teaches students, students are enrolled in courses etc. Thus, we can say that database
contains the data, related to a real-world enterprise, and is designed, built and populated
with data for specific applications related to the enterprise.
1.2.2 Definition of DBMS
A database management system (DBMS) is essentially a collection of interrelated data and
a set of programs to access this data. This collection of data is usually called the database.
Database systems are designed to maintain large volumes of data. Management of data
involves:
• Defining the structures for the storage of data.
• Providing the mechanisms for the security of data against unauthorized access.
The primary objective of a DBMS is to provide an environment that is convenient
and efficient to use, in retrieving information from and storing information into the
database.
The user of the DBMS is provided the following facilities among others:
• Adding empty files to the database.
• Inserting new data into the existing files.
• Retrieving data from the files.
• Updating data in the files.
• Deleting data from the files.
• Removing existing files from the database.
Therefore, DBMS can be used for different purposes besides data storage which are as
follows:
(i) Efficient access to data.
(ii) Avoiding data redundancy and inconsistency.
(iii) Providing security of data.
(iv) Enforcing different integrity constraints.
(v) Providing access for data to multiple users concurrently.

1.3 HISTORY OF DBMS


Most of the software applications focus on the manipulation of the data from the starting
days of computer. So, there is a need arises for a system that helps in storing and
manipulating the data. The first general purpose DBMS was designed by Charles Bachman
in early 1960’s and called as Integrated Data Store. It founded the basis for the network
data model and influenced database system through 1960’s. In 1966, IBM released the first
commercially available DBMS called IMS (Information Management System) which
based on the hierarchical data model and assumes all data relationship to be structured as
hierarchies. Conference on Data Systems Languages (CODASYL) set standards for
network database product in 1969.
Dr. E. F. Codd, an IBM researcher, proposed relational data model in theoretical paper in
1970. The publication of Codd’s paper in early seventies initiated a bulk of activities in
both research and commercial system developments communities and they worked to bring
out a relational DBMS. IBM developed a relational model prototype in 1976. In 1980’s,
the relational model was developed as a standard approach for DBMS. SQL is developed
as a part of IBM’s system R project which becomes a standard query language. So IBM
released first commercially available Database product based on relational model SQL/DS
for interactive operating system in 1981. SQL was standardized and was adopted as a query
language by ANSI and ISO. Many developments were being done in 1980’s and 1990’s in
the area of database system, which include the release of Paradox, DBase, Foxpro and
Access. Different researches worked out to develop a more powerful and rich data model
which can support complex data types.
Later on, Enterprise Resource Planning (ERP) and Management Resource Planning (MRP)
evolved. Both of these packages identify a set of common tasks e.g. human resource
planning, inventory management etc. of a large organization and provide a general
application layer to carry out these tasks. Then after, DBMS get into the revolutionary age
of internet, data stored in the DBMS can now be accessed with the help of web browsers
from anywhere and at any time. Stored data is being provided on the web in the form of
HTML and XML documents. In 2000, the fashionable area for innovation was XML
database. XML databases aim to remove the traditional division between documents and
data, allowing an organization’s information resource to be held in one place, whether they
are highly structured or not. All the database vendors try to develop more advanced DBMS
which can support complex data like video, streaming data, digital libraries on the web.
Thus, the database system evolved from sequential file access to the object-oriented
database system used in present scenario.

1.4 FILE SYSTEM V/S DBMS


Initially, a computer system used by an enterprise mainly performs data processing tasks
i.e. to insert the information about employees, retrieve information about employees of
particular department, accounting functions on salary of employees etc. Since these
systems performed normal record keeping functions, they were called data processing
system. Thus, data processing system is an automated system for processing data of an
organization. The conventional data processing approach is to develop a program (or many
programs) for each application. This results in one or more data files for each application
(fig. 1.1). Some of the data may be common between files. A major drawback of
conventional method is that the storage and access techniques are built into the program.
Therefore, though the same data may be required by two applications, the data will have to
be stored in to different places because each application depends on the way that the data
is stored.
There are various drawbacks of the conventional data file processing environment. Some
of them are listed below.

(i) Data Redundancy and inconsistency


Some data elements like name, address, identification code, are used in various
applications. Since data is required by multiple applications, it is stored in multiple data
files. In most cases, there is a repetition of data files. This is referred to as data redundancy,
and it leads to higher storage and access cost. In addition, it any lead to data inconsistency;
that is, the various copies of the same data may no longer agree.

(ii) Difficulty in accessing the data


Suppose the one of the university department needs to find out the names of all students
who live in a particular city. Because the original designed software does not provide any
report regarding this kind of information. There are two choices to meet this requirement:
either obtain the list of all students and extract the required information manually. The
other way is to ask a programmer to write the necessary application Program. Both
alternatives are unsatisfactory. Therefore, the conventional file-processing environments
do not allow needed data to be retrieved in a convenient and efficient manner. More
responsive data-retrieval systems are required for general use.

(iii) Data Isolation


When data is scattered in different files, and the files may be in different formats, the
retrieval of information from combination of files is difficult to some extent.

Figure 1.1 One to one correspondences between applications and data files

(iv) Integrity problems


The data values stored in the database must satisfy certain types of consistency
constraints. For example, a university maintains the record of each student and requires
that the enrollment number of each student should be unique. Developers enforce these
constraints in the system by adding appropriate code in the various application programs.
However, when new constraints are added, it is difficult to change the programs to enforce
them.

(v) Atomicity problems


A computer system, like any other device, is subject to failure. In many applications, it is
crucial that, if a failure occurs, the data be restored to the consistent state that existed prior
to the failure. Consider a program to transfer 1000₹ from the account A to the account B.
If a system failure occurs during the execution of the program, it is possible that the 1000₹
debited from account A but was not credited to Account B. This results in an inconsistent
database state. Clearly, it is essential to database consistency that either both credit or the
debit occurs, or that neither occurs. That is, the fund transfer must be atomic- it must
happen in it is entirely or not at all. It is difficult to ensure atomicity in a conventional file-
processing system.

(vi) Concurrent –access anomalies


For the sake of overall performance of the system and faster response, many systems allow
multiple users to update the data simultaneously. Today the largest internet retailers may
have millions of accesses per day to their data by shoppers. In such an environment,
interaction of concurrent updates is possible and may result in inconsistent data. Consider
a registration program maintains a count of students registered for a course, in order to
enforce limits on the number of students registered. When a student registers, the program
reads the current count for the course, verifies the count back to the database. Suppose two
students registers concurrently, with the count of (say)39, and both would then write back
40, leading to an incorrect increase of only 1, even though two students successfully
registered for the course and the count should be 41.

(vii) Security problems


Not every user of the database system should be able to access all the data. For example,
in a university, accounts personnel need to see only the part of database that has financial
information. They do not need access to information about academic records. But, since
application programs are added to the file-processing system in an ad hoc manner,
enforcing such security constraints are difficult.

1.4.1 Advantage of DBMS over File System


File system stores data in the form of records and data which are files managed by operating
system and uses application program to extract information from the file.
A major advantage the database approach has over the conventional approach is that a
database system provides centralized control of data.

(i) Reduced Redundancy


Unlike conventional approach each application does not have to maintain its own data files.
Data can be integrated and used by multiple applications at the same time.

(ii) Ensure Consistency


It becomes very difficult to maintain consistent format of files in file system. Different
programmers can use different programming languages, which may cause duplication of
information in several files. This duplication results in higher storage and access cost. In
addition, it may lead to data inconsistency i.e. various copies of same information may not
agree. For example, consider an employee management system, if address of an employee
which is stored at two places and is updated at only one place then the system will give
conflicting information and become inconsistent. The DBMS can guarantee that the
database is never inconsistent, by providing a fix format of data and by ensuring that a
change made to any entry, automatically applies to the other entries as well. This process
is known as propagating updates.

(iii) Data Manipulation Capabilities


File system requires an application program for processing the data stored in files according
to needs of user. If the user needs get changed then a different application program is
required. For example consider the employs management system. Suppose we want to find
name of employees in city “Jaipur”, then either new application program is developed or
we have to find out the name of employee having city as Jaipur manually in the case of
files system. This method is not an efficient process as developing a new application
program takes a lot of time and it is possible that after development of program, our needs
changes from finding employees in Jaipur to find employees in ‘Malviya Nagar, Jaipur’.
Database system can solve such problems by simply firing queries to the database as
needed and retrieve answer in response.

(iv) Data Independence (Reduced Programming Efforts)


In non-database systems, the requirements of the application dictate the way in which the
data is stored, and the access techniques. Besides, the knowledge of the organization of the
data and the access techniques are built into the logic and code of the application. These
systems are data dependent. Consider this example, suppose the university (mentioned
previously) has an application that processes the student file. For performance reasons, the
file is indexed on the roll number. The application would be aware of the existing index,
and the internal structure of the application would be built around this knowledge. Now,
consider that for some reason, the file is to be indexed on the registration date. In this case,
it is impossible to change the structure of the stored data without affecting the application
too. Such an application is a data dependent one. It is desirable to have data independent
applications. Suppose two applications X and Y need to access the same file. However,
both applications require a particular field to be stored in different formats. Application X
requires the field “customer-balance” to be stored in decimal format, while application Y
requires it be stored in binary format. This would pose a problem in the old systems. In a
DBMS, differences may exist in the way that data is actually stored, and the way that it is
seen and used by a given application. To conform to the changing requirements of the
enterprise, the DataBase Administrator (DBA) may need to change the storage structure or
access techniques. The DBA should be able to do this without having to modify the existing
applications. If applications are data dependent, programmer effort, that could otherwise
be available for the creation of new applications, would be necessary to modify existing
applications to match the changes made.

(v) Atomicity and Transaction Management


File system does not ensure completion of transaction and it may cause problem of data
inconsistency. For example, consider employee management system where company
wants to shift an employee from sales department to finance department. The procedure
for this transaction is to perform two operations, reduction in number of employees in sales
department and increment in the number of employee in finance department, but in file
system may combine of both operations cannot be guaranteed as we cannot make a single
unit of these two operations and if only one operation is performed and system crashes then
the database will become inconsistent. This problem can easily be solved by database
management system. It ensures completion of whole transaction which combines more
than one operation or no operation will be performed on behalf of the transaction. This
property is called ‘atomicity’.

(vi) Security
File system does not provide any security to the data stored, as there are no authentication
rights provided to user for the file. Complete file is at expose of user. The DBA has to
guarantee that only authorized persons have access to the database. The DBA defines the
security checks to be carried out. Different checks can be applied to different operation on
the same data. For instance, a person may have the access rights to query on a file, but may
not have the rights to delete or update that file. The DBMS allows such security checks to
be established for each piece of data in the database.

(vii) Integrity
Inconsistency between two entries can lead to integrity problems. However, even if there
is no redundancy, the database can still be inconsistent. For example, a student may be
enrolled in 10 courses in a semester when the maximum number of courses, one can enroll
is 7. Another example could be that of a student enrolling in a course that is not being
offered that semester. Such problems can be avoided in a DBMS by establishing certain
integrity checks to be carried out whenever any update operation is done.

(viii) Data administration


When several users share the data, centralizing the administration of data can offer
significant improvements. Experienced professionals who understand the nature of the
data being managed, and how different groups of users use it, can be responsible for
organizing the data representation to minimize redundancy and fine-tuning the storage
of the data to make retrieval efficient.
(ix) Concurrent access and crash recovery
A DBMS schedules concurrent accesses to the data in such a manner that users can
think of the data as being accessed by only one user at a time. Further, the DBMS
protects users from the effects of system failures.

(x) Reduced application development time


Clearly, the DBMS supports many important functions that are common to many
applications accessing data stored in the DBMS. This, in conjunction with the high-
level interface to the data, facilitates quick development of applications. Such
applications are also likely to be more robust than applications developed from scratch
because many important tasks are handled by the DBMS instead of being implemented
by the application.

1.5 DISADVANTAGES OF DBMS


In spite of many advantages, DBMS does not prove to be powerful or advantageous system
in certain scenarios due to following:
i. Overhead for providing security, integration of data, transaction management,
concurrency control etc.
ii. More investment is required for hardware and software.
iii. Special training is required to use DBMS.
iv. Its performance may not be adequate for certain specialized applications.
v. Many applications may need to manipulate the data in ways not supported by the
query language.

So, it is quite advantageous to use file system in certain situations, which are:
a. Database and application are simple and not expected to change.
b. Concurrent access is not required.
c. Real time applications as time constraints are not easy to maintain with DBMS.

1.6 DESCRIBING AND STORING DATA IN DBMS


DBMS is always concerned with some real-world enterprise. Data stored in DBMS
describe real world entities and represent relationships between these entities. For example,
there are employees, departments and projects in a company and data in the company
database describe these entities, in terms of their attributes and relationship to other entities.
Data can be described through different data model and at different levels of abstraction.

1.6.1 Data Abstraction


Data abstraction is one of the fundamental characteristics of any database management
system, which helps in making data more accurate and easy to use. Abstraction refers to
the act of representing essential features without including background details or
explanations. So, data abstraction refers to the act of representing data without giving
details that how data are stored or maintained. Data abstraction prevents irrelevant
information at a particular level. Complexity of data is hiding through several levels of
abstraction so as to simplify user interaction with the system. Different levels of abstraction
are:

Figure 1.2 Level of Data Abstraction


(i) Physical Level or Internal Level
It is the lowest level of abstraction which specifies storage details that how data are actually
stored on disks or on tapes. It specifies in the manner in which records are stored either as
the collection of pages or as the collection of records. Complex low level data structures
are described in detail at this level. The design of data structure described at this level is
called physical schema. The data structure at this level may include B trees, B+ trees,
hashing etc.

(ii) Logical Level or Conceptual View


The next higher level of abstraction describes what data are stored in the database, and
what relationship exists among those data. There is only one conceptual schema per
database. This schema also contains the method of deriving the objects in the conceptual
view from the internal views. The description of data at this level is in a format independent
of its physical representation. It also includes features that specify the checks to retain data
consistency and integrity. The logical level of abstraction is used by database
administrators, who decide what information is to be kept in the database

(iii) View Level


It is the highest level of abstraction which describes different views of the entire database.
These views are designed according to the requirements of user who wants to access only
a part of the database. A database may have several views, according to the demand of
individual user or the group of users. The data in these views are not exactly stored in
DBMS but they are computed using specification of view described by user. An analogy
to the concept of data types in programming language may clarify the distinction among
levels of abstraction. Most high-level programming languages support the notion of a
record type.

Customer
Cust_ID TYPE = BYTE (4), OFFSET = 0
External
Loan_No TYPE = BYTE (4), OFFSET = 4
Amount_in_Dollars TYPE = BYTE (7), OFFSET = 8

Customer_Loan
Cust_ID : 101
Loan_No : 1011 Conceptual
Amount_in_Dollars : 8755.00

CREATE TABLE Customer_Loan (


Cust_ID NUMBER(4)
Internal
Loan_No NUMBER(4)
Amount_in_Dollars NUMBER(7,2))

Figure 1.3 Database Abstraction

At physical level, a customer, account, employee record can be described as a block of


consecutive storage locations for example words or bytes. The language compiler hides
this level of details from the programmers. Similarly, the database system hides many of
the lowest level storage details from database programmers.
At logical level, each such record is described by a type definition and the interrelationship
among these record types is defined. Programmers using a programming language work at
this level of abstraction. Similarly, database administrators usually work at this level of
abstraction.
Finally, at the view level, computer users see a set of application programs that hide details
of the data type. Similarly, at the view level, several views of the database are defined, and
database users see these views. In addition to hiding details of the logical level of the
database, the views also provide a security mechanism to prevent users from accessing
parts of the database.

1.6.2 Instances and Schemas


The overall design of the database is called the database schema. The collection of
information stored in the database at a particular moment is called an instance of the
database. Schemas are changed infrequently. Database systems have several schemas,
partitioned according to the levels of abstraction. At the lowest level is the physical schema,
at the intermediate level is the logical schema and at the highest level is a subschema. In
general database system supports one physical schema, one logical schema, and several
subschemas.

1.7 DATA INDEPENDENCE


Three levels of abstraction, along with the mappings from internal to conceptual and from
conceptual to external; provide two distinct levels of data independence: Logical and
physical data independence.
1.7.1 Logical Data Independence
Indicates that conceptual/logical schema can be changed without affecting the existing
external (view) schemas. The change would be absorbed by the mapping between the
external and conceptual (logical) levels. Consider a change in the conceptual view such as
merging two records into one or adding fields to an existing record. This would require a
change in the mapping from the external view to the conceptual view so as to leave the
external view unchanged. Some changes such as the deletion of a conceptual view field or
record may require changes in the external view and application program.
1.7.2 Physical Data Independence
Indicates that the physical storage structures or devices used for storing the data could be
changed without necessitating a change in the conceptual view or any of the external views.
The change would be absorbed by the mapping between the conceptual and internal levels.
Modifications at physical level are occasionally to improve performance

1.8 DATABASE LANGUAGES


Most of the database management systems provide specialized languages called database
languages, to interact with database or to get some job done from the database. Commands
of these languages provide facility to the user to operate and manage the database
efficiently. Importance of database to the user is dependent on the ease with which
information can be obtained from it and one of the biggest reasons of popularity of
relational database management system is that it allows a rich class of questions to ask
from the database in an easy manner. Consider our sample Employee database, a user may
ask:
(i) Who is getting highest salary?
(ii). How many employees are working under Mr. Reddy ?
(iii) In which department employee’s strength is highest?
Such questions which involve the data stored in a database are called queries and database
management systems provides a specialized language called query language to ask queries
from database.
A DBMS provides facility of data manipulation like retrieval of data, insertion of data,
modification of existing data and deletion of data etc. Such facilities are given by Data
Manipulation Language (DML) commands. Query language is only one part of DML.
DBMS also supports some commands that can make changes in the structure of the
database i.e. schema, such commands are known as Data Definition Language (DDL)
commands. The DML and DDL are collectively known as data sublanguages, when
embedded within a host language. DDL make changes in Meta data (data about data) stored
in data dictionary and DML retrieve data from a database (query) or make changes in an
instance of the database.
Relational model supports a powerful mathematical logic based language, called relational
calculus which is a nonprocedural language for defining query solutions. Queries in this
language have precise meaning.

There are two types of relational calculus.


(i) Tuple Relational Calculus
(ii) Domain Relational Calculus
Relation algebra is another formal, procedural query language, based on a collection of
operators for manipulating relations.

1.9 TRANSACTION MANAGEMENT:


A transaction is a sequence of one or more SQL statements that together forms a logical
unit of work. Each statement in the transaction performs a part of the task, but all of them
are required to complete the task. All the statements forming a transaction must be executed
for the database to be in a consistent state. A transaction occurs when the database is
modified. Here is an example of a typical transaction. A customer orders a product. The
order processing program will:
(i) Query the table containing product details to see if the product is in stock.
(ii) Insert the order details into a table that holds the order details.
(iii) Update the table containing product details to reduce the quantity on hand by the
quantity order.
These three actions, in the sequence shown above, form a single logical unit which is called
transaction. As a rule, the statements in a transaction are executed as a single unit of work
in the database. Either all the statement will be executed successfully, or none of the
statements will be executed. The concept of transaction processing is critical for programs
that update a database because it ensures the integrity of the database.
The DBMS is responsible for ensuring the consistent state of the database even in the case
of the application program aborting in the middle of the transaction, or a hardware failure
in the middle of a transaction. Consider the above example of a transaction. What do you
suppose will happen if the order processing program aborted after step (ii)? The database
would reflect a partial transaction. It would be in an inconsistent state. To maintain
consistency, the DBMS undoes all the changes made if a transaction is not completed.
Thus, a DBMS guarantees that if a transaction executes some updates, and then for
whatever reasons, a failure occurs before the transaction reaches its normal termination,
then those updates will be undone.
To ensure integrity of the data, we require that the database system maintains the following
properties of the transactions:
(i) Atomicity
Either operations of the transactions are reflected properly in the database or none are. The
basic idea behind ensuring atomicity is as follows. The database system keeps track of the
old values of any data on which a transaction performs a write, and, if the transaction does
not complete its execution the old values are restored to make it appear as though the
transaction never executed. It is handled by a component called the Transaction Manager.
(ii) Consistency
If the database is consistent before an execution of the transaction, the database remains
consistent after the execution of the transaction. Execution of a transaction in isolation
preserves the consistency of the database. Ensuring consistency for an individual
transaction is the responsibility of the application programmer who codes the transaction.
(iii) Isolation
Even if the consistency and atomicity properties are ensured for each transaction, if several
transactions are executed concurrently, their operations may interleave in some undesirable
way, resulting in an inconsistent state. The isolation property of a transaction ensures that
the concurrent execution of transaction results in a system state that is equivalent to a state
that could have been obtained had these transactions executed one at a time in some order.
(iv) Durability
The durability property guarantees that, once a transaction completes successfully, all the
updates that it carried out on the database persist, even if there is a system failure after the
transaction completes execution. We can guarantee durability by ensuring that either
(a) The updates carried out by the transaction have been written to disk before the
transaction completes
(b) Information about the updates carried out by the transaction and written to disk is
sufficient to enable the database to reconstruct the updates when the database
system is restarted after the failure. Ensuring durability is the responsibility of a
component of the database system called the Recovery Manager.

1.10 STORAGE MANAGEMENT


Main memory of a computer cannot be used to store a medium size database which may
be in gigabytes or even in terabytes therefore data are stored in disks and moved between
disk and main memory. Database management system should structure the data in such a
way that movement of data between main memory and disk may be reduced up to the
lowest level possible. Minimizing data movement will improve data access speed because
data movement to and from the disk is slow relative to the speed of central processing unit.
The goal of database management system is to simplify and facilitate access to data. The
information that how the data are stored on physical media is hidden from the user. The
performance of system regarding data access speed can be measured in terms of response
time i.e. time elapsed between submission of command and data to the system and getting
the result of computation. Response time depends on the efficiency of the data structures
used to represent these data in the database and on how efficiently the system is able to
operate on these data structures. The storage manager component of database management
system is responsible for storing, retrieving and updating data in the database. It provides
an interface between the low–level data stored in the database and the application programs
and queries submitted to the system, it also translates the various DML statements into
low–level file system commands.

1.11 DATABASE USERS


Different people can use the database in a different manner. The interaction between user
and the database may be of several types according to the user. We can have four types of
users differentiated by the way they expect to interact with the DBMS :
1.11.1 Database Administrator
Databases of an enterprise are typically important enough and complex enough that the
task of designing and maintaining it requires a professional, called the DataBase
Administrator (DBA). DBA has central control over the system. The database
administrator is responsible for following functions :

(i) Schema Design and Maintenance


The DBA creates the database schema after interacting with the users of the system
and analyzing what data are to be stored in the database. He or she writes a set of definitions
that are translated by DDL compiler into a set of tables and are stored in data dictionary.
(ii) Physical Schema and Organisation Modification
DBA is responsible for designing physical schema and deciding how the data will be stored
on the physical media. DBA defines storage structure and access methods by writing a set
of definitions.
(iii) Authorization and Security
DBA is responsible for ensuring that unauthorized data access is not permitted. The DBA
also decides which parts of the database, various users can access and in which mode (read,
write or both). Database administrator grants different types of authorizations to different
users. For example, a clerk may be authorized to view salaries of different employees but
he may not be authorized to update salaries. This authority can be given to the accounts
officer only. So different access rights are given to different users according to requirement.
(iv) Integrity Constraint Specification
Data stored in the database must satisfy certain constraints for example in an employee
database, Employee No. must be unique for each employee and Address value must not be
left blank. These constraints are called integrity constraints and it is the responsibility of
DBA to identify all such constraints and apply them to the database.
(v) Recovery from Failure
The DBA is responsible for taking necessary steps required for restoring the database if the
system fails. DBA should keep backups of the database time to time and maintains logs of
system activities so that recovery may become possible.
(vi) Database Upgradation
DBA always tries to know, understand and analyze, changing requirements of user and
make upgradations in the database accordingly.
1.11.2 Application Programmers
They are computer professionals who write application programs. They embedded DML
call in the host language program. These DML calls are converted in the host language
normal procedure call by DML precompiler. The resulting program is then run through the
host language compiler, which generates appropriate object code.
1.11.3 Sophisticated Users
They work like an analyst and submit queries to a giving process or directly to the DBMS,
which breaks down the query into instructions that the storage manager understands.
1.11.4 Specialized Users
Speciallized users write specialized database applications like computer aided design
systems, knowledgebase and expert systems etc. Such systems are different from
traditional data processing framework and uses complex data types.
1.11.5 Naive Users
They use the DBMS only by interacting through application programs written previously.
For example, the clerk at ticket booking window, he uses an application program to do his
job of making reservations for a passenger.

1.12 DATABASE STRUCTURE


A database system is divided in different components. The exact architecture of a DBMS
depends on the operating system on which it has to work because the operating system will
provide the basic services, which database system uses base for accomplishing it’s tasks.

Database system can be divided in to two broad parts.


(i) Query Processor Components
(ii) Storage Manager Components
1.12.1 Query Processor Components
These components are used in evaluating DDL and DML queries and includes following
components.
(i) DML Compiler
It first attempts to transform user’s request into an equivalent but more efficient form and
then translates that into a set of low-level instructions that can be used by query evaluation
engine.
(ii) Embedded DML Pre–Compiler
It converts DML statements embedded in an application program to normal procedure calls
in host language. This pre compiler consult DML complier to generate the appropriate code
(iii) DDL Interpreter
It makes data dictionary, which contains metadata.
(iv) Query Evaluation Engine
It executes low level instructions generated by the DML complier.
1.12.2 Storage Manager Components
It provides the interface between the low-level data stored in the database and the
application programs and queries submitted to the system. These components include:
(i) File Manager
It manages disk space allocation and the data structures used to store the data
(ii) Buffer Manager
It is responsible for fetching data from disk to main memory and decides the caching
strategy suitable for the application.
(iii) Transaction Manager
It ensures consistency of the database after any transaction performed on it.
(iv) Authorization and Integrity Manager
It tests for satisfaction of integrity constraints and checks authority of users to access data.
It uses all the integrity constraints and authorization rules specified by the DBA.

1.12.3 Data Structures Used by DBMS


A DBMS can use several kind of data structures as a part of physical system
implementation. Each structure has its own importance. Following are some common data
structures.
(i) Data Files
It stores the database itself on the disk.
(ii) Data Dictionary
Information pertaining to the structure and usage of data contained in the database, the
metadata, is maintained in a data dictionary. The data dictionary is a database itself, which
documents the data. Each database user can consult the data dictionary to learn what each
piece of data and the various synonyms of the data fields mean. In a system where the data
dictionary is part of the DBMS(Integrated system) the data dictionary stores information
concerning the source of each data-field value, the frequency of its use, and an audit trail
concerning updates, including the who and when of each update.
Currently data dictionary systems are available as add-ons to the DBMS.
The data dictionary stores:
• Names of relations
• Names of the attributes of each relation
• Domains and lengths of attributes
• Names of views defined on the database, and definitions of those views
• Names of authorized users
• Accounting information about users
• Number of tuples in each relation
• Method of storage used for each relation
• Name of the index
• Name of the relation being indexed
• Attributes on which the index is defined
• Type of index formed
(iii) Indices
These are used to provide fast access to data items that hold particular values.
(iv) Statistical Data
It stores statistical information about the data stored in the database, like number of records,
blocks etc. in a table. This information can be used to execute a query efficiently.
Figure 1.4 Diagram of Database Management System

You might also like