You are on page 1of 42

Introduction to Databases

Databases have become ubiquitous, touching almost every activity. Every IT application today uses databases in some form
or the other. They have tremendous impact in all applications, and have made qualitative changes in fields as diverse as
health, education, entertainment, industry, and banking.

Database systems have evolved from the late 1960s hierarchical and network models to today’s relational model.

From the earlier file-based system (basically repositories of data, providing very simple retrieval facilities), they now
address complex environments, offering a range of functionalities in user-friendly environment.

The academic community has strived, and is continuing to strive, to improve these services. At the back of these complex
software packages is mathematics and other research which provides the backbone and basic building blocks of these
systems. It is a challenge to provide good database services in a dynamic and flexible environment in a user-friendly way.
An understanding of the basics of database systems is crucial to designing good applications

INTRODUCTION to DATA ARCHITECTURE

An organisation requires an accurate and reliable data and efficient database system for effective decision-making. To
achieve this goal, the organisation maintains records for its varied operations by building appropriate database models
and by capturing essential properties of the objects and record relationship. Users of a database system in the
organisation look for an abstract view of data they are interested in.

Furthermore, since database is a shared resource, each user may require a different view of the data held in the database.
Therefore, one of the main aims of a database system is to provide users with an abstract view of data, hiding certain
details of how data is stored and manipulated.

To satisfy these needs, we need to develop architecture for the database systems. The database architecture is a
framework in which the structure of the DBMS is described.
The DBMS architecture has evolved from early-centralized monolithic systems to the modern distributed DBMS system
with modular design. Large centralized mainframe computers have been replaced by hundreds of distributed workstations
and personal computers connected via communications networks.

In the early systems, the whole DBMS package was a single, tightly integrated system, whereas the modern DBMS is
based on client-server system architecture. Under the client-server system architecture, the majority of the users of the
DBMS are not present at the site of the database system, but are connected to it through a network. On server machines,
the database system runs, whereas on client machines (which are typically workstations or personal computers) remote
database users work.

The database applications are usually portioned into a two-tier architecture or a three-tier architecture, as shown in Fig.
2.1. In a two-tier architecture, the application is partitioned into a component that resides at the client machines, which
evokes database system functionality at the server machine through query language statements. Application program
interface standards are used for interaction between the client and the server.

Fig. 2.1. Database system architectures

In a three-tier architecture, the client machine acts as merely a front-end and does not contain any direct database calls.
Instead, the client end communicates with an application server, usually through a forms interface. The application server
in turn communicates with a database system to access data. The business logic of the application, which says what
actions to carry out and under what conditions, is embedded in the application server, instead of being distributed across
multiple clients. Three-tier architectures are more appropriate for large applications and for applications that run on the
World Wide Web (WWW).
It is not always possible that every database system can be fitted or matched to a particular framework. Also, there is no
particular framework that can be said to be the only possible framework for defining database architecture. However, in
this topic, a generalized architecture of the database system, which fits most system reasonably well, will be discussed.

SCHEMAS, SUBSCHEMA AND INSTANCES

When the database is designed to meet the information needs of an organisation, plans (or scheme) of the database and
actual data to be stored in it becomes the most important concern of the organisation. It is important to note that the
data in the database changes frequently, while the plans remain the same over long periods of time (although not
necessarily forever).

The database plans consist of types of entities that a database deals with, the relationships among these entities and the
ways in which the entities and relationships are expressed from one level of abstraction to the next level for the users’
view. The users’ view of the data (also called logical organization of data) should be in a form that is most convenient for
the users and they should not be concerned about the way data is physically organized. Therefore, a DBMS should do the
translation between the logical (users’ view) organisation and the physical organization of the data in the database.

2.2.1. Schema
The plan (or formulation of scheme) of the database is known as schema. Schema gives the names of the entities and
attributes. It specifies the relationship among them. It is a framework into which the values of the data items (or fields) are
fitted. The plans or the format of schema remains the same. But the values fitted into this format changes from instance to
instance. In other terms, schema mean an overall plan of all the data item (field) types and record types stored in a
database. Schema includes the definition of the database name, the record type and the components that make up those
records.

Let us look at a Fig. 1.23 and assume that it is a sales record database of M/s ABC, a manufacturing company. The
structure of the database consisting of three files (or tables) namely, PRODUCT, CUSTOMER and SALES files is the schema
of the database. A database schema corresponds to the variable declarations (along with associated type definitions) in a
program.

 Fig. 2.2 shows a schema diagram for the database structure shown in Fig. 1.23. The schema diagram displays the structure
of each record type but not the actual instances of records. Each object in the schema, for example, PRODUCT, CUSTOMER
or SALES are called a schema construct.
Fig. 2.2. Schema diagram for database of M/s ABC Company
(a) Schema diagram for sales record database

(b) Schema defined using database language


Fig. 2.3 shows the schema diagram and the relationships for another example of purchasing system of M/s KLY System.
The purchasing system schema has three records (or objects) namely PURCHASE-ORDER, SUPPLIER, PURCHASE-ITEM,
QUOTATION and PART. Solid arrows connecting different blocks show the relationships among the objects. For example,
the PURCHASE-ORDER record is connected to the PURCHASE-ITEM records of which that purchase order is composed and
the SUPPLIER record to the QUOTATION records showing the parts that supplier can provide and so forth. The dotted
arrows show the cross-references between attributes (or data items) of different objects or records.

Fig. 2.3. Schema diagram for database of M/s KLY System

(a) Schema diagram of purchasing system database


(b) Schema defined using database language
(c) Schema relationship diagrams

As can be seen in Fig. 2.3 (c), the duplication of attributed are avoided using relationships and cross-referencing. For
example, the attributes SUP-NAME, SUP-ADD and SUP-DETAILS are included in separate SUPPLIER record and not in the
PURCHASE-ORDER record. Similarly, attributes such as PART-NAME, PART-DETAILS and QTY-ON-HAND are included in
separate PART record and not in the PURCHASE-ITEM record. Thus, the duplication of including PART-DETAILS and
SUPPLIERS in every PURCHASE-ITEM is avoided. With the help of relationships and cross-referencing, the records are
linked appropriately with each other to complete the information and data is located quickly.
The database system can have several schemas partitioned according to the levels of abstraction. In general, schema can
be categorised in two parts; (a) a logical schema and (b) a physical schema. The logical schema is concerned with
exploiting the data structures offered by a DBMS in order to make the scheme understandable to the computer.
The physical schema, on the other hand, deals with the manner in which the conceptual database shall get represented in
the computer as a stored database. The logical schema is the most important as programs use it to construct applications.
The physical schema is hidden beneath the logical schema and can usually be changed easily without affecting application
programs. DBMSs provide database definition language (DDL) and database storage definition language (DSDL) in order to
make the specification of both the logical and physical schema easy for the DBA.

2.2.2. Subschema
A subschema is a subset of the schema and inherits the same property that a schema has. The plan (or scheme) for a view
is often called subschema. Subschema refers to an application programmer’s (user’s) view of the data item types and
record types, which he or she uses. It gives the users a window through which he or she can view only that part of the
database, which is of interest to him. In other words, subschema defines the portion of the database as “seen” by the
application programs that actually produced the desired information from the data contained within the database.
Therefore, different application programs can have different view of data. Fig. 2.4 shows subschemas viewed by two
different application programs derived from the example of Fig. 2.3.
Fig. 2.4. Subschema views of two applications programs
(a) Subschema for first application program

(b) Subschema for second application program

As shown in Fig. 2.4, the SUPPLIER-MASTER record of first application program {Fig. 2.4 (a)} now contains additional
attributes such a SUP-NAME and SUP-ADD from SUPPLIER record of Fig. 2.3 and the PURCHASE-ORDER-DETAILS record
contains additional attributes such as PART-NAME, SUP-NAME and PRICE from two records PART and SUPPLIER
respectively. Similarly, ORDER-DETAILS record of second application program {Fig. 2.4 (b)} contains additional attributes
such as SUP-NAME, and QTY-ORDRD form two records SUPPLIER and PURCHASE-ITEM respectively.
Individual application programs can change their respective subschema without effecting subschema views of others. The
DBMS software derives the subschema data requested by application programs from schema data. The database
administrator (DBA) ensures that the subschema requested by application programs is derivable from schema.
The application programs are not concerned about the physical organisation of data. The physical organisation of data in
the database can change without affecting application programs. In other words, with the change in physical organisation
of data, application programs for subschema need not be changed or modified. Subschemas also act as a unit for enforcing
controlled access to the database, for example, it can bar a user of a subschema from updating a certain value in the
database but allows him to read it. Further, the subschema can be made basis for controlling concurrent operations on the
database. Subschema definition language (SDL) is used to specify a subschema in the DBMS. The nature of this language
depends upon the data structure on which a DBMS is based and also upon the host language within which DBMS facilities
are used. The subschema is sometimes referred to as an LVIEW or logical view. Many different subschemas can be derived
from one schema.
2.2.3. Instances
When the schema framework is filled in the data item values or the contents of the database at any point of time (or
current contents), it is referred to as an instance of the database. The term instance is also called as state of the database
or snapshot. Each variable has a particular value at a given instant. The values of the variables in a program at a point in
time correspond to an instance of a database schema, as shown in Fig. 2.5.
Fig. 2.5. Instance of the database of M/s ABC Company
(a) Instance of the PRODUCT relation

(b) Instance of the CUSTOMER relation

(c) Instance of the SALES relation

The difference between database schema and database state or instance is very distinct. In the case of a database schema,
it is specified to DBMS when new database is defined, whereas at this point of time, the corresponding database state is
empty with no data in the database. Once the database is first populated with the initial data, from then on, we get
another database state whenever an update operation is applied to the database. At any point of time, the current state
of the database is called the instance.

2.3. THREE-LEVEL ANSI-SPARC DATA BASE ARCHITECTURE


For the first time in 1971, Database Task Group (DBTG) appointed by the Conference on Data Systems and Languages
(CODASYL), produced a proposal for general architecture for database systems. The DBTG proposed a two-tier
architecture as shown in Fig. 2.1 (a) with a system view called the schema and user views called subschemas. In 1975,
ANSI-SPARC (American National Standards Institute – Standards Planning and Requirements Committee) produced a
three-tier architecture with a system catalog. The architecture of most commercial DBMSs available today is based to
some extent on ANSI-SPARC proposal.
ANSI-SPARC three-tier database architecture is shown in Fig. 2.6. It consists of following three levels:
 Internal level,
 Conceptual level,
 External level.

Fig. 2.6. ANSI-SPARC three-tier database structure

The view at each of the above levels is described by a scheme or schema. As explained in Section 2.2, a schema is an
outline or plan that describes the records, attributes and relationships existing in the view. The term view, scheme and
schema are used interchangeably. A data definition language (DDL), as explained in Section 1.10.1, is used to define the
conceptual and external schemas. Structured query language (SQL) commands are used to describe the aspects of the
physical (or internal schema). Information about the internal, conceptual and external schemas is stored in the system
catalog, as explained in Section 1.2.6.

Let us take an example of CUSTOMER record of Fig. 2.2 as shown in Fig. 2.7 (a). The integrated record definition of
CUSTOMER record is shown in Fig. 2.7 (b). The data has been abstracted in three levels corresponding to three views
(namely internal, conceptual and external views), as shown in Fig. 2.8. The lowest level of abstraction of data contains a
description of the actual method of storing data and is called the internal view, as shown in Fig. 2.8 (c). The second level of
abstraction is the conceptual or global view, as shown in Fig. 2.8 (b). The third level is the highest level of abstraction seen
by the user or application program and is called the external view or user view, as shown in Fig. 2.8 (a). The conceptual
view is the sum total of user or external view of data.
Fig. 2.7. CUSTOMER record definition

(a) CUSTOMER record

(b) Integrated record definition of CUTOMER record

Fig. 2.8. Three views of the data


(a) Logical records

(b) Conceptual records


(c) Internal record

From Fig. 2.8, the following explanations can be derived:


 At the internal or physical level, as shown in Fig. 2.8 (c), customers are represented by a stored record type called
STORED-CUST, which is 74 characters (or bytes) long. CUSTOMER record contains five fields or data items namely
CUST-ID, CUST-NAME, CUST-STREET, CUST-CITY and CUST-BAL corresponding to five properties of customers.
 At the conceptual or global level, as shown in Fig. 2.8 (b), the database contains information concerning an entity
type called CUSTOMER. Each individual customer has a CUST-ID (4 digits), CUST-NAME (20 characters), CUST-
STREET (40 characters), CUST-CITY (10 characters) and CUST-BAL (8 digits).
 The user view 1 in Fig. 2.8 (a) has an external schema of the database in which each customer is represented by a
record containing two fields or data items namely CUST-NAME and CUST-CITY. The other three fields are of no
interest to this user and have therefore been omitted.
 The user view 2 in Fig. 2.8 (a) has an external schema of the database in which each customer is represented by a
record containing three fields or data items namely CUST-ID, CUST-NAME and CUST-BAL. The other two fields are
of no interest to this user and have thus been omitted.
 There is only one conceptual schema and one internal schema per database.

2.3.1. Internal Level


Internal level is the physical representation of the database on the computer and this view is found at the lowest level of
abstraction of database. This level indicates how the data will be stored in the database and describes the data structures,
file structures and access methods to be used by the database. It describes the way the DBMS and the operating system
perceive the data in the database. Fig. 2.8 (c)shows internal view record of a database. Just below the internal level there
is physical level data organisation whose implementation is covered by the internal level to achieve routine performance
and storage space utilization. The internal schema defines the internal level (or view). The internal schema contains the
definition of the stored record, the method of representing the data fields (or attributes), indexing and hashing schemes
and the access methods used. Internal level provides coverage to the data structures and file organisations used to store
data on storage devices.
Essentially, internal schema summarizes how the relations described in the conceptual schema are actually stored on
secondary storage devices such as disks and tapes. It interfaces with the operating system access methods (also called file
management techniques for storing and retrieving data records) to place the data on the storage devices, build the
indexes, retrieve the data and so on. Internal level is concerned with the following activities:
 Storage space allocation for data and storage.
 Record descriptions for storage with stored sizes for data items.
 Record placement.
 Data compression and data encryption techniques.
The process arriving at a good internal (or physical) schema is called physical database design. The internal schema is
written using SQL or internal data definition language (internal DDL).

2.3.2. Conceptual Level


The conceptual level is the middlelevel in the three-tier architecture. At this level of database abstraction, all the database
entities and relationships among them are included. Conceptual level provides the community view of the database and
describes what data is stored in the database and the relationships among the data. It contains the logical structure of the
entire database as seen by the DBA. One conceptual view represents the entire database of an organisation. It is a
complete view of the data requirements of the organisation that is independent of any storage considerations.
The conceptual schema defines conceptual view. It is also called the logical schema. There is only one conceptual schema
per database. Fig. 2.8 (b) shows conceptual view record of a database. This schema contains the method of deriving the
objects in the conceptual view from the objects in the internal view. Conceptual level is concerned with the following
activities:
 All entities, their attributes and their relationships.
 Constraint on the data.
 Semantic information about the data.
 Checks to retain data consistency and integrity.
 Security information.

The conceptual level supports each external view, in that any data available to a user must be contained in, or derived
from, the conceptual level. However, this level must not contain any storage-dependent details. For example, the
description of an entity should contain only data types of attributes (for example, integer, real, character and so on) and
their length (such as the maximum number of digits or characters), but not any storage consideration, such as the number
of bytes occupied. The choice of relations and the choice of field (or data item) for each relation, is not always obvious.
The process of arriving at a good conceptual schema is called conceptual database design. The conceptual schema is
written using conceptual data definition language (conceptual DDL).

2.3.3. External Level


The external level is the user’s view of the database. This level is at the highest level of data abstraction where only those
portions of the database of concern to a user or application program are included. In other words, this level describes that
part of the database that is relevant to the user. Any number of user views, even identical, may exist for a given
conceptual or global view of the database. Each user has a view of the “real world” represented in a form that is familiar
for that user. The external view includes only those entities, attributes and relationships in the “real world” that the user is
interested in. Other entities, attributes and relationships that are not of interest to the user, may be represented in the
database, but the user will be unaware of them. Fig. 2.8 (a) shows external or user view record of a database.
In the external level, the different views may have different representations of the same data. For example, one user may
view data in the form as day, month, year while another may view as year, month, day. Some views might include derived
or calculated data, that is, data is not stored in the database but are created when needed. For example, the average age
of an employee in an organisation may be derived or calculated from the individual age of all employees stored in the
database. External views may include data combined or derived from several entities.
An external schema describes each external view. The external schema consists of the definition of the logical records and
the relationships in the external view. It also contains the method of deriving the objects (for example, entities, attributes
and relationships) in the external view from the object in the conceptual view. External schemas allow data access to be
customized at the level of individual users or groups of users. Any given database has exactly one internal or physical
schema and one conceptual schema because it has just one set of stored relations, as shown in Fig. 2.8 (a) and (b). But, it
may have several external schemas, each tailored to a particular group of users, as shown in Fig. 2.8 (a). The external
schema is written using external data definition language (external DDL).
2.3.4. Advantages of Three-tier Architecture
The main objective of the three-tier database architecture is to isolate each user’s view of the database from the way the
database is physically stored or represented. Following are the advantages of a three-tier database architecture:
 Each user is able to access the same data but have a different customized view of the data as per their own needs.
Each user can change the way he or she views the data and this change does not affect other users of the same
database.
 The user is not concerned about the physical data storage details. The user’s interaction with the database is
independent of physical data storage organisation.
 The internal structure of the database is unaffected by changes to the physical storage organisation, such as
changeover to a new storage device.
 The database administrator (DBA) is able to change the database storage structures without affecting the user’s
view.
 The DBA is able to change the conceptual structure of the database without affecting all users.

2.3.5. Characteristics of Three-tier Architecture


Table 2.1 shows degree of abstraction, characteristics and type of DBMS used for the three levels.
Table 2.1. Features of three-tier structure

Abstraction Physical level Internal level Conceptual level External


Features ↓ level → level

Degree of abstraction   Low Medium High Medium

Characteristics   Hardware and Hardware and Hardware and Hardware


Software Software Software and
dependent dependent dependent Software
dependent

Degree of knowledge Hierarchical Attention required Attention required Attention required Attention
required by database DBMS about physical- about physical- about physical- required
designer using level details level details level details about
physical-
level
details

Network Attention required Attention required Attention required Attention


Table 2.1. Features of three-tier structure

Abstraction Physical level Internal level Conceptual level External


Features ↓ level → level

DBMS about physical- about physical- about physical- required


level details level details level details about
physical-
level
details

Relational No concern about No concern about No concern about No


DBMS physical level physical level physical level concern
details details details about
physical
level
details

2.4. DATA INDEPENDENCE

Data independence  is a major objective of implementing DBMS in an organisation. It may be


defined as the immunity of application programs to change in physical representation and access
techniques. Alternatively, data independence is the characteristics of a database system to
change the schema at one level without having to change the schema at the next higher level. In
other words, the application programs do not depend on any one particular physical
representation or access technique. This characteristic of DBMS insulates the application
programs from changes in the way the data is structured and stored. The data independence in
achieved by DBMS through the use of the three-tier architecture of data abstraction. There are
two types of data independence as shown in the mapping of three-tier architecture of Fig. 2.9.

1. Physical data independence.


2. Logical data independence.

Fig. 2.9. Mappings of three-tier architecture


2.4.1. Physical Data Independence

Immunity of the conceptual (or external) schemas to changes in the internal schema is referred
to as physical data independence. In physical data independence, the conceptual schema
insulates the users from changes in the physical storage of the data. Changes to the internal
schema, such as using different file organisations or storage structures, using different storage
devices, modifying indexes or hashing algorithms, must be possible without changing the
conceptual or external schemas. In other words, physical data independence indicates that the
physical storage structures or devices used for storing the data could be changed without
necessitating a change in the conceptual view or any of the external views. The change is
absorbed by conceptual/internal mapping, as discussed in Section 2.5.1.

2.4.2. Logical Data Independence

Immunity of the external schemas (or application programs) to changes in the conceptual
schema is referred to as logical data independence. In logical data independence, the users are
shielded from changes in the logical structure of the data or changes in the choice of relations to
be stored. Changes to the conceptual schema, such as the addition and deletion of entities,
addition and deletion of attributes, or addition and deletion of relationships, must be possible
without changing existing external schemas or having to rewrite application programs. Only the
view definition and the mapping need be changed in a DBMS that supports logical data
independence. It is important that the users for whom the changes have been made should not
be concerned. In other words, the application programs that refers to the external schema
constructs must work as before, after the conceptual schema undergoes a logical reorganisation.

2.5. MAPPINGS

The three schemas and their levels discussed in Section 2.3 are the description of data that actually exists in the physical
database. In the three-schema architecture database system, each user group refers only to its own external schema.
Hence, the user’s request specified at external schema level must be transformed into a request at conceptual schema
level. The transformed request at conceptual schema level should be further transformed at internal schema level for final
processing of data in the stored database as per user’s request. The final result from processed data as per user’s request
must be reformatted to satisfy the user’s external view. The process of transforming requests and results between the
three levels are called mappings. The database management system (DBMS) is responsible for this mapping between
internal, conceptual and external schemas. The three-tier architecture of ANSI-SPARC model provides the following two-
stage mappings as shown in Fig. 2.9:
 Conceptual/Internal mapping
 External/Conceptual mapping

2.5.1. Conceptual/Internal Mapping


The conceptual schema is related to the internal schema through conceptual/internal mapping. The conceptual internal
mapping defines the correspondence between the conceptual view and the stored database. It specifies how conceptual
records and fields are presented at the internal level. It enables DBMS to find the actual record or combination of records
in physical storage that constitute a logical record in the conceptual schema, together with any constraints to be enforced
on the operations for that logical record. It also allows any differences in entity names, attribute names, attribute orders,
data types, and so on, to be resolved. In case of any change in the structure of the stored database, the
conceptual/internal mapping is also changed accordingly by the DBA, so that the conceptual schema can remain invariant.
Therefore, the effects of changes to the database storage structure are isolated below the conceptual level in order to
preserve the physical data independence.

2.5.2. External/Conceptual Mapping


Each external schema is related to the conceptual schema by the external/conceptual mapping. The external/conceptual
mapping defines the correspondence between a particular external view and the conceptual view. It gives the
correspondence among the records and relationships of the external and conceptual views. It enables the DBMS to map
names in the user’s view on to the relevant part of the conceptual schema. Any number of external views can exist at the
same time, any number of users can share a given external view and different external view can overlap.
There could be one mapping between conceptual and internal levels and several mappings between external and
conceptual levels. The conceptual/internal mapping is the key to physical data independence while the
external/conceptual mapping is the key to the logical data independence. Fig. 2.9 illustrates the three-tier ANSI-SPARC
architecture with mappings.
The information about the mapping requests among various schema levels are included in the system catalog of DBMS.
The DBMS uses additional software to accomplish the mappings by referring to the mapping information in the system
catalog. When schema is changed at some level, the schema at the next higher level remains unchanged. Only the
mapping between the two levels is changed. Thus, data independence is accomplished. The two-stage mapping of ANSI-
SPARC three-tier structure provides greater data independence but inefficient mapping. However, ANSI-SPARC provides
efficient mapping by allowing the direct mapping of external schemas on to the internal schema (by passing the
conceptual schema) but at reduced data independence (more data-dependent).

2.6. STRUCTURE, COMPONENTS, AND FUNCTIONS OF DBMS

A database management system (DBMS) is highly complex and sophisticated software that handles access to the
database.
The structure of DBMS varies greatly from system to system and, therefore, a generalised component structure of DBMS
is not possible to make.

2.6.1. Structure of a DBMS

A typical structure of a DBMS with its components and relationships between them is shown in Fig. 2.10. The DBMS
software is partitioned into several modules. Each module or component is assigned a specific operation to perform. Some
of the functions of the DBMS are supported by operating systems (OS) to provide basic services and DBMS is built on top
of it. The physical data and system catalog are stored on a physical disk. Access to the disk is controlled primarily by OS,
which schedules disk input/output. Therefore, while designing a DBMS its interface with the OS must be taken into
account.

Fig. 2.10. Structure of DBMS


2.7. DATA MODELS
A model is an abstraction process that concentrates (or highlights) essential and inherent aspects of the organisation’s
applications while ignores (or hides) superfluous or accidental details. It is a representation of the real world objects and
events and their associations. A data model (also called database model) is a mechanism that provides this abstraction for
database application. It represents the organisation itself. It provides the basic concepts and notations to allow database
designers and end-users unambiguously and accurately communicate their understanding of the organisational data. Data
modelling is used for representing entities of interest and their relationships in the database. It allows the
conceptualisation of the association between various entities and their attributes. A data model is a conceptual method of
structuring data. It provides mechanism to structure data (consisting of a set of rules according to which databases can be
constructed) for the entities being modelled, allow a set of manipulative operations (for example, updating or retrieving
data from the database) to be defined on them, and enforce set of constraints (or integrity rules) to ensure accuracy of
data.
To summarise, we can say that a data model is a collection of mathematically well-defined concepts that help an
enterprise to consider and express the static and dynamic properties of data intensive applications. It consists of the
following:
 Static properties, for example, objects, attributes and relationships.
 Integrity rules over objects and operations.
 Dynamic properties, for example, operations or rules defining new database states based on applied state changes.
Data models can be broadly classified into the following three categories:
 Record-based data models
 Object-based data models
 Physical data models
Most commercial DBMSs support a single data model but the data models supported by different DBMSs differ.
2.7.1. Record-based Data Models
A record-based data models are used to specify the overall logical structures of the database. In the record based models,
the database consists of a number of fixed-format records possibly of different types. Each record type defines a fixed
number of fields, each typically of a fixed length. Data integrity constraints cannot be explicitly specified using record-
based data models. There are three principle types of record-based data models:
 Hierarchical data model.
 Network data model.
 Relational data model.
2.7.2. Object-based Data Models
Object-based data models are used to describe data and its relationships. It uses concepts such as entities, attributes and
relationships. Its definition has already been explained in Chapter 1, Section 1.3.1. It has flexible data structuring
capabilities. Data integrity constraints can be explicitly specified using object-based data models. Following are the
common types of object-based data models:
 Entity-relationship.
 Semantic.
 Functional.
 Object-oriented.
The entity-relationship (E-R) data model is one of the main techniques for a database design and widely used in practice.
The object-oriented data models extend the definition of an entity to include not only the attributes that describe the
state of the object but also the actions that are associated with the object, that is, its behaviour.
2.7.3. Physical Data Models
Physical data models are used for a higher-level description of storage structure and access mechanism. They describe
how data is stored in the computer, representing information such as record structures, record orderings and access
paths. It is possible to implement the database at system level using physical data models. There are not as many physical
data models so far. The most common physical data models are as follows:
 Unifying model.
 Frame memory model.

2.7.4. Hierarchical Data Models


The hierarchical data model is represented by an upside-down tree. The user perceives the hierarchical database as a
hierarchy of segments. A segment is the equivalent of a file system’s record type. In a hierarchical data model, the
relationship between the files or records forms a hierarchy. In other words, the hierarchical database is a collection of
records that is perceived as organised to conform to the upside-down tree structure. Fig. 2.12 shows a hierarchical data
model. A tree may be defined as a set of nodes such that there is one specially designated node called the root (node),
which is perceived as the parent (like a family tree having parent-child or an organisation tree having owner-member
relationships between record types) of the segments directly beneath it. The remaining nodes are portioned into disjoint
sets and are perceived as children of the segment above them. Each disjoint set in turn is a tree and the sub-tree of the
root. At the root of the tree is the single parent. The parent can have none, one or more children. A hierarchical model can
represent a one-to-many relationship between two entities where the two are respectively parent and child. The nodes of
the tree represent record types. If we define the root record type to level-0, then the level of its dependent record types
can be defined as being level-1. The dependents of the record types at level–1 are said to be at level-2 and so on.
Fig. 2.12. Hierarchical data model

A hierarchical path that traces the parent segments to the child segments, beginning from the left, defines the tree shown
in Fig. 2.12. For example, the hierarchical path for segment ‘E’ can be traced as ABDE, tracing all segments from the root
starting at the leftmost segment. This left-traced path is known as preorder traversal or the hierarchical sequence. As can
be noted from Fig. 2.12 that each parent can have many children but each child has only one parent.
Fig. 2.13 (a) shows a hierarchical data model of a UNIVERSITY tree type consisting of three levels and three record types
such as DEPARTMENT, FACULTY and COURSE. This tree contains information about university academic departments along
with data on all faculties for each department and all courses taught by each faculty within a department. Fig. 2.13
(b) shows the defined fields or data types for department, faculty, and course record types. A single department record at
the root level represents one instance of the department record type. Multiple instances of a given record type are used
at lower levels to show that a department may employ many (or no) faculties and that each faculty may teach many (or
no) courses. For example, we have a COMPUTER department at the root level and as many instances of the FACULTY
record type are faculties in the computer department. Similarly, there will be as many COURSE record instances for each
FACULTY record as that faculty teaches. Thus, there is a one-to-many (1:m) association among record instances, moving
from the root to the lowest level of the tree. Since there are many departments in the university, there are many instances
of the DEPARTMENT record type, each with its own FACULTY and COURSE record instances connected to it by appropriate
branches of the tree. This database then consists of a forest of such tree instances; as many instances of the tree type as
there are departments in the university at any given time. Collectively, these comprise a single hierarchic database and
multiple databases will be online at a time.
Fig. 2.13. Hierarchical data model relationship of university tree type

Suppose we are interested in adding information about departments to our hierarchical database. For example, since the
departments are having various subjects for teaching, we want to keep record of subjects with each department in the
university. In that case, we would expand the diagram of Fig. 2.13 to look like that of Fig. 2.14. DEPARTMENT is still related
to FACULTY which is related to COURSE. DEPARTMENT is also related to SUBJECT which is related to TOPIC. We see from
this diagram that DEPARTMENT is at the top of a hierarchy from which a large amount of information can be derived.
Fig. 2.14. Hierarchical relationship of department with faculty and subject
Hierarchical database is one of the oldest database models used by enterprise in the past. Information Management
System (IMS), developed jointly by IBM and North American Rockwell Company for mainframe computer platform, was
one of the first hierarchical databases. IMS became the world’s leading hierarchical database system in the 1970s and
early 1980s. Hierarchical database model was the first major commercial implementation of a growing pool of database
concepts that were developed to counter the computer file system’s inherent shortcomings.

2.7.4.1. Advantages of Hierarchical Data Model

Following are the advantages of hierarchical data model:


 Simplicity: Since the database is based on the hierarchical structure, the relationship between the various layers is
logically (or conceptually) simple and design of a hierarchical database is simple.
 Data sharing: Because all data are held in a common database, data sharing becomes practical.
 Data security: Hierarchical model was the first database model that offered the data security that is provided and
enforced by the DBMS.
 Data independence: The DBMS creates an environment in which data independence can be maintained. This
substantially decreases the programming effort and program maintenance.
 Data integrity: Given the parent/child relationship, there is always a link between the parent segment and its child
segments under it. Because the child segments are automatically referenced to its parent, this model promotes
data integrity.
 Efficiency: The hierarchical data model is very efficient when the database contains a large volume of data in one-
to-many (1:m) relationships and when the users require large numbers of transactions using data whose
relationships are fixed over time.
 Available expertise: Due to a large number of available installed mainframe computer base, experienced
programmers were available.
 Tried business applications: There was a large number of tried-and-true business applications available within the
mainframe environment.

2.7.4.2. Disadvantages of Hierarchical Data Model


 Implementation complexity: Although the hierarchical database is conceptually simple, easy to design and no data-
independence problem, it is quite complex to implement. The DBMS requires knowledge of the physical level of
data storage and the database designers should have very good knowledge of the physical data storage
characteristics. Any changes in the database structure, such as the relocation of segments, require changes in all
applications programs that access the database. Therefore, implementation of a database design becomes very
complicated.
 Inflexibility: A hierarchical database lacks flexibility. The changes in the new relations or segments often yield very
complex system management tasks. A deletion of one segment may lead to the involuntary deletion of all the
segments under it. Such an error could be very costly.
 Database management problems: If you make any changes to the database structure of the hierarchical database,
then you need to make the necessary changes in all the application programs that access the database. Thus,
maintaining the database and the applications can become very difficult.
 Lack of structural independence: Structural independence exists when the changes to the database structure does
not affect the DBMS’s ability to access data. The hierarchical database is known as a navigational system because
data access requires that the preorder traversal (a physical storage path) be used to navigate to the appropriate
segments. So the application programmer should have a good knowledge of the relevant access paths to access
the data from the database. Modifications or changes in the physical structure can lead to the problems with
applications programs, which will also have to be modified. Thus, in a hierarchical database system the benefits of
data independence is limited by structural dependence.
 Application programming complexity: Applications programming is very time consuming and complicated. Due to
the structural dependence and the navigational structure, the application programmers and the end-users must
know precisely how the data is distributed physically in the database and how to write lines of control codes in
order to access data. This requires knowledge of complex pointer systems, which is often beyond the grasp of
ordinary users who have little or no programming knowledge.
 Implementation limitation: Many of the common relationships do not confirm to the one-to-many relationship
format required by the hierarchical database model. For example, each student enrolled at a university can take
many courses, and each course can have many students. Thus, such many-to-many (n:m) relationships, which are
more common in real life, are very difficult to implement in a hierarchical data model.
 No standards: There is no precise set of standard concepts nor the does the implementation of model confirm to a
specific standard in a hierarchical data model.
 Extensive programming efforts: Use of hierarchical model requires extensive programming activities, and
therefore, it has been called as a system created by programmers for programmers. Modern data processing
environment does not accept such concepts.

2.7.5. Network Data Model


The Database Task Group of the Conference on Data System Languages (DBTG/CODASYL) formalized the network data
model in the late 1960s. The network data models were eventually standardised as the CODASYL model. The network data
model is similar to a hierarchical model except that a record can have multiple parents. The network data model has three
basic components such as record type, data items (or fields) and links. Further, in network model terminology, a
relationship is called a set in which each set is composed of at least two record types. First record type is called an owner
record that is equivalent to the parent in the hierarchical model. Second record type is called a member record that is
equivalent to child in the hierarchical model. The connection between an owner and its member records is identified by a
link to which database designers assign a set-name. This set-name is used to retrieve and manipulate data. Just as the
branches of a tree in the hierarchical data models represent access path, the links between owners and their members
indicate access paths in network models and are typically implemented with pointers. In network data model, member
can appear in more than one set and thus may have several owners, and therefore, it facilitates many-to-many (n:m)
relationships. A set represents a one-to-many (1:m) relationship between the owner and the member.
Fig. 2.15 shows a diagram of network data model. It can be seen in the diagram that member ‘B’ has only one owner ‘A’
whereas member ‘E’ has two owners namely ‘B’ and ‘C’. Fig. 2.16 illustrates an example of implementing network data
model for a typical sales organisation in which CUSTOMER, SALES_REPRESENTATIVE, INVOICE, INVOICE_LINE, PRODUCT
and PAYMENT represent record types. It can be seen in Fig. 2.16 that INVOICE_LINE is own by both PRODUCT and
INVOICE. Similarly, INVOICE has two owners namely SALES_REPRESENTATIVE and CUSTOMER. In network data model,
each link between two record types represents a one-to-many (1:m) relationship between them.
Fig. 2.15. Network data model

Fig. 2.16. Network data model for a sales organisation

Unlike the hierarchical data model, network data model supports multiple paths to the same record, thus avoiding the
data redundancy problem associated with hierarchical system.

2.7.5.1. Advantages of Network Data Model


 Simplicity: Similar to hierarchical data model, network model is also simple and easy to design.
 Facilitating more relationship types: The network model facilitates in handling of one-to-many (1:m) and many-to-
many (n:m) relationships, which helps in modelling the real life situations.
 Superior data access: The data access and flexibility is superior to that is found in the hierarchical data model. An
application can access an owner record and all the members record within a set. If a member record in the set has
two or more (like a faculty working for two departments), then one can move from one owner to another.
 Database integrity: Network model enforces database integrity and does not allow a member to exist without an
owner. First of all, the user must define the owner record and then the member.
 Data independence: The network data model provides sufficient data independence by at least partially isolating
the programs from complex physical storage details. Therefore, changes in the data characteristics do not require
changes in the application programs.
 Database standards: Unlike hierarchical model, network data model is based on the universal standards
formulated by DBTG/CODASYL and augmented by ANSI-SPARC. All the network data models confirm to these
standards, which also includes a DDL and DML.

2.7.5.2. Disadvantages of Network Data Model


 System complexity: Like hierarchical data model, network model also provides a navigational access mechanism to
the data in which the data are accesses one record at a time. This mechanism makes the system implementation
very complex. Consequently, the DBAs, database designers, programmers and end users must be familiar with the
internal data structure in order to access the data and take advantage of the system’s efficiency. In other words,
network database models are also difficult to design and use properly.
 Absence of structural independence: It is difficult to make changes in a network database. If changes are made to
the database structure, all subschema definitions must be revalidated before any applications programs can access
the database. In other words, although the network model achieves data independence, it does not provide
structural independence.
 Not a user-friendly: The network data model is not a design for user-friendly system and is a highly skill-oriented
system.

2.7.6. Relational Data Model

E.F. Codd of IBM Research first introduced the relational data model in a paper in 1970. The relational data model is
implemented using very sophisticated Relational Database Management System(RDBMS). The RDMS performs the same
basic functions of the hierarchical and network DBMSs plus a host of other functions that make the relational data models
easier to understand and implement. The relational data model simplified the user’s view of the database by using simple
tables instead of the more complex tree and network structures. It is a collection of tables (also called relations) as shown
in Fig. 2.17 (a) in which data is stored. Each of the tables is a matrix of a series of row and column intersections. Tables are
related to each other by sharing common entity characteristic. For example, a CUSTOMER table might contain an AGENT-
ID that is also contained in the AGENT table, as shown in Fig. 2.17 (a) and (b).
Fig. 2.17. Relational data model
(a) Relational Tables
(a) Linkage between relational tables

Even though the customer and agent data are stored in two different tables, the common link between the CUSTOMER
and AGENT tables, which is AGENT-ID, helps in connecting or matching of the customer to its sales agent. Although tables
are completely independent of one another, data between the tables can be easily connected using common links. For
example, the agent of customer “Lions Distributors” of CUSTOMER table can be retrieved as “Greenlay & Co.” from AGENT
table with the help of a common link AGENT-ID, which is AO-9999.
2.7.6.1. Advantages of Relational Data Model
 Simplicity: A relational data model is even simpler than hierarchical and network models. It frees the designers
from the actual physical data storage details, thereby allowing them to concentrate on the logical view of the
database.
 Structural independence: Unlike hierarchical and network models, the relational data model does not depend on
the navigational data access system. Changes in the database structure do not affect the data access.
 Ease of design, implementation, maintenance and uses: The relational model provides both structural
independence and data independence. Therefore, it makes the database design, implementation, maintenance
and usage much easier.
 Flexible and powerful query capability: The relational database model provides very powerful, flexible, and easy-to-
use query facilities. Its structured query language (SQL) capability makes ad hoc queries a reality.

2.7.6.2. Disadvantages of Relational Data Model


 Hardware overheads: The relational data models need more powerful computing hardware and data storage
devices to perform RDMS-assigned tasks. Consequently, they tend to be slower than the other database systems.
However, with rapid advancement in computing technology and development of much more efficient operating
systems, the disadvantage of being slow is getting faded.
 Easy-to-design capability leading to bad design: Easy-to-use feature of relational database results into untrained
people generating queries and reports without much understanding and giving much thought to the need of
proper database design. With the growth of database, the poor design results into slower system, degraded
performance and data corruption.
2.7.7. Entity-Relationship (E-R) Data Model
An entity–relationship (E-R) model is a logical database model, which has a logical representation of data for an enterprise
of business establishment. It was introduced by Chen in 1976. E-R data model is a collection of objects of similar structures
called an entity set. The relationship between entity sets is represented on the basis of number of entities from entity set
that can be associated with the number of entities of another set such as one-to-one (1:1), one-to-many (1:n), or many-to-
many (n:n) relationships, as explained in Chapter 1, Section 1.3.1.3. The E-R diagram is shown graphically.
Fig. 2.18 shows building blocks or symbols to represent E-R diagram. The rectangular boxes represent entity, ellipses (or
oval boxes) represent attributes (or properties) and diamonds represent relationship (or association) among entity sets.
There is no industry standard notation for developing E-R diagram. However, the notations or symbols of Fig. 2.18 are
widely used building blocks for E-R diagram.
Fig. 2.18. Building blocks (symbols) of E-R diagram
Fig. 2.19 (a) illustrates a typical E-R diagram for a product sales organisation called M/s ABC & Co.

This organisation manufactures various products, which are sold to the customers against an order.

 Fig. 2.19 (b) shows data items and records of entities. According to the E-R diagram of Fig. 2.19 (a), a customer having
identification no. 1001, name Waterhouse Ltd. with address Box 41, Mumbai [as shown in Fig. 2.19 (b)], is an entity since it
uniquely identifies one particular customer. Similarly, a product A1234 with a description Steel almirah and unit cost
of 4000 is an entity since it uniquely identifies one particular product and so on.
Fig. 2.19. E-R diagram for M/s ABC & Co

(a) E-R diagram for a product sales organisation


(b) Attributes (data items) and records of entities
Now the set of all products (all records in the PRODUCT table of Fig. 2.19 (b) of M/s ABC & Co. is defined as the entity set
PRODUCT. Similarly, the entity set CUSTOMER represents the set of all the customers of M/s ABC & Co. and so on. An
entity set is represented by set of attributes (called data items or fields). Each rectangular box represents an entity for
example, PRODUCT, CUSTOMER and ORDER. Each ellipse (or oval shape) represents attributes (or data items or fields). For
example, attributes of entity PRODUCT are PROD-ID, PROD-DESC and UNIT-COST. CUSTOMER entity contains attributes
such as CUST-ID, CUST-NAME and CUST-ADDRESS. Similarly, entity ORDER contains attributes such as ORD-DATE, PROD-ID
and PROD-QTY. There is a set of permitted values for each attribute, called the domain of that attribute, as shown in Fig.
2.19 (b).
The E-R diagram has become a widely excepted data model. It is used for designing of relational databases. A further
detail on the E-R data model is given in Chapter 6.
2.7.7.1. Advantages of E-R Data Model
 Straightforward relational representation: Having designed an E-R diagram for a database application, the
relational representation of the database model becomes relatively straightforward.
 Easy conversion for E-R to other data model: Conversion from E-R diagram to a network or hierarchical data model
can easily be accomplished.
 Graphical representation for better understanding: An E-R model gives graphical and diagrammatical
representation of various entities, its attributes and relationships between entities. This in turn helps in the clear
understanding of the data structure and in minimizing redundancy and other problems.

2.7.7.2. Disadvantages of E-R Data Model


 No industry standard for notation: There is no industry standard notation for developing an E-R diagram.
 Popular for high-level design: The E-R data model is especially popular for high-level database design.
2.7.8. Object-oriented Data Model
Object-oriented data model is a logical data model that captures the semantics of objects supported in an object-oriented
programming. It is a persistent and sharable collection of defined objects. It has the ability to model complete solution.
Object-oriented database models represent an entity and a class. A class represents both object attributes as well as the
behaviour of the entity. For example, a CUSTOMER class will have not only the customer attributes such as CUST-ID, CUST-
NAME, CUST-ADDRESS and so on, but also procedures that imitate actions expected of a customer such as update-order.

Instances of the class-object correspond to individual customers. Within an object, the class attributes takes specific
values, which distinguish one customer (object) from another. However, all the objects belonging to the class, share the
behaviour pattern of the class. The object-oriented database maintains relationships through logical containment.
The object-oriented database is based on encapsulation of data and code related to an object into a single unit, whose
contents are not visible to the outside world. Therefore, object-oriented data models emphasise on objects (which is a
combination of data and code), rather than on data alone. This is largely due to their heritage from object-oriented
programming languages, where programmers can define new types or classes of objects that may contain their own
internal structures, characteristics and behaviours.

Thus, data is not thought of as existing by itself. Instead, it is closely associated with code (methods of member functions)
that defines what objects of that type can do (their behaviour or available services). The structure of object-oriented data
model is highly variable. Unlike traditional databases (such as hierarchical, network or relational), it has no single inherent
database structure. The structure for any given class or type of object could be anything a programmer finds useful, for
example, a linked list, a set, an array and so forth. Furthermore, an object may contain varying degrees of complexity,
making use of multiple types and multiple structures.

The object-oriented database management system (OODBMS) is among the most recent approaches to database
management. They started in the engineering and design domain applications, and became the favoured system for
financial, telecommunications, and World Wide Web (WWW) applications. It is suited for multimedia applications as well
as data with complex relationships that are difficult to model and process in a relational DBMS .

2.7.8.1. Advantages of Object-oriented Data Model


 Capable of handling a large variety of data types: Unlike traditional databases (such as hierarchical, network or
relational), the object-oriented database are capable of storing different types of data, for example, pictures,
voices, video, including text, numbers and so on.
 Combining object-oriented programming with database technology: Object-oriented data model is capable of
combining object-oriented programming with database technology and thus, providing an integrated application
development system.
 Improved productivity: Object-oriented data models provide powerful features such as inheritance, polymorphism
and dynamic binding that allow the users to compose objects and provide solutions without writing object-specific
code. These features increase the productivity of the database application developers significantly.
 Improved data access: Object-oriented data model represents relationships explicitly, supporting both navigational
and associative access to information. It further improves the data access performance over relational value-based
relationships.
2.7.8.2. Disadvantages of Object-oriented Data Model
 No precise definition: It is difficult to provide a precise definition of what constitutes an object-oriented DBMS
because the name has been applied to a variety of products and prototypes, some of which differ considerably
from one another.
 Difficult to maintain: The definition of objects is required to be changed periodically and migration of existing
databases to confirm to the new object definition with change in organizational information needs. It posses real
challenge when changing object definitions and migrating databases.
 Not suited for all applications: Object-oriented data models are used where there is a need to manage complex
relationships among data objects. They are especially suited for specific applications such as engineering, e-
commerce, medicines and so on, and not for all applications. Its performance degrades and requires high
processing requirements when used for ordinary applications.

2.7.9. Comparison between Data Models

Table 2.2 summarises the characteristics of different data models discussed above.


Table 2.2. Comparison between different data models

Data model Data element Relationship organisation Identity Access Data Structural
organisation language Independence Independe

Hierarchical Files, Records Logical proximity in a linearized Record- Procedural Yes No


tree based

Network Files, Records Intersecting networks Record- Procedural Yes No


based

Relational Tables Identifiers of rows in one table Value- Non- Yes Yes
are embeded as attribute based procedural
values in another table

E-R diagram Objects, Entity Relational extenders that Value- Non- Yes Yes
set support specialized based procedural
applications.

Object- Objects Logical containment. Related Record- Procedural Yes Yes


oriented objects are found within a based
given object by recursively
examining attributes of an
object that are themselves
objects
2.8. TYPES OF DATABASE SYSTEMS

The classification of a database management system (DBMS) is greatly influenced by the underlying computing system on
which it runs, in particular of computer architecture such as parallel, networked or distributed. However, the DBMS can be
classified according to the number of users, the database site locations and the expected type and extent of use.
1. On the basis of the number of users:
1. Single-user DBMS.
2. Multi-user DBMS.
2. On the basis of the site locations:
1. Centralized DBMS.
2. Parallel DBMS.
3. Distributed DBMS.
4. Client/server DBMS.
3. On the basis of the type and the extent of use:
1. Transactional or production DBMS.
2. Decision support DBMS.
3. Data warehouse.
In this section, we will discuss about some of the important types of DBMS system, which are presently being used.

2.8.1. Centralised Database System


The centralised database system consists of a single processor together with its associated data storage devices and other
peripherals. It is physically confined to a single location. The system offers data processing capabilities to users who are
located either at the same site, or, through remote terminals, at geographically dispersed sites. The management of the
system and its data are controlled centrally form any one or central site. Fig. 2.20 illustrates an example of centralised
database system.

Fig. 2.20. Centralised database system


2.8.1.1. Advantages of Centralised Database System
 Most of the functions such as update, backup, query, control access and so on, are easier to accomplish in a
centralised database system.
 The size of the database and the computer on which it resides need not have any bearing on whether the database
is centrally located. For example, a small enterprise with its database on a personal computer (PC) has a centralised
database, a large enterprise with many computers has database entirely controlled by a mainframe.

2.8.1.2. Disadvantages of Centralised Database System


 When the central site computer or database system goes down, then every one (users) is blocked from using the
system until the system comes back.
 Communication costs from the terminals to the central site can be expensive.
To take care of disadvantages of centralised database systems, parallel or distributed database systems are used, which
are discussed in chapters 17 and 18.

2.8.2. Parallel Database System


Parallel database systems architecture consists of a multiple central processing units (CPUs) and data storage disks in
parallel. Hence, they improve processing and input/output (I/O) speeds. Parallel database systems are used in the
applications that have to query extremely large databases or that have to process an extremely large number of
transactions per second. Several different architectures can be used for parallel database systems, which are as follows:
 Shared data storage disk
 Shared memory
 Hierarchical
 Independent resources.

Fig. 2.21 illustrates the different architecture of parallel database system. In shared data storage disk, all the processors
share a common disk (or set of disks), as shown in Fig. 2.21 (a). In shared memory architecture, all the processors share
common memory, as shown in Fig. 2.21 (b). In independent resource architecture, the processors share neither a common
memory nor a common disk. They have their own independent resources as shown in Fig. 2.21 (c). Hierarchical
architecture is hybrid of all the earlier three architectures, as shown in Fig. 2.21 (d). A further detail on parallel database
system is given in Chapter 17.

Fig. 2.21. Parallel database system architecture


2.8.2.1. Advantages of a Parallel Database System
 Parallel database systems are very useful for the applications that have to query extremely large databases (of the
order of terabytes, for example, 1012 bytes) or that have to process an extremely large number of transactions per
second (of the order of thousands of transactions per second).
 In a parallel database system, the throughput (that is, the number of tasks that can be completed in a given time
interval) and the response time (that is, the amount of time it takes to complete a single task from the time it is
submitted) are very high.

2.8.2.2. Disadvantages of a Parallel Database System


 In a parallel database system, there is a startup cost associated with initiating a single process and the startup-time
may overshadow the actual processing time, affecting speedup adversely.
 Since processes executing in a parallel system often access shared resources, a slowdown may result from
interference of each new process as it competes with existing processes for commonly held resources, such as
shared data storage disks, system bus and so on.

2.8.3. Client/Server Database System


Client/server architecture of database system has two logical components namely client, and server. Clients are generally
personal computers or workstations whereas server is large workstations, mini range computer system or a mainframe
computers system. The applications and tools of DBMS run on one or more client platforms, while the DBMS softwares
reside on the server. The server computer is called backend and the client’s computer is called front-end. These server and
client computers are connected into a network. The applications and tools act as clients of the DBMS, making requests for
its services. The DBMS, in turn, processes these requests and returns the results to the client(s). Client/server architecture
handles the graphical user interface (GUI) and does computations and other programming of interest to the end user. The
server handles parts of the job that are common to many clients, for example, database access and updates. Fig.

2.22 illustrates client/server database architecture.

Fig. 2.22. Client/server database architecture


As shown in Fig. 2.22, the client/server database architecture consists of three components namely, client applications, a
DBMS server and a communication network interface. The client applications may be tools, user-written applications or
vendor-written applications. They issue SQL statements for data access. The DBMS server stores the related software,
processes the SQL statements and returns results. The communication network interface enables client applications to
connect to the server, send SQL statements and receive results or error messages or error return codes after the server
has processed the SQL statements. In client/server database architecture, the majority of the DBMS services are
performed on the server.
The client/server architecture is a part of the open systems architecture in which all computing hardware, operating
systems, network protocols and other software are interconnected as a network and work in concert to achieve user
goals. It is well suited for online transaction processing and decision support applications, which tend to generate a
number of relatively short transactions and require a high degree of concurrency.
Further details on client/server database system is given in Chapter 18, Section 18.3.1.

2.8.3.1. Advantages of Client/server Database System


 Client-server system has less expensive platforms to support applications that had previously been running only on
large and expensive mini or mainframe computers.
 Clients offer icon-based menu-driven interface, which is superior to the traditional command-line, dumb terminal
interface typical of mini and mainframe computer systems.
 Client/server environment facilitates in more productive work by the users and making better use of existing data.
 Client-server database system is more flexible as compared to the centralised system.
 Response time and throughput is high.
 The server (database) machine can be custom-built (tailored) to the DBMS function and thus can provide a better
DBMS performance.
 The client (application database) might be a personnel workstation, tailored to the needs of the end users and thus
able to provide better interfaces, high availability, faster responses and overall improved ease of use to the user.
 A single database (on server) can be shared across several distinct client (application) systems.

2.8.3.2. Disadvantages of Client/Server Database System


 Labour or programming cost is high in client/server environments, particularly in initial phases.
 There is a lack of management tools for diagnosis, performance monitoring and tuning and security control, for the
DBMS, client and operating systems and networking environments.
2.8.4. Distributed Database System
Distributed database systems are similar to client/server architecture in a number of ways. Both typically involve the use
of multiple computer systems and enable users to access data from remote system. However, distributed database system
broadens the extent to which data can be shared well beyond that which can be achieved with the client/server system.

 Fig. 2.23 shows a diagram of distributed database architecture.

Fig. 2.23. Distributed database system

As shown in Fig. 2.23, in distributed database system, data is spread across a variety of different databases. These are
managed by a variety of different DBMS softw
ares running on a variety
of different computing machines supported by a variety of different operating systems. These machines are spread (or
distributed) geographically and connected together by a variety of communication networks. In distributed database
system, one application can operate on data that is spread geographically on different machines. Thus, in distributed
database system, the enterprise data might be distributed on different computers in such a way that data for one portion
(or department) of the enterprise is stored in one computer and the data for another department is stored in another.
Each machine can have data and applications of its own. However, the users on one computer can access to data stored in
several other computers. Therefore, each machine will act as a server for some users and a client for others. A further
detail on distributed database system is given in Chapter 18.
2.8.4.1. Advantages of Distributed Database System
 Distributed database architecture provides greater efficiency and better performance.
 Response time and throughput is high.
 The server (database) machine can be custom-built (tailored) to the DBMS function and thus can provide better
DBMS performance.
 The client (application database) might be a personnel workstation, tailored to the needs of the end users and thus
able to provide better interfaces, high availability, faster responses and overall improved ease of use to the user.
 A single database (on server) can be shared across several distinct client (application) systems.
 As data volumes and transaction rates increase, users can grow the system incrementally.
 It causes less impact on ongoing operations when adding new locations.
 Distributed database system provides local autonomy.

2.8.4.2. Disadvantages of Distributed Database System


 Recovery from failure is more complex in distributed database systems than in centralized systems.

You might also like