You are on page 1of 46

Chapter 1

Introduction
• Database Sytems
• Concepts of Database Management System (DBMS)
• DBMS Applications
• Historical Roots of DBMS – Flat File Systems
• Historical Development of Database Systems
• Advantages/Purpose of DBMS
• DBMS Architecture
• Data Independence
• Instances and Schemas
• Data Models
1-1
• Client/Server Architecture
Database System

• Database system consists of logically related data stored in a single


logical data repository.
• Database system may be physically distributed among multiple
storage facilities
• DBMS eliminates most of file system’s problems.
• Current generation stores data structures, relationships between
structures, and access paths. Also defines, stores, and manages all
access paths and components
1-2
Database System

1-3
Concepts of Database Management System
• A database is the collection of related persistent data and
contains information relevant to an enterprise. The
database is also called the repository or container for a
collection of data files. For example, university database
for maintaining information about students, courses and
grades in university.
• A database system is basically just a computerized
record-keeping system. A database system involves
four
major components: data, hardware, software, and users.
• The database management system (DBMS) is the
software that handles all access to the database. It is
defined as the collection of interrelated data and a set of
programs to access those data. 1-4
Database Management System
• Users of a DBMS system can perform the following basic
operations on the database:
 Adding new, empty files to the database.
 Inserting data into existing files.
 Retrieving data from existing files.
 Changing data in existing files.
 Deleting data from existing files.
 Removing existing files from the database.

1-5
Database Applications
• Databases form an essential part of almost all enterprises.
Some database applications are given below:
 Banking: For customer information, accounts, and loans, and
banking transactions.
 Airlines: For reservation and schedule information.
 Universities: For student information, course registrations, and
grades.
 Credit card transactions: For purchase on credit cards and
generation of monthly statements.
 Telecommunication: For keeping records of call made, generating
monthly bills, maintaining balances on prepaid calling cards, and
storing information about the communication networks.
 Finance: For storing information about holdings,
sales, and
purchase of financial instruments such as stocks and bonds. 1-6
Database Applications
 Sales: For customer, product, and purchase information.
 Manufacturing: For management of supply chain and for tracking
production of items in factories, inventories of items in
warehouses/stores, and orders for items.
 Human resources: For information about employees,
salaries,
payroll taxes and benefits, and for generation of paychecks.

1-7
Historical Roots:Files and File Systems
• File systems typically composed of collection of file folders, each
tagged and kept in cabinet
• Contents of each file folder are logically related
• Computerized file systems are software that manages data of the
organization.
• Data processing (DP) specialist developed computerized file
systems.
• Each file used its own application program to store, retrieve, and
modify data
• Each file was owned by individual or department that commissioned
its creation
1-8
Flat-file Systems
• In early processing systems, an organization's information
was stored as groups of records in separate files. The file
processing systems consisted of a few data files and many
application programs. Each file, called a flat file, contained
processed information for one specific function, such as
accounting or inventory.
• A flat file system stores data in a plain text file. Each line
of the text file holds one record, with fields separated by
delimiters, such as commas or tabs.
• A File Management System (FMS) accommodate flat files
that have no relation to other files. This type of database is
ideal for a simple databases that do not contain a lot of
repeated information. Examples include excel spreadsheet
or word data list file. 2 1-
Flat-file Systems

1-
2
Disadvantages of file Processing Systems
a. Program - Data dependence:
File descriptions are stored within each application program that accesses a given
file. As a consequence, any change to a file structure requires changes to the file
descriptions for all programs that access the file.
Suppose it is decided to change the customer address field length in the records in
a file from 30 to 40 characters. The file descriptions in each program that is affected
would have to be modified. It is often difficult to locate all programs affected by
such changes.

b. Duplication of data:
Because applications are often developed independently in file processing systems,
unplanned duplicate data files are the rule rather than the exception. This
duplication is wasteful because it requires additional storage space and increased
effort to keep all files up to date. Duplicate data files often result in loss data
integrity because the data formats may be inconsistent or the data values may not
agree. For example, the same data item may have different names in different files.
1-
new applications. 2
Disadvantages of file Processing Systems
c. Limited data sharing:
With the traditional file processing approach, each application has its own private
files and users have little opportunity to share data outside their own applications.
It is often frustrating to managers to find that a requested report will require a
major programming effort to obtain data from several incompatible files in separate
systems. Data are scattered in various files, and the files may be in different
formats. Writing new application program to retrieve data was difficult.
d. Lengthy development times:
With the traditional file processing approach, there is little opportunity to leverage
the previous development efforts. Each new application requires that the developer
essentially start from scratch by designing new file formats and descriptions. The
lengthy development times required are often inconsistent with today’s fast-paced
business environment.
e. Excessive program maintenance:
The preceding factors all combine together to create a heavy program maintenance load in
organizations that rely on traditional file processing systems. As much as 80% of the total
information systems development budget may be devoted to program maintenance in such 1-
2
organizations. This leaves little opportunity for developing new applications.
Historical Development of Database
Systems
1960’s
File processing systems are still dominant during this period.
The first database management systems were introduced
during that decade and were used for large and complex
ventures such as “APPOLLO moon landing project”. The first
efforts of standardization were taken up with the formation of
data base task group in the late 1960’s

1-13
Historical Development of Database
Systems
1970’s:
During this decade the use of database management systems became a commercial reality.
The hierarchical and network database management systems were developed largely to cope
with increasingly complex data structures such as manufacturing the bills of materials that
was extremely difficult to manage with conventional file processing methods. The network
and hierarchical models are generally called as first generation DBMS.
Major Disadvantages:
1. Difficult access to data, based on navigational record-at-a-time procedures.
2. Very limited data independence, so that programs are not insulated from changes to data
formats.
3. No widely accepted theoretical foundation for either model, unlike the relational data
model.

1-14
Historical Development of Database
Systems
1980’s:
To overcome the above limitations, E.F Codd and others developed the relational data
model during the 1970’s. The model was considered second generation DBMS, received
wide spread commercial acceptance and diffusion during the 1980’s. With the relational
model, all the data were represented in the form of tables. A relatively simple fourth
generation language called SQL (for Structured Query Language) is used for data retrieval.

1-15
Historical Development of Database
Systems
1990’s:
This decade started the new era of computing, first with client/server computing, then
Internet applications became increasingly important. To cope with the increasingly complex
data, object oriented databases were introduced during the late 1980’s. Since organizations
must manage a vast amount of both structured and unstructured data, both the relational
and object-oriented databases are of great importance today.

1-16
Historical Development of Database
Systems
2000 and Beyond:
1. The ability to manage increasingly complex data types. These types include
multidimensional data, which is already assumed of importance in data ware house
applications.

2. The continued development of ‘universal servers’ based on object-relational DBMS.


These are database servers that can manage a wide range of data types transparently to
users.

3. Fully distributed databases will become a reality as an organisation will be able to


physically distribute its databases to multiple locations and update them automatically.

4. Content-addressable storage will become more popular. For example, a user can scan a
photograph and have the computer search for the closest match to that photo.

5. Database and other technologies, such as artificial intelligence and television like
information services will make database access much easier for untrained users.

1-17
Purpose/Advantages of DBMS
• In early days, database applications were built on top of
file systems. These systems have many drawbacks.
Database systems offer solutions to all the drawbacks of
file systems. The benefits using DBMS are:
 Redundancy can be reduced: The database is said to be
redundant if the same information is duplicated in several places
(data files). For example, the address and telephone number of a
particular customer may appear in a file that consists of saving-
account records and in a file that consists of checking-account
records. We can reduce redundancy by using DBMS.
 Inconsistency can be avoided: The database is said to be
inconsistent if various copies of the same data may no longer
agree. For example, a changed customer address may be reflected
in saving account but not elsewhere in the system. By using
DBMS we can avoid inconsistency.
1-18
Purpose/Advantages of DBMS
 Data can be shared: The data in the database can be
shared
among many users and applications.
 Transaction support can be provided: A transaction involves
several database operations. For example, transfer of a cash
amount from account A to account B. In this example two update
operations are required.
 Integrity can be maintained: The problem of integrity is the
problem of ensuring that the data in database is correct. By using
DBMS, we can maintain integrity problems.
 Security can be enforced: Not every user of the database system
 should
Efficientbe able to access
andallRecovery
data. can be provided: Provide
Backup for recovering from software and hardware failures to
facilities
reinstate database to previous consistent state.
 Data in the database can be accessed easily.
1-19
Database Users
• Database users can be classified into two categories: actors
on the scene and workers behind the scene.
• Actors on the Scene:
 These people’s jobs involve develop, use, and administer the
database. These people are classified into following categories:
 Database administrators: These people are responsible for
authorizing access to the database, for coordinating and monitoring
its use, acquiring software and hardware resources, controlling its
use and monitoring efficiency of operations. Thus DBA is
responsible for the overall control of the system at technical level.
Hence, DBA is responsible for the following tasks:
• Defining Conceptual Schema
• Defining Internal Schema
• Defining Security & Integrity Constraints
1-20
Database Users
• Monitoring performance & responsibilities to changing requirements
• Liaising with users
• Defining Dump and Reload policies
 Database Designers: These people are responsible for defining
the content, the structure, the constraints, and functions or
transactions against the database. They must communicate with the
end-users and understand their needs.
 End Users: They use the database for querying, updating and
generating reports. End-users can be categorized as follows:
• Casual end users: They access database occasionally when needed.
They use sophisticated database query language. They are middle or
high-level managers.
• Naive or Parametric end users: They make up a large section of the
end-user population. They use previously well-defined functions in
the form of “canned transactions” against the database. For example,
bank-tellers, reservation clerks.

1-21
Database Users
• Sophisticated end users: They have clear knowledge of database
system facilities to construct complex queries. They are Engineers,
Scientists. They make use of most database facilities.
• Stand-alone end user: They make use of personal databases by using
ready-made program packages that provide easy-to-use menu based
interface.
 System Analysts and Application Programmers:
• System analysts determine the requirements of end users, especially
naive and parametric end users and develop specifications for canned
transactions that meet these requirements.
• Application programmers implement these specifications as
programs; then they test, debug, document and maintain these canned
transactions.

1-22
Database Users
• Workers Behind the Scene:
 These people are associated with the design, development, and
operation of the DBMS software and system environment. These
people are not actively interested in the database itself. These
people are classified into following types:
 DBMS System Designers and implementers: These people
design and implement the DBMS modules and interfaces as a
software package.
 Tool Developers: These persons design and implement tools – the
software packages that facilitate database system design and use
and that help improve performance.
 Operators and Maintenance Personnel: These personnel are
responsible for the actual running and maintenance of the hardware
and software environment for the database system.

1-23
DBMS Architecture
• The architecture proposed by ANSI/
DBMS
(American National
SPARCStandards Institute, Standards Planning
And Requirements Committee) (ANSI/SPARC architecture)
is defined at three levels. This architecture is also called
three-schema architecture.
• This architecture provides three levels of abstraction
to simplify users’ interaction with the system.
• It provides users with an abstract view of data. The system
hides certain details of how data are stored and maintained.
• The of this architecture is to separate the user
applications
goal from physical database.
• It divides the system into three levels of abstraction:
the
internal
the or physical
external or view level,
leveNla.the logical
nda Kishor Ray or conceptual level, and
1-13
DBMS Architecture
End Users

External External External


•• •
Level View 1 View n
external/conceptual
mapping
Conceptual Conceptual Schema
Level
conceptual/internal
mapping
Internal Internal Schema
Level

1-14
StNoarndeadKisDhoar
DBMS Architecture
• Physical Level or Internal Level:
 It is the lowest level of abstraction and describes how the data in the
database are actually stored.
 This level describes complex low-level data structures in detail and is
concerned with the way the data is physically stored.
 Data only exists at physical level.
• Logical Level or Conceptual Level:
 This is the next higher level of abstraction and describes what data are
stored in the database, and what relationships exist among those data.
 It describes the structure of whole database and hides
details of physical storage structure.
 It concentrates on describing entities, data types,
relationships, attributes and constraints.
 All of the views must be derivable from this conceptual schema.
1-26
DBMS Architecture
• View Level or External Level:
 It is the highest level of abstraction and is concerned with the way
the data is seen by individual users.
 This level simplifies the users’ interaction with the system.
 It includes a number of user views and hence is guided by the end
user requirement.
 It describes only those part of the database in which the users are
interested and hides rest of all from those users. Each user group
refers to its own external schema.
• The DBMS must transform a request specified on an
external schema into a request against the conceptual
schema, and then into a request on the internal schema for
processing over the database. The process of transforming
requests and results between levels is called mapping.
1-27
Data Abstraction
Data Abstraction in Database Management Systems enables the conceptualization,
encapsulation, and manipulation of data at different levels of abstraction. It
supports modularity, reusability, information hiding, and facilitates improved
development and maintenance of database systems. By abstracting the
complexities of data storage and access, DBMS allows users to interact with the
data using high-level constructs, ensuring efficiency, security, and adaptability in
managing large amounts of structured information.

1-28
Data Abstraction
Here are some key points about data abstraction:
1. Conceptualization: In the realm of DBMS, data abstraction involves
conceptualizing the database system by identifying the essential entities,
attributes, and relationships that need to be represented. This abstraction
allows database designers and administrators to focus on the logical structure
and organization of the data, disregarding the underlying physical storage
details.
2. Encapsulation: Encapsulation in DBMS refers to bundling the data and the
operations that can be performed on that data into a single unit, known as a
table or a relation. Tables encapsulate related data items and define a well-
defined structure (schema) that represents the attributes and their
relationships. The internal implementation details of the table, such as storage
mechanisms, indexing, and access methods, are hidden from the users, who
interact with the table using SQL queries and commands. 1-29
Data Abstraction
3. Modularity and Reusability: Data abstraction in DBMS promotes modularity and
reusability by allowing the use of pre-defined abstract data types, such as tables,
views, and stored procedures. These abstractions provide a way to organize and
encapsulate data and functionality, which can be reused in different parts of the
database or in other database systems. For example, views allow users to define
virtual tables based on existing data, abstracting the underlying complexity and
providing a simplified interface for data retrieval and manipulation.

4. Information Hiding: Data abstraction in DBMS supports information hiding by


allowing users to interact with the database system through a well-defined
interface. Users can perform operations on the database, such as querying,
inserting, updating, and deleting records, without needing to know the
implementation details of how the data is stored and managed internally. This
abstraction protects the integrity and security of the data by controlling the access
and providing a controlled environment for data manipulation.
1-30
Data Abstraction
5. Abstraction Levels: In DBMS, data abstraction can be achieved at different levels.
At the logical level, database designers and administrators work with conceptual
data models, such as entity-relationship (ER) diagrams or relational schemas, to
define the structure and relationships of the data. At the physical level, DBMS
handles the storage, indexing, and access methods to optimize data retrieval and
ensure efficient management of the underlying storage structures. The DBMS
provides a layer of abstraction between these levels, enabling users to interact with
the database at a higher-level language, such as SQL, without concerning
themselves with the low-level implementation details.

1-31
Data Abstraction
6. Improved Development and Maintenance: Data abstraction in DBMS improves
the development and maintenance of database systems. It allows for a clear
separation of concerns between the logical representation of the data and its
physical implementation, facilitating changes and modifications without affecting
the overall database schema. This abstraction also promotes data independence,
where modifications to the physical storage structures or indexing methods do not
require changes to the logical schema or the applications that use the database.
This enhances the flexibility and adaptability of the database system to evolving
requirements.

1-32
Data Independence
• The three schema architecture further explains the concept
of data independence, the capacity to change the schema at
one level without having to change the schema at the next
higher level.
 Logical Data Independence
 Physical Data Independence
• Logical Data Independence:
 The capacity to change the conceptual schema without having to
change the external schemas and their associated application
programs.
• Physical Data Independence:
 The capacity to change the internal schema without
having to
change the conceptual schema. 1-33
Data Independence
• For example, the internal schema may be changed when
certain file structures are reorganized or new indexes are
created to improve database performance
• When a schema at a lower level is changed, only the
mappings between this schema and higher-level schemas
need to be changed in a DBMS that fully supports data
independence. The higher-level schemas themselves are
unchanged.
• Hence, the application programs need not be changed since
they refer to the external schemas.

1-34
Instances and Schemas
• Instances:
 Databases change over time as information is inserted and deleted.
The collection of information stored in the database at a particular
moment is called an instance of the database.
 It is also known as database state.
• Schema:
 The overall design of the database which is not expected to change
frequently is called the database schema.
 There are three schemas, partitioned according to the levels of
abstraction. The physical schema describes the database design at
physical level. The logical schema describes the database design at
the logical level. The schema at the view level is sometimes called
subschema and describes the view of the database. A database may
have several subschema.
1-35
Data Models
• The basic structure or design of the database is the data
model. A data model is a collection of conceptual tools for
describing data, data relationships, data semantics, and
consistency constraints. Some data models are given below:
• Entity-Relationship Model:
 Entity-relationship (E-R) model is a high level data model based
on a perception of a real world that consists of collection of basic
objects, called entities, and of relationships among these entities.
 An entity is a thing or object in the real world that is distinguishable
from other objects.
 Entities are described in a database by a set of attributes.
 A relationship is an association among several entities.
 The set of all entities of the same type is called an entity set and the
set of all relationships of the same type is called a relationship set.
1-36
Data Models
 Overall logical structure of a database can be expressed graphically
by E-R diagram. The basic components of this diagram are:
• Rectangles (represent entity sets)
• Ellipses (represent attributes)
• Diamonds (represent relationship sets among entity sets)
• Lines (link attributes to entity sets and entity sets to relationship sets)
 The figure below shows an example of E-R diagram.

1-37
Data Models
 In addition, the E-R model also represents certain constraints to
which the contents of the database must conform. The constraints
are mapping cardinalities and participation constraints.
(discussed later)
• Relational Model:
 It is the current pervasive model. The relational model is a lower
level model that uses a collection of tables to represent both data and
relationships among those data. Each table has multiple columns,
and each column has a unique name. Each table corresponds to an
entity set or relationship set, and each row represents an instance of
that entity set or relationship set.
 Relationships link rows from two tables by
embedding row
identifiers (keys) from one table as attribute values in the other
table.
 Structured query language (SQL) is used to manipulate data stored 1-38
in tables.
Fig: A sample relational database 1-23
Data Models
 The relational data model is the most widely used data model, and a
vast majority of current database systems are based on the relational
model. The relational model is at a lower level of abstraction than
the E-R model. Database designs are often carried out in the E-R
model, and then translated to the relational model.
• Object-oriented Model:
 This model represents an entity set as a class. A class represents
both object attributes as well as the behavior of the entity.
 Instances of class are objects. Within an object, the class attributes
takes specific values. However the behavior patterns of the class is
shared by all the objects belonging to the class.
 Attribute values can be primitive data types usually associated with
databases and programming languages or other objects. The object-
oriented model maintains relationships through ‘logical-
containment’.
1-24
Data Models
• Object Relational Model:
 This model combines the features of the object-oriented data model
and relational data model.
• Semistructured Model:
 This model permits the specification of data where individual data
items of the same type may have different set of attributes. The
extensible markup language (XML) is widely used to represent
semistructured data.
• Hierarchical Model:
 This model assumes that a tree structure is the most frequently
occurring relationship.
• Network Model:
 The network model replaces the hierarchical tree with a graph thus
allowing more general connections among the nodes. This model
was evolved to speciallyNahnadanKdilsehonroRany-hierarchical 1-25

relationships.
Client/Server Architecture
• In client/server architecture, user interface and application programs are
located on the client side and query and transaction facility are located
on the server side.
• The server is often called a query server or transaction server because it
provides these two functionality. It is also called an SQL server since
most servers are based on the SQL language and standard.
• The user interface programs and application programs can run on the
client side.
• When DBMS access is required, the program establishes a connection to
the DBMS which is on the server side.
• Once the connection is created, the client program can communicate
with the DBMS.
• Some interfaces that provide an application programming interface
(API) that allow client side programs to call the DBMS are ODBC
(Open Database Connectivity) and JDBC (Java Database Connectivity).
1-26
Client/Server Architecture
• Other variations of clients are also possible. For example, in some
DBMSs, more functionality is transferred to clients including user
interface, data dictionary functions, interactions with programming
language compilers, global query optimization, concurrency control,
and recovery across multiple servers. In such situations the server may
be called the Data Server.
• The described so far is called two-tier architecture
architecture
because the software components are distributed over two
systems:
• client
Many and server.
Web use an architecture called the three-tier
applications which adds an intermediate layer (middle tier or
architecture
application server or Web server) between the client and the database
server.
• This plays an intermediary role by storing business rules
server
(procedures or constraints) that are used to access data from the database
server. 1-27
Client/Server Architecture
• It can also improve database security by checking a client's credentials
before forwarding a request to the database server.
• Clients contain GUI interfaces and some additional application-specific
business rules.
• Advances in encryption and decryption technology make it safer to
transfer sensitive data from server to client in encrypted form, where it
will be decrypted.

1-28
Any Questions?

You might also like