You are on page 1of 24

------------------------------------------------------------------------------------------------------------------------------------------------

UNIT-I

----------------------------------------------------------------------------------------------------------------

Database System Applications:


● A Historical Perspective
● File Systems versus a DBMS
● the Data Model
● Levels of Abstraction in a DBMS
● Data Independence
● Structure of a DBMS
Introduction to Database Design:
● Database Design and ER Diagrams,
● Entities, Attributes, and Entity Sets,
● Relationships and Relationship Sets
● Additional Features of the ER Model
● Conceptual Design With the ER Model.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Database System Applications:

DBMS contains information about a particular enterprise

1. Collection of interrelated data


2. Set of programs to access the data
3. An environment that is both convenient and efficient to use
Database Applications:

1. Banking: all transactions


2. Airlines: reservations, schedules
3. Universities: registration, grades
4. Sales: customers, products, purchases
5. Online retailers: order tracking, customized recommendations
6. Manufacturing: production, inventory, orders, supply chain
7. Human resources: employee records, salaries, tax deductions
8. Databases touch all aspects of our lives

Here is the list of some popular DBMS systems:


● MySQL
● Oracle
● PostgreSQL
● dBASE
● FoxPro
● SQLite
● IBM DB2
● LibreOffice Base
● MariaDB
● Microsoft SQL Server

What Is a DBMS?

● A very large, integrated collection of data.

● Models real-world enterprise. Entities (e.g., students, courses) Relationships (e.g.,

Madonna is taking CS564)

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

● A Database Management System (DBMS) is a software package designed to store and

manage databases.

● A database is a collection of data typically describing the activities of one or more related organization

Why Use a DBMS?

● Data independence and efficient access.

● Reduced application development time.

● Data integrity and security.

● Uniform data administration.

● Concurrent access, recovery from crashes.

Why Study Databases??

● Shift from computation to information at the “low end”: scramble to webspace (a

mess!) at the “high end”: scientific applications

● Datasets are increasing in diversity and volume. Digital libraries, interactive video,

Human Genome project, EOS project ... need for DBMS exploding

● DBMS encompasses most of CS OS, languages, theory, AI, multimedia, logic

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Purpose of Database Systems:

In the early days, database applications were built directly on top of file systems

Drawbacks of using file systems to store data:

● Data redundancy and inconsistency

● Multiple file formats, duplication of information in different files

● Difficulty in accessing data

● Need to write a new program to carry out each new task

● Data isolation — multiple files and formats

● Integrity problems

● Hard to add new constraints or change existing ones

● Atomicity of updates

● Failures may leave database in an inconsistent state with partial updates carried out

● Example: Transfer of funds from one account to another should either complete or not

happen at all

● Concurrent access by multiple users

● Concurrent accessed needed for performance

● Uncontrolled concurrent accesses can lead to inconsistencies

● Example: Two people reading a balance and updating it at the same time

● Security problems

● Hard to provide user access to some, but not all, data

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

● Database systems offer solutions to all the above problems

● HISTORY OF DATABASE:

• 1950s and early 1960s:

– Data processing using magnetic tapes for storage

• Tapes provide only sequential access

– Punched cards for input

• Late 1960s and 1970s:

– Hard disks allow direct access to data

– Network and hierarchical data models in widespread use

– Ted Codd defines the relational data model

• Would win the ACM Turing Award for this work

• IBM Research begins System R prototype

• UC Berkeley begins Ingres prototype

– High-performance (for the era) transaction processing

• 1980s:

– Research relational prototypes evolve into commercial systems

• SQL becomes industry standard


Nasra Fatima, Assistant Professor, CSE-Dept
------------------------------------------------------------------------------------------------------------------------------------------------

– Parallel and distributed database systems

– Object-oriented database systems

• 1990s:

– Large decision support and data-mining applications

– Large multi-terabyte data warehouses

– Emergence of Web commerce


• 2000s:

– XML and XQuery standards

– Automated database administration

– Increasing use of highly parallel database systems

– Web-scale distributed data storage systems

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

● File Systems versus a DBMS

DBMS vs. Flat File

DBMS Flat File Management System

Multi-user access It does not support multi-user access

Design to fulfill the need of small and large businesses It is only limited to smaller DBMS systems.

Remove redundancy and Integrity. Redundancy and Integrity issues

Expensive. But in the long term Total Cost of Ownership is It’s cheaper
cheap

Easy to implement complicated transactions No support for complicated transactions

Files vs. DBMS:

● Applications must stage large datasets between main memory and secondary storage

(e.g., buffering, page-oriented access, 32-bit addressing, etc.)

● Special code for different queries

● Must protect data from inconsistency due to multiple concurrent users

● Crash recovery

● Security and access control

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

● Data Models
A Data Model in Database Management System (DBMS) is the concept of tools that are developed to summarize
the description of the database. Data Models provide us with a transparent picture of data which helps us in
creating an actual database. It shows us from the design of the data to its proper implementation of data.

1) Relational Data Model: This type of model designs the data in the form of rows and columns within a table. Thus, a
relational model uses tables for representing data and in-between relationships. Tables are also called relations. This model
was initially described by Edgar F. Codd, in 1969. The relational data model is the widely used model which is primarily used
by commercial data processing applications.

2) Entity-Relationship Data Model: An ER model is the logical representation of data as objects and relationships among
them. These objects are known as entities, and relationship is an association among these entities. This model was designed
by Peter Chen and published in 1976 papers. It was widely used in database designing. A set of attributes describe the
entities. For example, student_name, student_id describes the 'student' entity. A set of the same type of entities is known as an
'Entity set', and the set of the same type of relationships is known as 'relationship set'.

3) Object-based Data Model: An extension of the ER model with notions of functions, encapsulation, and object identity, as
well. This model supports a rich type system that includes structured and collection types. Thus, in the 1980s, various

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

database systems following the object-oriented approach were developed. Here, the objects are nothing but the data carrying
its properties.

4) Semistructured Data Model: This type of data model is different from the other three data models (explained above). The
semistructured data model allows the data specifications at places where the individual data items of the same type may have
different attribute sets. The Extensible Markup Language, also known as XML, is widely used for representing
semistructured data. Although XML was initially designed for including the markup information to the text document, it
gains importance because of its application in the exchange of data.

Advantages of Data Models


1. Data Models help us in representing data accurately.
2. It helps us in finding the missing data and also in minimizing Data Redundancy.
3. Data Model provides data security in a better way.
4. The data model should be detailed enough to be used for building the physical database.
5. The information in the data model can be used for defining the relationship between tables, primary and foreign
keys, and stored procedures.
Disadvantages of Data Models
1. In the case of a vast database, sometimes it becomes difficult to understand the data model.
2. You must have the proper knowledge of SQL to use physical models.
3. Even smaller changes made in structure require modification in the entire application.
4. There is no set data manipulation language in DBMS.
5. To develop a Data model one should know physical data stored characteristics.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Levels of Abstraction In DBMS


Major purpose of the database management system is to provide users with an abstract view of data
i.e. the system hides certain details of how the data are stored and maintained. Since database system
users are not computer trained,developers hide the complexity from users through 3 levels of
abstraction , to simplify user’s interaction with the system.

There are mainly 3 levels of data abstraction:

The main purpose of data abstraction is to achieve data independence in order to save the time and cost required
when the database is modified or altered.

1. Physical or Internal Level:

The physical or internal layer is the lowest level of data abstraction in the database management system. It is the layer that
defines how data is actually stored in the database. It defines methods to access the data in the database. It defines complex
data structures in detail, so it is very complex to understand, which is why it is kept hidden from the end user.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Data Administrators (DBA) decide how to arrange data and where to store data. The Data Administrator (DBA) is the person
whose role is to manage the data in the database at the physical or internal level. There is a data center that securely stores the
raw data in detail on hard drives at this level.

2. Logical or Conceptual Level:

The logical or conceptual level is the intermediate or next level of data abstraction. It explains what data is going to be stored
in the database and what the relationship is between them.

It describes the structure of the entire data in the form of tables. The logical level or conceptual level is less complex than the
physical level. With the help of the logical level, Data Administrators (DBA) abstract data from raw data present at the
physical level.

3. View or External Level:

View or External Level is the highest level of data abstraction. There are different views at this level that define the parts of
the overall data of the database. This level is for the end-user interaction; at this level, end users can access the data based on
their queries.

Advantages of data abstraction in DBMS


o Users can easily access the data based on their queries.
o It provides security to the data stored in the database.
o Database systems work efficiently because of data abstraction.

Example:

1) Physical level of data abstraction:


describes how a record (e.g., customer) is stored.This s the lowest level of abstraction which
describes how data are actually stored.
2) Logical level of data abstraction:
This level hides what data are actually stored in the database and what relations hip exists
among them. describes data stored in the database, and the relationships among the data.
type customer = record
customer_id:string;
customer_name:string;
customer_stree:string;
customer_city : string;

end;

3) View Level of data abstraction:


View provides a security mechanism to prevent users from accessing certain parts of the
Nasra Fatima, Assistant Professor, CSE-Dept
------------------------------------------------------------------------------------------------------------------------------------------------

database. application programs hide details of data types. Views can also hide information
(such as an employee’s salary) for security purposes.

Summary

• DBMS used to maintain, query large datasets.

• Benefits include recovery from system crashes, concurrent access, quick application

development, data integrity and security.

• Levels of abstraction give data independence.

• A DBMS typically has a layered architecture.

• DBAs hold responsible jobs and are well-paid!

• DBMS R&D is one of the broadest, most exciting areas in CS.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Data Independence
Data Independence is mainly defined as a property of DBMS that helps you to change the database schema at one
level of a system without requiring to change the schema at the next level. It helps to keep the data separated from
all programs that make use of it.

We have namely two levels of data independence arising from these levels of abstraction :

Physical level data independence: It refers to the characteristic of being able to modify the physical schema
without any alterations to the conceptual or logical schema, done for optimization purposes, e.g., the Conceptual
structure of the database would not be affected by any change in storage size of the database system server.

Changing from sequential to random access files is one such example. These alterations or modifications to the
physical structure may include:

● Utilizing new storage devices.


● Modifying data structures used for storage.
● Altering indexes or using alternative file organization techniques etc.

Logical level data independence: It refers characteristic of being able to modify the logical schema without
affecting the external schema or application program. The user view of the data would not be affected by any
changes to the conceptual view of the data. These changes may include insertion or deletion of attributes, altering
table structures, entities or relationships to the logical schema, etc.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Difference between Physical and Logical Data Independence

Logical Data Independence Physical Data Independence

Logical Data Independence is


mainly concerned with the structure Mainly concerned with the
or changing the data definition. storage of the data.

It is difficult as the retrieving of


data is mainly dependent on the
It is easy to retrieve.
logical structure of data.

Compared to Logic Physical Compared to Logical


independence it is difficult to Independence it is easy to
achieve logical data independence. achieve physical data
independence.
You need to make changes in the
Application program if new fields A change in the physical level
are added or deleted from the usually does not need change at
database. the Application program level.

Modification at the logical levels is


Modifications made at the
significant whenever the logical
internal levels may or may not
structures of the database are
be needed to improve the
changed.
performance of the structure.
Concerned with conceptual schema
Concerned with internal schema
Example: change in compression
Example: Add/Modify/Delete a
techniques, hashing algorithms,
new attribute
storage devices, etc

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Structure of DBMS
Database Management System is the combination of a Database and all the functionalities to organize and manage the
data. It is software that allows us to efficiently interact with data at various levels of abstraction.

As Database Management System is a complex set of programs, it is necessary to understand all the components of
DBMS so that we can easily manage the database.

Let’s understand the structure of DBMS in detail. The below diagram shows the typical structure of a database
management system. A Database Management System has three major components:

● Query Processor
● Storage Manager
● Disk Storage.

Structure of DBMS
Nasra Fatima, Assistant Professor, CSE-Dept
------------------------------------------------------------------------------------------------------------------------------------------------

Query Processor

The Query Processor receives the queries (requests) from the user and interprets them in the form of instructions. It also
executes the instructions received from the DML Compiler. It has the following four components:

1. DML Compiler: It converts the DML (Data Manipulation Language) Instructions into Machine
Language (low-level language).
2. DDL Interpreter: It interprets the DDL (Data Definition Language) Instructions and stores the
record in a data dictionary (in a table containing meta-data)
3. Query Optimizer: It executes the DML Instructions and picks the lowest cost evaluation plan out
of all the alternatives present.
4. Embedded DML Pre-compiler: It translates the DML statements embedded in Application
Programs into procedural function calls.

Storage Manager

Storage manager acts as the interface between the data stored in the database and the queries received from the end-
user. This component in the Structure of DBMS is responsible for the constraints applied to the data so that it remains
consistent. It also executes the DCL (Data Control Language). It encapsulates the following modules:

1. Authorization and Integrity Manager: It checks the authority of various users who access data and
the Integrity Constraints of the database.
2. Transaction Manager: Its job is to assure the system remains in a proper state during the
transaction process. It also ensures that concurrent transactions are executed without any conflict.
3. File Manager: It manages the space allocation of files in disk and data structures which stores
information in the database.
4. Buffer Manager: It manages the transfer of data between the secondary and main memory. It also
decides what data should be cached in the memory.

Disk Storage

The Disk Storage in the Structure of DBMS represents the space where data is stored. It has the following components:

1. Files: These are responsible for storing the data.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

2. Data Dictionary: It is the repository that maintains the information of the database object and
maintains the metadata.
3. Indices: These are the keys that are used for faster retrieval of data.

Thus, the Structure of DBMS represents the functional modules that are employed to process the queries received from
the user, retrieve the data, maintain the changes in the database and optimize the data retrieval.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Three Parts that make up the Database System are:

o Query Processor
o Storage Manager
o Disk Storage

The explanations for these are provided below:

1. Query Processor
The query processing is handled by the query processor, as the name implies. It executes the user's query, to put it simply. In
this way, the query processor aids the database system in making data access simple and easy. The query processor's primary
Nasra Fatima, Assistant Professor, CSE-Dept
------------------------------------------------------------------------------------------------------------------------------------------------

duty is to successfully execute the query. The Query Processor transforms (or interprets) the user's application program-
provided requests into instructions that a computer can understand.

Components of the Query Processor


o DDL Interpreter:

Data Definition Language is what DDL stands for. As implied by the name, the DDL Interpreter interprets DDL statements
like those used in schema definitions (such as create, remove, etc.). This interpretation yields a set of tables that include the
meta-data (data of data) that is kept in the data dictionary. Metadata may be stored in a data dictionary. In essence, it is a part
of the disc storage that will be covered in a later section of this article.

o DML Compiler:

Compiler for DML Data Manipulation Language is what DML stands for. In keeping with its name, the DML Compiler
converts DML statements like select, update, and delete into low-level instructions or simply machine-readable object code,
to enable execution. The optimization of queries is another function of the DML compiler. Since a single question can
typically be translated into a number of evaluation plans. As a result, some optimization is needed to select the evaluation
plan with the lowest cost out of all the options. This process, known as query optimization, is exclusively carried out by the
DML compiler. Simply put, query optimization determines the most effective technique to carry out a query.

o Embedded DML Pre-compiler:

Before the query evaluation, the embedded DML commands in the application program (such as SELECT, FROM, etc., in
SQL) must be pre-compiled into standard procedural calls (program instructions that the host language can understand).
Therefore, the DML statements which are embedded in an application program must be converted into routine calls by the
Embedded DML Pre-compiler.

o Query Optimizer:

It starts by taking the evaluation plan for the question, runs it, and then returns the result. Simply said, the query evaluation
engine evaluates the SQL commands used to access the database's contents before returning the result of the query. In a
nutshell, it is in charge of analyzing the queries and running the object code that the DML Compiler produces. Apache Drill,
Presto, and other Query Evaluation Engines are a few examples.

2. Storage Manager:
An application called Storage Manager acts as a conduit between the queries made and the data kept in the database. Another
name for it is Database Control System. By applying the restrictions and running the DCL instructions, it keeps the database's
consistency and integrity. It is in charge of retrieving, storing, updating, and removing data from the database.

Components of Storage Manager

Following are the components of Storage Manager:

o Integrity Manager:

Whenever there is any change in the database, the Integrity manager will manage the integrity constraints.

o Authorization Manager:

Authorization manager verifies the user that he is valid and authenticated for the specific query or request.

o File Manager:

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

All the files and data structure of the database are managed by this component.

o Transaction Manager:

It is responsible for making the database consistent before and after the transactions. Concurrent processes are generally
controlled by this component.

o Buffer Manager:

The transfer of data between primary and main memory and managing the cache memory is done by the buffer manager.

3. Disk Storage
A DBMS can use various kinds of Data Structures as a part of physical system implementation in the form of disk storage.

Components of Disk Storage

Following are the components of Disk Manager:

o Data Dictionary:

It contains the metadata (data of data), which means each object of the database has some information about its structure. So,
it creates a repository which contains the details about the structure of the database object.

o Data Files:

This component stores the data in the files.

o Indices:

These indices are used to access and retrieve the data in a very fast and efficient way.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

DBMS Architecture
o The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a large
number of PCs, web servers, database servers and other components that are connected with networks.
o The client/server architecture consists of many PCs and a workstation which are connected via the network.
o DBMS architecture depends upon how users are connected to the database to get their request done.

Types of DBMS Architecture

Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two types like: 2-tier
architecture and 3-tier architecture.

1-Tier Architecture
o In this architecture, the database is directly available to the user. It means the user can directly sit on the DBMS and
use it.
o Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for end users.
o The 1-Tier architecture is used for development of the local application, where programmers can directly
communicate with the database for quick response.

2-Tier Architecture
o The 2-Tier architecture is the same as the basic client-server. In the two-tier architecture, applications on the client
end can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are
used.
o The user interfaces and application programs are run on the client-side.
o The server side is responsible to provide the functionalities like: query processing and transaction management.
o To communicate with the DBMS, client-side applications establish a connection with the server side.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Fig: 2-tier Architecture

3-Tier Architecture
o The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't directly
communicate with the server.
o The application on the client-end interacts with an application server which further communicates with the database
system.
o End user has no idea about the existence of the database beyond the application server. The database also has no idea
about any other user beyond the application.
o The 3-Tier architecture is used in case of large web applications.

Fig: 3-tier Architecture

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Difference between DBMS Structure and DBMS Architecture


The structure of DBMS refers to the logical representation of all of its modules while the architecture of DBMS refers
to the representation of its design. The structure includes the components and programs which work under DBMS while
architecture includes the various entities present in the DBMS and how user interaction is done with the database.

The following Diagram shows the difference between the Structure and Architecture of the Database. Thus, a DBMS
Architecture can be a tier-1, tier-2, or tier-2 Architecture depending upon the Database Design but the Structure of the
Database is only one standard representation of functional units of DBMS.

Nasra Fatima, Assistant Professor, CSE-Dept


------------------------------------------------------------------------------------------------------------------------------------------------

Introduction to Database Design:


● Database Design and ER Diagrams,
● Entities, Attributes, and Entity Sets,
● Relationships and Relationship Sets
● Additional Features of the ER Model
● Conceptual Design With the ER Model.

Nasra Fatima, Assistant Professor, CSE-Dept

You might also like