You are on page 1of 25

21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

UNIT I INTRODUCTION

Syllabus:
Issues in File Processing System, Need for DBMS, Basic terminologies of Database, Database
system Architecture, Various Data models, ER diagram basics and extensions, Case study:
Construction of Database design using Entity Relationship diagram for an application such as
University Database, Banking System, Information System

DBMS stands for Database Management System. We can break it like this DBMS = Database +
Management System. Database is a collection of data and Management System is a set of programs
to store and retrieve those data.

Definition:
DBMS is a collection of inter-related data and set of programs to store & access those data
in an easy and effective manner.

What is the need of DBMS?


Database systems are basically developed for large amount of data. When dealing with
huge amount of data, there are two things that require optimization:
1. Storage of data.
2. Retrieval of data.

Storage:
❖ According to the principles of database systems, the data is stored in such a way that it acquires
lot less space as the redundant data (duplicate data) has been removed before storage.
❖ Let’s take a layman example to understand this:
In a banking system, suppose a customer is having two accounts, one is saving account and
another is salary account.
❖ Let’s say bank stores saving account data at one place (these places are called tables we will
learn them later) and salary account data at another place, in that case if the customer
information such as customer name, address etc. are stored at both places then this is just a

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 1


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

wastage of storage (redundancy/ duplication of data), to organize the data in a better way the
information should be stored at one place and both the accounts should be linked to that
information somehow. The same thing we achieve in DBMS.
Fast Retrieval of data:
❖ Along with storing the data in an optimized and systematic manner, it is also important that
we retrieve the data quickly when needed.
❖ Database systems ensure that the data is retrieved as quickly as possible.

Why use DBMS?


❖ To develop software applications in less time.
❖ Data independence and efficient use of data.
❖ For uniform data administration.
❖ For data integrity and security.
❖ For concurrent access to data, and data recovery from crashes.
❖ To use user-friendly declarative query language

Applications of Database Management Systems:

Domain Usage of DBMS


Managing customer information, account activities, payments, deposits,
Banking
loans, etc.
Maintain and Manage the Passenger Manifesto, reservations and
Transportation
schedule information.
Universities Student information, course registrations, colleges and grades.
Telecommunication It helps to keep call records, monthly bills, maintaining balances, etc.
For storing information about stock, sales, and purchases of financial
Finance
instruments like stocks and bonds.
Sales To store customer details, product details & sales information.
It is used for the management of supply chain and for tracking
Manufacturing
production of items. Inventories status in warehouses.
Social media Manage the user accounts, Security, Data access

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 2


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

DRAWBACKS OF FILE SYSTEM:


❖ In a File System, data is directly stored in set of files.
❖ It contains flat files that have no relation to other files (when only one table is stored in single
file, then this file is known as flat file).
Issues in File Processing System -
a) Data redundancy
b) Data inconsistency
c) Limited Data Sharing
d) Data Dependency
e) Lack of Data Integrity
f) Limited Security
g) Concurrency Control
h) Scalability Issues
i) Limited Query Capabilities
a) Data redundancy
Problem: Data is often duplicated across various files. This redundancy can lead to inconsistencies
and increase the likelihood of data errors.
Impact: Wasted storage space, increased maintenance efforts, and the risk of inconsistent data.
b) Data inconsistency
Problem: Changes made to data in one file may not be reflected in other related files. This
inconsistency can lead to inaccurate reporting and decision-making.
Impact: Unreliable information, potential for errors, and difficulties in maintaining data integrity.
c) Limited Data Sharing:
Problem: Difficulty in sharing data between different applications or departments. Each
department might maintain its own set of files, making it challenging to access and integrate
information across the organization.
Impact: Reduced collaboration, increased likelihood of outdated information, and inefficiencies
in information retrieval.
d) Data Dependence:
Problem: Programs are often written to access specific files directly. If the structure of a file
changes, it may require modifying all programs that use that file.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 3


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Impact: High maintenance costs, inflexibility in adapting to changing data structures, and
increased risk of errors during updates.
e) Lack of Data Integrity:
Problem: In the absence of constraints and rules enforced by a database management system,
maintaining data integrity becomes a manual task. This increases the risk of entering incorrect or
inconsistent data.
Impact: Lower data quality, increased chances of errors, and difficulties in ensuring the accuracy
of information.
f) Limited Security:
Problem: File systems often have limited security measures. Access controls are typically basic,
and there's a higher risk of unauthorized access.
Impact: Data breaches, unauthorized modifications, and compromised system security.
g) Concurrency Control:
Problem: Ensuring concurrent access to data by multiple users without conflicts is challenging.
File systems may lack mechanisms to handle concurrent updates properly.
Impact: Data corruption, lost updates, and challenges in maintaining data consistency in a multi-
user environment.
h) Scalability Issues:
Problem: As the volume of data grows, file processing systems may struggle to handle large
datasets efficiently. Performance issues can arise.
Impact: Reduced system performance, longer processing times, and challenges in scaling the
system.
i) Limited Query Capabilities:
Problem: File processing systems often lack a query language and sophisticated querying
capabilities. Retrieving specific information can be cumbersome.
Impact: Inefficient data retrieval, increased complexity in generating reports, and challenges in
extracting meaningful insights.

Advantage of DBMS over File System:


There are several advantages of Database management system over file system. Few of them are
as follows:

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 4


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

1. No redundant data – Redundancy removed by data normalization


2. Data Consistency and Integrity – data normalization takes care of it too
3. Secure – Each user has a different set of access
4. Privacy – Limited access
5. Easy access to data
6. Easy recovery
7. Flexible

Disadvantages of DBMS:
1. DBMS implementation cost is high compared to the file system
2. Complexity: Database systems are complex to understand
3. Performance: Database systems are generic, making them suitable for various applications.
However, this feature affects their performance for some applications

BASIC TERMINOLOGIES OF DATABASE:


1. Database - A collection of organized data that is stored and can be easily accessed, managed,
and updated.

2. DBMS (Database Management System) - Software that enables users to interact with the
database. It provides tools for creating, managing, and querying databases.

3. Table - A collection of data organized in rows and columns. Tables are the basic structure
in a relational database.

4. Row or Record - A single entry in a database table that contains data related to a specific
entity or object.

5. Column or Field - A vertical section in a database table that represents a specific attribute
or property. Columns hold the data for a particular aspect of the entity.

6. Primary Key - A unique identifier for each record in a table. It ensures that each row can
be uniquely identified and retrieved.

7. Foreign Key - A column in a table that refers to the primary key in another table. It
establishes a link between two tables, enforcing referential integrity.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 5


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

8. Index - A data structure that improves the speed of data retrieval operations on a database
table.

9. Query - A request for data retrieval or manipulation from a database. Queries are typically
written in a query language like SQL (Structured Query Language).

10. Normalization - The process of organizing the data in a database to eliminate redundancy
and improve data integrity.

11. Relational Database - A type of database that uses a tabular structure to organize data, and
relationships between tables are defined.

12. SQL (Structured Query Language) - A programming language used for managing and
manipulating relational databases. It includes commands for querying, updating, and
managing databases.

13. Transaction - A sequence of one or more database operations treated as a single unit.
Transactions ensure the consistency and integrity of a database.

14. ACID (Atomicity, Consistency, Isolation, Durability) - Properties that guarantee the
reliability of database transactions. Atomicity ensures that transactions are treated as a single
unit, Consistency ensures that a database remains in a valid state, Isolation ensures that
transactions are independent of each other, and Durability ensures that committed
transactions are permanent.

15. Schema - The structure or blueprint that defines the organization of data in a database,
including tables, fields, relationships, and constraints.

16. Data Dictionary - A centralized repository that stores metadata about the database,
including information about tables, columns, data types, and relationships.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 6


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

EVALUATION OF DATA MODELS:


❖ To come across the limitations of file systems, there are lot of researchers and software
developers designed and developed various data models.
❖ The important and widely accepted models are:
1) Hierarchical
2) Network
3) Entity relationship
4) Relational
5) Object oriented

1) HIERARCHICAL DATA MODEL:


❖ The first and fore most model of the DBMS.
❖ This model organizes the data in the hierarchical tree structure.
❖ This model is easy to understand with real time examples site map of a website

Features of a Hierarchical Model

1) One-to-many relationship
2) Parent-Child Relationship
3) Deletion Problem
4) Pointers

Advantages of Hierarchical Model

1) Simple and fast traversal because of using tree structure


2) Changes in parent node automatically reflected in child node

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 7


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Disadvantages of Hierarchical Model

1) Complexity
2) If Parent node is deleted then child node will be deleted

2) NETWORK MODEL:
❖ Network model is an extension of hierarchical model.
❖ This model was recommended as the best before relationship model.
❖ Same like hierarchical model, the only difference between these two models can have
more than one parent
❖ For Example, consider the following diagram a student entity has more than one parent

Features of a Network Model

1) Manage to Merge more Relationships

2) More paths

3) Circular Linked List

Advantages of Network Model

1) Data access is faster

2) Because of parent child relationship, the changes in parent reflect in child

Disadvantages of Network Model

1) More complex because of more and more relations

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 8


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

3) ENTITY-RELATIONSHIP MODEL (ER MODEL):

❖ This model is a high-level data model

❖ Represents the real – world problem as a pictorial representation

❖ Easy to understand by the developers about the specification

❖ It is like a visualization tool to represent a specific database

❖ It contains three components

1. Entities

2. Attributes

3. Relationships

Features of ER Model

1) Graphical representation

2) Visualization

3) Good Database design (Widely used)

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 9


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Advantages of ER Model

1) Very Simple

2) Better communication

3) Easy to convert to any model

Disadvantage of ER Model

1) No industry standard

2) Hidden information

4) RELATIONAL MODEL:
❖ Widely used model
❖ Data are represented as row-wise and column-wise (2-Dimensional Array)
❖ Example: EMP (Employee) Table

Features of Relational Model


1) Records
2) Attributes
Advantages of Relational Model
1) Simple
2) Scalable
3) Structured format

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 10


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

4) Isolation
Disadvantages of Relational Model
1) Hardware overheads

5) OBJECT ORIENTED MODEL:


❖ The real- time problems are easily represented through Object-Oriented data model which
is an OBJECT.
❖ In this Model, the data and its relationship present in the single structure
❖ Complex data like images, audio, videos can be stored easily
❖ Objects connected through links using common attribute(s)
❖ Example: Three Objects Faculty, Department and Campus linked using common attribute

INSTANCES AND SCHEMAS:


SCHEMA:
A logical structure of a database is called the schema. Schema is of three types.
1. Physical Schema or Internal Schema
2. Logical Schema or Conceptual Schema
3. View Schema or External Schema
❖ The design of a database at physical level is called physical schema, how the data stored
in blocks of storage is described at this level.
❖ Design of database at logical level is called logical schema, programmers and database
administrators work at this level, at this level data can be described as certain types of

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 11


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

data records gets stored in data structures, however the internal details such as
implementation of data structure is hidden at this level (available at physical level).
❖ Design of database at view level is called view schema. This generally describes end user
interaction with database systems.
INSTANCE:
❖ The data stored in database at a particular moment of time is called instance of database.
❖ Database schema defines the variable declarations in tables that belong to a particular
database; the value of these variables at a moment of time is called the instance of that
database.

DATA ABSTRACTION IN DBMS


❖ Database systems are made-up of complex data structures. To ease the user interaction with
database, the developers hide internal irrelevant details from users. This process of hiding
irrelevant details from user is called data abstraction.

Three levels of abstraction:


1) Physical level: This is the lowest level of data abstraction. It describes how data is actually
stored in database. You can get the complex data structure details at this level.
2) Logical level: This is the middle level of 3-level data abstraction architecture. It describes
what data is stored in database.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 12


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

3) View level: Highest level of data abstraction. This level describes the user interaction with
database system.
Example:
❖ Let’s say we are storing customer information in a customer table. At physical level these
records can be described as blocks of storage (bytes, gigabytes, terabytes etc.) in memory.
These details are often hidden from the programmers.
❖ At the logical level these records can be described as fields and attributes along with their
data types, their relationship among each other can be logically implemented. The
programmers generally work at this level because they are aware of such things about
database systems.
❖ At view level, user just interact with system with the help of GUI and enter the details at
the screen, they are not aware of how the data is stored and what data is stored; such details
are hidden from them.

DATA INDEPENDENCE:
❖ Change the schema at one level of a database system without having to change the schema
at the next higher level. We can define two types of data independence:
1. Logical data independence - is the capacity to change the conceptual schema without having
to change external schemas or application programs.
2. Physical data independence - is the capacity to change the internal schema without having to
change conceptual schemas.
3. Applications depend on the logical schema. In general, the interfaces between the various
levels and components should be well defined so that changes in some parts do not seriously
influence others

DBMS LANGUAGES:
Database languages are used for read, update and store data in a database. There are several
such languages that can be used for this purpose; one of them is SQL (Structured Query
Language).

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 13


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Types of DBMS languages:

1. Data Definition Language (DDL): DDL is used for specifying the database schema. Let’s

take SQL for instance to categorize the statements that comes under DDL.

• To create the database instance – CREATE


• To alter the structure of database – ALTER
• To drop database instances – DROP
• To delete tables in a database instance – TRUNCATE
• To rename database instances – RENAME
All these commands specify or update the database schema that’s why they come under Data
Definition language.

2. Data Manipulation Language (DML): DML is used for accessing and manipulating data in a
database.
• To read records from table(s) – SELECT
• To insert record(s) into the table(s) – INSERT
• Update the data in table(s) – UPDATE
• Delete all the records from the table – DELETE

3. Data Control language (DCL): DCL is used for granting and revoking user access on a
database –
• To grant access to user – GRANT
• To revoke access from user – REVOKE
In practical data definition language, data manipulation language and data control languages are
not separate language; rather they are the parts of a single database language such as SQL.

4. Transaction Control Language (TCL): Transaction Control Language statements are used to
control the transactions in the database.
• SAVE POINT
• ROLLBACK
• COMMITE

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 14


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

DATABASE SYSTEM ARCHITECTURE:

There are four types users accessing / managing the database


✔ Naïve Users
✔ Application Programmers
✔ Sophisticated Users
✔ Database Administrators

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 15


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

The database system is divided into three components:


✔ Query Processor
✔ Storage Manager
✔ Disk Storage.
Query Processor:

❖ It interprets the requests (queries) from user(s) via an application program /interface into
instructions.

❖ It also executes the user request which is received from the DML compiler.

❖ Query Processor contains the following components –

a) DML Compiler - It processes the DML statements into low level instruction

b) DDL Interpreter - It processes the DDL statements into a set of tables containing meta
data

c) Embedded DML Pre-compiler - It processes DML statements embedded in an


application program into procedural calls.

d) Query Optimizer - It executes the instruction generated by DML Compiler.

Storage Manager:

❖ It is an interface between the information stored in the database an and the requests (queries)

❖ It is also known as Database Control System

❖ It maintains the consistency and Integrity

❖ The main responsibility is managing the data manipulation such as addition deletion,
modification, etc.,

❖ Storage Manager contains the following components

a) Authorization Manager - It ensures role-based access control, i.e,. checks whether the
particular person is privileged to perform the requested operation or not.

b) Integrity Manager - It checks the integrity constraints when the database is modified.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 16


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

c) Transaction Manager - It controls concurrent access by performing the operations in a


scheduled way that it receives the transaction. Thus, it ensures that the database remains in
the consistent state before and after the execution of a transaction.

d) File Manager - It manages the file space and the data structure used to represent
information in the database.

e) Buffer Manager - It is responsible for cache memory and the transfer of data between the
secondary storage and main memory.

Disk Storage:

❖ Used to store all the information

❖ It contains the following components -

a) Data Files - It stores the actual data.

b) Data Dictionary - It contains the information about the structure of any database object. It
is the repository of information that governs the metadata.

c) Indices - It provides faster retrieval of data item.

d) Statistical Data - Contains the statistics of all information

DATABASE USERS AND ADMINISTRATOR:


Database Users
Users are differentiated by the way they expect to interact with the system.
❖ Application programmers: interact with system through DML calls.
❖ Sophisticated users form requests in a database query language
❖ Specialized users – write specialized database applications that do not fit into the
traditional data processing framework
❖ Naive users – invoke one of the permanent application programs that have been written
previously

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 17


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Database Administrator:
❖ Coordinates all the activities of the database system.
❖ The database administrator has a good understanding of the enterprise’s information
resources and needs.
❖ Database administrator's duties (Roles) include:
▪ Schema definition
▪ Storage structure and access method definition
▪ Schema and physical organization modification
▪ Granting user authority to access the database
▪ Specifying integrity constraints
▪ Acting as liaison with users
▪ Monitoring performance and responding to changes in requirements

E-R MODEL:
An Entity–Relationship Model (ER model) is a systematic way of describing and defining a
business process. An ER model is typically implemented as a database. The main components of
E-R model are: entity set and relationship set.
Here are the geometric shapes and their meaning in an E-R Diagram –

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 18


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Rectangle: Represents Entity sets.


Ellipses: Attributes
Diamonds: Relationship Set
Lines: They link attributes to Entity Sets and Entity sets to Relationship Set
Double Ellipses: Multivalued Attributes

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 19


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Dashed Ellipses: Derived Attributes


Double Rectangles: Weak Entity Sets
Double Lines: Total participation of an entity in a relationship set

A sample E-R Diagram:

An Entity is represented by a set of attributes, which are descriptive properties possessed by all
members of an entity set.
Attribute types:
❑ Simple and composite attributes.
❑ Single-valued attributes.
❑ Multi-valued attributes.
An attribute that can hold multiple values is known as multivalued attribute. We represent
it with double ellipses in an E-R Diagram.
E.g. A person can have more than one phone numbers so the phone number attribute is
multivalued.
❑ Derived attributes:
A derived attribute is one whose value is dynamic and derived from another attribute. It is
represented by dashed ellipses in an E-R Diagram.
E.g. Person age is a derived attribute as it changes over time and can be derived from
another attribute (Date of birth).

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 20


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

E-R diagram with multivalued and derived attributes:

Total Participation of an Entity set:

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 21


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

E-R Diagram with Composite, Multivalued, and Derived Attributes

Cardinality Constraints:
We express cardinality constraints by drawing either a directed line (→), signifies “one”
An undirected line (—), signifies “many” between the relationship set and the entity set.
One-to-one relationship
• A customer is associated with at most one loan via the relationship borrower
• A loan is associated with at most one customer via borrower

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 22


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

One-To-Many Relationship:
In the one-to-many relationship a loan is associated with at most one customer via
borrower; a customer is associated with several loans via borrower.

Many-To-One Relationships
Customers via borrower, a customer is associated with at most one loan via borrower

Many-To-Many Relationship
▪ A customer is associated with several loans via borrower
▪ A loan is associated with several customers via borrower

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 23


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Specialization
❖ Top-down design process: We designate subgroupings within an entity set that are
distinctive from other entities in the set.
❖ These sub groupings become lower-level entity sets that have attributes or participate in
relationships that do not apply to the higher-level entity set.
❖ Depicted by a triangle component labeled ISA (E.g. customer “is a” person).
❖ Attribute inheritance – a lower-level entity set inherits all the attributes and relationship
participation of the higher-level entity set to which it is linked

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 24


21CSC205P - DATABASE MANAGEMENT SYSTEMS Unit - 1

Generalization
❖ A bottom-up design process – combine a number of entities sets that share the same
features into a higher-level entity set.
❖ Specialization and generalization are simple inversions of each other; they are represented
in an E-R diagram in the same way.
❖ The terms specialization and generalization are used interchangeably.

Dr. N. Nithiyanandam/AP/CTECH/SRMIST Page 25

You might also like