Professional Documents
Culture Documents
Data Model gives us an idea that how the final system will look like after its complete
implementation. It defines the data elements and the relationships between the data
elements. Data Models are used to show how data is stored, connected, accessed and
updated in the database management system. Here, we use a set of symbols and text to
represent the information so that members of the organisation can communicate and
understand it. Though there are many data models being used nowadays but the Relational
model is the most widely used model. Apart from the Relational model, there are many
other types of data models about which we will study in details in this blog. Some of the
Data Models in DBMS are:
1. Hierarchical Model
2. Network Model
3. Entity-Relationship Model
4. Relational Model
5. Object-Oriented Data Model
6. Object-Relational Data Model
7. Flat Data Model
8. Semi-Structured Data Model
9. Associative Data Model
10. Context Data Model
Hierarchical Model
Hierarchical Model was the first DBMS model. This model organises the data in the
hierarchical tree structure. The hierarchy starts from the root which has root data and then it
expands in the form of a tree adding child node to the parent node. This model easily
represents some of the real-world relationships like food recipes, sitemap of a website
etc. Example: We can represent the relationship between the shoes present on a shopping
website in the following way:
Features of a Hierarchical Model
1. One-to-many relationship: The data here is organised in a tree-like structure where
the one-to-many relationship is between the datatypes. Also, there can be only one
path from parent to any node. Example: In the above example, if we want to go to
the node sneakers we only have one path to reach there i.e through men's shoes
node.
2. Parent-Child Relationship: Each child node has a parent node but a parent node can
have more than one child node. Multiple parents are not allowed.
3. Deletion Problem: If a parent node is deleted then the child node is automatically
deleted.
4. Pointers: Pointers are used to link the parent node with the child node and are used
to navigate between the stored data. Example: In the above example the ' shoes '
node points to the two other nodes ' women shoes ' node and ' men's shoes ' node.
Network Model
This model is an extension of the hierarchical model. It was the most popular model before
the relational model. This model is the same as the hierarchical model, the only difference
is that a record can have more than one parent. It replaces the hierarchical tree with a
graph. Example: In the example below we can see that node student has two parents i.e.
CSE Department and Library. This was earlier not possible in the hierarchical model.
Features of a Network Model
1. Ability to Merge more Relationships: In this model, as there are more relationships
so data is more related. This model has the ability to manage one-to-one
relationships as well as many-to-many relationships.
2. Many paths: As there are more relationships so there can be more than one path to
the same record. This makes data access fast and simple.
3. Circular Linked List: The operations on the network model are done with the help
of the circular linked list. The current position is maintained with the help of a
program and this position navigates through the records according to the
relationship.
Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this
model, we represent the real-world problem in the pictorial form to make it easy for the
stakeholders to understand. It is also very easy for the developers to understand the system
by just looking at the ER diagram. We use the ER diagram as a visual tool to represent an
ER Model. ER diagram has the following three components:
Example:
In the above diagram, the entities are Teacher and Department. The attributes
of Teacher entity are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The
attributes of entity Department entity are Dept_id, Dept_name. The two entities are
connected using the relationship. Here, each teacher works for a department.
Features of ER Model
Graphical Representation for Better Understanding: It is very easy and simple to
understand so it can be used by the developers to communicate with the
stakeholders.
ER Diagram: ER diagram is used as a visual tool for representing the model.
Database Design: This model helps the database designers to build the database and
is widely used in database design.
Advantages of ER Model
Simple: Conceptually ER Model is very easy to build. If we know the relationship
between the attributes and the entities we can easily build the ER Diagram for the
model.
Effective Communication Tool : This model is used widely by the database
designers for communicating their ideas.
Easy Conversion to any Model : This model maps well to the relational model and
can be easily converted relational model by converting the ER model to the table.
This model can also be converted to any other model like network model,
hierarchical model etc.
Disadvatages of ER Model
No industry standard for notation: There is no industry standard for developing an
ER model. So one developer might use notations which are not understood by other
developers.
Hidden information: Some information might be lost or hidden in the ER model.
As it is a high-level view so there are chances that some details of information
might be hidden.
Relational Model
Relational Model is the most widely used model. In this model, the data is maintained in
the form of a two-dimensional table. All the information is stored in the form of row and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model. Example: In this example, we have an Employee
table.
Object-Relational Model
As the name suggests it is a combination of both the relational model and the object-
oriented model. This model was built to fill the gap between object-oriented model and the
relational model. We can have many advanced features like we can make complex data
types according to our requirements using the existing data types. The problem with this
model is that this can get complex and difficult to handle. So, proper understanding of this
model is required.
Semi-Structured Model
Semi-structured model is an evolved form of the relational model. We cannot differentiate
between data and schema in this model. Example: Web-Based data sources which we can't
differentiate between the schema and data of the website. In this model, some entities may
have missing attributes while others may have an extra attribute. This model gives
flexibility in storing the data. It also gives flexibility to the attributes. Example: If we are
storing any value in any attribute then that value can be either atomic value or a collection
of values.
Item : Items contain the name and the identifier(some numeric value).
Links: Links contain the identifier, source, verb and subject.
Example : Let us say we have a statement "The world cup is being hosted by London from
30 May 2020". In this data two links need to be stored:
1. The world cup is being hosted by London. The source here is 'the world cup', the
verb 'is being' and the target is 'London'.
2. ...from 30 May 2020. The source here is the previous link, the verb is 'from' and the
target is '30 May 2020'.
This is represented using the table as follows:
Dr. S.P.Khandait
1
Introduction to DBMS
• Purpose of Database Systems
• View of Data
• Data Models
• Data Definition Language
• Data Manipulation Language
• Transaction Management
• Storage Management
• Database Administrator
• Database Users
• Overall System Structure
2
Database Management System
(DMBS)
3
Purpose of Database Systems
Database management systems were developed to
handle the following difficulties of typical file-
processing systems supported by conventional
operating systems:
• Data redundancy and inconsistency
• Difficulty in accessing data
• Data isolation – multiple files and formats
• Integrity problems
• Atomicity of updates
• Concurrent access by multiple users
• Security problems
4
Levels of Abstraction
• Physical level: describes how a record (e.g.
customer) is stored.
• Logical level: describes data stored in database,
and the relationships among the data.
type customer = record
name: string;
street: string;
city: integer;
end;
• View level: application programs hide details of
data types. Views can also hide information (e.g.
salary) for security purposes.
5
Objectives of Three-Level Architecture
6
Objectives of Three-Level Architecture
7
View of Data
An architecture for a database system
View level
Logical
level
Physical
level
8
ANSI-SPARC Three-Level Architecture
9
ANSI-SPARC Three-Level Architecture
• External Level
– Users’ view of the database.
– Describes that part of database that is relevant
to a particular user.
– The way perceived by end users.
• Conceptual Level
– Community view of the database.
– Describes what data is stored in database and
relationships among the data.
– The way perceived by the DBA &
programmers.
10
ANSI-SPARC Three-Level Architecture
Internal Level
– Physical representation of the database on the
computer.
– Describes how the data is stored in the
database.
– The way perceived by the DBMS & OS.
11
Differences between Three Levels
12
Instances and Schemas
• Similar to types and variables in programming
languages
• Schema – the logical structure of the database
(e.g., set of customers and accounts and the
relationship between them)
• Instance – the actual content of the database at
a particular point in time
In RDBMS context: Schema – table names,
attribute names with their data types for each
table and constraints etc.
13
Schemas versus Instances
• Database Schema: The description of the database. It rarely changes.
– Includes descriptions of the database structure, data types, and the
constraints on the database.
14
Schemas, Mappings, and Instances
u Mapping is the process of transforming requests and results between
the Internal, Conceptual & External levels.
15
Example
Schema Instance
16
Data Independence
• Ability to modify a schema definition in one
level without affecting a schema definition in
the other levels.
• The interfaces between the various levels and
components should be well defined so that
changes in some parts do not seriously
influence others.
• Two levels of data independence
– Physical data independence
– Logical data independence
17
Data Independence
18
Data Independence
19
Data Independence and the ANSI-SPARC
Three-Level Architecture
20
Database Languages
21
Data Definition Language (DDL)
• Specification notation for defining the database
schema
• DDL compiler generates a set of tables stored in a
data dictionary
• Data dictionary contains metadata (data about data)
• Data storage and definition language – special type
of DDL in which the storage structure and access
methods used by the database system are specified
22
Data Manipulation Language (DML)
23
Database Languages
24
Database Languages
• DBMS have a facility for embedding DDL & DML
(sub-languages) in a High-Level Language
(COBOL, C, C++ or Java), which in this case is
considered a host language
C,C++,Lisp,..
Application Program
Call to DB
DBMS
Local Vars
(memory)
25
Data Model
• Integrated collection of concepts for describing
data, relationships between data, and constraints
on the data in an organization.
– To represent data in an understandable way.
26
Data Models
• A collection of tools for describing:
– Data
– Data relationships
– Data semantics
– Data constraints
• Object-based logical models
– Entity-relationship model
– Object-oriented model
– Semantic model
– Functional model
• Record-based logical models
– Relational model (e.g., SQL/DS, DB2)
– Network model
– Hierarchical model (e.g., IMS)
27
Categories of Data Models
29
Data Models
• Conceptual data models (Object-based):
– Entity-Relationship
– Semantic
– Functional
– Object-Oriented
• Logical data models (Record_based):
– Relational Data Model
– Network Data Model
– Hierarchical Data Model
• Physical Data Models
- Unifying &
- frame based memory models
30
E/R (Entity/Relationship) Model
A conceptual level data model.
Provides the concepts of entities, relationships and attributes.
The University Database Context
Entities: student, faculty member, course, departments etc.
Relationships: enrollment relationship between student &
course,
employment relationship between faculty
member, department etc.
Attributes: name, rollNumber, address etc., of student entity,
name, empNumber, phoneNumber etc., of faculty entity etc.
Banking database: with customer and Account entities
31
Entity-Relationship Model
Example of entity-relationship model
social-security customer-street
account-number
customer-name customer-city balance
32
Representational Level Data Model
Relational Model : Provides the concept of a relation.
In the context of university database:
Relation name
Attributes
student
….
….
….
….
tuple
.
…
Relation scheme: Attribute names of the relation. Relation
data/instance: set of data tuples.
More details will be given in Relational Data Model Module.
33
Record-Based Data Models
Relational Data Model
34
Relational Model
Example of tabular data in the relational model:
name ssn street city account-number
Johnson 192-83-7465 Alma Palo Alto A-101
Smith 019-28-3746 North Rye A-215
Johnson 192-83-7465 Alma Palo Alto A-201
Jones 321-12-3123 Main Harrison A-217
Smith 019-28-3746 North Rye A-201
account-number balance
A-101 500
A-201 900
A-215 700
A-217 750
35
Record-Based Data Models
Network Data Model
36
Record-Based Data Models
Hierarchical Data Model
37
Functions of a DBMS
• A User-Accessible Catalog.
• Transaction Support.
• Recovery Services.
38
Functions of a DBMS
• Authorization Services.
• Integrity Services.
• Utility Services.
39
Architecture of DBMS - Overall System Structure
42
Architecture Details (1/3)
Disk Storage:
Meta-data – schema
- table definition, view definitions, mappings
Data – relation instances, index structures
statistics about data
Log – record of database update operations essential for failure
recovery
43
Architecture Details (2/3)
Query compiler
SQL adhoc queries
Compiles
update / delete commands
Query optimizers
Selects a near optimal plan for executing a query
- relation properties and index structures are utilized
44
Architecture Details (3/3)
RDBMS Run Time System:
Executes Compiled queries, Compiled application programs
Interacts with Transaction Manager, Buffer Manager
Transaction Manager:
Keeps track of start, end of each transaction
Enforces concurrency control protocols
Buffer Manager: Manages disk space
Implements paging mechanism Recovery Manager:
Takes control as restart after a failure
Brings the system to a consistent state before it can be resumed
45
Roles for people in an Info System Management (1/2)
Application Programmers
•Embed SQL in a high-level language and develop programs to
handle functional requirements of an IS
•Should thoroughly understand the logical schema or relevant
views
•Meticulous testing of programs - necessary
46
Roles for people in an Info System management (2/2)
Sophisticated user / data analyst:
Uses SQL to generate answers for complex queries
47
Transaction Management
• A transaction is a collection of operations that
performs a single logical function in a database
application.
• Transaction-management component ensures that
the database remains in a consistent (correct) state
despite system failures (e.g. power failures and
operating system crashes) and transaction failures.
• Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
48
Storage Management
• A storage manager is a program module that
provides the interface between the low-level data
stored in the database and the application
programs and queries submitted to the system.
• The storage manager is responsible for the
following tasks:
– Interaction with the file manager
– Efficient storing, retrieving, and updating of data
49
Database Administrator
• Coordinates all the activities of the database system; the
database administrator has a good understanding of the
enterprise’s information resources and needs:
• Database administrator’s duties include:
– Schema definition
– Storage structure and access method definition
– Schema and physical organization modification
– Granting user authority to access the database
– Specifying integrity constraints
– Acting as liaison with users
– Monitoring performance and responding to changes in
requirements
50
Purpose of Database Systems
The Database Management System (DBMS) is defined as a software system that
allows the user to define, create and maintain the database and provide control
access to the data.
It is a collection of programs used for managing data and simultaneously it supports
different types of users to create, manage, retrieve, update and store information.
Purpose
The purpose of DBMS is to transform the following −
Data into information.
Information into knowledge.
Knowledge to the action.
The diagram given below explains the process as to how the transformation of data
to information to knowledge to action happens respectively in the DBMS −
Previously, the database applications were built directly on top of the file system.
Drawbacks in File System
There are so many drawbacks in using the file system. These are mentioned below
−
Data redundancy and inconsistency: Different file formats, duplication of
information in different files.
Difficulty in accessing data: To carry out new task we need to write a new
program.
Data Isolation − Different files and formats.
Integrity problems.
Atomicity of updates − Failures leave the database in an inconsistent state.
For example, the fund transfer from one account to another may be
incomplete.
Concurrent access by multiple users.
Security problems.
Database system offer so many solutions to all these problems
Uses of DBMS
The main uses of DBMS are as follows −
Data independence and efficient access of data.
Application Development time reduces.
Security and data integrity.
Uniform data administration.
Concurrent access and recovery from crashes.
Applications of DBMS
The different applications of DBMS are as follows −
Railway Reservation System − It is used to keep record of booking of tickets,
departure of the train and the status of arrival and give updates to the
passengers with the help of a database.
Library Management System − There will be so many numbers of books in
the library and it is very hard to keep a record of all the books in a register or a
copy. So, DBMS is necessary to keep track of all the book records, issue
dates, name of the books, author and maintain the records.
Banking − We are doing a lot of transactions daily without directly going to the
banks. The only reason is the usage of databases and it manages all the data
of the customers over the database.
Educational Institutions − All the examinations and the data related to the
students maintained over the internet with the help of a database
management system. It contains registration details of the student, results,
grades and courses available. All these works can be done online without
visiting an institution.
Social Media Websites − By filling the required details we are able to access
social media platforms. Many users daily sign up for social websites such as
Facebook, Pinterest and Instagram. All the information related to the users are
stored and maintained with the help of DBMS.
• Data isolation: Because data are scattered in various files, and files may
be in different formats, writing new application programs to retrieve the
appropriate data is difficult.
• Integrity problems: The data values stored in the database must satisfy
certain types of consistency constraints. Suppose the university maintains
an account for each department, and records the balance amount in each
account. Suppose also that the university requires that the account balance of
a department may never fall below zero. Developers enforce these constraints
in the system by adding appropriate code in the various application programs.
However, when new constraints are added, it is difficult to change
the programs to enforce them. The problem is compounded when constraints
involve several data items from different files.
ISA
overdraft_limit min_balance
checking savings
• Attributes of higher-level entity-sets are inherited
by lower-level entity-sets
• Relationships involving higher-level entity-sets
are also inherited by lower-level entity-sets!
– A lower-level entity-set can participate in its own
relationship-sets, too
• Usually, entity-sets inherit from one superclass
– Entity-sets form a hierarchy
• Can also inherit from multiple superclasses
– Entity-sets form a lattice
– Introduces many subtle issues, of course
acct_id
account
balance
ISA
overdraft_limit min_balance
checking savings
ISA disjoint
overdraft_limit min_balance
checking savings
• “An account must be a checking account or a
savings account.”
• Every entity in higher-level entity-set must also
be a member of at least one lower-level entity-
set
– Called total specialization
• If entities in higher-level entity-set aren’t required
to be members of lower-level entity-sets:
– Called partial specialization
• account specialization is a total specialization
• Default constraint is partial specialization
• Specify total specialization constraint with
a double line on superclass side
• Updated bank account diagram:
acct_id
account
balance
ISA disjoint
overdraft_limit min_balance
checking savings
• Our bank schema so far:
acct_id
account
balance
ISA disjoint
overdraft_limit min_balance
checking savings
ISA disjoint
overdraft_limit min_balance
checking savings
ISA disjoint
overdraft_limit min_balance
checking savings
• Schemas:
account(acct_id, acct_type, balance)
checking(acct_id, overdraft_limit)
savings(acct_id, min_balance)
– Could use CHECK constraints SQL tables for
membership constraints, other constraints
• If specialization is disjoint and complete, can
convert only lower-level entity-sets to relational
schemas
– Every entity in higher-level entity-set also appears in
lower-level entity-sets
– Every entity is a member of exactly one lower-level
entity-set
• Each lower-level entity-set has its own relation
schema
– All attributes of superclass entity-set are included on
each subclass entity-set
– No relation schema for superclass entity-set
acct_id
acct_type
account
balance
ISA disjoint
overdraft_limit min_balance
checking savings
• Schemas:
checking(acct_id, acct_type, balance, overdraft_limit)
savings(acct_id, acct_type, balance, min_balance)
• Alternative schemas:
checking(acct_id, acct_type, balance, overdraft_limit)
savings(acct_id, acct_type, balance, min_balance)
• Problems?
– Enforcing uniqueness of account IDs!
– Representing relationships involving general accounts
• Can solve by creating a simple relation:
account(acct_id)
– Contains all valid account IDs
– Relationships involving accounts can use account
– Need foreign key constraints again…
• Generating primary key values is actually the easy part
• Most databases provide sequences
– A source of INTEGER or BIGINT values
– Perfect for primary key values
– Multiple tables can use a sequence for their primary keys
• PostgreSQL example:
CREATE SEQUENCE acct_seq;
manager
Another option is to treat works_on
relationship as an aggregate
Build a relationship against the aggregate
– manages implicitly includes set of entities
participating in a works_on relationship
instance
job
– Jobs can also have
no manager
employee works_on branch
manager manages
• Mapping for aggregation is straightforward
• For entity-sets and relationship-set being used
as an aggregate, mapping is unchanged
• Relationship-set against the aggregate:
– Includes primary keys of participating entity-sets
– Includes all primary key attributes of aggregated
relationship-set
– Also includes any descriptive attributes
– Primary key of relationship-set includes all the above
primary key attributes
– Foreign key against aggregated relationship-set, as
well as participating entity-sets
job
• Job schemas:
employee(emp_id, emp_name)
manages manager
job(title, level)
branch(branch_name, branch_city, assets)
works_on(emp_id, branch_name, title)
• Manager schemas:
manager(mgr_id, mgr_name)
manages(mgr_id, emp_id, branch_name, title)
• Differences between version with aggregation,
and version with quaternary relationship?
• Biggest difference:
– Quaternary relationship’s schema derives primary
and foreign key constraints from participating entities
– Relationship using aggregation derives primary and
foreign key constraints from aggregate relationship
• A subtle difference
– Doesn’t have any significant practical impact
• Covered two extensions to E-R model
– Higher level abstractions
• Generalization and specialization
– Can specify constraints:
• Membership constraints
• Completeness constraints
• Disjointedness constraints
• Aggregation
– Can build relationships that include other
relationships
• Straightforward mappings to relational model
• Next time: normal forms!
Database System Structure & architecture
Database Management System (DBMS) is software that allows access to data stored in a
database and provides an easy and effective method of –
Defining the information.
Storing the information.
Manipulating the information.
Protecting the information from system crashes or data theft.
Differentiating access permissions for different users.
Used by the computer system on which the database system runs.
• Database systems can be centralized, or client-server.
• A database system is divided into modules that deal with different responsibilities of
the overall system.
• Primary goal:- retrieving information from and storing new information into the
database.
3-tier Architecture
A 3-tier architecture separates its tiers from each other based on the complexity of the users
and how they use the data present in the database. It is the most widely used architecture to
design a DBMS.
Database (Data) Tier − At this tier, the database resides along with its query
processing languages. We also have the relations that define the data and their
constraints at this level.
Application (Middle) Tier − At this tier reside the application server and the
programs that access the database. For a user, this application tier presents an
abstracted view of the database. End-users are unaware of any existence of the
database beyond the application. At the other end, the database tier is not aware of any
other user beyond the application tier. Hence, the application layer sits in the middle
and acts as a mediator between the end-user and the database.
User (Presentation) Tier − End-users operate on this tier and they know nothing
about any existence of the database beyond this layer. At this layer, multiple views of
the database can be provided by the application. All views are generated by
applications that reside in the application tier.
Multiple-tier database architecture is highly modifiable, as almost all its components are
independent and can be changed independently.
Integrity Manager: It checks the integrity constraints when the database is modified.
File Manager: It manages the file space and the data structure used to represent
information in the database.
Buffer Manager: It is responsible for cache memory and the transfer of data between
the secondary storage and main memory.
Data Dictionary: It contains the information about the structure of any database object.
It is the repository of information that governs the metadata.
Indices: It provides faster retrieval of data item.
SYSTEM USERS
1. Application programmers
• Computer professionals and write application programs.
• They interact with the system through DML calls.
• DML calls are embedded in a program written in a host language (for example PHP,
python, c).
• These programs are commonly referred to as application programs.
Query Processor:
• A query processor helps the database system simplify and facilitate data access.
• System users are not required to know physical details of the implementation of the
system.
• Quick processing of updates.
• Queries are written in a nonprocedural language, at the logical "level.
• Results are stored into an efficient sequence of operations at the physical level.
Storage Manager:
• A storage manager is a program module that provides the interface between the low-
level data stored in the database and the application programs and queries submitted to the
system.
• The storage manager is responsible for the interaction with the file manager.
• The raw data are stored on the disk using the file system, which is usually provided by
a conventional operating system.
• The storage manager translates the various DML statements into low-level file system
commands.
• The storage manager is responsible for storing, retrieving, and updating data in the
database.
• A large amount of storage space is required for storing corporate databases (which may
range from hundreds to gigabytes to terabytes of data) and to manage this storage manager is
required.
• Data are to move between disk storage and main memory as per requirement because
the main memory of the computer cannot store this much information.
Disk Storage
• A DBMS can use several kinds of data structures as a part of physical system
implementation in the form of disk storage.
• Each structure has its own importance.
• Following are some common data structures.
• Disk storage is the central repository for storing all kinds of data in the database.
(i) Data
• It stores the database itself on the disk in the Data files.
(ii) Data Dictionary
• Information relating to the structure and usage of data contained in the database, the
metadata, is maintained in a data dictionary.
• The data dictionary is a database itself, documents the data.
• Each database user can consult the data dictionary to pick up what each piece of data
and the various synonyms of the data fields mean.
• In a system the data dictionary is part of the DBMS (Integrated system) the data
dictionary stores information concerning the source of each data-field value, the frequency of
its use, and an audit trail (verification of account) concerning updates, including the who and
when of each update.
• Currently data dictionary systems are available as add-ons to the DBMS.
(iii) Indices
• Indices, which can provide fast access to data items.
• A database index provides pointers to those data items that hold a particular value.
• Hashing is an alternative to indexing that is faster in some but not all cases.