You are on page 1of 79

1

Department of ECE
B.Tech IV Year I Semester (R18 Syllabus)
(Professional Elective-IV)

DATABASE MANAGEMENT SYSTEMS


Applicable From 2018-19 Admitted Batch

Faculty Name: Syed Abdul Khaliq Asst. Prof. (CSE Dept)


2

UNIT–I (Syllabus)

Database System Applications:

A Historical Perspective, File Systems versus a DBMS, the Data Model, Levels of

Abstraction in a DBMS, Data Independence, Structure of a DBMS

Introduction to Database Design:

Database Design and ER Diagrams, Entities, Attributes, and Entity Sets,

Relationships and Relationship Sets, Additional Features of the ER Model,

Conceptual Design with the ER Model


3

UNIT-I

Database System Applications


A Historical Perspective:
The concept of a database existed long before computers. In these times, data was stored
in journals, in libraries, and in hundreds of filing cabinets (that means a storage area having
doors and shelves). Everything was recorded via paper and that meant it took up space,
was hard to find, and difficult to back up.

➢ Data was stored as paper records


➢ Huge man power involved
➢ Unnecessary time was wasted like when searching for a particular record
➢ Overall it was time consuming, space consuming, require a lot of efforts, tedious and
inefficient.

Databases have developed alongside computers, and changed accordingly since their
inception. Database is a foundational element as we interact with the database even
without knowing it, such as

❖ Any time we purchase something online


❖ Log in to a particular service such as paying electricity bills thru online mode
❖ Access our bank accounts, and so on.

From the earliest days of computers, storing and manipulation of data have been a major
application focus. The initial computer applications focused on clerical tasks.

.
4

For example:

1. Employee’s payroll calculation


2. Work scheduling of a manufacturing industry
3. Order and entry processing and so on

The history of the database systems can be divided into decade wise:
1. 1950-1960:

In this decade, Magnetic tapes were developed for data storage. Data processing tasks
such as payroll were automated, with data stored on tapes. Processing of data consisted
of reading data from one or more tapes and writing data to a new tape. For input, process
punched cards were used, and for output, the printer was used.

In the tapes, storage data read in sequential order. Access to the database was through
low-level pointer operations. Storage detail depended on the type of data to be stored. A
user would need to know the physical structure of the database in order to query for
information.

o Data processing using magnetic tapes for storage

o Tapes provided only sequential access

o Punched cards were used for input

2. 1960-1970:

During this decade, hard disks were used for data storage and changed the structure of
data processing, since hard disks allowed direct access to data. Any location on disk could
be accessed. The first concept of database introduces in this decade by Charles
Bachman and he was the first person to develop the Integrated Data Store (IDS) which
was based on network data model.
5

Later, IBM (International Business Machines Corporation) developed the Information


Management Systems (IMS) which is the standard database system used and it
was developed based on the hierarchical database model.

Both databases i.e., IDS & IMS were components of the ‘navigational database’.

Navigational databases required users to navigate through the entire database to find
the required information.

A navigational database is the combination of both the hierarchical and network


model. There were two main models of navigational database:

➢ The hierarchical model

➢ The network model

The hierarchical model:

The hierarchical model was developed by IBM. In it, data is organized like a family tree.
Each data entry has a parent record, starting with a root record. The Hierarchical model
is considered navigational because it is necessary to navigate from up to down for the
required information.

The network model:

Network model navigate from up to down and down to up for the required information. The
network model was released at the Conference on Data Systems Languages
(CODASYL). It differed from the hierarchical model in that it allowed a record to have more
than one parent and child record.

3. 1970-1980:

During this decade the relational database model was developed by E.F. Codd (Edgar
Frank Codd). He proposed the relational model for the database. E. F. Codd releases
6

his paper “A Relational Model of Data for Large Shared Data Banks”. This paper the term
‘relational database’ which starts development of new way to store and access data.

Many of the database models we use today are relational based. It was considered the
standard database model. The relational model, which uses “declarative” techniques in
which you ask the system for what you want instead of how to navigate to it.

Two major relational database system prototypes were created during this decade and
they were:

o INGRES (Interactive Graphics and Retrieval System) by INGRES Corporation and


o SYSTEM R by IBM Corporation

INGRES was developed by Michael Stonebraker and Eugene Wong at the University of
California by INGRES Corporation. INGRES, stands for Interactive Graphics and
Retrieval System, was a relational database model. INGRES used a query language
called QUEL and later creation of systems such as

❖ MS SQL Server
❖ Sybase

INGRES (INGRES Corporation)

QUEL

System R (relational database management system) developed at IBM (International


Business Machines, United States). System R is a database system built as a research
project at IBM's San Jose Research Centre. System R was a project and introduced the
SQL language. System R used the SEQUEL as the query language. It was the first
implementation of SQL (Structured Query Language), in the history of database, which
has become the standard relational database query language which provide
good transaction processing performance.
7

System R (IBM Corporation)

SEQUEL

In this decade, Peter Chen a Chinese scientist introduces a new database model known
as ER Model (Entity-Relationship Model). E-R Model design is basically made to for data
applications.

4. 1980-1990:

During this decade, SQL was declared as a standard language for the queries by ISO
(international organization for standardization) and ANSI (American National
Standards Institute) which we still use today.

Another noteworthy event the history of database was the development and coming into
use of Object-oriented database management systems (OODBMS). Object databases
would view data as ‘objects’. They would work with programming languages that
supported the ‘object-oriented’ approach.

5. 1990-2000:

In this decade, applications such as PowerBuilder which was based on object-oriented


programming language for client/server applications owned by SAP, VB (Visual Basic) which
was used for Graphical User Interface Applications owned by Microsoft, and Oracle
Developer for creating client/server and Web applications run on Oracle database developed by
Oracle Corporation. A number of tools such as ODBC (Open Database Connectivity) an open
standard Application Programming Interface (API) for accessing a database such as Microsoft
Access, Microsoft Visual Foxpro and Microsoft SQL server).

The internet / WWW were introduced in this decade. It allows remote access to the
database systems and users began to use the client-server database.
8

The online businesses were increased resulting in a rise in demand for internet database
connectors like Active server page (ASP), Java Servlets, FrontPage, Dream Weaver.
Enterprise Java Beans etc.

In this decade, the creation of MySQL in 1995, which was an open source RDBMS
(Relational Database Management System) developed by ORACLE. MySQL is still used
by many organizations today. MYSQL based on SQL (Structured Query Language) and
runs on Linux, UNIX and Windows. MYSQL applications is used for a wide range of
purposes including Data Warehousing and E-Commerce. It’s used by many popular
websites including Facebook, Flickr, twitter and Youtube.

6. 2000-2010:

The term NoSQL (not only structured query language) was introduced. It refers to
databases that use query language other than SQL to store and retrieve data.

NoSQL databases are useful for unstructured data, and they growth in the
2000s.

NoSQL allowed for faster processing of larger and more varied datasets. NoSQL
databases are more flexible than the traditional relational databases.

7. 2010 onwards: (Big data, Distributed databases and cyber security were
introduced)

In this period, Big data (non-relational databases). Big Data is the type of data that
includes a combination of structured, semi-structured and unstructured data collected by
organizations that can be search for information which can be used in machine learning
and other applications. Big data meant for big databases in a variety of forms. The big
data is also refers to data that is so large or complex that it’s difficult or impossible to
process by using traditional methods.
9

Distributed databases: A distributed database is basically a database that is not limited


to one system; it is spread over different sites, i.e, on multiple computers or over a
network of computers. These are databases that store data across multiple physical
locations, rather than in one place.

Cyber security was introduced, it was involving computers or computer networks (such as
the Internet). Cyber security is the practice of defending computers, servers, mobile
devices, electronic systems, networks, and data from virus attacks. It's also known as
information technology security or electronic information security.

.
File System versus a DBMS
10

Storing and managing data is an important task for an individual as well as for a large
organization. There are various methods to store and manage data. Two of them are by
using the

❖ File system
❖ DBMS.

File System: File system was an early attempt to computerize the manual filing

system. It is basically a collection of application programs that performed services for the
end users. Each program within a file system defines and manages its own data. In this
system, a number of files are needed to perform various tasks. A file system is software
that manages the data files in a computer system.

Some of the important characteristics of file system are:

➢ Each file independent from one another


➢ Each file is called a flat file
➢ Each file contained and processed information for one specific task, such as
accounting or inventory
➢ Files are designed by using programs written in programming languages such as C,
C++

DBMS: DBMS helps to easily store, retrieve and manipulate data in a database. DBMS

is software to create and manage databases. DBMS provides more advantages than a
file system.

S.No. File System DBMS


File system is software that DBMS is software to create and manage the
1.
manages the data file and the user database file, it provides more advantages
11

has to write the programs for than a file system. DBMS is a collection of
managing the database. Handling a data. In DBMS the user is not required to write
file system is easier than DBMS, the programs.

In this system, the user has to write


the programs for managing the
Example:
database.
ORACLE, MySQL, Microsoft SQL Server,
Postre SQL, IBM DB2, SQLite, Microsoft
Examples:
Access, MariaDB and Microsoft Azure SQL
DOS(Disk operating system)
follows FAT (File allocation Table) Database.
system
WINDOWS follows NTFS (Newest As of June 2021, the most popular RDBMS in
Technology File System) the world was ORACLE.
UNIX follows UNIX file system

Linux follows Extended File


System
Bigdata follows Zetta byte File
System.

File system stored data in the form Data stored in a tabular form i.e., structured
of unstructured format. format.
Exampe: Data stores in a tabular form
Exampe: In a Notepad file data
stores like Empno Name Salary
2.
101 xyz 20000
1. Abc 20000 manager 102 Abc 40000
hyderabad 103 Sam 35000
104 Smith 45000
2. Xyz 4000 director pune

3. File system comes with the In DBMS


operating system. File system DBMS-File System(NTFS)–OS–Hard disk
12

manage by the OS. Data store in


the location such as
File System(NTFS)– Operating
System - Hard disk

For accessing a data file, you need Here in DBMS, accessing a file thru a simple
to know the file location where its query.
exist in the hard disk such as Drive
(C: or D: or E:) - Main directory
name – Subdirectory name - and
file name.
4.
For accessing a data file. Here, you
need to write a program in
C/C++/JAVA and mention the
location of the file in the program.
Compile the program, if there is no
errors then Run the program and it
will the access the data file.

Accessing a particular information In DBMS, only required information can be


is not possible instead complete access instead of complete file thru a simple
file has to access unnecessarily. query.
5. For example file size 1055 MB has For example accessing a small piece of
to access for a small piece of information i.e., 25KB out of 5 MB thru a simple
information query.
.

6 Data redundancy and Data In DBMS, the related data resides in the same
inconsistency exist in file system. storage location or the same information may
13

File processing system has more not be duplicated as a result minimizing data
data redundancy redundancy and reduced data inconsistency.
In DBMS there is no redundancy and no
In a file management system, the
inconsistency.
redundancy of data is greater.
Data redundancy is defined as the The redundancy of data is low in the DBMS or
storing of the same data in multiple less data redundancy in DBMS
locations.
Due to centralization of database, the problem
Data redundancy means the same
of data redundancy as well as inconsistency
piece of information may be exists
is controlled. Data inconsistency is low in a
in many different files.
database management system.

.For example, if you modify data in


DBMS provides data consistency through
one file but could not modify in other
normalization
files then it may leads to data
inconsistency.

That means, the same piece of


information may be exists in many
different places in files but could not
change or modify the other files
then it may lead to inconsistency.

Data inconsistency is higher in the


file system. When the same data
exists in multiple files. Data
Redundancy leads to Data
Inconsistency. Redundancy not
control in file system. File
14

processing system does not


provide data consistency

Example: If we have an address of


someone in many files and when
we change it in only one file and in
another file it may not be
updated so there is the problem of
data inconsistency may occur.

In file system, data integrity In DBMS, there is no data integrity problem.


problem. Data integrity refers to the Example of data integrity: User cannot insert string
7. data accuracy. Here, user can data in integer column
insert string data in instead of
integer by mistakenly.

In the file system, concurrent DBMS takes care of concurrent access of


access has many problems like data by using some form of locking. For
redirecting the file while other example, there are two transactions T1 and
deleting some information or T2 and data item x. T1(x). Here, x data item
8. updating some information. Since cannot be accessed by T2 until T1 finished
there is no locking system in the file i.e., locking.
system. .

9.
15

The file system offers lesser Database Management System offers high
security. security.

Database security includes protecting the


database and its various applications that
access it. Organizations must secure
databases from virus attacks such as cyber
security threats, as well as the misuse of data
and databases from those who can access
them.

Data is isolated in file system. Data can be shared (multi-user can access
Isolated data (data stored in data from a single machine). DBMS provide
standalone machine) multiple user interfaces

A multiuser database system must allow


Example: Data stored in a single
multiple users access to the database at the
machine and accessing data by the
10. single user at a time i.e single user same time. As a result, the multiuser DBMS
must have concurrency control strategies to
single machine. File processing
ensure several users access to the same data
system provides less flexibility in
item at the same time, and to do so in a
accessing data
manner that the data will always be correct i.e.,
data integrity. DBMS has more flexibility in
accessing data

Unauthorized access is not Unauthorized access is restricted in DBMS


restricted in file system
11.
16

It doesn’t offer backup and


DBMS system provides backup and
recovery of data if it is lost. File
recovery of data even if it is lost. DBMS
System doesn’t have a crash and
provides a crash recovery mechanism i.e.,
recovery mechanism i.e., if the
DBMS protects the user from the system
system crashes while entering
failure.
12. some data, then the content of the
file will lost.

13. There is no efficient query You can easily query data in a database
processing in the file system. using the SQL language

The Centralization is easy to achieve in the


14.. The centralization process is hard
DBMS system.
in File Management System.

15. Multiple user access to the Allow multiple users to access the database
information is difficult to provide at the same time
16. It is cheaper to design It is relatively expensive to design
17. It has simple structure Its structure is relatively complex
It is very difficult to protect a file DBMS provides a good protection
18.
under the file system mechanism.

.
17

the Data Model:

Data model means to model the data i.e., to give a shape to the data and to give a figure
to the stored data. A data model makes it easier to understand the meaning of the data
by its figure.

In simple words, we can define data model as “a collection of high-level data description
that hide many low-level storage details. A data model can also be defined as a collection
of conceptual tools for describing data, data relationships and consistency constraints”. A
DBMS allows a user to define the stored data in terms of data model.

A Data Model in Database Management System (DBMS), is the concept of tools that are
developed to summarize the description of the database. A database model is a
specification describing how a database is structured and used. A data model can
sometimes be referred to as a data structure, especially in the context of programming
languages.

There are various types of data model but the relational model is the most widely used
model.

The different Data Models in DBMS are:

1. Hierarchical Model
2. Network Model
3. Relational Model
4. Object-Oriented Model
5. Object-Relational Model
6. Entity-Relationship Model
18

1. Hierarchical Model
Hierarchical Model was the first DBMS model and one of the oldest Database Model The
general shape of this model is like an Organizational chart (Example-2). A node on the
chart represents a particular entity. The terms parent and child are used in describing a
hierarchical model. This model organizes the data in the hierarchical tree structure. The
hierarchy starts from the root which has root data and then it expands in the form of a
tree adding child node to the parent node.

This model used the tree as its basic structure. A tree is a data structure that consists of
a hierarchy of nodes, with a single node called the root, at the highest level. A node
represents a particular entity. A node may have any number of children, but each child
node may have only one parent node. This kind of structure is often referred to as “Inverted
Tree” (upwards to downwards). In this model parent-to-child creates one-to-many
relationship, but the child–to-parent creates one-to-one relationship.

In a hierarchical data model, records are arranged in a top-down structure. The nodes of
the tree represent data records. The relationships are represented as links or pointers
between nodes. Example: To locate a particular record in a hierarchical database, you
have to start at the top of the tree with a parent record and trace down the tree to the child.

Syntax
19

Example: Organization Chart

Hierarchical database in DBMS that represent data in a tree-like form. The relationship
between records is one-to-many. That means, one parent node can have many child
nodes. A hierarchical database model is a data model where data is stored as records but
linked in a tree-like structure with the help of a parent and child level. Each record has only
one parent. The first record of the data model is a root record.

In the above example, college is the root node here the root node has two children. The
root record is always on level 0 and is the first element to be traversed in the data model.
The next level children of the root record are Level 1 and have root as their parent. The
next level is Level 2 and so on.

.
20

Example:

The hierarchical model is based on the parent-child hierarchical relationship. In this model,
there is one parent entity with several children entity. At the top, there should be only one
entity which is called root.

For example: an organization is the parent entity called root and it has several children
entities like clerk, officer, and many more.

Features of a Hierarchical Model

1. One-to-many relationship: The data here is organized in a tree-like structure


where the one-to-many relationship is between the data types. Also, there can be
only one path from parent to any node. Example: In the example-2, if we want to
go to the node Faculty we only have one path to reach there i.e through
Department node.
21

2. Parent-Child Relationship: Each child node has a parent node but a parent
node can have more than one child node. Multiple parents are not allowed.

3. Deletion Problem: If a parent node is deleted then the child node is


automatically deleted.

4. Pointers: Pointers are used to link the parent node with the child node and are
used to navigate between the stored data. Example: In the example-2 the
'Department' node points to the three other nodes 'course' node and 'Faculty’ and
’Student' node.

Advantages of Hierarchical Model

• It is very simple and fast to traverse through a tree-like structure.


• Any change in the parent node is automatically reflected in the child node so; the
integrity of data is maintained.

Disadvantages of Hierarchical Model

• Complex relationships are not supported.


• As it does not support more than one parent of the child node so if we have some
complex relationship where a child node needs to have two parent node then that
can't be represented using this model.

• If a parent node is deleted then the child node is automatically deleted.

2. Network Model
This model is an extension of the hierarchical model. It was the most popular model
before the relational model. This model is the same as the hierarchical model; the only
difference is that a record can have more than one parent. The network model was
developed to overcome the limited scope of hierarchical model. It replaces the
hierarchical tree with a graph.
22

In Network Model, multiple parent-child relationships are used. The network model uses
a network structure, which is a data structure of nodes and branches.

In this model, there is no difference between parent and child nodes as in the hierarchical
model. Each node may be related to more than one node. In this model, directed graphs
are used instead of tree structure to represent the structure of database.

The main difference of Network model and hierarchical model is that a network model
permits a child node to have more than one parent nodes, whereas hierarchical model
dos not allows a child node to have multiple parent nodes.

Example-1:

Example-2:
23

The network model for ‘UNIVERSITY’ system is shown above figure, the Mathematics
Department node is associated with ‘Computer Department’ node. Similarly ‘Computer
Lab; and ‘Library’ nodes are associated with both the ‘Mathematics Department’ and
‘Computer Department’ nodes.

Features of a Network Model

1. Ability to merge more Relationships: In this model, as there are more


relationships so data is more related. This model has the ability to manage one-
to-one relationships as well as many-to-many relationships.

2. Many paths: As there are more relationships so there can be more than one path
to the same record. This makes data access fast and simple.

3. Circular Linked List: The operations on the network model are done with the help
of the circular linked list. The current position is maintained with the help of a
program and this position navigates through the records according to the
relationship.

Advantages of Network Model


24

• The data can be accessed faster as compared to the hierarchical model. This is
because the data is more related in the network model and there can be more
than one path to reach a particular node. So the data can be accessed in many
ways.

• As there is a parent-child relationship so data integrity is present. Any change in


parent record is reflected in the child record.

Disadvantages of Network Model

• As more and more relationships need to be handled the system might get
complex. So, a user must be having detailed knowledge of the model to work with
the model.

• Any change like updating, deleting, inserting is very complex.

3. Relational Model

Relational Model is the most widely used model. In this model, the data is maintained in
the form of a two-dimensional table. All the information is stored in the form of rows and
columns. The basic structure of a relational model is tables. So, the tables are also
called relations in the relational model.

The most popular data model in DBMS is the relational model. This model was
initially described by. E.F. (Edgar Frank) Codd, in 1970. The relational data model is the
widely used model which is primarily used by commercial data processing applications.
The relational model is considered as one of the most popular developments in the
database technology.

In relational model, data is organized in terms of rows and columns in a table known as
relation. Each table consists of rows also known as tuples. A tuple represents a collection
of information that describes a person, place or thing for example student roll number,
student name, student course etc., and columns also known as attributes. An attribute
25

represents the characteristics of a person, place or thing, for example Salary attribute in
a given below example.

The number of tuples in a relation determines its cardinality and the number of attributes
in a relation determines its degree.

The relational database relates or connects data in different tables through the use of a
common field or attribute.

Example: In the given example, we have an Employee table.

The most popular and extensively used data model is the relational data model. The data
model allows the data to be stored in tables called a relation. The relations are normalized
and the normalized relation values are known as atomic values. Each of the rows in a
relation is called tuples which contains the unique value. The attributes are the values in
each of the columns which are of the same domain.
26

Popular examples of standard relational databases include Microsoft SQL Server, Oracle
Database, MySQL and IBM DB2.

The main highlights of this model are:

• Data is stored in tables called relations.

• Relations can be normalized.

• In normalized relations, values saved are atomic values.

• Each row in a relation contains a unique value.

• Each column in a relation contains values from a same domain.

.
27

Features of Relational Model

• Tuples: Each row in the table is called tuple. A row contains all the information
about any instance of the object. In the above example, each row has all the
information about any specific individual like the first row has information about
John.

• Attribute or field: Attributes are the property which defines the table or relation.
The values of the attribute should be from the same domain. In the above
example, we have different attributes of the employee like Salary, Mobile_no, etc.

Advantages of Relational Model

• Simple: This model is more simple as compared to the network and hierarchical
model.

• Scalable: This model can be easily scaled as we can add as many rows and
columns we want.

• Structural Independence: We can make changes in database structure without


changing the way to access the data. When we can make changes to the database
structure without affecting the capability to DBMS to access the data we can say
that structural independence has been achieved.

Disadvantages of Relational Model

• Hardware Overheads: For hiding the complexities and making things easier for
the user this model requires more powerful hardware computers and data storage
devices.

• Bad Design: As the relational model is very easy to design and use. So the users
don't need to know how the data is stored in order to access it. This ease of design
can lead to the development of a poor database which would slow down if the
database grows.
28

4. Object-Oriented Model

The object-oriented model is based on a collection of objects. An object contains values


stored in variables within the object. An object also contains code that operates on the
object. This code is called methods. Objects that contain the same types of values and
the same methods are grouped together into classes. A class may be viewed as a
definition for objects.

The only way in which one object can access the data of another object is by invoking a
method of that other object. This action is called sending a message to the object. An
object-oriented data model is one of the most developed data models which contain video,
graphical files, and audio.

In this model, both the data and relationship are present in a single structure known as
an object. We can store audio, video, images, etc in the database which was not possible
in the relational model. Although you can store audio and video in relational database, it
is advised not to store in the relational database. In this model, two or more objects are
connected through links. We use this link to relate one object to other objects. This can
be understood by the example given below.
29

In the above example, we have two objects Employee and Department. All the data and
relationships of each object are contained as a single unit. The attributes like Name,
Job_title of the employee and the methods which will be performed by that object are
stored as a single object. The two objects are connected through a common attribute i.e
the Department_id and the communication between these two will be done with the help
of this common id.

5. Object-Relational Model
An Object relational model is a combination of a Object oriented model and a Relational
model. So, it supports objects, classes, inheritance etc. just like Object Oriented models
and has support for data types, tabular structures etc. like Relational data model.

One of the major goals of Object relational data model is to close the gap between
relational database and the object oriented database frequently used in many
programming languages such as C++, C#, Java etc.
30

We can have many advanced features like we can make complex data types according to
our requirements using the existing data types. The problem with this model is that this
can get complex and difficult to handle.

An object–relational database (ORD), or object–relational database management system


(ORDBMS), is a database management system (DBMS) similar to a relational database,
but with an object-oriented database model: objects, classes and inheritance are
directlysupported in database schemas and in the query language.

Example: ORDBMSs include PostgreSQL and Oracle

6. Entity-Relationship Model
Entity-Relationship Model or simply ER Model is a high-level data model diagram. In this
model, we represent the real-world problem in the pictorial form to make it easy to
understand. It is also very easy for the developers to understand the system by just
looking at the ER diagram. We use the ER diagram as a visual tool to represent an ER
Model.

An ER model is the logical representation of data as objects and relationships among


them. These objects are known as entities, and relationship is an association among
these entities. It was widely used in database designing. A set of attributes describe the
entities. ER Model is best used for the conceptual design of a database.

The following ER diagram syntax has the three components:

1. Entity
2. Attribute
3. Relationship
31

Syntax:

Example:

In the above diagram, the entities are Teacher and Department. The attributes
of Teacher entity are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The
attributes of entity Department entity are Dept_id, Dept_name. The two entities are
connected using the relationship. Here, each teacher works for a department.

Features of ER Model
32

• Graphical Representation for better Understanding: It is very easy and simple to


understand so it can be used by the developers to communicate with the
employees or customers.

• ER diagram is used as a visual tool for representing the model.

• This model helps the database designers to build the database and is widely used
in database design.

Advantages of ER Model

• Simple: Conceptually ER Model is very easy to build. If we know the relationship


between the attributes and the entities we can easily build the ER Diagram for
the model.

• Effective Communication Tool: This model is used widely by the database


designers for communicating their ideas.

• Easy Conversion to any Model: This model maps well to the relational model
and can be easily converted relational model by converting the ER model to the
table. This model can also be converted to any other model like network model,
hierarchical model etc.

Disadvantages of ER Model

• No industry standard for notation: There is no industry standard for developing


an ER model. So one developer might use notations which are not understood by
other developers.

• Hidden information: Some information might be lost or hidden in the ER model.


As it is a high-level view so there are chances that some details of information
might be hidden.
33

.
Levels of Abstraction in a DBMS:
There are three levels of data abstraction in DBMS which reduce the complexity of
the database. They are

1. Internal Level or Physical Schema

2. Logical Level or Conceptual Schema

3. View Level or External Schema

The pictorial representation of three levels of data abstraction is:


34

.1. Internal Level:

Physical Schema or Internal level is the lowest level of abstraction in the DBMS which
describes how the data is actually stored in the database and it also describes complex low
level data structure and access methods used by database. This internal level deals with
the storage of the data for the whole database system. This is the first or lowest level of
abstraction which describes how a record is actually stored in the system memory. It
is a low-level representation of the database

The Internal level of abstraction actually contains the database storage files and binary
files which is the actual storage of the database system. It depends on the hardware and
OS of the system
35

The entire database is described in this level that is internal level. It is a very complex level
to understand. For example, customer's information is stored in tables and data is stored
in the form of blocks of storage such as bytes, KB’s, Megabytes, Gigabytes etc.

Database developer will decide how the data is to be stored in the database. It is really
complex to understand. If we want to have indices to be created above the data that also
will be decided by the database application programmer. The entire database is described
over here in detail in this level. This is the lowest level of data abstraction. It describes how
data is actually stored in database. You can get the complex data structure details at this
level.

2. Logical Level:

Logical level or Conceptual Schema is the intermediate level, next in higher level and
also known as conceptual level which describes what data is stored and reveals the
relationships that exists among the stored data.

It tried to describe the entire data. It means what tables to be created and what are the links
are between these tables are mentioned in this level. This is less complicated than the
physical level. Little bit of complexity over here as well. But, it is not that much like at the
physical level. This level is used by database administrators or developer.

In short, the logical level contains fields and attributes along with the datatypes and the
relationships among the attributes which can be logically implemented.

Example: Let us take an example where we use the relational model for storing the
data. We have to store the data of a student, the columns in the student table will be
student_name, age, mail_id, roll_no etc. We have to define all these at this level while
we are creating the database. Though the data is stored in the database but the structure
36

of the tables like the Student table or Employee table etc are defined here in the
conceptual level or logical level. Also, how the tables are related to each other is defined
here.

It is less complex than the physical level. So, overall, the logical level contains tables (fields
and attributes) and relationships among table attributes.

Example: Take the example of the university database. We need to store data about
Faculty and students.

At the logical level, we will define the table of Faculty that


contains FACULTY_ID, NAME, SALARY and table of student that
contains STUDENT_ID, NAME, COURSE, PROJECT_NAME, PROJECT_GUIDE and

Here we define the structure of the database and relationships among the data.

3. View Level:
View level or External Schema: When coming to the third level that is the view level and
it is the highest level. This is also called the external level. There are different levels of views
in view level and each view defines only a part of a entire data. It also interacts with the
user since it provides the different views of the same database. It also provides multiple
views for the same database. View level can be used by all the users of the database. This
level is the least complex in all of these three levels and easy to understand. View level is
the highest level of data abstraction. This level describes the user interaction with database
system. This level tells the application about how the data should be shown to the user.

Example: If the student has a login-id and password in a university system, then as a
student, he can view his marks, attendance, fee structure, etc. But the faculty of the
37

university will have a different view. He will have options like salary, edit marks of a
student, enter attendance of the students, etc. So, both the student and the faculty have
a different view.

By doing so, the security of the system also increases. In this example, the student can't
edit his marks but the faculty who is authorized to edit the marks can edit the student's
marks.

Similarly, the dean of the college will have some more authorization and he will access
his view. So, different users will have a different view according to the authorization they
have.

Data independence
Data independence means capacity to change schema of one level of the database system
without having to change the schema at the next higher level or Data independence is the
ability to modify the schema (design of a database) at one level of the database system (i.e.,
internal schema) without affecting a schema in the next higher level (i.e., external schema).

Data independence is an important characteristic of DBMS as it allows changing the


structure of the database without making any changes in the application programs that use
the database.

One of the highest advantages of database is data independence. It means that we can
change the logical level or conceptual schema without affecting the data at another level.
It also means that we can also change the structure of a database without affecting the
38

data required by users and programs. This feature was not available in the file oriented
approach.

There are two types of data independence:

1. Physical data independence

2. Logical data independence

1. Physical data independence:

Physical data independence is the ability to change the physical schema or internal
schema without causing application programs to be rewritten. Modifications at the physical
level are occasionally necessary to improve performance. It means we change the physical
storage without affecting the logical or view level of the data.

Or

The ability to change the physical schema or internal schema without changing the logical
schema is called physical data independence.

For example:

A change to the internal level, such as using different storage devices should be possible
without having to change the logical level or view level.

Physical Data Independence refers to the characteristic of changing different file


organization or storage structures, storage devices, or indexing strategy without affecting
the logical level or conceptual level. Using this property we can easily change the storage
device of the database without affecting the logical schema.

2. Logical data independence:


39

The ability to change the logical level or conceptual schema without changing the View
level or External schema is called logical data independence.

Logical data independence is more difficult to achieve than the physical data
independence because the application programs are always dependent on the logical
structure of the database. Since application programs are heavily dependent on the logical
structure of the data that they access.

Logical Data independence means if we add new columns or remove columns from table
then the user view and programs should not change.

Logical data independence is the ability to modify the logical level or conceptual schema
without causing application programs to be rewritten. Modifications at the logical level or
conceptual schema are necessary whenever the logical structure of the database is
altered.

For example: Consider two users A & B. Both are selecting the fields "Employee
Number" and "Employee Name". If user B adds a new column (e.g. salary) to his table, it
will not affect the external view for user A, though the internal schema of the database has
been changed for both users A & B.

The changes in the logical level may include:

1. Changing the data definition.

2. Adding, deleting, or updating any new attribute, entity or relationship in the


database.

Examples of changes under Logical Data Independence


Due to Logical independence, any of the below change will not affect the external layer.

1. Add/Modify/Delete a new attribute, entity or relationship is possible without a


rewrite of existing application programs
2. Merging two records into one
40

Difference between Physical and Logical Data


Independence
Physical Data
S.No. Logical Data Independence
Independence

Logical Data Independence is mainly concerned Mainly concerned with the


1. with the structure or changing the data storage of the data.
definition.

It is concerned with storage


2. It is concerned with the structure of the data
of the data

It is very difficult as the retrieving of data mainly it is easy to retrieve


3.
dependent on the logical structure of data
41

Physical database is
Application program need not be changed if new concerned with the change
4.
fields are added or deleted from the database. of the storage device

Compared to Physical independence it is Compared to Logical


difficult to achieve logical data independence. Independence it is easy to
5.
achieve physical data
independence.

You need to make changes in the Application A change in the physical


program if new fields are added or deleted from level usually does not need
6.
the database. change at the Application
program level.

7. Modification at the logical levels is significant Modifications made at the


whenever the logical structures of the database internal levels may or may
are changed. not be needed to improve the
performance of the structure.

8. Concerned with internal


Concerned with conceptual schema
schema

9. Example: change in
compression techniques,
Example: Add/Modify/Delete a new attribute
hashing algorithms, storage
devices, etc

Structure of a DBMS
42

System structure

Structure of a DBMS:
43

DBMS (Database Management System) acts as an interface between the user and the
database. The user requests the DBMS to perform various operations such as insert,
delete, update and access on the database.

The components of DBMS perform these requested operations on the database and
provide necessary data to the users.

The Structure of DBMS can be classified into four components. They are:

1. DBMS Users

2. Query processor

3. Storage manager

4. Disk storage

1. DBMS users:
.

According to the structure of a DBMS. The database users are

a) Naïve users

b) Application programmers

c) Sophisticated users

d) Database administrator

a) Naive users or End uses (tellers, agents, web users):


44

Naive Users are unsophisticated users who interact with the system by using permanent
application programs.

For example: Users of ATM machine (Automated teller machine)

For example: A bank teller who needs to transfer Rs. 5000 from account A to account
B invokes a program called transfer. This program asks the teller for the amount of money
to be transferred, the account from which the money is to be transferred, and the account
to which the money is to be transferred.

a) Naïve users use Application interfaces:


An Application programming interface is a software interface that helps in connecting
between the computer or between computer programs. It is an interface that provides the
accessibility of information such that weather forecasting.

b) Application programmers:
Application programmers are computer professionals who write application programs.
Application programmers can choose software tools to develop user interfaces.

Application programmers are the developers who interact with the database by means
of DML queries. These DML queries are written in the application programs like C,
C++, JAVA, Pascal, etc.

Application programmers write Application program:


An application program designed to carry out a specific task to be used by the end-
users. Application program that performs a particular function directly for the user.

There are many different types of Application programs such as:

• Games
45

• Accounting software
• Graphics software
• Media players….. etc

c) Sophisticated users:
Sophisticated users interact with the system without writing program. Instead, they form
their requests in a database query language. They submit each such query to a query
processor, whose function is to break down DML statements into instructions that the
storage manager understands. Analysts who submit queries to explore data in the
database fall in this category.

Example: Specialized users, business analyst and scientists

Sophisticated users use Query tools:


1. MySQL:

2. Oracle RDBMS:

d) Database administrator:

A database administrator can be an individual or group of people. He is in complete charge


of the database and has control over both the application and the database. The DBA is
responsible for everything related to the database. The database administrator (DBA) is
the person or group in charge for implementing the database system, within an
organization. The “DBA has all the system privileges allowed by the DBMS and can assign
(grant) and remove (revoke) levels of access (privileges) to and from other users. DBA is
also responsible for the evaluation, selection and implementation of DBMS package.

2. Query processor:
46

Query Processor translates statements in a query language into low-level instructions


the database manager understands. (May also attempt to find an equivalent but more
efficient form.) The Query Processor simplifies and facilitates access to data. The Query
processor includes the following component. It interprets the requests (queries) received
from end user via an application program into instructions. It also executes the user
request which is received from the DML compiler.

The query processor includes the following components:

1. Application program object code


2. Compiler and linker
3. DML queries
4. DML compiler
5. DDL interpreter
6. Query evaluation engine

1. Application program and object code:


An application program is a designed to carry out a specific task and to be used by end-
users. Word processors, media players, and accounting software are examples. An
application program is a comprehensive, self-contained program that performs a particular
function directly for the user. Among many others, application programs include:

• Games
• Accounting software
• Graphics software
• Media players

Object code:
47

Object code is a set of instruction codes that is understood by a computer at the lowest hardware
level. Object code is usually produced by a compiler that reads some higher level computer
language source instructions and translates them into equivalent machine language instructions.

2. Compiler and linker:

Compiler: The compiler converts code written in a human-readable programming


language into a machine code representation which is understood by your processor. This
step creates object files. A compiler generates object code files (machine language) from
source code.

Linker: The linker is a program in a system which helps to link a object modules of
program into a single object file. A linker combines these object code files into an executable.
It performs the process of linking. Linkers are also called link editors. Linking is process of
collecting and maintaining piece of code and data into a single file.

3. DML queries:

Data Manipulation Language queries which are used to manipulate data itself. DML
commands are used to modify or manipulate data records present in the database tables.
Some of the basic DML operations are data insert (INSERT), data updation (UPDATE), data
removal (DELETE) and data querying (SELECT).

The following are the DML commands which can be used for DML queries:

1. SELECT: Command to fetch data or values from the database.

2. INSERT: Command to add new or fresh value to the database

3. UPDATE: Command to change or update the present/existing data to a newer value inside

the database

4. DELETE: Command to remove or delete the values or data information from the database’s

current table
48

5. MERGE: Command to merge two or more data tables inside a database.

4. DML Compiler and linker:

DML compiler: DML (Data Manipulation Language) compiler translates the DML
statements which are in a query language into the low-level instructions which the query
evaluation engine understands easily.

Linkers also link a particular module into system library. It takes object modules from assembler
as input and forms an executable file as output for loader.

5. DDL interpreter:
he DDL interpreter interprets DDL statements and records the definition in the data
dictionary. The DML compiler translates DML statements in a query language into an evaluation
plan consisting of low-level instructions that the query evaluation engine understands.

6. Query evaluation engine:

Query evaluation engine, which executes low-level instructions generated by the DML
compiler. It interprets the requests (queries) received from end user via an application
program into instructions. It also executes the user request which is received from the
DML.

3. Storage Manager:
.
49

The storage manager is important because databases typically require a large amount of
storage space.. A storage manager is a program module that provides the interface
between the low level data stored in the database and the application programs and
queries submitted to the system. The storage manager is responsible for the interaction
with the file manager. The raw data are stored on the disk using the file system, which is
usually provided by a conventional operating system. The storage manager translates the
various DML statements into low-level file-system commands.

The storage manager components include the following:

1. Buffer Manager

2. File Manager

3. Authorization and Integrity Manager

4. Transaction Manger

1. Buffer Manager: It is responsible for cache memory and the transfer of data

between the secondary storage and main memory.

Buffer manager, which is responsible for fetching data from disk storage into main
memory, and deciding what data to cache in main memory. The buffer manager is a critical
part of the database system, since it enables the database to handle data sizes that are
much larger than the size of main memory.

It is responsible for cache memory and the transfer of data between the secondary
storage and main memory.

2. File manager:
50

It manages the allocation of space on disk storage and the data structures used to
represent information stored on disk. The Buffer manager is responsible for fetching the
data from disk storage into main memory and deciding what data to cache in
main memory. It manages the file space and the data structure used to represent
information in the database.

File manager, which manages the allocation of space on disk storage and the data
structures used to represent information stored on disk.

3. Authorization and Integrity Manager:


Authorization manager checks the authority of users to access data. Authorization and
integrity manager, which tests for the satisfaction of integrity constraints and checks the
authority of users to access data. Integrity Manager: It checks the integrity constraints
when the database is modified.

4. Transaction Manager:

A transaction is a collection of operations that performs a single logical function in a


database application. Each transaction is a unit of both atomicity and consistency. Thus,
we require that transactions do not violate any database-consistency constraints. That is,
if the database was consistent when a transaction started, the database must be consistent
when the transaction successfully terminates. Transaction - manager ensures that the
database remains in a consistent (correct) state despite system failures (e.g., power
failures and operating system crashes).

Transaction manager, which ensures that the database remains in a consistent (correct)
state despite system failures, and that concurrent transaction executions proceed without
conflicting.

4. Disk storage:
51

The disk storage components such as data, data indices, data dictionary and statistical
data.

Disk storage includes the following components:


1. Data dictionary
2. Indices
3. Data files
4. Statistical data

1. Data dictionary:
It contains all the information about the database. As the name suggests, it is
the dictionary of all the data items. It contains a description of all
the tables, view, materialized views, constraints, indexes, triggers, etc. It contains the
information about the structure of any database object.
It is the repository of information that governs the metadata.

2. Indices:
It provides faster retrieval of data item.

3. Data files:
It stores the data. It has the real data stored in it. It can be stored as magnetic tapes,
magnetic disks, or optical disks.

4. Statistical data:
Statistical data as a measurement, such as a person's height, weight, IQ, or blood
pressure; or they're a count, such as the number of stock shares a person owns, how many
teeth a dog has, or how many pages you can read of your favorite book before you fall
asleep.
52

INTRODUCTION TO DATABASE DESIGN

Database Design and ER Diagrams:

Database Design:
Database design is the process of constructing a stable database structure from user
requirements analysis. The database design is considered to be the most important task
while following database approach for a reality. The database design structure the
grouping the fields into different files and then establishes meaningful associations
between different files in an optimal manner which helps to minimize the response time
while accessing and manipulating the database during its use. To accomplish this, one
should look at the user requirements and find the correct means of logically representing
them. Once the basic data needs are identified, the conceptual data model (logical level)
can be created.

The bad database design may lead to:

❖ Repetition of information
❖ Inability to represent certain information

The design goals of relational database design are:

❖ Avoid redundant data.


❖ Ensure that relationships among attributes are represented
❖ Facilitate the checking of updates for violation of database integrity constraints

.
53

The database design process can be divided into six steps. They are

1. Requirement Analysis
2. Conceptual Design
3. Logical Design
4. Schema Refinement
5. Physical Design
6. Security Design.

1. Requirement Analysis:

The very first step in designing a database application is to understand what data is to be
stored in the database, what applications must be built on top of it, and what operations
are most frequent and leads to performance requirement.

The information gathered in the requirements analysis step is used to develop a


high-level description of the data to be stored in the database along with constraints.

2. Conceptual Database Design:

This step is carried out using the Entity-Relationship (ER) model. The goal of this step is
to create a simple description of the data that closely matches how users and developers
think of the data. This facilitates discussion among all the people involved in the design
process, even those who have no technical background. At the same time the initial design
must be sufficiently precise to enable a straightforward translation into a data model
supported by a commercial database system that is relational model.

3. Logical Database Design:

In this step, we must choose a DBMS to implement our database design and convert the
conceptual database design into a database schema in the data model of the chosen
DBMS. We will consider only relational DBMS and therefore, the task in the logical design
step is to convert an ER schema into relational database schema.
54

4. Schema Refinement:

The forth step in database design is to analyze the collection of relations in relational
database schema to identify potential problems and to refine it.

A schema can be designed as a complete description of database. The specifications for


database schema are provided during the database design and this schema does not
change frequently.

5. Physical Database Design:


This step may simply involve building indexes on some tables and clustering some tables,
or it may involve a substantial redesign of parts of the database schema obtained from the
earlier design steps.

6. Security Design:

Security is an important issue in database management because information stored in a


database is very valuable. So the data in a database management system need to be
protected from unauthorized access and updates.
.

Entity-Relationship (ER) Diagrams:


ER Diagram stands for Entity Relationship Diagram, also known as ERD is a diagram that
displays the relationship of entity sets stored in a database. In other words, ER diagrams
help to explain the logical structure of databases. ER diagrams are created based on three
basic concepts: entities, attributes and relationships.

ER Diagrams contain different symbols that use rectangles to represent entities, ovals to
define attributes and diamond shapes to represent relationships.

At first look, an ER diagram looks very similar to the flowchart. However, ER Diagram
includes many specialized symbols, and its meanings make this model unique. The
55

purpose of ER Diagram is to represent the entity framework infrastructure. ER diagram


has the following three components:

Following are the main components and its symbols in ER Diagrams:

➢ Rectangles: This Entity Relationship Diagram symbol represents entity types


➢ Ellipses : Symbol represent attributes
➢ Diamonds: This symbol represents relationship types
➢ Lines: It links attributes to entity types and entity types with other relationship
types
➢ Primary key: attributes are underlined
➢ Double Ellipses: Represent multi-valued attributes

ER Diagram Examples

For example, in a University database, we might have entities for Students, Courses, and
Lecturers. Students entity can have attributes like Rollno, Name, and DeptID. They might
have relationships with Courses and Lecturers.

Entities:
An entity is a real-world objects that can be identify from all other objects. In ER diagram
an entity is represented by a rectangle that can be denoted as entity box. The name of
the entity is written in the centre of rectangle, whenever ER diagram is applied to relational
model, an entity is mapped to relational table where in each row represents an entity
instance.
56

Example:

Example: Professors, Students, Courses, Departments, etc are some of the entities of
a College Management System.

For example, each person in an enterprise is an entity. A entity has a set of properties
and the values for some set of properties may uniquely identify an entity.

For example, a customer with customer-id property with value 101 uniquely identifies that
person.

An entity in an ER Model is a real-world entity having properties called attributes. An entity


can be place, person, object, event or a concept, which stores data in the database. An
entity contains a real-world property called an attribute. Attributes are defined by a set of
values known as domains. Every attribute is defined by its set of values called domain.
For example, in a school database, a student is considered as an entity. Student has
various attributes like name, age, class, etc. Examples of an entity are a single person,
single product, or single organization.

Any particular row (a record) in a relation (table) is known as an entity.


For example ‘xyz’ student record.

Examples of entity:

• Person: Employee, Student, Patient


• Place: Store, Building
• Object: Machine, product, and Car
• Event: Sale, Registration, Renewal
• Concept: Account, Course

For example, in a College database, the entities can be Professors, Students, Courses, etc.
Entities has attributes, which can be considered as properties describing it, for example, for
57

Professor entity, the attributes are Professor_Name, Professor_Address,


Professor_Salary, etc. The attribute value gets stored in the database.

Example of Entities:
A university may have some departments. All these departments employ various lecturers
and offer several programs.

Some courses make up each program. Students register in a particular program and enroll
in various courses. A lecturer from the specific department takes each course, and each
lecturer teaches a various group of students.

Example: Professors, Students, Courses, Buildings, Departments, etc are some of the
entities of a College Management System.

Attributes:
An entity is represented by a set of attributes. Attributes are descriptive properties an
entity. An attribute is defined as a property that describes the characteristics feature of
a particular entity. It can also be defined as the qualifier that provides additional
information about the entity. Generally, an attribute is an atomic value or unit of
information associated helps in uniquely defining an entity.

Attributes are the properties which define the entity type. For example, Roll_No, Name, DOB,
Age, Address, Mobile_No are the attributes which defines entity type Student. In ER diagram,
attribute is represented by an oval.
58

In Entity-Relationship diagrams, attributes are represented by ellipse and the name of


attributes written inside the ellipse. Each of the attribute is linked with the respective
entity. For example, the attributes associated with the ‘Employee’ entity include
‘Empcode’, ‘Empname’, ‘Phone-no’, ‘Address’ and ‘Age’.

For each attribute associated with an entity set, we must identify a domain of possible
values. For example, the domain associated with the attribute name of ‘Employee’ might
be the set of 20 character strings. Another Example, ‘Employee number’ associated
domain consists of integers 1 through 10.

Example:

where stno,stna,branch-name and branch-code are attributes of an entity student.

Types of Attributes:
S.No. Types of Attributes Description
A simple attribute is an attribute
composed of a single component
with an independent existence.
Simple attributes cannot be
1. Simple attribute or Key attribute further subdivided. Examples of
simple attributes are Roll-no,
Age.., etc..
Simple attributes can’t be divided
any further. For example, a
59

student’s contact number. It is


also called an atomic value.
An attribute which cannot be
further subdivided into
components is a simple
attribute.
Example: The roll number of a
student, the id number of an
employee.

A single-valued attribute is one


that holds a single value for a
single entity. Examples are
Room-no,Customer-id. Single-
valued attributes are also called
as atomic values.

Attributes that can have single


value at a particular instance of
time are called single valued. A
person can’t have more than one
age value. Therefore, age of a
person is a single-values
2. Single-valued attribute attribute.

Single-value attributes contain


single value. For example −
Social_Security_Number.

The attribute which takes up


only a single value for each
entity instance is single-valued
attribute.

Example: The age of a student.

It is possible to break down


composite attribute. For
example, a student’s full name
3. Composite attribute may be further divided into first
name, second name, and last
name.
60

An attribute composed of
multiple components each with
an independent existence is
called a composite attribute.
Example of composite attributes
are
1. Name: which is composed of
attributes such as First name,
Middle name and Last name
2. Address: which is composed
of other components such as
street,city and pincode.

An attribute which can be


splitted into components is a
composite attribute.

Example: The address can be


further splitted into house
number, street number, city,
state, country and pincode, the
name can also be splitted into
first name middle name and last
name.
A derived attribute is one that
represents a value that is
derivable from the value of a
related attribute or set of
attributes. For example, the age
attribute can be derived from the
date of birth attribute.

This type of attribute does not


4.. Derived attribute include in the physical database.
However, their values are
derived from other attributes
present in the database. For
example, age should not be
stored directly. Instead, it should
be derived from the DOB of that
employee.

An attribute which can


be derived from other
61

attributes of the entity type is


known as derived attribute. e.g.;
Age (can be derived from DOB).
In ER diagram, derived attribute
is represented by dashed oval.

An attribute that can be derived


from other attributes is derived
attribute.

Example: Total and average


marks of a student.

A Multi-valued attribute is one


that holds multiple values for a
single entity. For example, a
student entity can have multiple
values for the hobby attribute
such as reading, music and
painting

Multi-valued attributes can


have more than one values. For
example, a student can have
more than one mobile number,
email address, etc.
5. Multi-valued attribute An attribute consisting more
than one value for a given entity.
For example, Phone_No (can be
more than one for a given
student). In ER diagram,
multivalued attribute is
represented by double oval.
The attribute which takes up
more than a single value for
each entity instance is multi-
valued attribute.
Example: Phone number of a
student:Landline and mobile.
62

Entity Sets:
An entity set is a set of entities of the same type that share the same properties or
attributes. The set of all customers at a given bank can be defined as the entity set
customer.

An entity set is defined as a group of entities that have similar types or attributes. For
example, Employees working in an organization are defined as entities E1,E2,E3,E4,….
which may contain similar attributes defined under a specific entity type called ‘Employee’.
Here, the group of entities i.e., { E1,E2,E3,E4,E5,…..} is referred to as an entity set.

If an entity set contains enough attributes for creating a primary key, then it is termed as
‘Strong entity set’. On the other hand, if an entity set does not contain enough attributes
for creating a primary key then it is termed as ‘Weak entity set’.

An entity set is a group of similar kind of entities. All rows of a relation (table) in RDBMS
is entity set. For example, a student name, and student ID describes the ‘Student’ entity,
A set of the same type of entities is known as an ‘Entity set’.

The entity can be divided into two: a)Strong Entity b)Weak Entity

Strong Entity Weak Entity


63

Strong entity set always has a primary It does not have enough attributes to build a
key. primary key.
It is represented by a rectangle symbol. It is represented by a double rectangle symbol.
It contains a Primary key represented by It contains a Partial Key which is represented
the underline symbol. by a dashed underline symbol.
The member of a strong entity set is The member of a weak entity set called as a
called as dominant entity set. subordinate entity set.
In a weak entity set, it is a combination of
Primary Key is one of its attributes which
primary key and partial key of the strong entity
helps to identify its member.
set.
In the ER diagram the relationship The relationship between one strong and a
between two strong entity set shown by weak entity set shown by using the double
using a diamond symbol. diamond symbol.
The connecting line of the strong entity The line connecting the weak entity set for
set with the relationship is single. identifying relationship is double.
In the ER diagram the relationship The relationship between one strong and a
between two strong entity set shown by weak entity set shown by using the double
using a diamond symbol. diamond symbol.

Relationships and Relationship sets:

Relationships:
A relationship defines an association among two or more entities. Consider two entities
such as Student and a Class. These two entities can be associated as Student “studies”
in a class. Here, “studies” is a relationship between the two entities such as student and
class. Similarly, Student “Enrolled” in a course. Here, “Enrolled” is a relationship between
the two entities such as Student and Course.

Example:
64

Logical association among entities is called relationship. Relationship tells how two
attributes are related. Example: Professor works for a department i.e.,
Relationships among entities.

Relationship is nothing but an association among two or more entities. E.g., Tom works
in the Chemistry department. Entities take part in relationships. We can often identify
relationships with verbs or verb phrases.

For example:
• You are attending this lecture
• I am giving the lecture
• Just look entities, we can classify relationships according to relationship-types:
• A student attends a lecture
• A lecturer is giving a lecture.

Example:

In the above diagram, the entities are Teacher and Department. The attributes
of Teacher entity are Teacher_Name, Teacher_id, Age, Salary, Mobile_Number. The
65

attributes of entity Department entity are Dept_id, Dept_name. The two entities are
connected using the relationship. Here, each teacher works for a department.

Relationship sets:
A relationship set is a set of relationships of the same type. As with entities, we may wish
to collect a set of similar relationships into a relationship set. A relationship set can be
thought of as set of n-tuples:

{(e1….en)}

Each n-tuple denotes a relationship involving, n entities e1 through en, where entity e1 is
in entity set Ei.

A collection of similar relationships is called a relationship set and is denoted by a


rhombus.

The set of the same type of relationships is known as 'relationship set'. A relationship set
is a set of relationships of same type.

Features of ER Diagrams:

• Graphical Representation for Better Understanding: It is very easy and simple to


understand so it can be used by the developers to communicate with the
stakeholders.
• ER Diagram: ER diagram is used as a visual tool for representing the model.
66

• Database Design: This model helps the database designers to build the database
and is widely used in database design.

Advantages of ER Diagrams:

• Simple: Conceptually ER Model is very easy to build. If we know the relationship


between the attributes and the entities we can easily build the ER Diagram for
the model.
• Effective Communication Tool: This model is used widely by the database
designers for communicating their ideas.
• Easy Conversion to any Model: This model maps well to the relational model
and can be easily converted relational model by converting the ER model to the
table. This model can also be converted to any other model like network model,
hierarchical model etc.

Disadvantages of ER Diagrams:

• No industry standard for notation: There is a industry standard for developing


an ER diagram. So one developer might use notations which are not understood
by other developers.
Hidden information: Some information might be lost or hidden in the ER model. As it is a
high-level view so there are chances that some details of information might be hidden.

Additional Features of the ER Model:


Additional features of the ER model can be described with some common properties of
the data in expressing ER model. They are

1. Key Constraints
2. Participant Constraints
3. Weak Entities
4. Class Hierarchies
5. Aggregation

.
67

1. Key Constraint:

Keys Constraints are rules that define what data values are allowed in certain data
columns. They are an important database concept and are part of a database's schema
definition.

Certain restrictions must be laid on the level of association that an entity has with a
relationship. These restrictions are called key constraint.

Constraints or the rules that are to be followed while entering data into columns of the database
table. Constraints ensure that data entered by the user into columns must be within the criteria
specified by the condition For example, if you want to maintain only unique IDs in the employee
table or if you want to enter only age under 18 in the student table etc

Example: Key constraint on manages.

Consider a relationship set called manages between the Employees and Departments
entity sets such that each department has at most one manager, although a single
68

Employee is allowed to manage more than one Department. The restriction that each
department has at most one manager is an example of a key constraint, and it implies that
each Department entity appears in at most one manager relationship in any allowable
instance of manages. This restriction is indicated in the ER diagram by using an arrow
from Department to Manages.

Example:

A student can study in almost one college at a time. This restriction is called a key
constraint.

2. Participant constraints:
Entities can participate in a relationship either totally or partially. Participant constraints
can be classified into two types.

a) Total Participation
b) Partial Participation

a) Total Participation:

The participation of an entity set E in a relationship set R is said to be Total if every entity
in E participates in at least one relationship in R. If every entity in an entity set, participates
in a relationship of relationship set, then the participation is said to Total Participant else
Partial Participant.
69

For example, all the student will study in the college but only few of them participate in
other activities such as Games.

Diagram:

b) Partial Participation:
If only some entities in E participate in relationships R, the participation of entity set E in
relationship R is said to be partial.

Partial participation can be represented as

Diagram:

3. Weak Entities:
Weak entity is the one that depends on other entities for existence.
If the existence of entity ‘x’ depends on the existence of entity ‘y’ then ‘x’ is said to be
existence dependent on ‘y’. If ‘y’ is deleted then there is no existence of ‘x’. Entity ‘y’ is
said to be a dominant entity, and ‘x’ is said to be sub-ordinate entity.

Double rectangle represents weak entity set.


Diagram:
70

Double Diamond represents weak-entity relationship.


A weak entity is an entity that depends on the existence of another entity. In more
technical terms, it can define as an entity that cannot be identified by its own attributes. It
uses a foreign key combined with its attributed to form the primary key.

An entity like order item is a good example for this. The order item will be meaningless
without an order so it depends on the existence of order.

A weak entity cannot be identified uniquely as it does not have sufficient entities to form a
primary key. It can be made uniquely unidentifiable, by associating it with a another entity
set called identifying or owner entity set. The owner entity set “OWN” the weak entity set.

The relationship among the two entity sets is called identifying relationship. The identifying
relationship is always many-to-one and its participation is Total. Weak entity set is not
provided with a primary key.

Example:

Consider the entity set ‘Loan’ and the entity set ‘Payment’ that keeps information about all
the payments that were made in connection

4. Class Hierarchies:
71

Class Hierarchy is a method of classifying the entities into sub-classes i.e., entities can be derived
from the parent class. The entities that represent the subclasses can inherit the attributes of
parent class entity and even can have their own entities.

For example:
Consider a “Person” entity set as the parent entity with attributes name, address, and age. The
two sub-classes of this entity set are “Student” and “Faculty”. The attributes of “Student” include
attributes of “Person” plus Course and the attributes of “Faculty” include attributes of “Person”
plus Lecture.

Therefore, it can be said that the attributes of “Person” are inherited by “Student” and “Faculty”
and that both these sub-classes are “ISA” person. It is even possible to classify the entity set
“Person” based on different criterion like senior_person simply by adding a second “ISA” node to
the “Person” entity set.

The subclass-superclass relationship is an inheritance and are often called “ISA” relationship
because a member of the subclass “ISA” member of the superclass. The relationship between
superclass and subclass is represented using Class Hierarchies.

Class Hierarchy Representation:

The following are the two ways of representing a class hierarchy. They are

1. Specialization

2. Generalization

1. Specialization:
Specialization is the process of designating sub groupings within an entity set. Specialization is
a top-down process. All the entities within a entity set do not share all the attributes.

For example:

In a college database, faculty and student entities have same common attributes like name, street
and city. In addition, they have extra attributes like faculty has empid, salary and student has
studentid, marks etc.
72

An entity person can be defined with attributes name, street and city which can further be
subdivided into student and faculty. This subgrouping is known as specialization.

It is a process of identifying the subsets of an entity set each of which has different characteristic
features. In this process, the superclass is defined followed by the subclasses definition. After
defining the superclass of subclasses, the attributes relationship associated with these
subclasses are defined.

For example:
73

In specialization all the entities within an entity set do not share all the attributes.

An Entity person can be defined with attributes name, street and city which can further be
subdivided into student and faculty. This subgrouping is known as specialization. It is a process
of identifying the subsets of an entity set each of which have different characteristics features.

Thus specialization is the process of defining a set of subclasses of an entity type. This entity
type is also termed as the superclass of specialization. The set of subclasses forming a
specialization is on the basis of some distinguishing characteristics of the entities in the
superclass.

For example:

In a college database faculty and student entities have same common attributes like name, street
and city. In addition, they have extra attributes like faculty has emp-id, salary and student has
student-id, marks etc….

2. Generalization:
Generalization is a special case of specialization. The design approach may be top-down or
bottom-up. In top-down, the entities are identified and are subdivided. In contrast, in bottom-up
approach all the low-level entities are grouped to form a high level entity.

For example:

Designer may first identify attributes of Student, Faculty and then group of common attributes into
a higher entity, this is known as Generalization. The high-level entity is called superclass and
low-level entity is called a subclass.
74

In this example, Person entity is a super class of Employee , Customer subclass entities. We
can say that attributes of Person entity have been inherited by Emplioyee, Customer entities
and that Employee ISA person.
75

Example:

Example:

.
76

5. Aggregation
It is an abstraction in which relationship sets are treated as higher level entity sets and can
participate in relationships. Aggregation allows us to indicate that a relationship set participates in
another relationship set.

Aggregation is used to simplify the details of a given database where ternary relationships will be
changed into binary relationships. Ternary relation is only one type of relationship which is working
between three entities.

Aggregation is shown in the image below −

.
77

6. Conceptual Design with the ER Model:


1. Entity versus Attribute:

The main difference between Entity and Attribute is that an entity is a real-world object that
represents data in RDBMS while an attribute is a property that describes an entity.
Relational Database Management System (RDBMS) is a type of database management system
based on the relational model. It helps to store and manage data efficiently to access them easily.
RDBMS stores data in tables or relations. Each table consists of columns and rows. Before
creating a database, it is essential to design a database. An ER diagram helps to accomplish that
task. Entity and Attribute are two concepts related to ER diagrams.

Entity and attribute are the most common terms of DBMS. The fundamental difference between
the entity and attribute is that an entity is an object that exists in a real-world and can be easily
distinguished among all other objects of real-world whereas, the attributes define the
characteristics or the properties of an entity on the basis of which it is easily distinguishable
among other entities of the real-world.

In the relational database, we collect the data in the form of a table. So, the rows of a table
represent the entities of the same type and the columns of a table are considered as attributes of
the entities present in that table.
78

2. Entity versus Relationship:

The main difference between entity and relationship in DBMS is that the entity is a real-world
object while the relationship is an association between the entities. Also, in the ER diagram,
a rectangle represents an entity while a rhombus or diamond represents a relationship.

An entity is a table in DBMS, and it represents a real-world object. These entities are connected
to each other using relationships.

A Database Management System (DBMS) is a software program that stores, retrieves and
manipulates data in the databases. A DBMS contains multiple databases, and each database
consists of multiple tables. The tables are related to each other using relationships. An entity is a
table in DBMS, and it represents a real-world object. These entities are connected to each other
using relationships.

.
79

3. Binary versus Ternary Relationships:


A binary relationship is when two entities participate and is the most common relationship degree.
For Example: A unary relationship is when both participants in the relationship are the same entity.
For Example: Subjects may be prerequisites for other subjects.

A ternary relationship is an association among three entities. The ternary relationship construct
is a single diamond connected to three entities. Sometimes a relationship is mistakenly modeled
as ternary when it could be decomposed into two or three equivalent binary relationships.

********************************UNIT-1 NOTES COMPLETED*******************************

You might also like