You are on page 1of 25

CHAPTER 2

ENTITY RELATIONSHIP MOEDL

I. Important terminologies
1.1. Database: Database is a collection of inter-related data which helps in
efficient retrieval, insertion, and deletion of data from database and organizes the
data in the form of tables, views, schemas, reports etc. For Example, university
database organizes the data about students, faculty, and admin staff etc. which
helps in efficient retrieval, insertion, and deletion of data from it.
1.2. Data Definition Language (DDL): which deals with database schemas and
descriptions, of how the data should reside in the database.

 CREATE: to create a database and its objects like (table, index, views, store
procedure, function, and triggers)
 ALTER: alters the structure of the existing database
 DROP: delete objects from the database
 TRUNCATE: remove all records from a table, including all spaces allocated for
the records are removed
 COMMENT: add comments to the data dictionary
 RENAME: rename an object
1.3. Data Manipulation Language (DML): which deals with data manipulation
and includes most common SQL statements such SELECT, INSERT, UPDATE,
DELETE, etc., and it is used to store, modify, retrieve, delete, and update data in a
database.

 SELECT: retrieve data from a database


 INSERT: insert data into a table
 UPDATE: updates existing data within a table
 DELETE: delete all records from a database table
 MERGE/ UPSERT operation (insert or update)
 CALL: call a PL/SQL or Java subprogram
 EXPLAIN PLAN: interpretation of the data access path
 LOCK TABLE: concurrency control

1
1.4. Database Management System: The software which is used to manage
database is called Database Management System (DBMS). For Example, MySQL,
Oracle etc. are popular commercial DBMS used in different applications. DBMS
allows users the following tasks:

 Data Definition: It helps in creation, modification and removal of definitions


that define the organization of data in database.
 Data Updation: It helps in insertion, modification, and deletion of the actual
data in the database.
 Data Retrieval: It helps in retrieval of data from the database which can be
used by applications for various purposes.
 User Administration: It helps in registering and monitoring users, enforcing
data security, monitoring performance, maintaining data integrity, dealing
with concurrency control, and recovering information corrupted by
unexpected failure.
1.5. Paradigm Shift from File System to DBMS
- File System manages data using files in hard disk. Users are allowed to create,
delete, and update the files according to their requirement. Let us consider the
example of file-based University Management System. Data of students is
available to their respective Departments, Academics Section, Result Section,
Accounts Section, Hostel Office etc. Some of the data is common for all sections
like Roll No, Name, Father Name, Address and Phone number of students but
some data is available to a particular section only like Hostel allotment number
which is a part of hostel office.
- Let us discuss the issues with this system:

 Redundancy of data: Data is said to be redundant if same data is copied at


many places. If a student wants to change Phone number, he has to get it
updated at various sections. Similarly, old records must be deleted from all
sections representing that student.

 Inconsistency of Data: Data is said to be inconsistent if multiple copies of


same data do not match with each other. If Phone number is different in
Accounts Section and Academics Section, it will be inconsistent.
Inconsistency may be because of typing errors or not updating all copies of
same data.

2
 Difficult Data Access: A user should know the exact location of file to access
data, so the process is very cumbersome and tedious. If user wants to search
student hostel allotment number of a student from 10000 unsorted students’
records, how difficult it can be.

 Unauthorized Access: File System may lead to unauthorized access to data. If


a student gets access to file having his marks, he can change it in unauthorized
way.

 No Concurrent Access: The access of same data by multiple users at same


time is known as concurrency. File system does not allow concurrency as data
can be accessed by only one user at a time.

 No Backup and Recovery: File system does not incorporate any backup and
recovery of data if a file is lost or corrupted.
- DBMS 3-tier architecture divides the complete system into three inter-related but
independent modules as shown below:

 Physical Level: At the physical level, the information about the location of
database objects in the data store is kept. Various users of DBMS are unaware
of the locations of these objects. In simple terms, physical level of a database
describes how the data is being stored in secondary storage devices like disks
and tapes and gives insights on additional storage details.

 Conceptual Level: At conceptual level, data is represented in the form of


various database tables. For Example, STUDENT database may contain
STUDENT and COURSE tables which will be visible to users, but users are

3
unaware of their storage. Also referred as logical schema, it describes what
kind of data is to be stored in the database.

 External Level: An external level specifies a view of the data in terms of


conceptual level tables. Each external level view is used to cater to the needs
of a particular category of users. For Example, FACULTY of a university is
interested in looking course details of students, STUDENTS are interested in
looking at all details related to academics, accounts, courses, and hostel
details as well. So, different views can be generated for different users. The
focus of external level is data abstraction.
1.6. Data Independence
Data independence means a change of data at one level should not affect another
level. Two types of data independence are present in this architecture:

 Physical Data Independence: Any change in the physical location of tables


and indexes should not affect the conceptual level or external view of data.
This data independence is easy to achieve and implemented by most of the
DBMS.

 Conceptual Data Independence: The data at conceptual level schema and


external level schema must be independent. This means a change in
conceptual schema should not affect external schema. e.g., Adding or deleting
attributes of a table should not affect the user’s view of the table. But this type
of independence is difficult to achieve as compared to physical data
independence because the changes in conceptual schema are reflected in the
user’s view.
1.7. Phases of database design
Database designing for a real-world application starts from capturing the
requirements to physical implementation using DBMS software which consists of
following steps shown below:

4
 Conceptual Design: The requirements of database are captured using high
level conceptual data model. For Example, the ER model is used for the
conceptual design of the database.
 Logical Design: Logical Design represents data in the form of relational
model. ER diagram produced in the conceptual design phase is used to
convert the data into the Relational Model.
 Physical Design: In physical design, data in relational model is implemented
using commercial DBMS like Oracle, DB2.
1.8. Advantages of DBMS
DBMS helps in efficient organization of data in database which has following
advantages over typical file system:
 Minimized redundancy and data inconsistency: Data is normalized in DBMS
to minimize the redundancy which helps in keeping data consistent. For
Example, student information can be kept at one place in DBMS and accessed
by different users. This minimized redundancy is due to primary key and
foreign keys
 Simplified Data Access: A user need only name of the relation not exact
location to access data, so the process is very simple.
 Multiple data views: Different views of same data can be created to cater the
needs of different users. For Example, faculty salary information can be
hidden from student view of data but shown in admin view.

5
 Data Security: Only authorized users are allowed to access the data in DBMS.
Also, data can be encrypted by DBMS which makes it secure.
 Concurrent access to data: Data can be accessed concurrently by different
users at same time in DBMS.
 Backup and Recovery mechanism: DBMS backup and recovery mechanism
helps to avoid data loss and data inconsistency in case of catastrophic failures.
1.9. DBMS functions
A Data Base Management System is a system software for easy, efficient, and
reliable data processing and management. It can be used for:
 Creation of a database.
 Retrieval of information from the database.
 Updating the database.
 Managing a database.
It provides us with the many functionalities and is more advantageous than the
traditional file system in many ways listed below:

 Processing Queries and Object Management:


In traditional file systems, we cannot store data in the form of objects. In
practical-world applications, data is stored in objects and not files. So, in a file
system, some application software maps the data stored in files to objects so
that can be used further.
We can directly store data in the form of objects in a database management
system. Application-level code needs to be written to handle, store and scan
through the data in a file system whereas a DBMS gives us the ability to query
the database.
 Controlling redundancy and inconsistency:
Redundancy refers to repeated instances of the same data. A database system
provides redundancy control whereas in a file system, same data may be
stored multiple times. For example, if a student is studying two different
educational programs in the same college, say, Engineering and History, then
his information such as the phone number and address may be stored multiple
times, once in Engineering dept and the other in History dept. Therefore, it
increases time taken to access and store data. This may also lead to

6
inconsistent data states in both places. A DBMS uses data normalization to
avoid redundancy and duplicates.
 Efficient memory management and indexing:
DBMS makes complex memory management easy to handle. In file systems,
files are indexed in place of objects, so query operations require entire file
scans whereas in a DBMS, object indexing takes place efficiently through
database schema based on any attribute of the data or a data-property. This
helps in fast retrieval of data based on the indexed attribute.
 Concurrency control and transaction management:
Several applications allow user to simultaneously access data. This may lead
to inconsistency in data in case files are used. Consider two withdrawal
transactions X and Y in which an amount of 100 and 200 is withdrawn from
an account A initially containing 1000. Now since these transactions are
taking place simultaneously, different transactions may update the account
differently. X reads 1000, debits 100, updates the account A to 900, whereas
Y also reads 1000, debits 200, updates A to 800. In both cases account A has
wrong information. This results in data inconsistency. A DBMS provides
mechanisms to deal with this kind of data inconsistency while allowing users
to access data concurrently.
 Access Control and ease in accessing data:
A DBMS can grant access to various users and determine which part and how
much of the data can they access from the database thus removing
redundancy. Otherwise in file system, separate files must be created for each
user containing the amount of data that they can access. Moreover, if a user
must extract specific data, then he needs a code/application to process that
task in case of file system, e.g., Suppose a manager needs a list of all
employees having salary greater than X. Then we need to write business logic
for the same in case data is stored in files. In case of DBMS, it provides easy
access of data through queries, (e.g., SELECT queries) and whole logic need
not be rewritten. Users can specify exactly what they want to extract out of the
data.
 Integrity constraints: Data stored in databases must satisfy integrity
constraints. For example, consider a database schema consisting of the various
educational programs offered by a university such as Tech/M.Tech/B.Sc/M.Sc
etc. Then we have a schema of students enrolled in these programs. A DBMS

7
ensures that it is only out of one of the programs offered schema, that the
student is enrolled in, i.e. Not anything out of the blue. Hence, database
integrity is preserved. Apart from the above mentioned features a database
management also provides the following:

 Multiple User Interface

 Data scalability, expandability, and flexibility

 Security
1. 10. Data Abstraction and Data Independence
Database systems comprise complex data-structures. To make the system efficient
in terms of retrieval of data, and reduce complexity in terms of usability of users,
developers use abstraction i.e., hide irrelevant details from the users. This
approach simplifies database design.

- There are mainly 3 levels of data abstraction:

 Physical: This is the lowest level of data abstraction. It tells us how the data is
stored in memory. The access methods like sequential or random access and
file organization methods like B+ trees, hashing used for the same. Usability,
size of memory, and the number of times the records are factors that we need to
know while designing the database.
Suppose we need to store the details of an employee. Blocks of storage and the
amount of memory used for these purposes are kept hidden from the user.
 Logical: This level comprises the information that is stored in the database in
the form of tables. It also stores the relationship among the data entities in
relatively simple structures. At this level, the information available to the user
at the view level is unknown.
We can store the various attributes of an employee and relationships, e.g., with
the manager can also be stored.
 View: This is the highest level of abstraction. Only a part of the actual database
is viewed by the users. This level exists to ease the accessibility of the database
by an individual user. Users view data in the form of rows and columns. Tables
and relations are used to store data. Multiple views of the same database may
exist. Users can just view the data and interact with the database, storage and
implementation details are hidden from them.

8
View 1 View 2 View 3

Logical level

Physical level

The main purpose of data abstraction is to achieve data independence to save time
and cost required when the database is modified or altered.

- Two levels of data independence arising from these levels of abstraction:

(1). Physical level data independence: It refers to the characteristic of being able
to modify the physical schema without any alterations to the conceptual or logical
schema, done for optimization purposes, e.g., Conceptual structure of the database
would not be affected by any change in storage size of the database system server.
Changing from sequential to random access files is one such example. These
alterations or modifications to the physical structure may include:
 Utilizing new storage devices.
 Modifying data structures used for storage.
 Altering indexes or using alternative file organization techniques etc.
(2). Logical level data independence: It refers characteristic of being able to
modify the logical schema without affecting the external schema or application
program. The user view of the data would not be affected by any changes to the
conceptual view of the data. These changes may include insertion or deletion of
attributes, altering table structures entities or relationships to the logical schema,
etc.
II. The entity relationship model

9
ER Model is used to model the logical view of the system from data perspective
which consists of these components:
2.1. Entity
An Entity may be an object with a physical existence – a particular person, car,
house, or employee – or it may be an object with a conceptual existence – a
company, a job, or a university course. There are many definitions and descriptions
of an entity. Here are a few; some are quite informal; some are very precise:

• An entity is something of interest.

• An entity is a category of things that are important for a business, about which
information must be kept.

• An entity is something you can make a list of, is a class or type of things.

• An entity is a named thing, usually a noun.

Two important aspects of an entity are that it has instances and that the
instances of the entity somehow are of interest to the business.

2.2. Attribute(s):
An attribute is a piece of information that in some way describes an entity. An
attribute is a property of the entity, a small detail about the entity. Attributes are
the properties which define the entity type. For example, StdID, StdName, Date
of birth, Address, Tel, Email are the attributes which defines entity type Student.

- Key Attribute
The attribute which uniquely identifies each entity in the entity set is called key
attribute. For example, StdID will be unique for each student.
- Composite Attribute
An attribute composed of many other attributes is called as composite attribute.
For example, Address attribute of student Entity type consists of Street, City,

10
State, and Country. In ER diagram, composite attribute is represented by an oval
comprising of ovals.

- Multivalued Attribute
An attribute consisting of more than one value for a given entity. For example,
Phone_No (can be more than one for a given student). In ER diagram, multivalued
attribute is represented by double oval.

- Derived Attribute
An attribute which can be derived from other attributes of the entity type is
known as derived attribute. (e.g., Age can be derived from DOB).

The complete entity type Student with its attributes can be represented as:

2.3. Instance
Entity has instances and that the instances of the entity somehow are of interest to
the business. It is the particular information about object in real world
2.4. Entity type

11
Group of attributes that depict to entity make entity type. All objects in natural
world become entity must have the same entity type
Example:
Objects: The people who work in Moonlight Coffee shop (like Mr. John, Ms. Jill
etc.)

Entity EMPLOYEE

Attributes Code Name Job Addr. Tel

001 John accountant Brighton 123-xxx.xxx

002 Jill Cleaner Brighton 456-xxx.xxx

Instances 003 Bull Cook Dakota 789-xxx. xxx

004 Smith Driver Dakota 965-xxx.xxxx

005 Bob Cleaner Suanse 521-xxx.xxxx

Entity type: EMPLOYEE (Code, Name, job, Addr., Tel)

2.5. Relationships
 Present something of significance to the business
 Express how entities are mutually related
 Always exist between two entities (or one entity twice)
 Always have two perspectives
 Are named at both 2 ends
 Drawn by line with symbol at both 2 ends

12
A particular relationship can be worded in many ways: An EMPLOYEE has a JOB,
or an EMPLOYEE performs a JOB, or an EMPLOYEE holds a JOB
Based on what you know about instances of the entities, you can decide on four
questions:
(1). Must every employee have a job? (yes)
In other words, is this a MANDATORY or OPTIONAL relationship for an employee?
(2). Can employees have one job? (yes)
Can confirm that employee has only one job, and
(3). Must every job be done by an employee? (no)
In other words, is this a MANDATORY or OPTIONAL relationship for a job?
(4). Can a job be done by more than one employee? (yes)
Can confirm that JOB has many employees do

Relationships can be mandatory or optional, in the same way as attributes.


Mandatory relationships are drawn as a solid line, optional relationships as dotted
lines.

Relationship Representation
Relationships are represented by a line, connecting the entities. The name of the
relationship, from either perspective, is printed near the starting point of the

13
relationship line.
The shape of the end of the relationship line represents the degree of the
relationship.
This is either one or many. One means exactly one; many means one or more.
In the above example, it is assumed that JOBS are held by one or more
EMPLOYEES.
This is shown by the tripod (or crowsfoot), at EMPLOYEE.
An EMPLOYEE, on the other hand, is assumed here to have exactly one JOB. This
is represented by the single line at JOB.
In the above example, it is assumed that JOBS are held by one or more
EMPLOYEES.
This is shown by the tripod (or crowsfoot), at EMPLOYEE.
An EMPLOYEE, on the other hand, is assumed here to have exactly one JOB. This
is represented by the single line at JOB.
There are four compositions as:

have

be held

An EMPLOYEE must have exactly one JOB.


A JOB must be held by one or more EMPLOYEES.

have

be held

An EMPLOYEE must have one JOB.


A JOB may be held by many EMPLOYEES.

have

be held

An EMPLOYEE may have one JOB.


A JOB must be held by many EMPLOYEES.

14
have

be held

An EMPLOYEE may have one JOB.


A JOB must be held by many EMPLOYEES.
2.6. Entity Relationship Models and Diagrams
An Entity Relationship Model (ER Model) is a list of all entities and attributes as
well as all relationships between the entities that are of importance. The model also
provides background information such as entity descriptions, data types and
constraints. The model does not necessarily include a picture, but usually a diagram
of the model is very valuable.
An Entity Relationship Diagram (ER Diagram) is a picture, a representation of the
model or of a part of the model. Usually, one model is represented in several
diagrams, showing different business perspectives. Here are some symbols for:
ENTITY

ATTRIBUTE

15
RELATIONSHIP

2.7. Other relationship type

 Relationship 1:1

 Relationship m:m
The various types of m:m relationships are common in a first version of an ER Model.
In later stages of the model most m:m relationships, and possibly all, will disappear.

DRIVER CAR

CAR DRIVER

CarID Type Color DriverID DriverID Name Age CarID


001 Toyota Red 01 01 Bull 25 001
002 Nisan White 01 03 Bob 33 003
003 Kia Black 03 05 Steve 27 004
004 Toyota White 05 01 Bull 25 002
004 Toyota White 03 03 Bob 33 004

16
This type of relationship must normalized into 1:m or m:1, by add more
intermediary entity

DRIVER SHIFT CAR

 Recursive Relationships in ER diagrams


A relationship between two entities of a similar entity type is called
recursive relationship. Here the same entity type participates more than once in a
relationship type with a different role for each instance. In other words, a
relationship has always been between occurrences in two different entities.
However, the same entity can participate in the relationship. This is termed
a recursive relationship.

17
CASE STUDY

What Information is Available?


The illustration shows a piece of a weather forecast torn from a European
newspaper, showing various types of information. What are the types of
information? One of the first things you will see are, for example, “København”,
“Bremen”. These are cities, or more precisely, names of cities. The little drawings
represent the type of weather; these drawings are icons. The next columns are
temperatures, probably maximum and minimum; the arrows indicate wind direction
and the number next to it is the wind force. Then there is a date on top which is the
forecast date. Therefore, we have:
• Name of the city (such as “København”)
• Weather type (such as “cloudy with rain”)
• Icon of the weather type
• Minimum temperature
• Maximum temperature
• Wind direction arrow
• Wind force
• Forecast date
No, you can find out even more information. To do this you have to have some
“business” knowledge. In this case it is geographical knowledge.

18
You may notice that the cities in the weather forecast are not printed in a random
order. The German cities (Bremen, Berlin and Munchen) are grouped together, just
as the French cities are. Moreover, the cities are not ordered alphabetically by name
but seem to be ordered North-South. Apparently, this report “knows” something to
facilitate the grouping and sorting. This could be:
• Country of the city
• Geographical position of the city
and maybe even
• Geographical position of the country
Try to identify which of the above types of information is probably an entity, which
is an attribute, and which is a relationship.
City and Country are easy. These are entities, both with, at least, attribute Name and
Geographical Position. Weather Type could also be an entity as there is an attribute
available: Icon. For the same reason there could be an entity Wind Direction. Now,
where does this leave the temperatures and forecast date? These cannot be attributes
of city as the forecast date is not single value for a city: there can be many forecast
dates for a city. This is how you discover that there is still one entity missing, such
as Forecast, with attributes Date, Minimum and Maximum Temperature, Wind
Force. There are likely to be relationships between:
• COUNTRY and CITY
• CITY and FORECAST

19
• FORECAST and WEATHER TYPE
• FORECAST and WIND DIRECTION.

In this entity relationship diagram some assumptions are made about the
relationships:
• Every FORECAST must be about one CITY, and
not all CITIES must be in a FORECAST- but may be in many
• Every CITY is located in a COUNTRY, and every COUNTRY has one or
more CITIES
• A FORECAST must not always contain a WEATHER TYPE, and
not all WEATHER TYPES are in a FORECAST—but may be in many
• A FORECAST must not always contain a WIND DIRECTION, and
not all WIND DIRECTIONS are in a FORECAST—but may be in many
The rationale behind these assumptions is that we consider an incomplete
FORECAST still to be a FORECAST, unless we do not know the date or the
CITY the FORECAST refers to.

20
EXERCISES

1. Identify the list of attributes from shop list as follow:

2. Find the attributes appropriate to each entity, then draw a line between the attribute
and the entity or entities it describes.

3. List which of the following concepts you think is an Entity, Attribute, or Instance. If you
mark one as an entity, then give an example instance. If you mark one as an attribute or
instance, give an entity. For the last three rows, find a concept that fits.

21
4. Make a list of about 15 different entities that you think are important for
Moonlight Coffees. Use your imagination and common sense and, of course, use
what you find in the summary that is printed below.

5. Use the information from the receipt and make a list of entities and attributes.

6. Use the schedule that is used in one of the shops in Amsterdam as a basis for an entity
relationship model. The schedule shows, for example, that in the week of 12 to 18 October
Annet B is scheduled for the first shift on Monday, Friday, and Saturday.

22
7. Analyze the example page from Ralph’s famous Raving Recipes book and list as
many different types of information that you can find that seem important.

a. Group the various types of information into entities and attributes


b. Name the relationships you discover and draw a diagram

8. Which text corresponds to the diagram?

23
9. Choose entities, attributes and relationship for templates follow

24
10. Give the ENTITIES as follow:

READER BOOK

FACULT MAJOR
Y
a). Find attributes and make relationships for entities
b). Normalize relationship if necessary

25

You might also like