Professional Documents
Culture Documents
2
DATABASE DESIGN AND ER DIAGRAMS
CONTENTS
2.0 Aims and Objectives
2.1 Introduction
2.2 Database design
2.3 Entity, Attributes and Entity Sets
2.4 Relationships and Relationship Set
2.5 E-R Diagrams
2.6 Additional Features of ER Diagrams
2.7 Conceptual Database Design with the ER Model
2.8 Let us Sum up
2.9 Lesson end Activities
2.10 Keywords
2.11 Questions for Discussion
2.12 Suggested Readings
2.1 INTRODUCTION
Constructing a database to manage a data application requires a lot of intuition and
experience. An ill-designed database can render the entire effort put into the development
of a database system useless. The database development itself concerns with identifying
the items for which data needs to be stored and subsequently organizing the data into a
useful form suitable for data management. This lesson revolves around the means and
tools used in the database development activity.
Entity
Entity is an item - person or thing - that can be identified in the miniworld under
consideration. For instance, while analyzing a banking system, we come across several
entities such as Account holder, Branch manager, Account books, Cheque books and the
like. Each of these identifiable things are entities as far as banking miniworld is concerned.
Attribute
An attribute is a characteristic property of an entity. Each entity of the system under
44 consideration has a number of different attributes highlighting different aspect of the
entity. Thus, Name, Address, Date of birth, Date of opening account etc.; and Name, Database Design
and ER Diagrams
Address, Employee number, Date of joining etc.; are a few attributes of the Account
holder and Branch manager entities respectively. Similarly, Account number, Branch
code, etc. are a few attributes of the cheque book entity.
An entity is represented by the set of attributes it has. Every entity is described by a set
of (attribute, data value) pair. Thus, the Account holder and Branch manager entities can
be represented as follows:
AccountHolder(Name="Eknath", Address="A-45, Tagore Garden", DateOfBirth="13/11/1966",
DateOfOpeningAccount="12/10/2005")
BranchManager(Name="Vibhor Kasliwal", Address="B-46, Naraina", EmployeeNumber="BM008",
DateOfJoining="01/07/2006")
An attribute may be restricted to take values from a set of values called domain of the attribute.
Attributes can be classified in a number of different ways. According to atomicity an attribute
can be Simple or Composite. A simple attribute is atomic and cannot be broken further down
into simpler components where as a composite attribute is composed of simpler values. For
instance, while age is a simple attribute of a student entity, date of birth is a composite
attribute because it can be still broken into three simpler values - day, month and year.
On the basis of whether an attribute must take a value for an instance of a given entity
or not, attributes can be either Null-valued or Non-Null-valued. A null attribute may not
take a value in which case it is said to have missing value. A non-null attribute must not
have missing value. Note that missing value is not same as zero. Thus, the PhoneNumber
attribute is a null-valued because an entity may or may not have a phone number. Attributes
which are not null type must not accept a missing value.
On the basis of the multiplicity of values taken attributes can be classified either as Single-
valued or Multi-valued. A single-valued attribute takes only one value for a given entity
while a multi-valued attribute can take more than one value for a given entity. For instance,
for the ongoing banking system, the attribute BranchManagerId is a single-valued while
PhoneNumber is a multi-valued attribute - i.e., one instance of a branch manager will have
a single value for BranchManagerId but can have more than one phone number.
An attribute can be Derivable or Non-Derivable. Derivable attribute can be obtained by
subjecting other attributes to computation. For instance, Age is a derived attribute for an
AccountHolder entity because its value can be obtained by computing the difference between
the current date and the DateOfBirth attribute. Obviously, since the value of the derived
attribute can be computed from other attributes its value need not be stored in the database.
Entity Set
A collection of entities of the same type is referred to as entity set.
E = {x x is anentity}
Entity sets may or may not be disjoint to each other. For example, the entity set
BrabchManagers of all branch managers and the entity set AccountHolders of all account
holders of a particular bank may have common members and therefore are not disjoint.
An attribute acts like a function mapping a set of entity into a domain of attributes.
atribute
entity æææ Æ domain
There is one pair for each attribute of the entity set. An entity also corresponds to a data
type (such as structure) of a computer programming language. Thus, a programming
language variable corresponds to an entity in the E-R model.
45
Relational Database
Management System 2.4 RELATIONSHIPS AND RELATIONSHIP SET
Entities do not exist in isolation form each other. They are more often than not related
somehow with each other. A relationship is an association between entities. Let there be
n number of entities given by:
E1,E2,E3 , , ,….,En
In fact another relationship may also exist in the opposite manner. In our case the
relationship - Belongs To - can be thought of a relationship from Account entity to
AccountHolder entity as:
Account (AccountNumber = "1345", OpeningBalance=50000) Belongs To AccountHolder(Name="Eknath",
Address="A-45, Tagore Garden", DateOfBirth="13/11/1966", DateOfOpeningAccount="12/10/2005")
Thus, the two relationships "Owns" and "Belongs To" are inversely related to each other.
A relationship that exists between two entities is called binary relationship. But a
relationship may exist between more than two entities as well. A relationship between
three entity sets is called ternary relationship, and so on. The number of sets forming a
relationship is known as degree of relationship. Therefore, a binary relationship is of
degree 2 while ternary is that of 3 and a quinary of degree 5.
The objective of the E-R designing activity is to obtain these components of the E-R model.
Entity
Single-valued
attribute
Relationship
Connector
Derived attribute
Multi-valued attribute
Existence
dependency
Identifying
relationship
This list is not complete. Other symbols will be coved later. Using these symbols one can
draw an entity-relationship diagram for a database. ER diagram for the entity
AccountHolder is shown in figure 2.1 and for entity Account is shown in figure 2.2
Figure 2.1: ER diagram for the entity: Account Holder (Name, Address, Date of Birth, Date of
Opening Account)
Figure 2.2: ER diagram for the entity: Account (Account Number, Opening Balance) 47
Relational Database
Management System
Cardinality Constraints
Constraints are domain rules that must be followed by the ER components for which
they are defined. A common constraint laid on relationship is that of cardinality. Cardinality
constraint specifies the multiplicity of instances of entities related to each other by a
relationship.
The relationship between entity sets may be many-to-many (M: N), one-to-many (1:M),
many-to-one (M: 1) or one-to-one (1:1). The 1:1 relationship between entity sets E1 and
E2 indicates that for each entity in either set there is at most one entity in the second set
that is associated with it. The 1:M relationship from entity set E1 to E2 indicates that for
an occurrence of the entity from the set E1, there could be zero, one, or more entities
from the entity set E2 associated with it. Each entity in E2 is associated with at most one
entity in the entity set E1. In the M: N relationship between entity sets E1 and E2, there
is no restriction to the number of entities in one set associated with an entity in the other
set. The database structure, employing the E-R model is usually shown pictorially using
entity-relationship (E-R) diagram. The cardinality of a relation may be represented by a
set of pictorial notations called Crow Feet notation shown listed in table 2. These symbols
are paired to form desired cardinality.
Table 2.2.: Crow feet notation
0 Zero
1 Exactly one
M Many
Paired Notation
The "Crow's Foot" notation represents relationships with connecting lines between entities,
and pairs of symbols at the ends of those lines to represent the cardinality of the
relationship.
To illustrate these different types of relationships consider the following entity sets:
DEPARTMENT, MANAGER, EMPLOYEE, and PROJECT.
The relationship between a DEPARTMENT and a MANAGER is usually one-to-one;
48 there is only one manager per department and a manager manages only one department.
This relationship between entities is shown in Figure 2.4. Each entity is represented by a Database Design
and ER Diagrams
rectangle and the relationship between them is indicated by a direct line. The relationship
for MANAGER to DEPARTMENT and from DEPARTMENT to MANAGER is both
1:1. Note that a one-to-one relationship between two entity sets does not imply that for
an occurrence of an entity from one set at any time there must be an occurrence of an
entity in the other set. In the case of an organization, there could be times when a
department is without a manager or when an employee who is classified as a manager
may be without a department to manage. Figure 2.5 shows some instances of one-to-
one relationships between the entities DEPARTMENT and MANAGER.
A one-to-many relationship exists from the entity MANAGER to the entity EMPLOYEE
because there are several employees reporting to the manager. As we just pointed out,
there could be an occurrence of the entity type MANAGER having zero occurrences of
the entity type EMPLOYEE reporting to him or her. A reverse relationship, from
EMPLOYEE to MANAGER, would be many to one, since many employees may be
supervised by a single manager. However, given an instance of the entity set EMPLOYEE,
there could be only one instance of the entity set MANAGER to whom that employee
49
Relational Database reports (assuming that no employee reports to more than one manager). The relationship
Management System
between entities is illustrated in Figure 2.12 to 2.13.
Key Constraints
A key is a value (or set of values) that can serve the purpose of identifying instances at
various occasions is called a key. As mentioned earlier entities and relationships can
have attributes. Also each entity and relationship can have many instances.
An attribute (or a set of attributes) that tells each instance from the other is called
primary key. It is not necessary that the primary key is unique itself. An entity (or
relationship) may have more than one primary key. One of these possibilities is designated
to be primary key the rest are referred to as candidate keys. A set of attributes containing
a primary key is called a superkey.
In an ER diagram a primary key is indicated by underlining the key attribute(s). For
instance in our banking example, AccountNumber attribute can be taken as the primary
key for the Account entity because this attribute (AccountNumber) is capable of identifying
each instance of Account entity uniquely. The same is indicated by underlining the attribute
in the ER diagram (Figure 2.16).
E m p loy e e _ N o D a te _ o f_ H ire
Nam e
EM PLOY EE
G E N E R A L IZ AT IO N S P E C IA L IZ AT IO N
IS _ A IS _ A
F U L L _ T IM E _ PA R T _ T IM E _
S a la r y EM PLOY EE Ty p e
EM PLOY EE
IS _ A IS _ A IS _ A IS _ A
fa c u lty S ta ff te a ch in g casual
Aggregation
Aggregation is the process of compiling information on an object, thereby abstracting a
higher-level object. In this manner, the entity person is derived by aggregating the
characteristics name, address, and IDnumber. Another form of the aggregation is
abstracting a relationship between objects and viewing the relationship as an object. As
such, the ENROLMENT relationship between entities Student and Course could be
viewed as entity REGISTRATION. Examples of aggregation are shown in Figure 2.20
53
Relational Database
Management System Person
(a)
(b)
Figure 2.20: Example of Aggregation
Computing System
54
PowerDesigner Rational Rose
Database Design
2.7 CONCEPTUAL DATABASE DESIGN WITH THE ER and ER Diagrams
MODEL
Database designing is a long arduous process that becomes even more complex for
complex user requirements. The process encompasses all that takes to fit a database
solution to the users' requirements. Generally, users' requirements are not captured
correctly because either the user is unable to specify his requirements clearly or the
developer comprehends differently. Therefore, usually database design process is carried
out in iterative manner. In any case the database design process goes through the following
broad developmental phases:
Conceptual design: In this phase data models are developed independent of any physical
considerations.
Logical design: In this phase the conceptual design is refined and the same is mapped
onto some database model. The database models to choose from are relational model,
object-oriented model, network model, hierarchical model etc.
Physical design: in this phase the logical model obtained thus far is mapped onto a
specific Database Management System. Here one has a lot any choices ranging from
simple MS-Access to massive Oracle.
Note that in some cases where good understanding of the domain data is available
upfront some of these phases may be skipped and tables and other objects can be created
right away in a specific database management system. However, the real world situation
happens to be far more complex than that. Therefore, for real world applications, one is
advised to use all the three phases of development. This will ensure that the database
has been designed to operate as desired.
Conceptual design is an activity of domain data modelling wherein we identify the data
objects or entities and their relationships are identified and documented. ER diagrams
are extremely helpful in arriving at the conceptual design of a database. Separate ER
diagrams of entities and relationships are laid down to form the complete conceptual
design.
56
1. FNam LNam
Database Design
and ER Diagrams
Salary Locatio DeptN
Startdate
EMPN EMPLOYEE DEPARTMEN
(0,1) (1,1)
Manages
Address Hours
(0,N)
(1,N)
(0,N) WorksOn
(0,1)
(1,N) Controls
Supervisor (1,1)
Depends On
PROJECT
DEPENDENT
Name Location
Locatio
Relationship BDate
2.10 KEYWORDS
Attribute: A property of an entity whose data is to be stored.
Data: A representation of some measurable aspect of some fact.
Database: A repository of well-organized data.
Database administrator: The database user who is responsible for making decisions
and controlling the database.
Database management system: A computerized system that manages and controls an
electronic database.
Data model: A collection of concepts that describes the structure of a database such as
tables etc.
Entity-Relationship model: A data modeling tool that treats each object of the miniworld
as a separate unit having individual attributes and being related to one another.
ER diagram: A graphical representation of entities, attributes and relationships in a
miniworld.
57
Relational Database Information: A higher-level meaningful deduction obtained by processing data.
Management System
Key attribute: An attribute of an entity that uniquely determines each instance of an
entity.
Metadata: Data that describe data stored in a database.
Miniworld: (Also called Universe of Discourse) The small part of the world whose
related data are stored in a particular database.
Process: The set of activities that derives information from the given data.
Schema: The description of the data models in a database.
Weak entity: An entity type that does not have a key attribute of its own.