You are on page 1of 51

Database Concepts EGS 2207

Database design
Lecture outline

1. Database design stages


2. Entity Relationship Model
3. Drawing Entity Relationship Diagrams
4. Class Exercise
Database design process

1. Requirements analysis
2. Conceptual database(db) design
3. Choice of the DBMS
4. Data model mapping(Logical design)
5. Physical design
6. Implementation
 feedback loops exist, i.e. may need to revisit earlier

stages during a later stage


 Conceptual design and logical design are DBMS

independent while physical design is DBMS specific.


Requirements collection and
analysis
 Document the data requirements of the users
 Functional requirements are the operations that can be
applied to the database including queries and updates.
 Specification used as the basis of the design.
 Typical activities:
 Identification of application areas and user groups

 Analysis of existing documentation of application areas

eg policy documents, forms,reports,organisational


charts
 Analysis of current operating environments and the

planned use of the information eg information flow ,


types of transactions
 Responses to user questionnaires are analysed.
Cont

 Start from a description of the requirements which is:


 poorly structured,
 heterogeneous
 informal
 And use a technique to transform that into a specification
of the database requirements which is:
 formal
 homogeneous
 consistent
 complete
Conceptual design

 To produce a conceptual schema of the db


 Two sub tasks:
 Schema design
 Transaction design
 Schema design:
 Expressed using concepts of the high level data model
 No implementation details(should be non-technical so
users may understand)
 But should be detailed in terms of the objects of the
domain the db will represent.
 Independent of the DBMS
cont

 Transaction design:
 examines the database applications whose
requirements were analysed in phase 1 and produces
high level specifications for these transactions

 The goal is to achieve understanding of database


structure, semantics ,interrelationships and constraints.
cont

 Need to be expressed in a “language” which offers:


 expressiveness: able to distinguish between different

types of data,relationships and constraints


 simplicity: easy for non-specialist users to understand

and use concepts


 minimality: small number of basic concepts that are

distinct and do not overlap


 diagrammatic representation: for ease of presentation;

it should therefore be easy to interpret


 formality: must represent a formal, unambiguous

specification of the data


 Some of these requirements sometimes conflict
 Most popular data model used is the Entity-Relationship
(ER) model
Conceptual schema design

 Purpose: to produce a conceptual schema of the database


 Expressed using concepts of the high level data model
 Not including implementation details(has to be
understood by non-technical users)
 But detailed in terms of the ‘objects’ of the domain the
database will represent.
 Independent of the DBMS to be used
 Cannot be used directly to implement the database
 Design is made in terms of a semantic or conceptual
model
Transaction Design

 Purpose: to produce a design of the transactions, that will


run on the database
1. retrieval: retrieve data for display or as part of a report
2. update: enter new data or amend existing data
3. mixed: more complex applications may do both retrieval
and update
Why?
 need to be sure to include in the conceptual schema all

information required by transactions


 relative importance and frequency of use of

transactions will influence physical database design


Choosing a DBMS

 Purpose: establish which is the best framework for


implementing the produced schema.
 Choice made on the basis of technical factors
 the DBMS has to support the required tasks

 Of economic factors
 software acquisition/maintenance, hardware

acquisition,creation/conversion, training of staff


 And of organisational factors:
 platforms supported, availability of vendor services
Logical design

 To transform the generic , DBMS independent conceptual


schema into the data model of the chosen DBMS
 Two stages:

1. system independent mapping: no consideration of any


specific characteristics that may apply to the specific
DBMS package
2. Tailoring to DBMS: different DBMSs may implement the
same data model in slightly different ways
 Result is a set of DDL statements in the language of the

chosen DBMS.
 Some CASE tools generate DDL statements from a

conceptual design
Physical design

 To choose the specific storage structures and access


paths for the db files.
 Attention is placed on performances
 Response time: minimise data access time for data
items referenced by frequently used transactions.
 Space utilisation: less frequently used data and queries
may be archived.
 Transaction throughput: average number of
transactions that can be processed per minute.
Implementation

 To create the db
 Compile and execute DDL statements
 Populate the db
 Manually or automatically(may need to convert data
from a previous format)
 Implement application programs(Transactions).
 Programs are written with embedded DML statements.
 Operational phase may begin
PART II
ENTITY RELATIONSHIP MODEL
• Its a tool commonly used to:
–Translate different views of data among end users
and fits it in a common framework
–Define data processing and constraint
requirements to help meet different views
–Help implement the database
Database model using ER modelling
• System requirements analysis
–The first step in developing a database model using ER
modelling is to consider the requirements of the system.
–The requirements are typically gathered from a scope
document
ERD Development Process

 Identify the entities


 Determine the attributes for each entity
 Select the primary key for each entity
 Establish the relationships between the entities
 Draw an entity model
 Test the relationships and the keys
A Simple Example

 STUDENTs attend COURSEs that consist of many


SUBJECTs.
 A single SUBJECT (i.e. English) can be studied in many
different COURSEs.
 Each STUDENT may only attend one COURSE.
Cont....
• Identifying entities in ER modeling
–Entities are objects or things that can be described by
their characteristics.
–An entity represent a real world object or concept eg
student, courses
–Its a set and not a single occurrence
• Include :
–Weak
–Recursive
–strong
WEAK ENTITY
• Its existence dependent
–Cant exist without an entity with which it is related
• Has a primary key that is partially or totally derived
from the parent entity in the relationship.
• Represented by a double rectangle
RECURSIVE ENTITY
• Its where a relationship exists between
occurrences of the same entity type.
• The condition is found in a unary relationship
Eg a course is a prerequisite of many courses
An employee manages many employees
STRONG ENTITY
• Not existence dependent
• As we identify entities, we list the attributes that describe
the entity
ATTRIBUTES
• Identifies or describes an entity
• Has a domain that is the set of possible values
• May share a domain
• Types of attributes include:
– Key attributes
– Simple attributes
– Composite attributes
– Single valued attributes
– Multi valued attributes
– Derived attributes
Attributes

 An attribute is a property or characteristic of an entity type, for


example the entity EMPLOYEE may have attributes
Employee_Name and Employee_Address.
 In ER diagrams place attributes name in an ellipse with a line
connecting it to its associated entity
 Attributes may also be associated with relationships
 An attribute is associated with exactly one entity or
relationship
Simple versus composite
attributes
 Some attributes can be broken down into meaningful
component parts, such as Address, which can be broken
down into Street_Address, City..etc.
 The component attributes may appear above or below the
composite attribute on an ER diagram
 Provide flexibility to users, as can refer to it as a single unit or
to the individual components
 A simple (atomic) attribute is one that cannot be broken down
into smaller components
A composite attribute
Single-Valued versus Multivalued
Attribute
 It frequently happens that there is an attribute that may have
more than one value for a given instance, e.g. EMPLOYEE
may have more than one Skill.
 A multivalued attribute is one that may take on more than one
value – it is represented by an ellipse with double lines
Entity with a multivalued attribute (Skill)
and derived attribute (Years_Employed)
Stored versus Derived Attributes

 Some attribute values can be calculated or derived from


others
 e.g., if Years_Employed needs to be calculated for
EMPLOYEE, it can be calculated using Date_Employed and
Today's_Date
 A derived attribute is one whose value can be calculated from
related attribute values (plus possibly other data not in the
database)
 A derived attribute is signified by an ellipse with a dashed line
Key Attributes

 Certain attributes identify particular facts within an entity,


these are known as KEY attributes.

 The different types of KEY attribute are:


 Primary Key
 Composite Primary Key

 Foreign Key
Key Definitions
 Primary Key:
 One attribute whose value can uniquely identify a
complete record (one row of data) within an entity.
 Composite Primary Key
 A primary key that consists of two or more attribute
within an entity.
 Foreign Key
 A copy of a primary key that exists in another entity
for the purpose of forming a relationship between
the entities involved.
Identifier attribute

 Identifier attribute or Key is an attribute (or combination of


attributes) that uniquely identifies individual instances of an
entity type, such as Student_ID
 To be a candidate identifier, each entity instance must have a
single value for the attribute, and the attribute must be
associated with each entity
 The identifier attribute is underlined, such as Student_ID
Simple and composite key attributes
(a) Simple key attribute
Composite Identifier

 A Composite Identifier is when there is no single (or atomic)


that can serve as an identifier
 Flight_ID is a composite identifier that has component
attributes Flight_Number and Date – this combination is
required to uniquely identify individual occurrences of Flight
 Flight_ID is underlined, whilst its components are not
(b) Composite key attribute
Criteria for selecting identifiers

Some entities have more than one candidate identifier, so the


following criteria should be used:
Choose identifier that will not change in value over the life of each
instance of the entity type
Choose identifier that is guaranteed to have valid values and Will
not be null (or unknown). If composite, make sure all parts will
have valid values
Criteria for selecting identifiers

Avoid the use of intelligent identifiers whose structure indicates


classifications, locations or people that might change. e.g. the
first two digits of an identifier may indicate a warehouse
location, but such codes are often changed as conditions
change, which renders them invalid.
Consider substituting new, simple identifiers for long, composite
ones, e.g. an attribute called Game_Number could be used for
the entity type GAME instead of Home_Team and Away_Team
RELATIONSHIPS
• Identifying relationships in ER modeling
• Its an association between entities
• They have names in the form of verbs
• Its degree indicates the number of associated
entities or participants e.g
Unary
Binary
Ternary
Recursive
Cont....
• Cardinality refers to the three possible relationships
between two entities
• relationships can exist between as many entities as
there are in the model
Unary relationship

 Is between the instances of a single entity type (also called


recursive relationships)
 ‘Is_Married_To’ is a one-to-one relationship between instances
of the PERSON entity type
 ‘Manages’ is a one-to-many relationship between instances of
the EMPLOYEE entity type
Binary relationships

 Between the instances of two entity types, and is the most


common type of relationship encountered in data modelling.
e.g. (one-to-one) an EMPLOYEE is assigned one
PARKING_PLACE, and each PARKING_PLACE is assigned
to one EMPLOYEE
 e.g. (one to many) a PRODUCT_LINE may contain many
PRODUCTS, and each PRODUCT belongs to only one
PRODUCT_LINE
 e.g. (many-to-many) a STUDENT may register for more than
one COURSE, and each COURSE may have many
STUDENTS
Ternary relationships

 A ternary relationship is a simultaneous relationship among


the instances of 3 entity types
 It is the most common relationship encountered in data
modelling
Ternary relationships

 Here, vendors can supply various parts to warehouses


 The relationship ‘Supplies’ is used to record the specific
PARTs supplied by a given VENDOR to a particular
WAREHOUSE
 There are two attributes on the relationship ‘Supplies’,
Shipping_Mode and Unit_Cost
 e.g. one instance of ‘Supplies might record that VENDOR X
can ship PART C to WAREHOUSE Y, that the Shipping_Mode
is ‘next_day_air’ and the Unit_Cost is £5-00 per unit
One-to-one

• represented by a line without any annotations that


joins two entities.
• One-to-one means that for the two entities
connected by the line, there is exactly one instance
of the first entity for each one instance of the
second entity.
.
One-to-many (or many-to-one)

• A one-to-many relationship is represented by a line


annotated with a 1 and an M (or a 1 and an N).
• One-to-many means that for the two entities
connected by the line, there are one or more
instances of the second entity for each one instance
of the first entity.
• Are the most common relationships between
• entities.
Many-to-many

• A many-to-many relationship is represented by a


line annotated with an M and an N.
• Many-to-many means that for the two entities
connected by the line, each instance of the first
entity is related to one or more instances of the
second entity
E-R MODEL COMPONENTS
• E-R diagram
–Represent the conceptual database as viewed by
end users
–Diagram depicts the model’s main components:
• Entities
• Attributes
• Relationships
E-R Diagram features
• Rectangles
–Represent entities—that is, objects being modeled.
–Each entity is labeled with a meaningful title.
• Diamonds
–Represent relationships between entities;
–a relationship is labeled with a descriptive title that
represents how the entities interact.
CONT....
• Ellipses
–Represent attributes that describe an entity.
• Lines
–Connect entities to relationships.
–Lines may be without any annotation, be annotated with
an M and an N, or annotated with an M and a 1 (or an N
and a 1).
–Annotations indicate the cardinality of the relationship
–Connect attributes to entities. These lines are never
labeled
E-R Modeling Tools

You might also like