You are on page 1of 102

Chapter Three

Requirement Analysis

1
Overview
• This Chapter covers the activities that are usually
carried out during the Requirements Gathering
step of the Analysis Phase of the SDLC.
• It covers some tools and techniques that can be
used to produce the deliverables from the phase.
The next section will cover the techniques used
to structure the requirements.

2
SDLC Analysis Phase
• The purpose of the Analysis phase is as follows:
– To determine how the current information system in
the organization functions
– To assess what users would like to see in the new
system i.e. gather requirements
– To structure the requirements so that they can be
translated into technical system specifications

3
SDLC Analysis Phase
• In order to achieve this purpose, the following
activities are carried out:
– Analysis of the current system – in order to
understand the system and the problems in it
– Gathering of requirements - from users, for the new
system
– Requirements Definition for the new system
– Data Definition – based on the Requirements
– Define Business Rules – based on the Requirements

4
SDLC Analysis Phase
• The deliverables from the phase are:
– Detailed requirements for the new system
– Description of the current system – that clearly describes the
current system, whether it is computer-based or not.
– High-level, initial design for the new system – based on the
Detailed Requirements and on the Description of the Current
System
– System specification –
Revised Project Goal and Project Plan
Project Development Plan
Test Plan

5
Requirements Analysis
• Requirements Analysis consists of two steps:
– The first step is to gather the requirements for the
new system and to understand the current system (if
there is one)
– The second step is to structure them in order to
develop an initial design and a system specification
for the new system.

6
Gathering Requirements
• The process of gathering requirements involves the
Systems Analyst going out and talking to users. There
are some traditional methods, such as interviewing,
and some more modern methods, such as Joint
Application Development (JAD).
•  As a result of this process, the Systems Analyst should
understand:
– Business objectives that drive what work is done and how it
is done
– The information users need to do their jobs
7
Gathering Requirements
– The data that is handled within the organization
– When, how and why the data are moved, transformed &
stored (the data flows & processing logic)
– The sequence of different data-handling activities and
dependencies between them
– The rules for how data are handled and processed (business
rules)
– Policies and guidelines that describe the nature of the
business, the market and environment in which it operates
– Key events affecting data values and when these events occur

8
Traditional Methods/Techniques
• The traditional methods for Requirements Gathering
involve asking questions and observing users at work.
These methods are still in use, but they are sometimes
supplemented with additional modern techniques:
Asking Questions - Interviewing
• Interviewing is very commonly used during
Requirements Analysis. It is necessary to:
– Interview users about their work, what information they
use to do their work, where they find problems in the
system.
9
Traditional Methods/Techniques
– Interview managers in order to get an understanding of the
policies and strategy of the organization and also to understand
what expectations managers have of their staff who use the
system.
• Interviewing is essentially about gathering facts, opinions
and speculation. It is also about observing people – as
body language and emotions may provide more
information about what they want and need and how
they assess the current system. Some people may be
interviewed more than once, in order to review or clarify
information and/or to obtain more information.
10
Traditional Methods/Techniques
• Interviews should be planned and structured in
order to gain the most from them. It is not
necessary to interview every person involved
with the current or new system – a cross-section
of people should be interviewed (all should be
people who are within the Scope of the system).
• A typical interview structure is as shown in Figure
below.

11
Introduction
(who you are, and what project you are working on)

Preliminaries

Define your goal for the interview


(list of points)

Define procedures
(e.g. permission to take notes, how long it will take)

Define what you believe to be true and confirm it


(in relation to the person’s job, use of the system
etc)

Body of the interview

Follow up points in more detail


(get more info, make sure you understand) Define new areas of interest

Summarize findings and verify them


Conclusion

12
Traditional Methods/Techniques
• The questions you intend to ask can also be written out
before the interview – but remember that you can ask more
questions based on responses received during the interview.
•  There are different types of question that can be asked.
• Open-ended questions – when there are many possible
answers or when you do not know the precise question to
ask. These are general questions and can be used to establish
the person’s viewpoint on the subject. For example:
–  “What do you think is the best thing about the information system you
use in your work?”
– “How relevant is the monthly sales forecast report for your work?”

13
Traditional Methods/Techniques
• Closed-ended questions – where there tends to be a specific
answer; sometimes a range of answers can be provided from which
to choose one.
• For example:
– “Where do you send your summary of daily orders?”
– “Which of the following would you say is the one best thing about the
information system you use in your work (pick only one)?
• having easy access to all of the data that you need
• the system’s response times
• the ability to run the system concurrently with other applications”
• The possible answers could be True/False, multiple choice (choosing
one or more), rating on a scale, ranking in order of importance.

14
Guidelines for conducting interviews:
• Open questions should not be ‘leading’ – that is; they
should not suggest an answer to the interviewee.
• All questions should be phrased in a way that does not
suggest that there is a right and a wrong answer – the
purpose is to find out what the users think and how they
use the system - they should not have to give an answer
that they think someone else wants to hear.
• Use a combination of question types – e.g. use an open
question to find out what the user does and then ask a
set of closed questions based on that answer.
15
• Closed questions are good to use if the person being
interviewed appears to be uncooperative or not willing to
speak.
• Listen carefully and take notes during the interview.
• Type up your notes (e.g. in MS Word) as soon afterwards as
possible – so that you do not forget any of the information you
gathered.
• Conducting interviews is a time-consuming and expensive
process – as the analyst needs to spend time with all the
relevant people. However, they are effective ways to
communicate and to obtain important information.
16
Asking Question - Questionnaires
• Questionnaires are a less time-consuming way to obtain
information from people in the organization.
• A questionnaire can be used to get information from many
people in a relatively short period of time.
• However, they can be less effective at extracting the
relevant information.
• A questionnaire is a list of printed questions which must
be filled in by the respondents to the questionnaire. The
questions seek to find out the information required by the
systems analyst.
17
• The questionnaires are sent out to the selected users
and the replies must then be analysed by the systems
analyst.
• A different questionnaire can be sent to different
groups of people e.g. one to sales people, one to data
entry clerks, one to managers. Those to whom the
questionnaire is sent should be representative of all
users – it is not necessary to send to all users. Criteria
by which to select those to whom to send can be one
or more of the following:
18

Those who are convenient to sample – people at a local site,
people who are willing to be surveyed, those who have most
motivation to respond

A random group – randomly select people from a list of all users
(e.g. by selecting every nth person on the list)

A selected sample – specify only people who meet certain criteria
e.g. people who have used the current system for more than 2
years or those who use it most often

A stratified sample – use where there are several categories of
people you want to include e.g. users, managers, customers who
use the system – and choose a random group from each category

19
• When questionnaires are returned, they should be
checked for ‘nonresponse bias’ – this means that the
people who chose to respond may be different to those
who chose not to respond. In that case, you may be
missing vital information from the non-respondents –
e.g. disinterest in the system because they do not find it
useful at all.
•  A questionnaire is more suited to questions that have
quantifiable answers e.g. “how many orders a day do
you process?” i.e. closed-ended questions.
20
• Questionnaires are usually administered by sending a
paper copy out to all desired respondents. However,
they can be completed by other methods, such as
the following:
– An interviewer asks the questions and fills in the answer –
either in person or over the phone
– A soft copy is sent to respondents by email or flash and
they send back the completed document
– The questionnaire is produced as a web-based form that
can be filled in on an intranet or internet site
21
Observation
• It is also possible to gather information about a system by watching
the users of the system at work.
• This method can bring more objective information – as the analyst
can see what the person does (behaviour), rather than what they
say they do.
• This can be used to supplement or confirm the information
obtained by asking questions.
•  Observation can take place in two ways:
– by the analyst participating in the user's work e.g. becoming a member of
the team for a week
– by watching the user(s) at work – this can be done in person or by video
camera

22
• The advantage of observation over asking questions is
that the analyst can see what the user's behaviour and
interactions with the system actually are, not what they
say they are.
• There are also disadvantages to observation - while
being observed, people may not behave normally (if
they know they are being observed) – they may change
their behaviour. Also, the time at which the observation
takes place may mean that the analyst sees only a small
subset of the work done by the user.
23
Analysing Documents (Procedures, Manuals etc)

•  This can provide information about:


– Problems with existing systems (e.g. missing information,
redundant steps)
– Possible new features that can be added to existing
systems, if certain new information is now available (e.g.
analysis of sales based on customer type)
– Special circumstances that occur irregularly but may not be
identified by any other requirements gathering technique
– Data definitions, rules for processing data (business rules)
in the system

24
• Relevant types of documentation include:
– Written work procedures for an individual role or a work
group – a work procedure describes how a particular job or
task is carried out; it includes the data and information
used and created in the process.
– Business forms – forms are used for many business
functions e.g. in a bank, there are forms for making
deposits/lodgements, withdrawals, transfers etc. Forms
can provide good understanding of a system because they
explicitly show what data flows in and out of a system

25
– A printed form can correspond to a screen in the
proposed or existing system, on which a user can
view, enter or edit data. Forms that already contain
actual data are even more useful, as they show the
types of data in the system.
– Reports generated by current systems – it is possible
to work back from the information on the report to
understand what data was used to generate the
report.

26
– Analysis of reports can determine what data needs to be
captured over time and over what time periods, and how
the data needs to be manipulated or transformed.
– Documentation about current computer systems – if the
team that analysed, designed and built the existing system
also produced good documentation about the system
(specifications, test plans, user manuals etc) then this can
be a good source of information to new system.
–  All of the above, but especially work procedures, provide
information about the formal system. The formal system is
the one that has been documented by the organization.

27
• All those methods (interviewing, questionnaires,
observation, analysis of documentation) can be used to
gather requirements and build up information about
the current system.
• Interviewing and observing can help to gather
knowledge about the informal system – what people
actually do in their work. Analysing documentation and
questionnaires can help to gather knowledge about
the formal system – what is expected of the system

28
Comparison of Methods
• In a given situation, the choice may depend on the
available resources and on the nature of the project
Characteristic Interviews Questionnaires Observation Document Analysis
Richness of information High Medium to low High Low (passive) and old
Time required Can require a lot of time Not as much time required Can require a lot of time Not as much time
required
Expense Can be high – time & Some, not as expensive Can be high Not expensive
travel
Chance to follow-up and probe Good – can ask more Limited – can possibly do after Good – can ask Limited – can only do if
for more information questions during the answers received back questions during & after original author is
interview available
Confidentiality Interviewer knows the Respondents can be unknown Observed person is Depends on the
interviewee – may be more likely to give known to observer; may document
honest feedback if anonymous change behaviour
Involvement of subject Interviewee is involved Respondent is passive – no Observed person may or None
and committed (if clear commitment may not be involved &
interviewer takes right committed – depends on
approach) whether they know if
they are being observed
Potential Audience Limited numbers (due to Large but response from all is Limited numbers and Potentially biased –
time & cost constraints) not guaranteed, and this can limited time for each some documents kept,
but do get complete bias the results others not; document not
responses created for this purpose
29
Modern Methods/Techniques
• This section covers some modern techniques used for
Requirements Analysis. These are often used to
supplement the traditional techniques and they support
the next step of structuring requirements.
Joint Application Design (JAD)
• JAD is a technique that is used to speed up the process of
Requirements Analysis.
• It involves gathering together all the key people and, in
one session, attempting to identify what needs to be
done.
30
Person/People Role
Sponsor A senior person in the organisation who is supporting and
providing the budget for the development; usually attends
only at start and end of session
Session leader To organize and run the JAD – should be trained in
facilitation and group management and is usually a systems
analyst also. Sets the agenda, resolves conflicts and
disagreements, keeps the session on track and gets all ideas
from participants
Users Key users of the existing and new system
Managers of user groups To provide information about the organization's direction
and the motivations and impacts of the systems; also to
support the requirements determined during the JAD.
Systems analysts To learn from users and managers; may participate and
advise on IS issues.
Other IS staff To learn from the discussion; may also contribute ideas on
(programmers, database technical feasibility or limitations of systems.
designers etc)
Scribe To take notes during the sessions, and record the outcomes
of the meeting (on a laptop or PC if possible)

31
Business Process Re-engineering (BPR)

• BPR is a process in which existing methods of doing


business are replaced with new and updated methods.
• In many businesses, the systems in place consist of
programs that were written some time ago and still use
old methods and technologies. They often support
only one business unit.
• These are called legacy systems. These systems were
often designed around the processes that took place in
the business unit.

32
• In some organizations, the management is looking for new
ways to perform current task i.e. to improve or change the
way things are done – i.e. to change the business
processes. This is called Business Process Re-engineering.
• The new ways of doing things may be very different from
the old ways, but the benefits may be big because
processes become more efficient.
•  Because the legacy systems were built around the
business processes, a business re-engineering process
often requires changes to the information systems also.

33
Prototyping
• Prototyping can mean a number of things – testing a new
idea; a first step in development or it can be a technique
for Requirements Analysis.
• The aim of prototyping in Requirements Analysis is to get
users more involved in the requirements definition. After
asking questions and analysing documentation, the basic
requirements can be converted into a limited working
version of the system.
• This can be used to demonstrate the proposed new
system to users, and to get their feedback on it.
34
• Prototyping is often used to show users what the screens
and/or forms in a computer system will look like. The back-
end functionality does not have to exist – just the front-end or
the screens, so the user can see the layout, the information
and the navigation menus. This is called interface prototyping.
A sequence of screens that reflect the business process can be
put together – this is known as a storyboard.
• This technique is often used in web-site development – as
screens can be quickly and easily built using HTML – it is not
necessary to build or write the code to carry out other
functions that take place behind the scenes.

35
• Prototyping is most useful in the following
circumstances:
– User requirements are not clear or well understood
e.g. for a totally new system
– Only one or a few users involved
– Possible designs are complex
– Communication problems have existed in the past,
between users and analysts

36
• It also has some disadvantages:
– Analysts may avoid creating formal documentation of
the system
– The prototype may be very influenced by the initial
user who reviews it – and thus might not be
adaptable to other users
– Often build as a stand-alone system, which may result
in interfaces with other systems being ignored during
this phase

37
Structuring Requirements

38
Overview
• This part covers some tools and techniques that
can be used to structure the requirements
gathered during the requirement Analysis Phase.
• It describes Process Modelling and Data
Modelling, and the structured analysis techniques
used for each - Data Flow Diagram (DFD) and E-R
(Entity Relationship) analysis, respectively. DFDs
are looked at in detail whilst E-R analysis will be
covered on a subsequent chapter.
39
•  It was stated that the purpose of the Analysis phase is as
follows:
– To determine how the current information system in the
organization functions
– To assess what users would like to see in the new system i.e. gather
requirements
– To structure the requirements so that they can be translated into
technical system specifications.
•  This part covers some tools and techniques that can be used
for step 3 – to structure the requirements gathered. This is
the second part of the process called 'Requirements Analysis'.

40
Structuring System Requirements
• The information gathered during the Requirements
Gathering process needs to be organized into a form that
is a meaningful representation of the existing system and
of the requirements for the new system.
• This is done by producing models – of the processing
elements and data transformations and then of the
structure of the data.
• This entire process is structuring of requirements.
• These models can then be used as a starting point of the
Design phase of the SDLC.
41
• There are two stages in the structuring process:
– Process Modelling
– Conceptual Data Modelling.
• Process Modelling involves graphically representing
the processes, or actions, that capture, manipulate,
store and distribute data between a system and its
environment and between components with the
system.
– A data flow diagram (DFD) is commonly used form of
a process model, & is the technique we will look at.

42
• Conceptual Data Modelling involves representing
the data in a system or organization, to show the
overall structure of the data. A data model should be
independent of any DBMS & other implementation
considerations.
– Entity-relationship (E-R) data models are commonly
used diagrams that show how data is organised in a
system.
•  Normally, Process Modelling and Data Modelling
take place at the same time, with different teams
working on each.

43
Process Modelling – Using Data Flow Diagrams
(DFDs)
• A DFD is a diagram that shows the movement of
data between external entities and the processes
and data stores within a system.
• As can be seen from the deliverables in the table,
the diagrams evolve from the general to the
specific – they get more detailed as we move
from deliverable 1 to deliverable number 4.

44
 Symbols
• There are two sets of standard data flow diagram
symbols each consists of four symbols:

45
Context Diagram

• Example Context Diagram for Food Ordering System


of a restaurant. This provides a general overview of
the system.
• Things to note:
– The Context Diagram shows only one process – this
represents the entire system.
– There are three External Entities:_______, ________
and________. These entities represent the boundary
between the system and its environment.

46
– There are data flows between the External
Entities and the process. The data in the system
must come from somewhere, and if data goes out
of the system, it must go somewhere.
– These data flow represent the interfaces between
the system and its environment.
– There are no data stores on a Context diagram –
this is because the data stores are conceptually
inside the process – so they are not shown until
we start decomposing the 'Food Ordering System'
process.

47
• The next step is to consider the context diagram,
and think about what processes are inside the single
process. This will produce the next diagram – the
level-0 diagram.

48
Level-0 Diagram

• The context diagram provides a general overview


of the system. Further DFDs are used to focus on
the details of the context diagram, starting with a
level-0 diagram.
• A level-0 diagram represents the major processes,
data flows and data stores of the system, at a high
level of detail. The level-0 diagram for this example
is shown in Figure below.
 
49
• Things to consider when drawing level-0 diagram:
– The same external entities will appear on this diagram –
they provide the sources and sinks for system inputs and
outputs.
– In this diagram, the 0 process is decomposed into further
processes – these are numbered 1.0, 2.0 and so on – to
identify them as belonging to the level-0 diagram.
– To think about how to decompose the 0 process, look for
functions that do the following actions:
• Capturing data from different sources
• Maintaining data stores
• Producing and distributing data to different sinks
• High-level descriptions of data transformation operations

50
51
• Things to note about this diagram:
– It does not show timings e.g. for a data flow, it does
not show when or how frequently the data is
transferred, or the volume of data that is transferred.
The DFD hides physical characteristics about data
and processes.
– Some processes are coupled – this means that when
a process, process A, produces a data flow that
becomes input to another process, process B, then
process B must be ready to accept the data.

52
– On the diagram in Figure above, process _____ and
process _____ are coupled, also process _____ and
process _____.
– Some are decoupled – for example, if the data flows
into a data store and then into a further process. In
this case, the second process can accept the input at
any time. Process ____ and process ____ are
decoupled in the diagram above; also process _____
and process _____.

53
Decomposition – Level-n Diagrams

• The act of breaking the single system into separate


processes is called functional decomposition. The
above diagram can be further decomposed, to break
each of the 4 processes into further processes.
• This results in a set of hierarchically related
diagrams, in which one process on a given diagram
is explained in greater detail on another diagram.
Processes are broken down into sub-processes.

54
• Decomposition can continue until no sub-process
can be logically broken down any further.
• The lowest level of a DFD is called a primitive DFD.

• This hierarchy is illustrated in Figure below. The


levels can continue to level-3, level-4 and so on, until
primitive DFDs are reached.

• As a general rule, no DFD should have more than


about 7 processes in it, as this would make the
diagram too crowded and difficult to understand.

55
Hierarchy of DFDs

56
Class Exercise
• Consider a bank accounting system. The exercise is to
produce DFDs for the personal banking system. The personal
banking system is the system that allows customers of the
bank, who are individuals, to open accounts, deposit into
accounts and withdraw from accounts.
• Draw the Context Diagram for the Personal Banking
System. This should contain at least one external entity
(the customer) and more if you think it appropriate.
There will also be data flows between the external
entities and the process that represents the system.

57
• Draw the Level-0 diagram for the Personal
Banking System. This diagram will decompose
the single process in the Context Diagram into a
number of sub-processes. Do not try to break the
process down too much – this can be done on
the higher level diagrams.
• Draw the Level-1 diagrams for the processes on
the Level-0 diagram for it, if any

58
Reading Assignment
• Rules & Guidelines for Data Flow Diagramming
• What is “Balancing DFDs”? and what rules and
guidelines are there in relation to Balancing DFDs

59
Rules & Guidelines for Data Flow
Diagramming
General guidelines
• Objects should have unique names
• Use meaningful names
• inputs to a process are different from the outputs of
that process or else the process produces other new
data flows

60
Rules
Process
• No process can have only outputs . If an object has only
outputs, then it must be a source/external entity.
• No process can have only inputs. If it has only inputs,
then it must be a sink/external entity.
• A process has a verb phrase label, to indicate the action
carried out by the process. The verb used is often the
same as the verb that might be used in a computer
programming language – e.g. print, read, write, sort

61
Data Store
• Data cannot move directly from one data store to
another – data must be moved by a process
• Data cannot move directly to/from an outside sink
or source (external entity) from/to a data store. Data
must be moved by a process.
• A data store has a noun phrase label that indicates
what is stored in the data store
External Entity
• Data cannot move directly from a source to a
sink.

62
Data Flow
• A data flow has only one direction of flow between
symbols.
• A fork or split in a data flow means that exactly the
same data is going from one common location to
two or more different processes, data stores or
external entities.
• A join in a data flow means that exactly the same
data come from two or more different processes,
data stores or external entities, to go to a common
location.

63
• A data flow cannot go directly back to the same process
it leaves. There must be at least one other process that
handles the data flow, produces some other data flow
• A data flow to a data store means that data is being
inserted, updated (changed) or deleted
• A data flow from a data store means that data is being
retrieved or used.
• A data flow has a noun phrase label that indicates what
the data contains. More than one noun phrase label
can appear on a single arrow, as long as all of the flows
on the same arrow move together as one package.

64
Balancing DFDs

• When decomposing a DFD from one level to the next,


inputs and outputs to a process need to be conserved in
the next level. This is called balancing.
• As the system is decomposed, the successive levels of
DFD should be balanced. If extra outputs or inputs appear
on a level-n diagram, the analyst should review the level-
(n-1) diagram and, if necessary, add in the extra data
flows to balance the diagrams.
• This principle of balancing leads to some more rules that
can be applied to data flow diagramming.
65
Rules to ensure Balanced DFDs
• A composite data flow on one level can be split into
component data flows at the next level – but no new data can
be added and all data in the composite must be accounted for
in one or more sub flows. The names of the data flows should
indicate what was split.
• The input to a process must be sufficient to produce the
outputs (including data placed in data stores) from the
process. Therefore all outputs can be produced and all data in
inputs move somewhere, either to another process or to a
data store outside the process or on a more detailed DFD
showing a decomposition of that process.

66
• At the lowest level (primitive) DFDs, new data flows
may be added to represent data that are transmitted
under exceptional conditions. These usually
represent error messages from the system (e.g.
'Customer not known; do you want to create a new
customer?') or confirmation notices (e.g. 'Do you
want to delete this record?')
• To avoid data flow lines crossing each other, a data
store or external entity can be repeated on a DFD. If
doing this, an extra symbol should be used to
indicate repeated items.

67
Further guidelines
• Completeness – the DFDs should include all of the
components necessary for the system being
modelled. If a DFD contains data flows that do not
go anywhere, or data stores, processes or external
entities that are not connected to anything else –
then the DFD is not complete.
• In addition, all elements of the DFDs should be
defined in the project dictionary.

68
• Consistency – the representation of the system at the
various levels of DFD should be consistent with each
other, from the Context diagram to the primitive
DFDs. Some examples of violations of inconsistency:
– a level-1 diagram that has no corresponding level-0
diagram
– a data flow that appears on a higher level DFD but not
on lower levels (e.g. on level-1 but not on level-2) –
this would be a violation of balancing
– a data flow attached to one object on a lower-level
diagram but attached to a different object at a higher
level

69
• Timing – DFDs do not show the timings of data flows
and other events. Therefore, it is not necessary to
consider time when modelling the system using
DFDs.
• Iterative Development – the process of producing
the DFDs is an iterative one – which means that
each diagram will normally be drawn a few times
before it accurately models the system.
• Primitive DFDs – or, when to stop decomposing into
further DFDs. Some ways of knowing when to stop
are as follows:

70

All processes have been reduced to a single decision or
calculation or database operation

When each data store represents data about a single entity
(such as a customer, an employee, an order)

When the system users do not need to see any more details, or
when the analysts feel that sufficient detail has been reached

When every data flow does not need to be split further to
show that different data are handled in various ways.

When it appears that each business form or transaction,
computer screen and report has been shown as a single data
flow (this often means that each screen and report title
corresponds to the name of an individual data flow)

When it appears that there is a separate process for each
choice on all the lowest-level menu options in the system

71
• The primitive DFDs will be very detailed, as they are
the lowest level.
• DFDs should be used to model the current physical
system and the new logical system.
• Usually, the current physical system is modelled first
and then the new logical system.
• However, some experts believe that analysts should
not spend too much time on the current physical
system – the higher levels can be done, but the
lower, more detailed levels can be omitted. More
time and effort should be put into producing the
DFDs for the new logical system.
72
Physical & Logical DFDs

• When drawing DFDs, the diagrams can model the physical


functions of the system and/or the logical functions of
the system.– physical and logical. This means that there
are two types of DFD that can be produced
•  Logical DFDs show what happens in a system, without
showing how it occurs.
•  Physical DFDs show how things happen – by showing the
physical components of the system (e.g. the person who
carries out a particular job, or what computer program is
used).
73
Logical vs Physical DFD
Logical DFD
• Logical DFD depicts how the business operates.
• The processes represent the business activities.
• The data stores represent the collection of data regardless of how the data are
stored.
• It’s how business controls.
Physical DFD
• Physical DFD depicts how the system will be implemented (or how the current
system operates).
• The processes represent the programs, program modules, and manual
procedures.
• The data stores represent the physical files and databases, manual files.
• It show controls for validating input data, for obtaining a record, for ensuring
successful completion of a process, and for system security.

74
Example
The logical DFD describes what the system does by including the essential
sequence of business activities. It model the business data and activities
instead of actual forms, location and roles.

75
Data Modeling & Initial Design

78
Overview
• This covers technique of Entity Relationship analysis, used
for Data Modelling part of structuring system
requirements.
•  Conceptual Data Modelling involves representing data in
a system or organization, to show the overall structure of
the data. A data model should be independent of any
DBMS & other implementation considerations.
• Entity-relationship (E-R) data models are commonly used
diagrams that show how data is organised in a system.

79
• DFDs show how, where and when data are used or
changed in a system. They do not show the
definition, structure and relationships within the
data – this is the aim of E-R analysis.
• To summarise, DFDs show data in motion while E-R
diagrams show relationships among data objects.
• There is a correspondence between the process
model and the data model. Some of the links
between the two are as follows:
– Data elements included in the data flows on DFDs
also appear in the data model and vice versa.

80
– All raw data captured and retained in data stores
must be included in the data model.
– The data model includes manual and automated
data stores – because it is a general picture of the
data in the organization, not just of computerised
systems.
– Data stores included on DFDs must correspond to
business objects (called data entities) in the data
model.
• The deliverables for conceptual data modelling are
shown in the table below

81
Introduction
• E-R diagram, as the name suggests, shows entities (objects)
and the relationships between them. It also shows
attributes of the entities and relationships.
• Take for example a personnel system in a company, in which
employees belong to departments and employees are
assigned to projects:
• Some of the Entities in this system are:
– Department
– Employee
– Project
82
• Some of the Relationships in this system are:
– A Department has Employees
– An Employee manages projects
– Employees work on projects
• Some of the Attributes are:
– A Department has a name and a location
– An Employee has a name, a phone number and a
salary
•  Note that Entities are described by nouns and
Relationships are described by verbs.
• Entities, Attributes and Relationships are described in
more detail below.
83
Entities
• An entity is a person, place, object, event or concept
in the system.
• An entity has its own identity that distinguishes it
from every other entity. Example, each Employee
has a unique ID that distinguishes it from every
other Employee.
•  An entity type is a collection of entities that share
common properties or characteristics.

84
• Each entity type in an E-R model is given a name.
The name is singular noun. Thus, in our example, we
have an entity type called 'Employee' – it is not
called 'Employees'. This is because the name
represents a set of entities.
• In E-R diagram, an entity type is represented by a
rectangle, and its name is indicated in capital letters.
• An entity instance is a single occurrence of an entity
type. An entity type is described just one time in the
data model, but many instances of that entity type
may be represented by data stored in the database.

85
• For example, there may be hundreds or thousands
of employees in an organization – each one is an
instance of the Employee entity type.
• In data modelling, the term ‘entity’ is often used to
refer to an entity type. The term ‘entity instance’ is
usually used to indicate an instance of the entity.

entity type representation

86
Attributes
• An entity type has a set of attributes – properties or
characteristics – associated with it.
•  So, for example, the EMPLOYEE entity has attributes of
Employee_ID, Employee_Name, Salary.
•  In E-R diagrams, attributes are named with an initial
capital letter followed by lowercase letters. An attribute
can be represented by an ellipse (oval) shape with a line
connecting it to the associated entity. Attributes can also
be represented by listing them within the entity
rectangle, under the entity name. For example:
EMPLOYEE
Employee_ID
Employee_Name
Salary
Address 87
Candidate Keys and Identifiers

• Every entity type should have an attribute or set of


attributes that distinguishes one instance from other
instances of the same type i.e. that uniquely
identifies each instance. This is called a candidate
key. For example, a candidate key for Employee
would be Employee_ID.
• In some cases, more than one attribute is required
to identify a unique entity. This is a composite
candidate key.

88
• Some entities might have more than one candidate
key. For example, another possible candidate key for
Employee is the combination of Employee_Name
and Address.
• If more than one candidate key, it is necessary to
choose one. The selected candidate key is known as
an identifier. An identifier is indicated on E-R
diagram by underlining the attribute name.
• Identifiers are critical to data integrity in a database,
so when selecting identifiers, you should be careful
and follow the following guidelines:

89
Relationships
• Relationships connect the various components of an
E-R model. A relationship is an association between
the instances of one or more entity types, that is of
interest to the organization or the system.
• Usually, an association means that an event has
occurred, or there is some natural link between entity
instances. This is why relationships are labelled with
verb phrases. For example, a Department has
Employees, an Employee manages a Department.

92
• This is indicated on an E-R diagram by a diamond
shape with the relationship verb in it.
• The 'Manages' relationship is one-to-one, the 'Has'
relationship is one-to-many – the connecting lines
also indicate this – a Department can have many
Employees, so the connector from the Has
relationship has two extra lines connecting to the
Employee entity. The connector for the many side of
the relationship is sometimes called a “crow’s foot”.

93
Weak Entities (Multi-valued Attributes)

• An attribute that can have more than one value


for each entity instance is called a multi-valued
attribute. Take for example a system used to
track maintenance of computers. One entity in
this system is a Computer. One attribute of a
Computer is the operating system. So we might
draw this as below. COMPUTER
ComputerID
Name
Operating_System

94
• However, we know also that a Computer can have more
than one Operating System. This means that
Operating_System is a multi-valued attribute. To model
this in an E-R diagram, it is extracted out from the
Computer entity type and put into a separate entity
type. This separate entity is called a weak entity or an
attributive entity. The weak entity is then linked to the
regular entity by a one-to-many relationship connector.
• Note that this relationship does not have a name – this
indicates that it is a weak entity. Also, we can say that, in
this system, an Operating System does not really exist
(or at least, is not relevant to the system) unless it is
associated with one or more Computers. In this sense, it
is dependent – so this is also called a dependent entity.
95
• In some diagramming, the weak entity may also be
indicated with a double border around the entity
type rectangle or by a second line across the top of
the rectangle – as shown here.

• The identifier for a weak entity is a composite of the


identifier of parent entity (in this case, ComputerID)
and an attribute that uniquely identifies the weak
entity within the parent. In this case, the identifier is
a composite of ComputerID and OperatingSystemID.
96
Relationship Degree
• E-R diagrams can convey more information about
the relationships between entities.
• The degree of a relationship is the number of entity
types that participate in the relationship.
•  If there is one entity type involved in a relationship,
it has degree 1; also called a unary or recursive
relationship. For example, an Employee is managed
by another Employee (the relationship here is
'manages').

97
• If there are two entity types in the relationship, it
has degree 2; also called a binary relationship.
• If there are three entity types in the relationship, it
has degree 3; called a ternary relationship.
• Higher degree relationships are also possible (N-ary)
but these rarely occur in practice.
• Also, this makes the relationships more complex, so
it is better to try to reduce the relationship to a
ternary or binary relationship if possible.

98
Relationship Cardinality
• The cardinality of a relationship specifies the number
of relationships in which a given entity instance can
appear. An entity instance can appear in:
•  one (1) relationship between its entity type and the
other entity type or
• any variable number (N) relationships between its
entity type and the other entity type.
• Take, for example, the 'Has' relationship between
Employee and Department.
99
• An Employee can be in only one Department – so an
instance of Employee can appear in only one
relationship with a Department instance.
• A Department can have many Employees – so an
instance of Department can appear in many (N)
relationships with Employee instances.
• There are two ways to show the cardinality of a
relationship. One is to put extra lines on the connector
to the related entity type when the cardinality is many
(N). This is shown below.
• Normal connector indicates a cardinality of 1. In this
diagram, one Employee manages one Department – so
cardinality for both sides of the relationship is 1.
100
• The other way is to put the character 1 or N next
to the connector. This is shown below.
• The crow's foot notation is clearer as it is obvious
which is the many side of the relationship.

101
• In some relationships, the participation of an entity
type may be optional.
• Example, if we add another entity type to the
diagrams above – the Project entity.
•  Employees work on projects – an employee can
work on many projects, but does not have to work
on any project. This means that the minimum
cardinality for Employee in this relationship is 0,
while the maximum is N. A minimum cardinality of 0
means participation of entity instances in the
relationship is optional.

102
• Each project must have at least one Employee working
on it. This means that the minimum cardinality for
Project in this relationship is 1, while the maximum is
N. A 1 minimum cardinality means that participation
of entity instances in the relationship is mandatory.
• Optional participation is shown by putting a 0 over
the connector – to indicate the value of the minimum
cardinality. If the maximum cardinality is a specific
number (> 1), then it can be shown next to the
connector also e.g. in this diagram, we show that the
maximum number of Employees working on a Project
is 10.

103
• Mandatory participation is shown by putting a 1
over the connector (which looks like a perpendicular
line on the connector) – to indicate the value of the
minimum cardinality. If maximum cardinality is also
1 then the relationship is a 1-to-1 or a 1-to-many –
this is indicated by two perpendicular lines over the
connector. Employee can be in one and only one
Department, so the connector to Department for
the 'Has' relationship has two lines over it.

104
Associative Entities
• In some cases, a relationship also has attributes.
Consider a system for tracking maintenance of
computers. The operation of maintaining a computer
has some attributes in which we are interested – for
example, the date of the maintenance and what was
done.

105
Chapter end

106

You might also like