Professional Documents
Culture Documents
Week 8
Data related issues and
model coding
✔Data:- central to the development and
use of simulation models.
Week 8
Data related issues and model coding
may be available.
Categories A
Category B
Category C. 2011, E. Opiyo
SCI-UON, June/July
Obtaining Data
Category A data
➔Are known or have been collected earlier;
➔Collected for some other reasons;
➔Automatically collected electronically.
Examples
●Physical layout of a manufacturing plant;
●The cycle times of the machines;
●Service times and arrival rates in a bank from
a survey of staffing levels;
●Transaction data at service points.
SCI-UON, June/July 2011, E. Opiyo
Obtaining Data
Category B data
The data that should be collected.
Examples
➔Service times;
➔Arrival patterns;
➔Machine failure rates and repair times;
➔Nature of human decision-making.
SCI-UON, June/July 2011, E. Opiyo
Obtaining Data
Category B data -Data collection
➢Direct observations if necessary;
➢Use questionnaires or interviews with
subject matter experts such as staff,
equipment suppliers or customers.
Challenge-limitation
Category B data
➔Use adequate sample size;
➔Ensure that the data collection staff have no
vested interest in the data;
➔Put the mechanisms in place to monitor and
avoid inaccuracies creeping into the data
collection;
➔For critical data arrange for two sets of
observations and cross-checking.
SCI-UON, June/July 2011, E. Opiyo
Data format
Data should be accurate and it should also be in
the right format for the simulation.
Understand how the computer model, particularly the
Example
Interpreting the time between component failure. This
can be interpreted as the time between start of one
breakdown to the start of the next breakdown or, the
other interpretation is the time between the end of one
breakdown to the start of the next breakdown.
SCI-UON, June/July 2011, E. Opiyo
Data format
✔Know the format of the data that are
being supplied or collected and ensure that
these are appropriate for the simulation
model.
Traces;
✗
Empirical distributions
✗
Statistical distributions.
✗
arrival event)
the nature of the fault (machine
●
breakdown e v e n t ) .
S CI -UO N , J un e/July
2011, E. Opiyo
Representing Unpredictable Variability
Traces
➔The trace is read by the simulation,
from a file as it runs and the events are
recreated in the model as described by
the trace.
➔Traces
can be obtained by collecting
data from the real system for example by
some automatic monitoring systems.
SCI-UON, June/July 2011, E. Opiyo
Representing Unpredictable Variability
Traces
Call arrival t Irri e (rninutes) Call type
0.09 1
0.54 1
0.99 3
1.01 2
1.25 1
1.92 2
2.14 2
2.92 3
3.66 3
Example of a 5.46 2
trace
[Robinson
2004, p. 101] SCI-UON, June/July 2011, E. Opiyo
Empirical distributions
✔Show the frequency with which data values,
or ranges of data values, occur;
>.
5 -
gQ) 0 -
:: 4
L
w
:s 30 -
0
L 2 -
0 -
10 •
I I I I
0 0-1 1- 2- 3-4 4-5 5-6 6- 7-8
2 3 Inter-arrival time 7
Example of an Empirical (minutes)
Distribution: Call Arrivals
at a Call Centre. SCI-UON, June/July 2011, E. Opiyo
Statistical distributions
Usually defined by some function or
mathematical probability density
function (PDF).
There are many but the best known one is the
normal distribution (ND).
spread).
Uses: model errors in weight or dimension that
occur in manufacturing components.
SCI-UON, June/July 2011, E. Opiyo
Statistical distributions
The Normal Distribution
-2 -1 0 1 2 3 4 5 6
x
Example of a
normal distribution SCI-UON, June/July 2011, E. Opiyo
~\.
Normal (mean= 2. SD= 1)
Statistical distributions
ND is however limited:-
I
-2 -1 0 2 3 4 5 6
~.
x
Continuous distributions
Discrete distributions
Approximate distributions
SCI-UON, June/July 2011, E. Opiyo
Statistical distributions
The general categories of standard statistical
distributions
Continuous distributions
For sampling data that can take any value
across a range or an interval.
Approximate distributions
➔Used in the absence of data;
➔They do not have strong theoretical
underpinnings.
Example
Uniform distribution
SCI-UON, June/July 2011, E. Opiyo
Bootstrapping
➔It involves re-sampled data at random
with replacement from an original trace.
modeled;
By fitting a distribution to empirical data.
➔
30 -
>.
(.)
25 -
c
s
0-
20 -
Q)
~
LL
15 -
10
5 -
I I
0 I I I I
0-3 3-6 6-9 ~12 12-15 15-18 18-21 21-24 24-27 27-30
Repair time (minutes)
35 -
30 -
ce-, 25 -
i
u.!
i::
20
15
-
10
I
5 -
0 I
I I
I
I
I I I I
In
o-.3 .3-6 •6-9 9'-12 12-15 15-18 ·1e--21 .21-24 24-27 27·-aO
Hepa;..- tt·ino1e
(no1i01u.J1tes)
Graphically
5
0- 0-3 3-6 6-9 9-12 12-15 15-18 18-21 21-24 24-27
+'-._._,~...._,_..~-,.-'-._._,~,_,...~...,.._._._ 27-30 >30
,~..........~~~~~ >3
0 Repair time
0--3 3-- 9-12 15--18 18-21 21-24 {>-3 :l--6 6-9 9-12 12-15 15-18 18-21 21- (minutes)
6 6-9 12-15 M7 27-30 >30
Repair lime SCI-UON,DJEmupniriceal /JuD Elrylang2(70.621,
24 24-27 27-30 D Empirical 0 Erlang
(7.62, 5)
(minules) Repair time (minutes)
31) , E. Opiyo
OEmpirical
OErlang (7.62,
1)
Test the goodness-of-fit- graphical method
Limitations
➢Inspecting histograms is possible only if a
small amount of data is available, say less
than 30 samples;
➢The shape of the histogram is unlikely to be
smooth.
➢Graphical approaches based on cumulative
probabilities can, however, overcome these
limitations.
The chi-square test is probably the best known
goodness-of-fit test.
SCI-UON, June/July 2011, E. Opiyo
Test the goodness-of-fit
The chi-square test
Is probably the best known goodness-of-fit test.
The calculation of the chi-square value as follows:
k
χ2 = Σ (O − E )2/ E
i i i
i=1
where: χ2 = chi-square value ;
Oi = observed frequency in ith range (empirical
distribution); Ei = expected frequency in ith range
(proposed distribution)
k = total number of ranges
SCI-UON, June/July 2011, E. Opiyo
Test the goodness-of-fit
The chi-square test
Two other factors needed:
The level of significance- Typically, 5%
➔
35 -
5-
I
I I I
0 I I I I I
The value of k.
➢
Use:-
✔
A programming language;
✔
A✔
specialized simulation software
package (most expected).