10 - Chapter 6 Statistical Analysis of Output From Terminating Simulations

ESI 4523
INDUSTRIAL SYSTEMS SIMULATION

Chapter 6: Statistical Analysis of Output from Terminating Simulations
Cristina Galloway
E-mail: rivech@shands.ufl.edu
Chapter 6 Content
• Output Analysis of Terminating Simulation

• Confidence Interval
• Number of Replications
• Statistical Issues with Simulation Output
• Comparing Two Scenarios via Output Analyzer
• Comparing Multiple Scenarios via Process Analyzer (PAN)
ESI 4523 2
Time Frame of Simulations
• Terminating: Specific starting and stopping conditions

• Run length will be well-defined (and finite)
• Steady-state: Long-run (technically forever)
• Theoretically, initial conditions don’t matter
• Why differentiate these two systems?
The way how we obtain statistics is different!!
ESI 4523 3
Terminating simulation (1)
• Modeling a specific interval of a system

• Specific starting and stopping conditions
• Examples: restaurant, service shops, etc.
• Empty and idle at the beginning of simulation
• Warm up period (or transient state) is a part of the system
characteristic
• Warm up period: time that the system reaches its
steady state from its initial state
ESI 4523 4
Terminating simulation (2)
• Two ways to put stopping conditions

• Replication time
• E.g., the office closes every 5 O’clock
• Counters of entities
• # of arrivals or # of departures
• E.g., I am interested in the store with the first 100
customers
ESI 4523 5
Steady-State Simulation
• Quantities to be estimated are defined in the long run

• Stable system: resources are not busy all the time
• The initial conditions (empty and idle state) should not be
a part of simulation results
• E.g., Analyze consumer behavior during peak hours
ESI 4523 6
Output analysis for Terminating simulation (1)
• Output data collection in simulation

• Make Independently Identically Distributed (IID)
replications => Avoid Bias with Random Statistical Samples
• Collect multiple outputs

• Run > Setup > Replication Parameters: Number of Replications
field
• Create data files (“XXX.dat”) for the output analysis using
“Output analyzer”
ESI 4523 7
• Output Analysis
• Point estimate: summarize the sample by a single number
that is an estimate of the population parameter
• For n samples of X,
• Sample mean: 𝑋ത = ( σ 𝑋𝑖 Τ𝑛 )
• Sample variance: S2 = ( σ 𝑋𝑖 − 𝑋ത Τ 𝑛 − 1 )
ESI 4523 8
• Output Analysis (Cont’d)

• Interval estimate: A range of values within which the true
parameter lies with high probability
• Interval estimate using 𝑋ത or

S2 gives you a better idea
• Provide more information
than point estimates
ESI 4523 9
Confidence intervals for terminating systems (1)
• Confidence interval: A range within which we can have a

certain level of confidence that the true mean falls
𝑆 𝑆
𝑋ത − 𝑡𝑛−1,1−𝛼 ത
≤ 𝜇 ≤ 𝑋 + 𝑡𝑛−1,1−𝛼
2 𝑛 2 𝑛
• Half width: the distance from 𝑋ത to either endpoint
𝑆
𝑡𝑛−1,1−𝛼
2 𝑛
ESI 4523 10
Half-width
ESI 4523 11
Half-width
ESI 4523 12
• Example: obtain a 95% confidence interval for the expected

time E(T) to produce 2,000 parts
Results from 10 replications
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
32.62 32.57 33.51 33.29 32.10 34.24 32.70 33.49 33.36 34.61
σ10
𝑖=1 𝑇𝑖
ത
𝑇= ≅ 33.25 0.606
𝑁 Half width = 𝑡9,1−0.05 ≅ 0.56
2 10
σ10
𝑖=1 𝑇 𝑖 − ത
𝑇 2
𝑆2 = ≅ 0.606 where 𝑡9,1−0.05 = 2.262
𝑁−1 2
Confidence interval: 33.25 − 0.56 ≤ 𝜇 ≤ 33.25 + 0.56
ESI 4523 13
• Count how many data (Ti) belong to the confidence interval
Confidence interval: 32.69 ≤ 𝜇 ≤ 33.81

Results from 10 replications
T1 T2 T3 T4 T5 T6 T7 T8 T9 T10
32.62 32.57 33.51 33.29 32.10 34.24 32.70 33.49 33.36 34.61
• Is that something to do with 95%?
No
ESI 4523 14
• Confidence interval (CI) is not an interval in which 95% of the

average measures from replications will fall
This is in fact called a prediction interval!!
• Interpretation of CI
Set 1
If we make a lot of sets of 10 Set 2
Set 3
replications, about 95% of these Set 4
Set 5
intervals would cover μ Set 6
Set 7
μ
ESI 4523 15
• Confidence interval will shrink to a point as n increases

• But, prediction interval won’t since it needs to allow for
the variation in future replications
• When n is infinite, t distribution becomes standard normal

distribution (central limit theorem)
• The interval will be between ±1.96 for 95% probability
ESI 4523 16
• Prefer smaller confidence intervals – precision

𝑆 Want this to be “small”
𝑡𝑛−1,1−𝛼
2 𝑛
• However, we cannot control t or S
• Must increase n – how much?
• Solve for
2
𝑆
𝑛 = 𝑡𝑛−1,1−𝛼
2ℎ
Target half width
ESI 4523 17
• Approximation
• Replace t by Z, corresponding normal critical value

• Pretend that current S will hold for larger samples
• Get 𝑆
2
S= sample standard deviation from
𝑛 ≅ 𝑍1−𝛼Τ2 “initial” number of replications (n0)
ℎ
• Easier but different approximation

ℎ02 h0= half width from “initial” number of
𝑛 ≅ 𝑛0 2
ℎ replications (n0)
ESI 4523 18
• Application to the simple call center model (Model 6-1)

• From initial 10 replications, 95% half-width on Total Cost
was ±812.76 (3.8% of 𝑋ത = 21,618.33)
• Let’s get this down to ±250 or less
• First formula: 𝑛 ≅ 1.962 1,136.242 Τ2502 = 79.4 ≅ 80
• Second formula: 𝑛 ≅ 10 812.762 Τ2502 ≅ 106
• Changed Number of Replications to 110
• Result: 22175.19 ± 369.54
ESI 4523 19
Replication
• Why do we increase the number of replications?

• Randomness!
• Confidence intervals: https://youtu.be/jvoxEYmQHNM
• The higher the confidence interval, the more confident we
are that the samples are representative of the actual system
ESI 4523 20
Replication
• How many replications is enough?

• Depends on the type of distribution
• The closer a distribution is to a normal one, the less
you’ll need
• If approximately symmetrical and unimodal, 10-20
samples is enough
• Even if a very strange distribution, 100 samples is
enough
ESI 4523 21
Statistical Issues with Simulation Output (1)
• Example of simulation outputs

Replication (i) Within run observations (𝒚𝒊𝒋 ) ഥ𝒊
Average 𝒚
1 𝒚𝟏𝟏 , 𝒚𝟏𝟐 , 𝒚𝟏𝟑 , … , 𝒚𝟏𝒎−𝟏 , 𝒚𝟏𝒎 ഥ𝟏
𝒚
2 𝒚𝟐𝟏 , 𝒚𝟐𝟐 , 𝒚𝟐𝟑 , … , 𝒚𝟐𝒎−𝟏 , 𝒚𝟐𝒎 ഥ𝟐
𝒚
3 𝒚𝟑𝟏 , 𝒚𝟑𝟐 , 𝒚𝟑𝟑 , … , 𝒚𝟑𝒎−𝟏 , 𝒚𝟑𝒎 ഥ𝟑
𝒚
4 𝒚𝟒𝟏 , 𝒚𝟒𝟐 , 𝒚𝟒𝟑 , … , 𝒚𝟒𝒎−𝟏 , 𝒚𝟒𝒎 ഥ𝟒
𝒚
… … …
n 𝒚𝒏𝟏 , 𝒚𝒏𝟐 , 𝒚𝒏𝟑 , … , 𝒚𝒏𝒎−𝟏 , 𝒚𝒏𝒎 ഥ𝒏
𝒚
i: the replication from which the observation came

j: the index of each entity within the replication
ESI 4523 22
• Underlying assumptions to construct the confidence intervals

and that Arena assumes:
• Observations must be independent so that no correlation
exists between consecutive observations
• Observations are identically distributed throughout the
entire duration of the process (i.e. they are time invariant)
• Observations are normally distributed
ESI 4523 23

• Observations must be independent so that no correlation
exists between consecutive observations
• Within run observations, observations are not

independent
• If the waiting time observed for one entity is long, it
is highly likely that the waiting time for the next
entity observed is going to be long
ESI 4523 24

• Observations are identically distributed throughout the
entire duration of the process (i.e. they are time invariant)
• Within run observations for a particular replication,

observations are non-stationary
• They do not follow the same (identical) distribution
throughout the simulation run
ESI 4523 25

• Observations are normally distributed
• Within run observations, observations for a particular

replication, observations do not follow normal
distributions
We cannot use the confidence interval based on n*m data!!
ESI 4523 26
ഥ𝒊 (a replication)
• Solution: Focus on the average 𝒚
ഥ𝟏 , 𝒚
• First, the values of 𝒚 ഥ𝟐 , …, 𝒚
ഥ𝒏 are independent each
other Different random seed
ഥ𝟏 , 𝒚
• Second, the values of 𝒚 ഥ𝟐 , …, 𝒚
ഥ𝒏 are identically
distributed
ഥ𝟏 , 𝒚
• Third, the values of 𝒚 ഥ𝟐 , …, 𝒚
ഥ𝒏 are not normally
distributed
ഥ𝒊
• But, we can use the central limit theorem because 𝒚
has a I.I.D. random value
ESI 4523 27
• We must compute a confidence interval based on n data,

instead of n*m data
Replication Within run observations Average

(i) (𝒚𝒊𝒋 ) ഥ𝒊
𝒚
1 𝒚𝟏𝟏 , 𝒚𝟏𝟐 , 𝒚𝟏𝟑 , … , 𝒚𝟏𝒎−𝟏 , 𝒚𝟏𝒎 ഥ𝟏
𝒚
2 𝒚𝟐𝟏 , 𝒚𝟐𝟐 , 𝒚𝟐𝟑 , … , 𝒚𝟐𝒎−𝟏 , 𝒚𝟐𝒎 ഥ𝟐

𝒚
3 𝒚𝟑𝟏 , 𝒚𝟑𝟐 , 𝒚𝟑𝟑 , … , 𝒚𝟑𝒎−𝟏 , 𝒚𝟑𝒎 ഥ𝟑
𝒚
4 𝒚𝟒𝟏 , 𝒚𝟒𝟐 , 𝒚𝟒𝟑 , … , 𝒚𝟒𝒎−𝟏 , 𝒚𝟒𝒎 ഥ𝟒
𝒚
… … …
n 𝒚𝒏𝟏 , 𝒚𝒏𝟐 , 𝒚𝒏𝟑 , … , 𝒚𝒏𝒎−𝟏 , 𝒚𝒏𝒎 ഥ𝒏
𝒚
ESI 4523 28
Comparing Two Scenarios (1)
• Usually compare alternative system scenarios,

configurations, layouts, sensitivity analysis
• Two scenarios in the simple call center model
• Base case – Model 6-4
• More-resources case – Add 3 trunk lines (29), 3 each of New Sales,
New Tech 1, New Tech 2, New Tech 3, and New Tech All
• Comparing outputs of two scenarios

• Total cost and percent rejected
• Save output statistics to files for each replication
ESI 4523 29
Running Arena Without Animation
ESI 4523 30
• Make CIs on expected outputs from each alternative, see if

they overlap; look at Total Cost
• Base case:
22,175.19 ± 369.54, or [21805.65, 22,544.73]
No
• More-resources case: overlap
24,542.82 ± 329.11, or [24,213.71, 24,871.93]
But this approach doesn’t allow for a precise, efficient

statistical conclusion
ESI 4523 31
• Paired-t test
• Evaluate whether means vary over two different scenarios
• H0: The mean difference between paired observations
is zero => Both scenarios have the same characteristics
• Let di (= y1,i − y2,i) be the difference between the two

observations on each pair
• 95% CI for the true mean difference
𝑠𝑑 • N: the number of samples
𝑑ҧ ± 𝑡 0.05 × • Sd: Sample standard deviation
𝑁−1, 2 𝑁 • 𝑑:ҧ Sample mean
ESI 4523 32
• Example
• Evaluate whether means vary over two different scenarios
Replications Y1,i Y2,i di −6
𝑑ҧ = = −1.2
1 18 22 -4 5
2 21 20 1
20.8
3 20 23 -3 𝑠𝑑 = ≅ 2.28
4
4 25 26 -1
5 23 22 1 𝑡 0.05 ≅ 2.78
4, 2
• 95% CI for the true mean difference
2.28
−1.2 + 2.78 × = 1.63
−4.03 ≤ 𝑑 ≤ 1.63 5
2.28
Accept H0 since the CI include zero −1.2 − 2.78 × = −4.03
5
ESI 4523 33
Compare Mean via Output Analyzer (1)
• Output Analyzer
• Separate application that operates on .dat files produced
by Arena
• To save output values (Expressions) of entries in Statistic
data module (Type = Output)
• Did for both Total Cost and Percent Rejected
• Will overwrite these file names next time
• .dat files are binary … can only be read by Output
Analyzer
ESI 4523 34
• Start Output Analyzer, open a new data group

• Basically, a list of .dat files of current interest
• Can save data group for later use – .dgr file extension
• Add button to select (Open) .dat files for the data group
ESI 4523 35
• Analyze > Compare Means menu option

• Add data files … “A” and “B” for the two alternatives
• Select “Lumped” for Replications field
• Title, confidence level, accept Paired-t Test, do not Scale Display
since two output performance measures have different units
ESI 4523 36
• Results:
• Both CIs miss 0

• Conclude that there is a (statistically) significant difference on both
output performance measures Reject H0
ESI 4523 37
Evaluating Many Scenarios with the Process
Analyzer (1)
• With (many) more than two scenarios to compare, two
problems are
• Simple mechanics of making many parameter changes,
making many runs, keeping track of many output files
• Statistical methods for drawing reliable, useful
conclusions
Process Analyzer (PAN) addresses these
ESI 4523 38
Analyzer (2)
• PAN operates on program (.p) files
– produced when .doe file is run
(or just checked)
• Start PAN from Arena (Tools >
Process Analyzer) or via Windows
• PAN runs on its own, separate
from Arena
ESI 4523 39
Analyzer (3)
• Create a scenario in PAN
• Load a program (.p) file
• Set of input controls that
you choose
• Chosen from Variables and
Resource capacities – think
ahead
• You fill in specific numerical
values
ESI 4523 40
Analyzer (4)
• Create a scenario in PAN (Cont’d)
• Set of output responses that
you choose
• Chosen from automatic Arena
outputs or your own Variables
• Values initially empty … to be
filled in after run(s)
• Duplicate (right-click,
Duplicate) scenarios, then edit
for a new one
ESI 4523 41
Analyzer (5)
• PAN Projects and Runs
• Program files can be the same .p file, or .p files from
different model .doe files
• Controls, responses can be the same or differ across
scenarios in a project – usually will be mostly the same
• Think of a project as a collection of scenario rows – a table
• Can save as a PAN (.pan extension) file
ESI 4523 42
Analyzer (6)
• Select scenarios in project to run (maybe all)
• PAN runs selected models with specified controls
• PAN fills in output-response values in table
• Equivalent to setting up, running them all “by hand” but
much easier, faster, less error-prone
ESI 4523 43
Analyzer (7)
• Create 7 scenarios
• Controls • Responses
• Trunk line • Total Cost
• New tech 1, 2, and 3 • Percent Rejected
• New Tech All
• New Sales
ESI 4523 44
Analyzer (8)
• Create a chart in PAN
• Select Total Cost column, Insert > Chart (or or right-
click on column, then Insert Chart)
• Chart Type: Box and Whisker
• Next, Total Cost; Next defaults
• Next, Identify Best Scenarios
• Smaller is Better, Error
Tolerance = 0 (not the default)
• Show Best Scenarios; Finish
ESI 4523 45
Analyzer (9)
• Vertical boxes: 95% confidence
intervals
• Red scenarios statistically
significantly better than blues
• More precisely, red
scenarios are 95% sure to
contain the best one
Numerical values (including C.I. • Narrow down red set –
half widths) in chart – right click
on chart, Chart Options, Data more replications, or Error
So which scenario is “best”? Tolerance > 0
(Criteria disagree) Reduce risk
Combine them somehow?
ESI 4523 46

10 - Chapter 6 Statistical Analysis of Output From Terminating Simulations

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

10 - Chapter 6 Statistical Analysis of Output From Terminating Simulations

Uploaded by

Copyright:

Available Formats

ESI 4523

INDUSTRIAL SYSTEMS SIMULATION

• Output Analysis of Terminating Simulation

• Terminating: Specific starting and stopping conditions

• Modeling a specific interval of a system

• Two ways to put stopping conditions

• Quantities to be estimated are defined in the long run

• Output data collection in simulation

• Collect multiple outputs

• Output Analysis (Cont’d)

• Interval estimate using 𝑋ത or

• Confidence interval: A range within which we can have a

• Half width: the distance from 𝑋ത to either endpoint

• Example: obtain a 95% confidence interval for the expected

Confidence interval: 33.25 − 0.56 ≤ 𝜇 ≤ 33.25 + 0.56

• Count how many data (Ti) belong to the confidence interval

Confidence interval: 32.69 ≤ 𝜇 ≤ 33.81

• Is that something to do with 95%?

• Confidence interval (CI) is not an interval in which 95% of the

• Confidence interval will shrink to a point as n increases

• When n is infinite, t distribution becomes standard normal

• The interval will be between ±1.96 for 95% probability

• Prefer smaller confidence intervals – precision

Target half width

• Replace t by Z, corresponding normal critical value

• Easier but different approximation

• Application to the simple call center model (Model 6-1)

• Why do we increase the number of replications?

• How many replications is enough?

• Example of simulation outputs

i: the replication from which the observation came

• Underlying assumptions to construct the confidence intervals

• Underlying assumptions to construct the confidence intervals

• Within run observations, observations are not

• Underlying assumptions to construct the confidence intervals

• Within run observations for a particular replication,

• Underlying assumptions to construct the confidence intervals

• Within run observations, observations for a particular

We cannot use the confidence interval based on n*m data!!

• We must compute a confidence interval based on n data,

Replication Within run observations Average

2 𝒚𝟐𝟏 , 𝒚𝟐𝟐 , 𝒚𝟐𝟑 , … , 𝒚𝟐𝒎−𝟏 , 𝒚𝟐𝒎 ഥ𝟐

• Usually compare alternative system scenarios,

• Comparing outputs of two scenarios

• Make CIs on expected outputs from each alternative, see if

But this approach doesn’t allow for a precise, efficient

• Let di (= y1,i − y2,i) be the difference between the two

• Start Output Analyzer, open a new data group

• Analyze > Compare Means menu option

• Both CIs miss 0

Process Analyzer (PAN) addresses these

You might also like