Professional Documents
Culture Documents
William A. Menner
INTRODUCTION
Constructing abstractions of systems (models) to these problems, it has become a tremendously powerful
facilitate experimentation and assessment (simula, tool for examining the intricacies of today's increasing,
tion) is both art and science. The technique is partic, ly complex world.
ularly useful in olving problems of complex systems As powerful as modeling and simulation methods
where easier solutions do not present themselves. can be, applying them haphazardly can lead to errone'
Modeling and simulation methods also allow experi, ous conclusions. This article presents a structured set
mentation that otherwise would be cumbersome or of guidelines to help the practitioner avoid the pit,
impossible. For example, computer simulations have falls and successfully apply modeling and simulation
been responsible for many advancements in such fields methodology.
as biology, meteorology, cosmology, population dy, Guidelines are all that can be offered, however.
namics, and military effectiveness. Without simula, Despite a firm foundation in mathematics, computer
tion, study of these subjects can be inhibited by the science, probability, and statistics, the discipline re'
lack of accessibility to the real system, the need to mains intuitive. For example, the issues most relevant
study the system over long time periods, the difficulty in a cardiology study may be quite different from those
of recruiting human subjects for experiments, or all of most significant in a military effectiveness study.
these factors. Because the technique offers solutions to Therefore, this article offers few strict rules; instead, it
Problem Formulation
Formulating problems includes wntmg a problem
Validate
statement. The problem statement describes the system
each ~ under study and lists input, system parameters, and
step '-""""'!l..... . , . . . - - ,
output. Generating problem statements begins with an
examination of the system, which can be a formidable
JO HNS HOPKINS APL TEC HNIC AL DIGEST, VOLUME 16, NUMBER 1 (1995) 9
w. A. MENNER
lagged~Fibonacci, generalized feedback shift register, the answers coming. Modeling is not that simple,
and Tausworthe. 11 Because a number of poor generators however. It is a big mistake simply to run the model
are also on the market,13 the modeling and simulation once, compute averages for output data, and conclude
community has developed a general distrust of the that the averages are the correct set of values with
library generators that come packaged with many soft~ which to operate the system. To support accurate and
ware products. Therefore, analysts should choose useful decisions, data must be collected and character~
generators for which descriptions and favorable test ized with much greater fidelity. Effective design of
results have been published in reputable literature. experiments begins this process.
When this strategy is not possible, output from a cho~ The experimentation step involves building a com~
sen generator should be tested for independence, cor~ puterized shell to drive the computerized model and
relation, and goodness of fit using statistical hypothesis collect data in a manner that supports the study objec~
testing procedures. tives. Thus, in designing the experiments, the analyst
should look back at the study objectives and look for~
ward to the methods for analyzing output data. Typi~
Random Variate Generation
cally the shell manages such things as pseudorandom
The inverse~transform method for generating ran~ number seeds and streams, warm~up periods, sample
dom variates relies on finding the inverse of the cumu~ sizes, precision of results, comparison of alternatives,
lative distribution function (CDF) that represents the collection of data, variance reduction techniques, and
desired distribution of variates. The CDF, F(x), is either sensitivity analysis. Variance reduction techniques and
an empirical distribution or a theoretical distribution sensitivity analysis are treated in the following para~
that has been fitted to trace data. Empirical CDFs are graphs; the other items either were discussed in the
formed by sorting trace data into increasing order and preceding sections or are considered later in the article.
forming a continuous, piecewise~linear interpolation
function through the sorted data. With either theoret~
ical or empirical distributions, a variate Y having the Variance Reduction Techniques
CDF F(x) is generated as Y = F- 1(U), where U is a In some cases, the precision of results produced by
random number uniformly distributed on the interval simulation is unacceptably poor. The most common
(0, 1) generated according to the criteria just described. remedy for poor precision is to collect more data during
The composition method is used when the CDF additional simulation runs, even though computational
F(x) can be expressed as costs can limit both the number of simulation runs and
the amount of data that can be gathered. Regardless of
circumstances, variance reduction techniques often
]
F(x)= LPjF/x), can improve simulation efficiency, allowing higher~
(2)
j=l precision results from the same amount of simulation.
Two of the most common variance reduction tech~
niques, common random numbers and antithetic vari~
where {Pj} represents a discrete probability distribution,
ates, are based on the following statistical definitions 17 :
and each Fj is a CDE A positive random integer k,
where 1 :s k :S j, is generated from the {Pj} distribution,
and the final random variate Y is generated from the
var[aX + bY]=a 2 var[X]+b 2 var[Y] + 2ab cov[X, Y], (3)
distribution Fk•
Sometimes random variates can be generated by
taking advantage of special relationships between prob~ where
ability distributions. For example, if Y is normally dis~
tributed with a mean of zero and a variance of one, then
yZ has a chi~squared distribution with 1 degree of free~
dom. Several other advanced techniques are also
available for random variate generation, including
acceptance-rejection 14 and the alias method. 15 ,16 cov[U, V]=E[(U - E[U]) (V - E[V])] , (5)
the same output variable. A point estimate for the The width of the confidence interval for this point
difference between corresponding output from the two estimate depends on var[(X + Y)/2], which, using Eq.
models is 3 with a = 1/2 and b = 1/2, is
(6)
X+Y 1 n
- - = - I JXi+Y). (8) E[X]= f xj(x, ()) dx, (10)
2 2n i= l all x
and, using the usual continuity assumptions regarding nonterminating system. The analyst is interested in
the interchange of integration and differentiation, it characterizing the distributions of output processes once
follows that they have become invariant to the passage of time.
unlikely to be independent. This problem, called approach involves collecting a prespecified number of
initialization bias, is typically solved by ignoring data data points from which a confidence interval is com~
collected during the warm~up period. puted. The precision of the confidence interval thus
Many procedures exist for determining when a sim~ depends on the sample size. The precision approach
ulation has achieved a steady state. 20 One of the most increases sample size until a prespecified precision is
effective involves plotting moving averages of each achieved. Choosing the best approach for a given study
frequently sampled output variable during many inde~ depends on factors such as the desired accuracy and the
pendent replications. 21,22 The time at which steady cost of simulation runs.
state occurs is chosen a the time beyond which the In either approach, the analyst must remember that
moving average sequences converge, provided the plots confidence interval analysis is based on the central
are reasonably smooth. Plot smoothness is sensitive to limit theorem, 17 which states that the means of random
moving average window size and may need to be ad~ samples of any parent population having finite mean
justed-a process similar to finding acceptable interval and variance will approach a normal distribution as the
widths for histograms. sample size tends to infinity. Sample size must be large
If a good estimate of steady,state time is available, enough to justify the assumption that collected output
nonterminating systems can be treated much like ter~ data are approximately normally distributed. The size
minating systems by using a method called indepen, of a large~enough sample is proportional to the skew~
dent replications with truncation. In this method, ness (i.e., nonsymmetry) of the parent distribution for
independent pseudorandom number streams are used an output process. If the parent distribution is not
for each simulation run, and output data collection is skewed, small sample sizes (as few as 10 data points)
truncated during the warm~up period. The moving may suffice. For severely skewed output processes, on
average values of each output variable are recorded as the other hand, sample sizes of 50 may not be large
point estimates once steady state is achieved. This enough. The nature of some systems may suggest a
method produces IID data, but it has the disadvantage certain level of skewness in output processes, but in
of "throwing away" the warm~up period during each general, parent output distributions are unknown quan~
run. If simulation runs are costly, one~long~run meth~ tities. Therefore, when in doubt, collect more data.
ods may be preferred, but these methods force the
analyst to deal with the problem of autocorrelation.
Autocorrelation means that consecutively collected VALIDATING THE MODEL
data points exhibit correlation with one another; they Because modeling and simulation involve imitating
are not independent. For example, two cars arriving an existing or proposed system using a model that is
consecutively to a car wash both must wait for all simpler than the system itself, questions regarding the
cars ahead of them. Compounding this problem is the accuracy of the imitation are never far away. The pro~
tendency for simulation output to be positively auto~ cess of answering these questions is broadly termed
correlated. Change occurs more slowly under such validation. More specifically, the term validation is
conditions, resulting in confidence intervals that un, used to describe the concern over accuracy in modeling
derestimate the amount of error in output data. and data analysis. The term verification describes ef-
Autocorrelation effects can be defeated in many forts to ensure that the computerized model and its
ways. The batch means method is perhaps the most implementation are "correct." The terms accreditation,
common. With this method, data collected during credibility, or acceptability describe efforts to impart
steady state are grouped into batches and the sample enough confidence to users of the model or its results
mean is computed for each batch. For large enough to guarantee the implementation of tudy conclusions.
batch sizes, the batch means are sufficiently uncorre~ Validation, verification, and accreditation (known
lated that they can be treated as IID data for compu~ collectively as VV&A) cover all teps in the modeling
tation of a confidence interval. Methods for determin~ and simulation process, implying that accuracy is of
ing batch size generally use one of the many hypothesis great concern throughout the study.
testing procedures for independence, such as the runs Performing VV &A on simulation models has been
test, the flat spectrum test, and correlation tests. 23 likened to the general problem of validating any sci~
Other methods for output data analysis include the entific theory. Thus, borrowing from the philosophy of
autoregressive method, the regenerative method, the science, three historical approaches to VV &A have
spectral estimation method, and the standardized time evolved: rationalism, empiricism, and positive eco~
series method. 24 nomics. Rationalism involves establishing indisputable
Most output data analysis methods can be ap~ axioms regarding the model. These axioms form the
proached in two different ways: by specifying fixed sam~ basis for logical deductions from which the model is
pIe sizes or by specifying precision. The fixed sample size developed. Empiricism refuses to accept any axiom,
deduction, assumption, or outcome that cannot be preliminary design review, evaluates the transformation
substantiated by experiment or the analysis of empirical of requirements into data and code structures; phase 2,
data. Positive economics is concerned only with the a critical design review, walks through the details of the
model's predictive ability. It imposes no restrictions on design, assessing such things as algorithm correctness
the model's structure or derivation. and error handling. A structured walk, through of the
Practical efforts to validate simulation models software code checks for such things as coding errors
should attempt to include all three historical VV &A and adherence to coding standards and language
approaches. Including all approaches can be difficult, conventions.
because situations do exist in which high predictability Software testing is designed to uncover errors in the
goes hand,in,hand with illogical assumptions, as in the computerized implementation and includes white, and
case of studying celestial systems or amoebae aggrega, black, box testing. White,box tests ensure that all coded
tion using continuum models. 25 Empirically validating statements have been exercised and that all possible
a model of a nonexistent system is also a problem. states of the model have been assessed. Black,box
Despite these difficulties, some subset of available tests explore the functionality of software components
validation techniques must be systematically applied to through the design of scenarios that exhaustively ex,
every modeling and simulation study if conclusions are amine the cause and effect relationships between in,
to be accepted and implemented. puts and outputs. All of these software verification
The modeling and simulation team must first decide techniques are applied with respect to the list of soft,
who will conduct VV &A. The team can perform this ware quality factors given in the Constructing System
task itself or can use an independent party of experts. Models section.
If the team performs VV&A, few new perspectives are Most software systems provide tools to assist the
generally brought to bear on the study. The use of VV &A process. Among the more important are graph,
independent peer assessment can be costly, however, ics and animation, interactive debuggers, and the
and can involve repetition of previous work. In either automated production of logic traces. Graphics and
case, the VV&A team should be familiar with model, animation involve the visual display of model status
ing and simulation methods and their potential pitfalls. and the flow of entities through the model. These tools
The VV &A team's first step should be comparing the are extremely useful, particularly for spotting gross
methods used by the modeling and simulation team to errors in computerized model output and for enhancing
the accepted methods outlined in any of the standard model credibility through demonstrations. Interactive
texts on the subj ect. debuggers help pinpoint problems by stopping the sim,
Validation techniques typically are categorized as ulation at any desired time to allow the value of certain
statistical or subjective. Statistical techniques are used program variables to be examined or changed. A logic
in studies of existing systems from which data can be trace is a chronological, textual record of changes in
collected. They are particularly useful for assessing the system states and program variables. It can be checked
validity of input data models, analyzing output data, by hand to determine the specific nature of errors.
comparing alternative models, and comparing model Typically, the logic trace can be tuned to focus on a
and system output. Statistical techniques include, but particular area of the model.
are not limited to, analysis of variance, confidence Ultimately, for each step of modeling and simula,
intervals, goodness,of,fit tests, hypothesis tests, regres, tion, the goal of VV &A is to avoid committing three
sion analysis, sensitivity analysis, and time series anal, types of errors. A type I error (called the model builder's
ysis. In contrast, subjective techniques generally are risk) results when study conclusions are rejected despite
used during proof,of,concept studies or to validate being sufficiently credible. A type II error (called the
systems that cannot be observed and from which data model user's risk) results when study conclusions are
cannot be collected. Subjective techniques include, accepted despite being insufficiently credible. A type
but are not limited to, event validation, field tests, III error (called solving the wrong problem) results
hypothesis validation, predictive validation, submodel when significant aspects of the system under study are
testing, and Turing tests. 26 left unmodeled. Many methods besides the techniques
Verifying the correctness of the computerized model described here are available for avoiding these errors
and its implementation has two main components: and producing a model that is accepted as reasonable
technical reviews and software testing. Technical re' by all people associated with the system under study.
views include structured walk,throughs of the software These methods include the use of existing theory, com'
requirements, design, code, and test plan. Software re, parisons with similar models and systems, conversa,
quirements are assessed for their ability to support the tions with system experts, and the active involvement
objectives of the modeling and simulation study. The of analysts, managers, and sponsors in all steps of the
software design is reviewed in two phases. Phase 1, a process.
Also, over the course of the study, the modeling and 12Morgan, B. J. T., Elements of Simulation, Chapman and Hall, London (1984) .
13Park, S. K., and Miller, K. W., "Random Number Generators: Good Ones Are
simulation team should systematically communicate all Hard to Find," Commun. ACM 31, 1192-1201 (1988).
major decisions to every participant. This step helps 14S chmeiser, B. W., and Shalaby, M. A, "Acceptance/Rejection Methods for
Beta Variate Generation," J. Am. Stat. Assoc. 75 ,673-678 (1980).
ensure that the model retains credibility from the start lSWalker, A. J., "An Efficient Method of Generating Discrete Random
of the work through the final stages of the study. Variables with General Distributions," ACM Trans. Math. Software 3,
253-256 (1977).
Unfortunately, one short article cannot present all 16Kronmal, R. A, and Peterson, A V., "On the Alias Method for Generating
issues relevant to modeling and simulation. Readers Random Variables from a Discrete Distribution," Am. Stat. 33 , 214-218
(1979).
interested in further study of the issues presented here 17Hamen, D. L., Statistical Methods, 3rd Ed., Addison-Wesley, Reading, MA
are encouraged to consult the many excellent modeling (1982) .
1SNelson, B. L., "Variance Reduction for Simulation Practitioners," in 1987
and simulation textbooks available, such as Banks and Winter Simul. Conf. Proc. , Theson, A, Grant, H., and Kelton, W. D. (eds.),
Carson 27 ; Bratley, Fox, and Schrage 28 ; Fishman 23 ; Law pp. 43- 51 (1987).
19Strickland , S. G., "Gradient/Sensitivity Estimation in Discrete-Event Simula-
and Kelton 1O ; Lewis and Orav 29 ; and Pritsker. 30 tion," in 1993 Winter Simul. Conf. Proc., Evans, G. W., Mollaghasemi, M.,
Russell, E. c., and Biles, W. E. (eds.), pp. 97-105 (1993).
20Schruben, L. W., "Detecting Initialization Bias in Simulation Output," Opel'.
Res . 30, 569-590 (1982).
REFERENCES 21 Welch, P. D., On the Problem of the Initial Transient in Steady State Simulations,
Technical Report, IBM Watson Research Center, Yorktown Heights, NY
1 Haberman, R., Mathematical Models, Prentice-Hall, Englewood Cliffs, NJ (1981).
(1977). 22We lch, P. D., "The Statistical Analysis of Simulation Results," in The
2 Maki, D. P., and Thompson, M., Mathematical Models and Applications, Computer Perfonnance Modeling Handbook, S. Lavenberg (ed.), pp. 268-328,
Prentice-Hall, Englewood Cliffs, NJ (1973). Academic Press, New York (1983).
J Polya, G., How To Solve It , 2nd Ed., Princeton Press, Princeton, NJ (1957) . 2JFishman, G. S., Principles of Discrete Event Simulation, John Wiley & Son,
4 Saaty, T. L., and Alexander, J. M., Thinking with Models, Pergamon Press, New York (1978).
New York (1981). 24Alexopoulos, c., "Advanced Simulation Output Analysis for a Single
5 Starfield, A M., Smith, K. A, and Bleloch, A L., How To Model It-Problem System," in 1993 Winter Simul. Conf. Proc., Evans, G. W., Mollaghasemi, M.,
Solving for the Computer Age, McGraw-Hill, New York (1990). Russell , E. c., and Biles, W. E. (eds.), pp. 89-96 (1993).
6 Sommerville, 1., Software Engineering, 3rd Ed., Addison-Wesley, Reading, MA 2SUn, C. c., and Segel, L. A, Mathematics Applied to Detenninistic Problems in
(1989). the Natural Sciences, Macmillan, New York, pp. 3-31 (1974).
7 Pressman, R. S., Software Engineering, 3rd Ed., McGraw-Hill, ew York 26Sargent, R. G., "Simulation Model Verification and Validation," in 1991
(1992). Winter Simul. Conf. Proc., Nelson, B. L., Kelton, W. D., and Clark, G. M.
S Banks, J., "Software for Simulation," in 1987 Winter Simul. Conf. Proc., (eds.), pp. 37-47 (1991).
Theson, A, Grant, H., and Kelton, W. D. (eds.), pp. 24-33 (1993). 27Banks, J., and Carson, J. S., Discrete-Event System Simulation, Prentice-Hall ,
9 Savage, S. L. , "Monte Carlo Simulation," in IEEE Videoconference on Englewood C liffs, NJ (1984).
Engineering Applications for Monte Carlo and Other Simulation Analyses, IEEE, 2SBratley, P., Fox, B. L., and Shrage, L. E., A Guide to Simulation, 2nd Ed.,
Inc., New York (1993). Springer-Verlag, New York (1987).
IOLaw, A M., and Kelton, W. D., Simulation Modeling and Analysis, 2nd Ed. , 29Lewis, P. A. W., and Orav, E. J. , Simulation Methodology for Statisticians,
McGraw-Hill, New York (1991). Operations Analysts, and Engineers, Vol. I, Wadsworth, Pacific Grove, CA
11 L'Ecuyer, P., "A Tutorial on Uniform Variate Generation," in 1989 Winter (1989).
Simul. Conf. Proc., MacNair, E. A, Musselman, K. J., and Heidelberger, P. JOPritsker, A A B., Introduction to Simulation and SLAM II, John Wiley & Sons,
(eds.), pp. 40-49 (1989). New York (1986).
THE AUTHOR