You are on page 1of 6

Book Reviews

Stochastic Simulation
Brian D. Ripley
New York: John Wiley and Sons, 1987. xiv + 237 pp.
One fifth (4 of 20) of the research articles published in the Journal of
Educational Statistics in 1988 include simulation studies that justify or illus-
trate the authors' conclusions. A similar fraction (6 of 33) of the articles in
the 1988 volume of Psychometrika include simulations; comparable propor-
tions could be expected in other journals at the boundary of theoretical
statistics and social/psychological applications. Due in part to the complex-
ity of the problems tackled today and in part to the availability of cheap,
powerful computing—by no means independent influences—simulation
and Monte Carlo methods have become both necessary and practical tools
for statisticians and applied workers in quantitative areas of education and
psychology.
Simulation has become popular—not only in the quantitative social sci-
ences, but in all of the mathematical sciences from physics to operations
research to number theory—because it is almost always easy to do. This
ease of use makes the simulation experimenter vulnerable to two common
pitfalls. Selection of the basic source of "random numbers" is often passive:
Whatever is available in the computer's standard subroutine library is used.
However, the fact that a pseudo-random number generator appears in a
popular software package or operating system is hardly reason to trust it,
as is shown by the infamous RANDU generator, once popular on IBM
mainframes and PDP mini-computers, and by the generators burned into
RAM on today's PCs. Simulation design and reporting also deserve special
care. Some attempt must be made to assess the accuracy of the simulation
estimates: One should accurately estimate and report SE (6) as well as 6. In
addition, enough detail should be reported that the interested reader can
replicate the study and check the results, just as with other experiments.
Yet these considerations are also easy to overlook.
Brian D. Ripley's Stochastic Simulation is a short, yet ambitious, survey
of modern simulation techniques. Three themes run throughout the book.
First, one shoud not take basic simulation subroutines for granted, es-
pecially on mini- or microcomputers where they tend to be poor implemen-
tations, implementations of poor algorithms, or both. Second, design of
experiments, or variance reduction as it is known in this field, deserves
greater consideration. Third, modern methods make it possible to simulate
and analyze processes that are dependent over time, and using such pro-
cesses opens the door to new simulation techniques, such as simulated
annealing in optimization.
Ripley intends this book to be a "comprehensive guide," and it is indeed
most accurately described as a researcher's handbook with examples and

82
Book Reviews

exercises. Each of Ripley's seven chapters treats a separate topic in simu-


lation, usually illustrated with interesting examples, and ends with a smat-
tering of exercises. The chapters are largely independent of one another,
except for chapter 7, which depends on everything else. A guide such as this
stands or falls on the topics chosen: Does it cover the topic you are inter-
ested in for your work? Hence, I will spend most of this review summarizing
the topics Ripley has chosen. At the end of the review I will indicate how
I believe the book can be used.
The first four chapters of Stochastic Simulation deal with general issues
of pseudo-random number generation and nonuniform random variate
generation. Chapter 5 considers variance reduction and other experimental
design issues. Chapter 6 treats some special topics in the simulation and
analysis of dependent data, such as queueing systems, which would be of
particular interest to operations researchers. Finally, chapter 7 presents a
brief survey of some modern applications of simulation in statistical infer-
ence, optimization, and numerical analysis. The end matter includes an
extensive (approximately 350 references) bibliography, a description of the
machines and generators used for examples in the book, a list of sub-
routines in Fortran 77 to accompany chapters 2 and 3, and a short index.
The subroutines are not available on disk or tape.
Stochastic Simulation begins with a very engaging chapter that lays out
the basic issues of generating adequate pseudo-random numbers, choosing
between simulation and analysis in modeling, treating simulation as experi-
mentation, and using simulation in statistical inference. The last substantive
section of the chapter present three examples intended to whet the read-
er's appetite for clever simulation: some simple uses of simulation to check
asymptotic distribution theory, an illustration of variance reduction for
comparing estimators in a location family, and a queueing problem. The
examples are given in uneven detail; in the variance reduction example,
Ripley charges into calculations without explaining what the goal is. On the
other hand, it was nice to see two of the three examples illustrated using a
BBC microcomputer, roughly equivalent to an Apple He. Indeed, through-
out the text Ripley repeatedly brings PCs—mostly IBM PCs and BBC
micros—into the discussion.
Chapter 2 treats the construction and testing of pseudo-random number
generators. After a brief discussion of general principles and some special
cases (including the random number generators of several microcomputer
BASICs), both of the currently popular general techniques, linear congru-
ential generators and shift-register generators, are discussed in detail. The
emphasis is on lattice structure—the tendency of successive pairs, triples,
and so forth, from the generator to lie on a few hyperplanes and hence ruin
independence assumptions—and the theoretical assessment of these gener-
ators. Empirical tests, and shuffling of generator output to increase appar-

83
Book Reviews
ent randomness, are also briefly mentioned. Some of the harder math
needed for the theoretical treatment is confined to a special "proofs"
section. The algorithms presented in this chapter for developing, operating,
and testing these generators are reproduced as Fortran 77 subroutines in
Appendices B.l through B.4. Finally, comparisons and recommendations
are made among several generators in current use.
As is characteristic of the entire book, Ripley shines when he explains the
general ideas in this chapter, but he often lapses into a hurried, telegraphic
style when he must do mathematics. Proofs tend to be short sketches, and
unusual nomenclature is often used without comment. For example, to my
ear a nonsingular matrix is a square one with an inverse; Ripley uses
nonsingular for any matrix, square or not, of full rank. By itself, this is not
serious, but continued as a habit it is distracting in the already terse math-
ematical discussions. A more leisurely discussion of linear congruential and
shift-register generators can be found in chapter 6 of Kennedy and Gentle
(1980).
In chapter 3 Ripley turns to the transformation of uniform random vari-
ables (which pseudo-random number generators are intended to generate)
into independent observations from other univariate distributions. There
are two morals here. First, transformation algorithms should be correct and
easy to check: This suggests short, simple codes for which speed is a close
second, not a first, consideration. Second, the transformations used should
not unduly accentuate deficiencies of the pseudo-random number gener-
ator used as raw material.
To make the point that transformations can seriously amplify flaws in the
underlying generator, Ripley plots successive pairs of variates {XhXi+x)
from the Box-Muller polar transformation algorithm, intended to produce
pairs of uncorrelated normals. The pictures are shocking: If poor gener-
ators are used for input, spirals, pinwheels, and wagon wheels result—plots
that resemble anything but the expected bivariate N(091) data cloud. Other
transformations intended to produce normals produce bows, paintbrush
spatters, and floral patterns—one of which stands, presumably as a warn-
ing, on the dustcover of the book!
A useful collection of general techniques and particular algorithms for
producing nonuniform variates is presented in the latter part of the chapter.
The now-standard tricks of inverse probability transform, acceptance/
rejection, mixture composition, ratio of uniforms, and squeezing difficult-
to-evaluate functions between polynomial envelopes are all well repre-
sented. These ideas are then specialized to produce specific algorithms for
many common discrete and continuous distributions. Some of the algo-
rithms are reproduced as Fortran 77 subroutines in Appendices B.5 through
B.8. Separate recommendations are made for mainframe and PC use.
The general recommendations at the end of chapters 2 and 3 are quite

84
Book Reviews

helpful, but one point that is left implicit in these two chapters deserves
explicit mention: There is no universally good generator, whether of uni-
form or nonuniform variates. Practical considerations such as machine
speed and word size tend to conflict with multivariate independence, period
length, and other theoretical considerations. A particular simulation prob-
lem will, in general, require only a few properties of randomness; one
should test the generator(s)/transformation(s) one intends to use thor-
oughly for these properties before running the full simulation, and one
should be aware that properties not empirically verified may well not hold.
Chapter 4 treats the generation of dependent variates. The choices of
models to simulate tend to be time-series rather than simple multivariate
distributions. Although this will be very appealing to operations research-
ers, it is less so to education and psychology researchers, whose data tend
to be independent and indentically distributed observations of vectors from
(discrete or continuous) multivariate distributions. A short section does
treat the multivariate normal distribution, as well as simple multivariate
discrete cases in which the joint distribution is known. Nevertheless, the
techniques in this chapter are useful in modern simulation applications such
as Monte Carlo optimization and Monte Carlo solution of linear systems.
The bulk of the chapter is devoted to Poisson processes with variable rates,
Markov processes, and Gaussian processes. Two final sections treat the
approximation of any discrete multivariate distribution as the equilibrium
distribution of a point process, and the simulation of Markov random fields.
Variance reduction is the subject of chapter 5. The goal is to reduce,
through careful design of the simulation, the sampling variance of the
estimates produced by the simulation. The context of the discussion is
Monte Carlo integration, in which an integral such as
9 = Ef[g(X)] = fg(x)f(x)dx (1)
is approximated by the unbiased estimator

Qfg = l2g(xd, *,~i.i.d./(-). (2)


Four methods are discussed. Importance sampling seeks to reduce Var Qfg
by observing that Ef[g(X)] = Eh[k(Y)], where h(y) is any new density, and
k(y) = g(y)-f(y)/h(y). Thus, d hk is also unbiased for 6, and Var 6 ^ ^
Var Qfg as long as h mimics g -f;h may also be easier to simulate from than
/. The methods of control variates and antithetic variates are strategies for
pairing an existing unbiased estimator 6 with a negatively correlated un-
biased estimator V of zero such that the new estimator 6 = 6 4- V has a lower
variance. Conditioning exploits the well-known fact that for any covariate
W, Var E[Q\W] < Var 6; the trick is to find Wsuch that 6 = E[%\W] is easy
to calculate. A final section sketches relevant methods from the conven-

85
Book Reviews

tional design of experiments literature. Variance reduction is worthwhile to


consider, especially when X is multivariate, and probably often overlooked
in practice. As a digest of some of the more successful techniques, chapter
5 should promote a wider awareness of experimental design in simulation.
In chapter 6 Ripley returns to dependent processes, again concentrating
on time series and queueing processes. Typically, a parameter 6 of the
steady-state distribution of a process XUX2,... is to be estimated; a biased
estimator Zm = ty(XT(m),... ,_ATr(m+i)-i) is assumed and the bias and variance
of a new estimator based on Z are considered. Among the issues addressed
are choice of d so that Zd+uZd+2, • • • is approximately stionary; splitting
Zi,Z 2 , up into batches to obtain approximately independent batch
means; and fitting AR and ARMA models to the Z,s. The last general
strategy considered is choosing T(m) so that the "tours" XT(m),...,
XT(m+i)-\ are genuinely independent, for different ms. This technique is
called regenerative simulation and works particularly well in queueing pro-
cesses, which have independent tours between "visits" to an empty queue.
Finally, chapter 7 surveys some novel uses of simulation. Readers learn
about Monte Carlo hypothesis testing, Monte Carlo confidence intervals,
and the bootstrap as general techniques in statistical inference. Several
simulation methods in optimization are discussed, including an engaging
introduction to simulated annealing in the context of image reconstruction.
Markov methods for solving linear systems of equations are also consid-
ered, as well as quasi-Monte Carlo integration, in which nonindependent
samples are used in (2) to create maximal variance reduction. (A charming
historical discussion of variance reduction as applied to Buffon's needle
problem seems out of place at the end of chapter 7; a shorter version would
be better suited to chapter 1 or chapter 5.) Although none of the topics in
chapter 7 are considered in detail, the sketches Ripley provides are crisp
and informative. Armed with these sketches, the interested reader can
consult the original sources to learn more.
Who can use Stochastic Simulation! The book is not really aimed at
classroom use: Ripley's preface directs it to "all those who use simulation
in their work." A better choice for the classroom might be Rubinstein's
(1981) less ambitious but more detailed text. Stochastic Simulation can be
used as a handbook for algorithms and an entree into the general simula-
tion literature; such a user needs to be familiar with computing and with the
language of statistical inference and simple stochastic processes. However,
researchers who intend to use this book to help them adapt simulation
techniques to their own fields need a solid background in undergraduate
mathematics and an understanding of statistics at the level of Bickel and
Doksum's (1977) graduate text.
Stochastic Simulation is intended to be a comprehensive guide. In most
areas it serves as an authoritative introduction, but it falls short of in-depth

86
Book Reviews

coverage. For example, although chapters 3 and 4 on nonuniform random


variate generation are quite good, much more is presented in the encyclo-
pedic work of Devroye (1986). In the book's later chapters, Ripley's style
is to give the reader a general feel for each technique followed by indica-
tions of original sources. As an initial reference from which to get basic
ideas, literature references, and a well-developed point of view on the
current state of Monte Carlo methods, Stochastic Simulation is a valuable
resource.

References
Bickel, P. J., & Doksum, K. A. (1977). Mathematical statistics: Basic ideas and
selected topics. Oakland: Holden-Day.
Devroye, L. (1986). Non-uniform random variate generation. New York: Springer-
Verlag.
Kennedy, W. J., Jr., & Gentle, J. E. (1980). Statistical computing. New York:
Marcel Dekker.
Rubinstein, R. V. (1981). Simulation and the Monte Carlo method. New York:
Wiley.

BRIAN W. JUNKER
University of Illinois at Urbana-Champaign

87

You might also like