You are on page 1of 10

JOURNAL OF SPACECRAFT AND ROCKETS

Vol. 49, No. 1, January–February 2012

Applying Monte Carlo Simulation to Launch Vehicle Design


and Requirements Verification

John M. Hanson∗
NASA Marshall Space Flight Center, Huntsville, Alabama 35812
and
Bernard B. Beard†
ARES Corporation, Huntsville, Alabama 35805
DOI: 10.2514/1.52910
This paper is focused on applying a Monte Carlo simulation to probabilistic launch vehicle design and
requirements verification. The approaches developed in this paper can be applied to other complex design efforts as
well. Typically, the verification must show that requirement “x” is met for at least “y%” of cases, with, say, 10%
consumer risk or 90% confidence. Two aspects of making these runs will be explored in this paper. First, there are
several types of uncertainties that should be handled in different ways, depending on when they become known (or
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

not). The paper describes how to handle different types of uncertainties and how to develop vehicle models that can be
used to examine their characteristics. This includes items that are not known exactly during the design phase, but will
be known for each assembled vehicle; other items that become known before or on flight day; and items that remain
unknown on flight day. Second, this paper explains a method (order statistics) for determining whether certain
probabilistic requirements are met and enables the user to determine how many Monte Carlo samples are required.
The methods also apply to determining the design values of parameters of interest in driving the vehicle design.

Nomenclature behavior satisfies requirements. Typically, these requirements state


FBIN = cumulative binomial probability that the simulation results must achieve a very high probability of
Gx = cumulative distribution function success. The purpose of this paper is to describe two aspects of using
gx = probability density function of a continuous variable x Monte Carlo analysis to support vehicle design and requirements
for which cumulative distribution function is Gx verification. The first is how to analyze four different types of
k = number of failures uncertainties that must be analyzed for launch vehicle development.
m = order statistic of interest, mth order statistic Some of this analysis was briefly introduced in [1], but no full
N = number of Monte Carlo samples description of these issues has been published previously. The
n = sigma level of the z vehicle target value second purpose of this paper is to describe the order statistics
p = actual failure probability approach to verifying requirements and to determining design values
p^ = maximum likelihood estimate of p from experimental of important parameters. The approach is not new [2,3], but it is not
data well known in the aerospace community. The development here
PBIN = binomial probability density function shows how to derive the appropriate numbers for how many
pA = allowable or acceptable failure probability (when only a Monte Carlo samples are needed and how many failures can be
consumer value is specified) allowed in order to meet a specified requirement. It is essentially the
pC = required failure probability for the consumer same process that is used in statistical quality control. Then, order
pP = allowed failure probability for the producer statistics is used to show how to derive values for parameters that are
xi = input parameters that are allowed to vary for a vehicle needed for vehicle design. Other complex aerospace systems have
model similar needs, so the work in this paper may be applied to these other
z = desired target value of a vehicle parameter (such as systems as well.
payload or maximum dynamic pressure)
 = standard deviation Overview
z = standard deviation of z
Monte Carlo simulation, where (in this application) uncertain
inputs are randomly varied and many samples are taken in order to
obtain statistical results, is widely used in simulating complex space
Introduction systems. In this paper, random choices within a Monte Carlo
simulation will be called samples and a Monte Carlo simulation of
M ONTE Carlo simulation is used extensively to analyze launch
vehicle performance and to verify that the launch vehicle many samples will be called a run. There are many things to get right
in setting up and making Monte Carlo runs that are not the focus of
Presented as Paper 2010-8433 at the AIAA Guidance, Navigation, and this paper, including making sure all the uncertainty values used are
Control Conference, Toronto, 2–5 August 2010; received 15 November 2010; appropriate (with the right level of conservatism), taking care in
revision received 14 April 2011; accepted for publication 23 April 2011. This modeling the random inputs correctly, checking all the models to
material is declared a work of the U.S. Government and is not subject to make sure they correctly model reality, verifying the simulation, and
copyright protection in the United States. Copies of this paper may be made others. The paper briefly touches on model verification.
for personal or internal use, on condition that the copier pay the $10.00 per- This paper is also not focused on alternative approaches to
copy fee to the Copyright Clearance Center, Inc., 222 Rosewood Drive,
Monte Carlo and alternative ways of generating the sample cases
Danvers, MA 01923; include the code 0022-4650/12 and $10.00 in
correspondence with the CCC.
(different from randomization using the presumed distribution for

Aerospace Engineer, Flight Mechanics and Analysis Division, MSFC/ each uncertain parameter). There are alternative approaches that have
EV40; john.m.hanson@nasa.gov. Senior Member AIAA. been designed to focus in the tails of output probability distributions.
† The tails are of primary interest when high levels of success are
Senior Consultant, Tennessee Valley Office, 200 Sparkman Drive;
bbeard@arescorporation.com. needed. One of these alternative approaches is called importance
136
HANSON AND BEARD 137

sampling, and it involves choosing samples in the tail of the most different assumptions). It is important to run some cases with a larger
important distributions (the ones that affect the output the most) [4]. number of Monte Carlo samples (or run some cases with different
This method is particularly useful when one or two input parameters random numbers) in order to see how much the design is driven by
are clearly most important and there is a single-output parameter of this numerical conservatism.
interest (e.g., stage separation clearance). This paper is focused on a Two particular aspects of making Monte Carlo runs for
problem where there are 100s of input parameters and 100s of output requirements verification will be explored in this paper. The first
parameters of interest, and where the mapping of the most important discussion concerns four types of uncertainties that should be
inputs to outputs is not always clear and is different depending on the handled in different ways. Of these four types, the first captures items
output in question. So, importance sampling does not apply here. that are not known during the vehicle design phases but that will be
There are a number of alternative sampling methods, including known for a specific vehicle (before flight) and can be used in the
various methods falling under the category of Latin hypercube trajectory design and in understanding margin or adjusting payload.
sampling [5]. In these methods, deterministic sampling (or in some Examples include design uncertainties for engines, uncertainty in
cases, random sampling within deterministic regions of the proba- axial force that will be reduced by the time the vehicle flies (through
bility space) spreads the input points over the probabilistic space wind-tunnel tests and maybe test flights), and uncertainty related to
rather than randomly choosing the points as in simple Monte Carlo the overall ensemble of engines but for each specific engine will be
sampling. Points in the tails are sampled more often than with simple known (because the engine is put on a test stand and fired before
random sampling. The alternative sampling approaches, in many flight). Components will be weighed. Of course, there will be no
cases, lead to much faster convergence for the mean and standard improved information about the assembled vehicle if these preflight
deviation of the output as compared with random sampling. How- tests do not occur. Some of the separation of models into what is
ever, it is not clear how to appropriately sample more compli- known before launch and what is still unknown on launch day is
cated dispersion models (that include correlations and nonlinear introduced in [1]. The current paper discusses the methods in more
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

effects) with these procedures. (Ares I rocket models in this category detail. Another type of uncertainty is the mixture ratio variation for
included the winds model, first stage propulsion model, and engine each engine, which is also known before flight (part of the total
transient model). It is very difficult to know, for example, what variation is known before flight, based on the engine test, and part is
statistical level the various wind profiles represent and how they will still unknown on flight day) but does not affect the trajectory design
affect the various outcomes of interest such as flight control errors or nominal payload capability. It does affect how much of each
and engine gimbal angles. If the more complicated models were propellant will remain, and this knowledge should be used in the
sampled randomly and other models were sampled using Latin flight day go/no go decision. A third type of uncertainty is variations
hypercube sampling, it would be difficult to understand the statistics in temperature (affecting solid motor performance) and winds that
of the output. In contrast, the statistical results are well understood for are known for the flight day go/no go decision and for day-of-launch
random sampling, as explained in this paper. Examining equivalent trajectory design but would not be used to adjust payload, since the
methods for alternate sampling procedures could be a topic for future payload is decided well before flight day. Finally, there are items that
study. are unknown on flight day. It is also possible to break these items up
This paper is focused on applying Monte Carlo simulation to into two components: those for which the true value randomly varies
launch vehicle design and requirements verification. Requirements from flight to flight (e.g., the variation in winds between a measured
statements typically take the form of “The vehicle shall do x.” An wind profile that is used to design the flight day trajectory and the
example might be that the vehicle delivers a required mass to a actual wind profile that the vehicle experiences) and those for which
specified orbit, that certain stability margins are met, or that navi- there is a single value that does not vary much from flight to flight but
gation errors do not exceed certain limits. Then, the verification is unknown (e.g., the pitch moment coefficient). This paper will
requirement statement says, for example, “x shall be verified through discuss ways to model these different uncertainties. Included will be
analysis. Analysis shall use Monte Carlo simulation and show that x a discussion of vehicle models that are based on the preflight-known
is met for at least y% of cases, with 10% consumer risk (CR).” CR is variations and ways to develop these models.
the risk that the verification (based on a finite number of samples) The second major topic in this paper is the understanding of the
says the requirement is met when it is actually not met. (Statisticians statistical results of Monte Carlo simulations. The Monte Carlo
sometimes call this a type II error.) Ten percent CR is the same as 90% method necessarily produces outputs that are statistical in nature.
confidence that the requirement is in fact met if the analysis indicates That is, a Monte Carlo run consisting of N sample trajectories enables
it is met. Sometimes, a higher fidelity test might be used to validate an approximate estimate of the properties of the distributions of
that the Monte Carlo simulation truly reflects reality. Besides various output parameters. Most engineers are familiar with the usual
verifying that a requirement is met, during vehicle design, the desire measures of central tendency, such as mean and median, as well as
is also to identify probabilistic design cases, e.g., the 99.865% worst measures of dispersion, such as the standard deviation. The mean of
structural conditions, heat rates, actuator angles, propellant usage, the sample provides an unbiased estimate of the mean of the modeled
etc., typically also with 10% CR. Some structural analyses have used population, with an uncertainty that is given by the sample standard
50% CR (50% confidence). deviation. What is less well known among engineers is how to deal
The reason that there is a CR statement is that if we ran the with the statistics of the tails of the distributions of output parameters,
Monte Carlo run a second time with different random seeds, we which are often more important than the means and standard
would get a different answer. A 10% CR means that if we ran, say, deviations when verifying vehicle requirements.
10,000 Monte Carlo runs of 2000 samples each to measure success in Consider a parameter that has a Gaussian distribution. The
meeting the requirement, 10% of the runs would result in a violation Gaussian distribution is unbounded. However, any finite set of N
of the requirement. Without a CR (or confidence) requirement, it samples will have a largest sample. In fact, the maximum of N will
would be possible to run a very small number of Monte Carlo have a distribution that is related to the underlying Gaussian but is not
samples and declare success. The converse of CR is producer risk itself Gaussian. The next largest sample will have a slightly different
(PR), the risk that we think a product does not meet the requirement distribution, and so on. The branch of statistics that deals with these
(so we reject it) when in fact it does. distributions is called order statistics [2], and it is closely related to
To keep the CR low, the system is typically overdesigned. One way the field known as sampling theory.
to reduce the impact of the overconservatism is to generate a very One of the first decisions faced by an analyst setting up
large number of Monte Carlo samples (assuming the system and all Monte Carlo runs is the number of samples that are required in order
uncertainties are modeled correctly; otherwise, we are just playing to achieve a desired level of accuracy in the output parameters. Some
with incorrect statistics). But this option is limited by computer measures, such as the sample mean, are known to improve in
power, particularly when a large number of Monte Carlo runs (each accuracy as the inverse of the square root of N (at least for parameters
with many samples) are already necessary in order to capture a that obey the central limit theorem). However, since the standard
variety of scenarios (different missions, different launch months, and deviation (the numerator in the standard error) is itself a complicated
138 HANSON AND BEARD

function of the input parameters, it is generally not possible to predict uncertainty that just randomly varies.) A way to handle this is to
how many samples are needed to reach a desired accuracy in advance devise multiple system models. For example, choose sluggish (heavy
of the simulation. It would seem at first glance that the situation in the and slow) and sporty (light and fast) launch vehicle models [1]
tails of the distribution is even less tractable. However, it turns out covering the range of the design uncertainties (and other uncer-
that it is possible to relate the order statistics of the sample to the tainties that are better known for a particular launch vehicle before it
underlying distribution in a way that allows one to pick a sample size flies). Next, design trajectories for how these would be flown. Finally,
a priori. The sample size is chosen to provide an estimate of the value run the Monte Carlo simulation for each using the estimated
of an output parameter that bounds a required fraction of the uncertainties remaining on flight day. If the Monte Carlo simulation
underlying distribution, such as 99.865%, while meeting a required had all uncertainties included as if they were all unknown, then a
confidence level, such as 90%, i.e., 10% CR. 99.865% value (with 10% CR) from this simulation would not cover
This paper explains this process and shows how to generate the the 99.865% value for the sluggish or sporty vehicle models; their
information needed for verifying requirements success. Much of this success might be much lower.
development is the same as acceptance sampling in statistical quality As an example, for the Ares I launch vehicle design, when a
control, and the development in this paper is not new in this area [3]. particular vehicle is assembled for launch, a small motor test
This approach is less well known in aerospace simulation arenas. estimates the first stage solid rocket motor burn rate. The J-2X engine
This same procedure can be used to determine parameters needed for that powers the upper stage is run on a test stand before flight. Various
creating the vehicle design, e.g., the 99.865% (with 10% CR) highest vehicle elements are weighed. The axial force coefficient is better
acceleration. known by the time of first flight than it is in the early design phases.
Sometimes it is desirable to derive a distribution rather than to use All these things influence how sporty or sluggish the vehicle will be.
order statistics, e.g., when the required failure percentage is ex- Assuming that it is necessary to meet requirements throughout the
tremely low (e.g., 1e-6) and the Monte Carlo results are not near that year, a sluggish vehicle model must meet the payload requirements in
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

threshold (e.g., the Monte Carlo run consists of 2000 samples, so the coldest month. Likewise, the vehicle structure and thermal
none of the samples fall close to the 1e-6 level). Without a huge protection must be sufficient for a sporty vehicle model flying in the
number of Monte Carlo samples (likely requiring more than the time summer (when the solid motor burn rate is highest).
available to make the runs), order statistics could not be used. In this There are additional parameters that will likely be better known
case, it makes sense to fit a distribution to the data (or to the tail of the before flight than they are during design and that primarily affect
data) and to estimate where the 1e-6 level is reached. There will, of flight control, such as the pitch moment coefficient. However, since
course, be significant uncertainty in this estimate due to the size of the the flight control must be able to fly any combination of the various
sample. aerodynamic parameters with any combination of the vehicle
The good thing about order statistics is that they are non- models, a more robust design (that does not cause an explosion of the
parametric, i.e., they do not depend on an assumption or model of the number of Monte Carlo runs that must be made) simply lumps the
underlying probability distribution. The downside is that most of entire uncertainty for these types of parameters into the Monte Carlo
the information bought with the computer time is not used (because runs for the varying vehicle and launch month cases. These param-
the interest is primarily in the tail), and consequently there is an eters do not affect whether a vehicle model is sporty or sluggish, but
undesirable tradeoff between conservatism and CR (a lower CR they affect the flight control. Putting the full variation into the
means more conservatism is necessary). The nice thing about fitted simulation means that the sporty and sluggish models will have to be
distributions (parametric statistics) is that they use basically all the successful with the full range of flight control parameter variations.
information from the generated data and, of course, the bad thing is Separate sensitivity studies should also be performed. More will be
that the uncertainty associated with the choice of distribution is said about epistemic parameters and flight control later.
difficult to control. One can test for normality (Gaussian behavior), For Ares I design, vehicle models were used for heavy/slow (worst
although there will necessarily be a certain number of false negatives payload performance and lowest thrust/weight ratio at liftoff), light/
in those tests, where a distribution is accepted as being normal when fast (highest dynamic pressure, highest heat rate, highest accel-
it is really not. If the data are not Gaussian, it may be more appropriate eration, and highest liftoff thrust/weight ratio that leads to the fastest
to fit a distribution to the tail (say, the outlying 10%) of the sampled umbilical release needs), hybrid (sluggish first stage and sporty upper
data rather than fitting the whole distribution [6]. This paper will not stage, with high upper stage engine thrust, leading to highest
focus on fitting distributions to data, but rather it will explain the acceleration in upper stage flight), a low liquid oxygen remaining
order statistics approach along with the vehicle modeling approach as model, and a low liquid hydrogen remaining model. For all models,
a function of type of uncertainty that was discussed earlier. except for the low propellant remaining cases, a nominal mixture
ratio for the engine is assumed. This is because an off-nominal
mixture ratio does not affect the trajectory design, the nominal
Setting up Vehicle Models and Monte Carlo Runs payload performance, or the induced environments (maximum
Vehicle Models dynamic pressure, acceleration, etc.) to the first order. The mixture
For a launch vehicle design, propulsion parameters are not well ratio variation only affects the propellant remaining. So the off-
known at the start of the design effort. They will be better known by nominal mixture ratio value, seen on the engine test stand, can be
the time the vehicle is launched, but there will still be uncertainty on incorporated into this same vehicle modeling process for the cases
launch day. The design uncertainties will be gone by the time of where the objective is low propellant remaining. Note that it is
launch, and what remains is flight day uncertainty. Also, variations in assumed here that the vehicle is assembled with components that are
the propulsion parameters for the ensemble of engines may be better chosen randomly; that is, particular engines and other components
known for a particular engine if it is tested before flight. The are not chosen to assemble a vehicle with specific characteristics.
trajectory will be designed for the known variations. The flight per- Cherry-picking vehicle components from a warehouse in order to
formance reserve (the extra propellant needed on board to cover for obtain desired vehicle characteristics would change the statistics.
uncertainties) should be designed to cover for flight day uncer- There are three steps to the vehicle model design process. First,
tainties, not for all the uncertainty that exists early in design. calculate the partial derivatives of the desired varying outputs with
Likewise, the accuracy of the orbit insertion, the structural and respect to each of the parameters that can vary but that will be known
aerothermal loads, and other parameters will be variations from the for a given assembled vehicle, e.g., the partial of dynamic pressure
trajectory that include the uncertainties remaining at the time of flight with respect to burn rate. Next, determine the target values of each
and not the full set of early design uncertainties. output, e.g., the target dynamic pressure, heat rate, acceleration, and
The parameters that are better known for a specific vehicle on liftoff thrust/weight ratio. Finally, derive the vehicle model that uses
flight day have a true value for the vehicle that will fly, but the value is the total set of parameters and maximizes the statistical chance of
unknown during design or before choosing a specific vehicle. (This is ending up with a vehicle with those parameters that match the desired
an epistemic uncertainty in statistics, as opposed to an aleatory output targets. Each of these steps is described in more detail below.
HANSON AND BEARD 139

Partial Derivatives impact. So, other things being equal, uniformly distributed param-
The partial derivatives are derived by generating an optimized eters should be chosen at their extreme values in order to maximize
trajectory (using whatever trajectory design procedure is being used the overall vehicle likelihood. This assumes, though, that the results
for the vehicle trajectories) with a slightly modified input of the are feasible with the uniformly distributed parameter at its endpoint.
parameter in question. For payload, the partials are the partial of If there are multiple parameter targets, it is possible that the solution
payload with respect to each input parameter change (burn rate, axial requires an intermediate value for the uniformly distributed variable
force coefficient, first stage dry mass, etc.). For remaining oxygen, (which would be determined as part of the optimization process).
the partials are the partial of oxygen remaining with respect to each
input parameter change. All other parameters are held constant. The
trajectory optimization in each case is typically performed to Parameters Known and Unknown on Flight Day
maximize payload to orbit. However, if the desired partial is, say, Using this process yields a vehicle model that results from using a
dynamic pressure with respect to upper stage mass, the mass trades known engine, weighed components, and a specific value of the axial
one for one with payload with no change in dynamic pressure. The force coefficient at each flight condition. In design and for
partial in this case must be done with fixed payload, instead requirements, verification models like the heavy/slow and light/fast
maximizing remaining propellant. models cover the range of cases. The trajectory is designed with the
model in mind, and the payload can be adjusted, if desired. On flight
Generating the Target Values for the Parameters day, the winds are normally measured to help with the go/no go
decision. The temperature is also known (particularly important for
Once the partial derivatives are known, and assuming the 1- solid rocket motor performance). So, most of the wind variation and
variations of the different input parameters are also known (regard- temperature variation become known quantities and are no longer
less of distribution type), assume that the effect of the different flight day unknown. These variations can be used for the trajectory
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

parameters is independent and generate the root-sum-square design if day-of-launch trajectory design is used. Evaluation of this
combination set of parameters allows for a more informed decision as to whether
v the vehicle meets all its requirements at the time of launch. The flight
u N 
uX @z 2 performance reserve, that part of the propellant set aside to cover for
z  t  (1) the flight day uncertainties, can be reduced since these variations are
i1
@xi i
no longer unknown.
The remaining flight day uncertainties are randomly chosen in the
where z is the desired output variable (payload, maximum Monte Carlo simulation when doing the vehicle design work for
acceleration, etc.), xi are the input parameters to be chosen, and i are worst performance, structural loads, aerothermal heating, abort situ-
the 1- variations of the input parameters. To get the target value of z, ations, subsystem design, and all other integrated design areas. While
choose the sigma level corresponding to the desired probability level the vehicle is still in the design phases, the practice (for payload
(assuming the combination of the various parameters behaves as a performance studies) is to postulate a sluggish vehicle, then to
Gaussian). As an example, for a 99.73% high value, the sigma level determine the expected performance for the vehicle with challenging
would be 2.782. So, the target value of z is then winds and temperatures, and finally to show through the Monte Carlo
simulation that the flight performance reserve is sufficient to achieve
z  nz (2)
orbit. The sluggish vehicle model choice represents the manifesting
where n is the necessary sigma level. The process may be extended to likelihood (probability of being able to launch a vehicle that has been
correlated parameters, but that will not be pursued here. assembled). The challenging winds and temperatures represent the
ability to launch in a certain percentage of natural environments. The
result of the Monte Carlo simulation represents the probability of
Determining the Vehicle Model being able to take the sluggish vehicle on the challenging day and get
The problem to be solved here is to maximize the probabilistic it to orbit. The percentage of success needed for this last item is
likelihood of the vehicle model given the constraints z (target probably the highest, because it represents whether the vehicle, once
parameters, two in the heavy/slow model and four in the light/fast launched, can reach orbit, whereas the earlier analyses represent
model in the example above). This assumes the number of targeted cases where the choice of not launching is still available.
parameter values is fewer than the number of vehicle parameters that As mentioned earlier, aerodynamic parameters, such as pitching
can be varied so that the vehicle model can generate the target z moment, should be better known when the vehicle flies than they are
values for a range of possible input parameters. The optimization early in design. Also, there is an actual value of these parameters that
with constraints can be done numerically (see, for example, [7]) by likely does not change much from flight to flight. Other parameters
choosing candidate values for the xi , determining the probability that primarily affect flight control and have this same characteristic
density for each xi (no matter what its distribution type), multiplying include the vibrational modes of the vehicle and the fuel slosh
these to get the joint probability density, and then numerically parameters. All of these are epistemic parameters as opposed to the
varying the xi to maximize the joint probability density. Instead of truly randomly varying aleatory parameters. Since most of the
multiplying the probability density functions (PDFs), it is more parameters that have a significant effect on flight control are
numerically stable to take the natural logs of the PDF values and add epistemic, these can be left in the Monte Carlo simulation, and
the logs. Given the 1- impacts on z, as shown in parentheses in various combinations of these parameters are investigated for their
Eq. (1), and assuming independence of the impacts from the xi , the effect on the flight control. For flight control, it becomes a simulation
value of z for each set of candidate xi and for each target parameter of mostly epistemic parameters. This is important because the flight
may be determined. Typically this will not satisfy the constraints (the control impacts of various parameters are typically very nonlinear. (It
desired value of the z targets). Satisfying the constraints is part of the may not be clear how to select the most stressing combinations.) It
numerical optimization. One way to do this is to introduce a penalty would also be advisable to test variations in these parameters
function into the objective function such that the optimization individually in order to determine which ones have the largest impact
procedure will force the constraints to be approximately satisfied. on flight control. If parameters for which the vehicle is particularly
The user can control how closely the constraints are satisfied by sensitive were to end up at bad values, it is important that the vehicle
varying the penalty for each constraint. design still works. A Monte Carlo simulation for the remaining
Note that, in the case of a uniform distribution, all values are parameters (with the sensitive parameter set to the bad value) may be
equally likely. Thus, if the desire is a worst payload case, choosing advisable. All critical models (including those discussed in this
the maximum aerodynamic axial force coefficient gives the biggest paragraph) should be validated through appropriate tests, which
payload impact from this uniformly distributed parameter so that could potentially modify the results obtained before the testing. It
other (Gaussian) parameters do not have to contribute as much to the may also be advisable to add testing for parameters with particular
140 HANSON AND BEARD

sensitivity, so that improved estimates become available. This


approach may reduce the epistemic uncertainty, leading to more
confidence in the results.

Summary of Procedure for Performing Monte Carlo Runs


The first step in setting up Monte Carlo runs is to generate vehicle
models that represent the desired level of coverage for a vehicle that
might be assembled. This is a manifesting model; that is, it
determines the likelihood that it will be possible to use the randomly
chosen combination of parts for an actual launch during a particular
time of the year. Next, choose the Monte Carlo cases that will be
needed. These are combinations of the vehicle models, launch
months, mission destinations, and any other choices that we may end
up launching, in order to find the worst cases of desired end results
such as worst structural loads and worst fuel usage. If the time of
launch, within a month, can be a driver in the launch go/no go
decision (e.g., a low predicted propellant temperature with a
headwind impacting payload capability), then run a Monte Carlo
simulation where only these parameters are varied and the trajec-
tories are designed as would be done on launch day and choose the
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

case for the desired percentage of coverage (e.g., a cold temperature


and a headwind). Finally, run the Monte Carlo simulations with the
flight day unknowns randomly varying in order to determine the
worst-case and probabilistic values of interest. For Ares I, there are
100s of output values of interest.

Number of Monte Carlo Samples and How to Obtain


Required Percentage Numbers
Consumer and Producer Risks
Imagine a Monte Carlo run that is used to estimate the failure rate
of some component or system. To be concrete, suppose N  2000
components are subjected to a stressful duty cycle and five of them
fail. What can be said of the failure rate of the components? The
immediate temptations are to say that the empirical failure rate is
5/2000 or 0.25% and to assume that this sample is typical. But in a
second Monte Carlo simulation, a different number might randomly Fig. 1 Probabilities PBIN kjp; N for different k for three different
fail. What confidence interval can be put on this estimate? To be 90% actual failure rates (N  2000): a) p  0:24%, b) p  0:25%, and
confident that a quoted failure rate is an upper bound on the actual c) p  0:26%.
failure rate, what rate should be quoted?
The rigorous approach, called binomial failure analysis, assumes
that there is some actual failure probability p and that the 2000 binomial probability and decreases with increasing p. The cumu-
components form one Monte Carlo run out of an infinite number. For lative probability of seeing k or fewer failures is given by [9]
any given p, the probability of seeing exactly k failures in a run of size
N is given by the binomial probability formula [8] Xk  
N
  FBIN kjp; N  pj 1  pNj (4)
N j0
j
PBIN kjp; N  pk 1  pNk (3)
k
If this function of p is plotted for a given k and N, the result is the
The binomial distribution is based on the probability of failure p and curve shown in Fig. 2. The curve is called the operating characteristic
is not tied to any underlying distribution such as the normal (OC) for the sampling plan k; N. The terminology comes from the
distribution. discipline of quality control. So, for a very low actual probability of
Given the parameters, it is easy to plot this probability. Figure 1 failure, there is nearly a 100% probability of seeing k or fewer
shows the results for three failure rates near the intuitive result. failures. As the actual probability of failure increases, the chance that
Notice that any of these actual failure rates gives nearly the same
result, around PBIN  18%, which means that the expectation is that
about 18 out of every 100 runs of 2000 samples each will exhibit
exactly five failures. Also notice that k in the intervals from about two
to eight have nonnegligible probabilities PBIN . Conversely, this
means that it would be mildly coincidental if the actual failure rate
were really 0.25%, and that if the actual failure rate were really
0.25%, one should bet against getting exactly five failures out of
2000 samples.
So, what does this mean in terms of a confidence interval on p?
There is a range of actual failure rates consistent with the observed
failure fraction k=N. The standard approach to binomial failure
analysis is to compute, for a range of possible failure rates p, the
probability of getting k or fewer failures, and define the confidence in
terms of the resulting function of p. For example, for k  5, then for Fig. 2 Cumulative probability Fkjp; N as a function of actual failure
each of the three subplots in Fig. 1, imagine adding up the heights of rate if maximum accepted k  5 when N  2000. This is the probability
the columns for 0; 1; . . . ; 5 failures. The result is the cumulative of accepting the result for each actual failure probability.
HANSON AND BEARD 141

the number of failures will be k or fewer decreases until it is nearly Table 1 Combinations of number of allowable
zero and the chance of seeing more than k failures is near 100%. failures and required number of samples
Note that if the actual failure rate were 0.25%, the plot shows that N k pA , % CR, % k=N
about 62% of 2000 sample runs exhibit five or fewer failures. If there
were a strict requirement for the component failure rate not to exceed 852 0 0.27 10 0
0.25%, then a single run of 2000 samples with five failures would not 1440 1 0.27 10 0.000694
1970 2 0.27 10 0.001015
give much confidence that the components meet the requirement. 2473 3 0.27 10 0.001213
There is still a 58% chance of acceptance if p  0:26%, about a 50% 1705 0 0.135 10 0
chance if p  0:28%, and so on. In fact, the probability of getting five 2880 1 0.135 10 0.000347
or fewer failures in 2000 samples drops below 10% only when p 3941 2 0.135 10 0.000507
increases to 0.463%. (Note that this is equivalent to saying that 11410 10 0.135 10 0.000876
p  0:463% is the solution to the equation FBIN kjp; N  10% for 514 0 0.135 50 0
the given k and N.) This suggests that a (5, 2000) sampling plan is 1243 1 0.135 50 0.000805
likely to be unsatisfactory to verify a requirement like p < 0:25%. 1981 2 0.135 50 0.001010
This risk of accepting an invalid design is called CR, believing the
requirement is met when in fact it is not met.
The traditional approach to choosing a sampling plan exploits the only for CR can be designed. If the conservatism penalty is not large,
fact that two independent parameters (k and N) can be specified. This accounting only for CR is reasonable.
approach also reflects the fact that the producer and the consumer of So, what is the best way to choose a sampling plan for a given
the components have different concerns. The consumer has a consumer-required failure rate? The essential process involves the
requirement that the components have a specified not-to-exceed
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

following steps. First, specify an acceptable failure rate (a not-to-


failure rate. The producer, however, would like not to have good exceed requirement) pA . Next, choose a sampling plan k; N, in
products rejected because there is an insufficient number of samples which a design is accepted if, in a run of N samples, no more than k
(leading to a higher chance the consumer-specified failure rate will failures are encountered. Finally, compute the upper bound of the
not be met). So, the producer would typically specify an allowed probability of accepting the design if the actual failure rate exceeds
failure rate for design that makes it likely the product will pass the the requirement, i.e., if p > pA . The probability in the last step is the
test. To satisfy both parties, the producer-allowable failure rate would CR. If the CR is unacceptable for a given k; N, it is necessary to
have to be less than the consumer-specified requirement. revise the sampling plan (go to a smaller k or a larger N). Typically,
In this context, the CR for a given sampling plan k; N is the upper requirements are written with CR limited to 10% or less; that is, the
bound on the probability that components will be accepted that have verification of various aspects of system reliability and performance
greater than the consumer-required failure rate. The PR for a given is to allow no more than a 10% chance that an unsatisfactory design
sampling plan k; N is the upper bound on the probability that will be accepted because of statistical errors.
components will be rejected that have less than the producer- For any given acceptable failure rate pA and number of samples N,
allowable failure rate. CR is the risk of accepting a bad product, and specifying an upper bound on CR of 10% is equivalent to putting an
PR is the risk of rejecting a good product. upper bound on the allowable number of failures k. Conversely, for
Figure 3 shows the traditional setup of an OC that provides 10% any given acceptable failure rate pA and allowable number of failures
CR and 10% PR for consumer and producer required/allowable k, specifying CR is equivalent to putting a lower bound on the
failure rates of pC  0:250% and pP  0:125%, respectively. The number of samples N. Using Eq. (4), Table 1 shows some typical
parameters k and N, in this case (13, 7579), are selected to be the results for pA  0:27% and pA  0:135%. Notice how, in order to
minimum values that meet the goals of the sampling plan (10% risk capture the required CR, the allowable failure fraction k=N is much
for both the consumer and producer with the previously specified smaller than the target failure fraction (0.0027 and 0.00135). The
failure rates). Note that if the actual failure rate is less than pP , there is allowable number of failures is noticeably less than what one gets
a finite probability that the component will be rejected, and if the using an intuitive rule like k  NpA . Equivalent results may be
actual failure rate is greater than pC , there is a finite probability that obtained using the development in [3].
the component will be accepted. Figure 3 illustrates that the producer Consider the OC for a sampling plan based on this table. Figure 4 is
must design the component (or system) to the lower failure rate in a typical result. (For this figure, N  1970 has been rounded up to
order to be confident that it will pass the test for the consumer. N  2000.) This figure clarifies the meanings of type I and type II
However, without a separate PR requirement, the PR is the errors (PR and CR). Designs actually meet or fail to meet the
complement of the CR, PR  1  CR. Typical launch vehicle requirement if they reside to the left or to the right of the vertical line
requirement statements specify a consumer value only. Failure to at p  pA  0:27%. However, they are accepted or rejected
specify or to account for PR typically means the design will be according to whether the results of the Monte Carlo simulation lie
overconservative, since a smaller number of samples with no failures below or above the OC. The shaded area to the right and below the
might satisfy the consumer, but that means the system in reality is OC represents CR, i.e., the acceptance of a design that actually fails
much better than required. Nonetheless, sampling plans accounting to meet requirements. (Note the upper bound on CR is 10%, as

Fig. 3 Example of an OC for a sampling plan devised to provide 10% Fig. 4 OC for the (2, 2000) sampling plan if maximum accepted k  2
CR and 10% PR if maximum accepted k  13 when N  7579. when N  2000.
142 HANSON AND BEARD

promised.) The shaded area to the left and above the OC represents underlying distribution. This is important because the tails of
the PR for this situation where there is no separate specification for trajectory dispersion distributions are often not Gaussian. It also
the producer-allowable failure rate. That is, this shaded area enables accounting for CR.
represents the region where the design exceeds the requirement and To see the connection, suppose the PDF of some continuous
fails the Monte Carlo test. variable x of interest (e.g., dynamic pressure, remaining fuel,
Note that PR is not specified in Table 1, so any of these sampling acceleration, etc.) is gx, with cumulative distribution function
plans has 90% PR at the specified failure rate pA . However, the (CDF)
advantage of increasing N is that the PR at failure rates less than the Zx
specified failure rate becomes significantly less. Suppose, hypo- Gx  gx0  dx0 (5)
thetically, that the producer set a maximum allowable failure rate of 1
pP  0:20% (less than the consumer-specified rate pC 
0:27%). For the (2, 2000) sampling plan, the PR is around 75%. Consider a run of N independent samples taken from gx. Order the
But the OC shifts as N is increased. Figure 5 shows the (44, 20,000) run from least to greatest x value, numbering them 1; 2; . . . ; N. Then,
and (509, 200,000) sampling plans, for which k was selected so that the mth-order statistic is defined to be the mth smallest value in the
CR is 10%. Thus, for the (2, 2000) sampling plan, the PR at 0.20% is list, and it is a random variable with a computable PDF and CDF.
around 75%. For the (44, 20,000) sampling plan, the PR at 0.20% is (Note that m  1 corresponds to the minimum, and m  N
down to about 25%. For the (509, 200,000) sampling plan, the PR at corresponds to the maximum.) The mth-order statistic is denoted
0.20% is essentially zero. As the run size N is increased, the test xm . The CDF for xm is given by
aligns more and more closely with one specific failure rate. This is the N  
value of taking more samples. The design does not have to be as X N
Fm xjN  Gxj 1  GxNj (6)
conservative to ensure that it passes the test. These sampling plans j
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

jm
may be applied to Monte Carlo trajectory simulations to determine
whether probabilistic success requirements are satisfied. So, one This is a basic result from order statistics; see [2], for example.
generates N sample trajectories and uses these procedures to Assuming a high value in the gx tail is of interest, then for a
determine whether the proper success fraction (1  failure fraction) particular high value of x (e.g., dynamic pressure), this equation
was achieved with the required CR. gives the cumulative probability that the remaining tail values from m
to N fall below x (show up in the Monte Carlo run as successes) given
From Binomial Failures to Order Statistics that the success probability of those values is Gx. Since the vehicle
Besides determining whether requirements are met, trajectory design values are covering for values below the particular high value
dispersions are frequently used to establish extreme values for launch of x, higher values than that may be viewed as failures. From the
vehicle operating parameters; e.g., the 99.73% value of maximum binomial theorem (the sum of binomial probabilities for all possible
dynamic pressure, maximum pitch attitude error, maximum axial outcomes is equal to 1),
acceleration, low propellant remaining, etc. Simply using the N  
X N
percentile function from Excel® (or equivalent) to compute the Gxj 1  GxNj  1
99.865 percentile value from a run of 2000 trajectories will usually j0
j
underestimate the actual 99.865% value, however, and will involve
considerable CR. so that
There is a close relationship between binomial failure analysis and !
order statistics. This enables a nonparametric estimate of population X
N N
extremes, i.e., an estimate that does not depend on the specifics of the Gxj 1  GxNj
jm j
!
X
m1 N
1 Gxj 1  GxNj (7)
j0 j

The last summation is simply the cumulative binomial probability for


p  Gx, k  m  1. That is, if Gx is the probability of success,
the last summation is the cumulative probability of success for zero to
m  1 successes. So Fm xjN is the cumulative probability of
success for m to N successes and varies with x, the parameter of
interest. Using Eqs. (6) and (7) leads to
Fm xjN  1  FBIN m  1jGx; N (8)

Because of the symmetries of the binomial formula, this is equivalent


to
Fm xjN  FBIN fN  mj1  Gx; Ng (9)

which is more computationally friendly than the other form when N


is large and N  m is small. Excel has a convenient worksheet
function, BINOMDIST. This expression is equivalent to
 BINOMDISTN  m; N; 1  G; TRUE

where N, m, and G refer to the appropriate cells. However, large N or


large N  m can produce #NUM! errors. For reference, the PDF for
the mth-order statistic is the derivative of the CDF and is given by
 
N1
fm xjN  N Gxm1 1  GxNm gx (10)
Fig. 5 OCs for sampling plans a) (44, 20,000) and b) (509, 200,000). m1
HANSON AND BEARD 143

99.865% level will be estimated. One may read the graph as the
probability that success will be seen for the particular order statistic
m. If there is only a 10% chance of success, that means if success is
seen for that order statistic, then there is 90% confidence there will
actually be success.
To examine PR using the same approach, one may compute the
success rate for which there is a 10% chance that the order statistic
shows failure, i.e., the value of Gx at which the CDF Fm is equal to
90%. In Fig. 7, these levels are illustrated along with the 10% CR
levels. The 10% PR levels are 99.994 and 99.981% for the N  1705
and N  2880 sampling plans, respectively. Another way of saying
this is that the PR for the (1705, 1705) plan is 10% if the acceptable
Fig. 6 CDF for an order statistic does not depend directly on the success rate is 99.994%.
underlying distribution, only on the cumulative Gx. This graph is for A less mathematical way to estimate the impact of conservatism to
N  2000, m  1997, and CR  64:7% for 99.85% success. the design is to measure the difference between the naïve setting with
that many samples (e.g., 99.865% with 2880 samples is point 2876.1)
and the value with 10% CR (point 2879). If the output changes
The CDF Fm for the mth-order statistic is a function of the CDF significantly for these two cases, or if the difference is costly to
Gx of the underlying distribution, and it is not tied to x directly. vehicle design, then this is an indicator that more Monte Carlo
This means that order statistics are nonparametric; that is, they do not samples would be beneficial in order to reduce the necessary
depend on the parameters or type of underlying distribution. This is conservatism. This method is an indicator, but it is of course not
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

because the statistics are tied to the binomial calculations of success completely reliable since the difference between these outlier data
and failure, and not to the distribution that leads to the successes and points has randomness.
failures. Figure 6 shows a CDF and illustrates the CR as a function of Finally, the sampling plans in tables like Table 1 are spaced
probability level desired when examining point 1997 out of 2000 closely enough that interpolation between entries is feasible. A
samples, irrespective of the underlying probability distribution. The Monte Carlo simulation might produce a run of N  2000
resulting curve is valid for any underlying probability density. Notice Monte Carlo samples, and a 10% CR estimator is needed for some
that there is a 64.7% CR for the 1997=2000  0:9985 value, so a parameter such as maximum dynamic pressure. Figure 8 illustrates
more conservative success percentage is necessary in order to claim the philosophy behind the interpolation. The figure shows 10% CR
10% CR. contours versus N for constant k  N  m values. For every N and
It is desired to choose order statistics that satisfy a given k, the 10% CR point is computed, using the relations developed
benchmark success rate, such as Gx  99:865%, while main- above. Thus, the vertical axis is the 10% CR value from the CDF
taining a 10% upper bound on CR. For example, what is the 99.865% Gx for the underlying distribution (the quantity on the horizontal
highest acceleration with 10% risk (90% confidence)? It is necessary axis of Figs. 6 and 7). The horizontal lines are the benchmark levels
to choose N and m that satisfy the inequality 99.73 and 99.865% success, the success levels corresponding to
two- and one-tailed 3- levels for a standard Gaussian. Values from
Fm  FBIN N  mj1  99:865%; N < 10% (11) computed binomial failure tables are marked on the plot, e.g.,
k; N  1; 1440.
These combinations are already provided in the calculations leading The interpolation problem can be posed as follows: suppose a run
to a table equivalent to Table 1, e.g.,FBIN kj0:135%; N < 10%. of N  2000 samples is generated. It is desired to use high-end order
The equivalent order statistic is thus m  N  k. For example, the statistics to estimate the 99.865% value of maximum dynamic
entries in Table 1 for pA  0:135% show that the sampling plans pressure, with no more than 10% CR. This is ostensibly the point X in
k; N  0; 1705; 1; 2880; 2; 3941, and so on, all satisfy the Fig. 8, which lies on the line AB at N  2000 connecting the k  0
CR  10% constraint. Figure 7 shows the CDFs for the first two of and k  1 contours. How should the m  1999 and m  2000 data
these cases, illustrating how they provide 90% confidence that the points (corresponding to N  2000 and k  1, 0), denoted x1999 and
x2000 , be used to formulate this estimate? The answer is that BXC and
AXD are almost triangles, and they are nearly similar, which means
that

AX DX 2880  2000
   0:749 (12)
AB DC 2880  1705

Thus, one can use

x^ 99:865%  x1999  0:749x2000  x1999  (13)

Fig. 7 CDF for the order statistics for a) m; N  1705; 1705 and
b) (2879, 2880), showing that they both provide CR  10% for 99.865%
success. The 90% level indicated is where PR is 10%. Fig. 8 Ten percent CR contours for various sampling plans.
144 HANSON AND BEARD

as an estimator for the 99.865% high value of the parameter x procedure can also be used to determine appropriate percentile
(maximum dynamic pressure in this example). In this case, explicit settings for various parameters of interest in performing the vehicle
computation of the ordinates at points A and B shows that there is less design. The techniques presented in this paper should be applicable
than 0.05% error in the coefficient 0.749 associated with the similar to other complex system designs.
triangle approximation. Similar interpolations can be done for other
success values; for example, one can interpolate between k; N  References
2; 1970 and (3, 2473) (i.e., between x1997 and x1998 ) for the 99.73%
[1] Hanson, J. M., and Hall, C. E., “Learning About Ares I from
value for N  2000 to obtain
Monte Carlo Simulation,” AIAA Paper 2008-6622, Aug. 2008.
x^ 99:73%  x1997  0:94x1998  x1997  (14) [2] David, H. A., and Nagaraja, H. N., Order Statistics, 3rd ed., Wiley-
Interscience, New York, 2003, pp. 9–11.
[3] White, K. P., Johnson, K. L., and Creasey, R. R., “Attribute Acceptance
In the case where a minimum value is desired (such as the 99.73% Sampling as a Tool for Verifying Requirements Using Monte Carlo
low propellant remaining case), the negative of the desired quantity Simulation,” Quality Engineering, Vol. 21, No. 2, April 2009, pp. 203–
may be ordered and the same procedures used. 214.
doi:10.1080/08982110902723511
Conclusions [4] Hanson, J. M., and Beard, B. B., “Applying Monte Carlo Simulation to
Launch Vehicle Design and Requirements Analysis,” NASA TP-2010-
This paper covers two specific aspects of performing a 216447, Sept. 2010, http://ntrs.nasa.gov/archive/nasa/casi.ntrs.nasa.
Monte Carlo simulation for launch vehicle design and requirements gov/20100038453_2010042045.pdf [retrieved July 2011].
verification. The first is how to handle different types of uncer- [5] Robinson, D., and Atcitty, C., “Comparison of Quasi- and Pseudo-
tainties: those that will become known when a particular vehicle is Monte Carlo Sampling for Reliability and Uncertainty Analysis,”
assembled, those that will become known on flight day, and those that AIAA Paper 1999-1589, April 1999.
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

will remain unknown on flight day. The method involves defining [6] Coles, S., An Introduction to Statistical Modeling of Extreme Values,
Springer–Verlag, London, 2001, pp. 74–91.
challenging vehicle models that must meet the requirements, then [7] Bertsekas, D. P., Constrained Optimization and Lagrange Multiplier
defining challenging conditions for when they may need to launch, Methods, Athena Scientific, Nashua, NH, 1996, pp. 76–93.
and then running the Monte Carlo simulations for remaining flight [8] Feller, W., An Introduction to Probability Theory and Its Applications,
day uncertainties. 3rd ed., Vol. 1, Wiley, New York, 1968, p. 148.
The second topic discussed in this paper involves determining how [9] Birolini, A., Reliability Engineering: Theory and Practice, 4th ed.,
many Monte Carlo samples are necessary when trying to meet a Springer–Verlag, Berlin, 2004, pp. 240–242.
required value of success along with a required confidence level. If
results of each Monte Carlo sample are viewed as being successes or
failures, then order statistics provides a method of determining how P. Gage
many samples are needed and how many may be allowed to fail. This Associate Editor
This article has been cited by:

1. John HansonRobin M. Pinson. 2012. Calculating Launch-Vehicle Flight Performance Reserve. Journal of Spacecraft
and Rockets 49:2, 353-363. [Citation] [PDF] [PDF Plus]
Downloaded by Stanford University on October 6, 2012 | http://arc.aiaa.org | DOI: 10.2514/1.52910

You might also like