CH04

Ch.
4 Review of Basic
Probability and Statistics
4.1 Introduction
Design the Perform statistic
simulation analyses of the
experiments simulation output data
Model a Probability Generate random

probabilistic and samples from the
system statistics input distribution
Choose the
Validate the
input probabilistic
simulation
distribution
model
4.2 Random variables and their
properties
• Experiment is a process whose outcome is not
known with certainty.
• Sample space (S) is the set of all possible
outcome of an experiment.
• Sample points are the outcomes themselves.
• Random variable (X, Y, Z) is a function that
assigns a real number to each point in the
sample space S.
• Values x, y, z
Examples
• flipping a coin
S={H, T}
• tossing a die
S={1,2,…,6}
• flipping two coins
S={(H,H), (H,T), (T,H), (T,T)}
X: the number of heads that occurs
• rolling a pair of dice
S={(1,1), (1,2), …, (6,6)}
X: the sum of the two dice
Distribution (cumulative) function
F ( x )  P( X  x ) for    x  
P( X  x ) : the probability associated with the event { X  x}
Properties:
1. 0  F ( x )  1 for all x.
2. F(x) is nondecreasing [i.e., if x1  x2 , then F ( x1 )  F ( x2 ) ].
3. lim F ( x )  1 and lim F ( x )  0 .

x  x 
Discrete random variable
A random variable X is said to be discrete if it can take on at most a
countable number of values.
The probability that X takes on the value x1

p( xi )  P( X  xi ) for i  1,2,...
Then 
 p( x )  1
i 1
i
Probability mass function on I=[a,b]

P( X  I )   p( x )
a  xi b
i
F ( x)   p( x )
xi  x
i for all    x  
Examples
p(x)
1
5/6
2/3
1/2
1/3
1/6
0 1 2 3 4 x
p(x) for the demand-size random variable X.
P(2  X  3)  p(2)  p(3)  13  13  2

3
F(x)
1
5/6
2/3
1/2
1/3
1/6
0 1 2 3 4 x
F(x) for the demand-size random variable X.

Continuous random variables
A random variable is said to be continuous if there exists a nonnegative function f(x)
such that for any set of real number B

PX  BB fxdx and  fxdx 1
f(x) is called the probability density function.
 x fydy 0
x
PX xPX  
x, x
x
 x
x
PX  
x, x x fydy
  fydy
x
FxPX  , x for all  x 
f ( x )  F ( x )
b
PX  Ia fydy FbFa I 
a, b
f(x)
P( X  [ x, x  x ])
P( X  [ x ' , x '  x])
x x  x x' x '  x x
Interpretation of the probability density function

Uniform random variable on the interval [0,1]
1 if 0 x 1
fx
0 otherwise
If 0 x 1 , then
F ( x)  0 f ( y )dy  0 1dy  x
x x
f(x) F(x)
1 1
0 1 x 0 1 x
f(x) for a uniform random variable on [0,1] F(x) for a uniform random variable on [0,1]
x
 x
x
1 if 0 x 1 PX  
x, x x fydy
fx
0 otherwise Fx xFx
x xx
x
where 0 x x x 1
Exponential random variable
F(x)
f(x)
1 1

0 x
0 x
f(x) for an exponential random F(x) for an exponential random

variable with mean  variable with mean 
Joint probability mass function
If X and Y are discrete random variables, then let
px, yPX x, Y y for all x, y
where p(x,y) is called the joint probability mass function of X and Y.
X and Y are independent if

px, yp X xp Yy for all x, y
where
p X x px, y
all y
p Yy px, y
all x
are the (marginal) probability mass functions of X and Y.
Example 4.9
Suppose that X and Y are jointly discrete random variables with
xy
27
for x 1, 2 and y 2, 3, 4
px, y
0 otherwise
Then
4
p X x xy
27
 x
3
for x=1,2
y2
2
p Yy xy
27
 y
9 for y=2,3,4
x1
x, yxy/27 p X xp Yy For all x, y, the random variables X and Y

Since p
are independent.
Joint probability density function
The random variables X and Y are jointly continuous if there exists a nonnegative
function f(x,y), such that for all sets of real numbers A and B,
X  A, Y  BB A f
P x, y
dxdy
X and Y are independent if
x, yf X 
f xf Y
y for all x and y
where

x f
f X x, y
dy

y f
f Y x, y
dx
are the (marginal) probability density functions of X and Y, respectively.
Example 4.11
Suppose that X and Y are jointly continuous random variables with
24xy for x 0, y 0, and x y 1

f
x, y
0 otherwise
Then
1x
x0 24xydy 12xy 2|1
f X 0
x 12x
1 x2 for 0 x 1
1y
y0 24xydx 12yx 2
f Y 1y
0 1 y2 for 0 y 1
12y
Since
f12 , 1
2
6 32 2 f X 12 
f Y12 
X and Y are not independent.

Mean or expected value

 x jp Xi x j if X i is discrete
Ex i i  j1

 xf Xi x if X i is continuous
The mean is one measure of central tendency in the sense that

it is the center of gravity
Examples 4.12-4.13
For the demand-size random variable, the mean is given by
 116 213 313 416  52
For the uniform random variable, the mean is given by
 0 xfxdx 0 xdx  12

1 1
Properties of means
1. E
cXcE
X
2. 
E n
c X 
i1 i i
 n
i1 i
Xi 
c E
Even if the X‘s
i are dependent.
Median x 0.5
The median x 0.5 of the random variable is defined to be the
smallest value of x such that F Xi 
x 0.50. 5
f X i (x )
area=0.5
x 0.5 x
The median x 0.5 for a continuous random variable
Example 4.14
1. Consider a discrete random variable X that takes on each of the
values, 1, 2, 3, 4, and 5 with probability 0.2. Clearly, the mean
And the median of X are 3.
2. Now consider random variable Y that takes on each of the

values, 1, 2, 3, 4, and 100 with probability 0.2. The mean
and the median of X are 22 and 3, respectively.
Note that the median is insensitive to this change in the distribution.
The median may be a better measure of central tendency than the mean.
Variance
Var( X i )   i2  E[( X i  i )2 ]  E ( X i2 )  i2  E ( X i2 )  [ E ( X i )]2
For the demand-size random variable,

X 21 21 2 21 3 21 4 21  43
E
6 3 3 6 6
VarXEX 22  43 5 2  11
6 2 12
For the uniform random variable on [0,1],

1 2
0 x 2dx  1
1
X  0 x f
E 2 xdx
3
Var X 22  1 1 2  1
XE
3 2 12
2
2 small
large
 
Density functions for continuous random variables with

large and small variances.
Properties of the variance
1. Var
X0
2. VarcXc 2VarX
 i1 Xi  i1 VarXi  if the X i‘s are
3. Var
n n
independent (or uncorrelated).

Standard deviation
i  2i
The probability that X i is between i  1.96 i and i  1.96 i is 0.95.

Covariance
Cov( X i , X j )  Cij  E[( X i  i )( X j   j )]  E ( X i X j )  i  j
The covariance between the random variables X i and X j

is a measure of their dependence.
Cij Cji
Cij Cji 2i if i=j,
Example 4.17
For the jointly continuous random variables X and Y in Example 4.11
XY 0 0 xyf
1 1x
E x, y
dydx
0 x 2
1
0 24y 2dydx
1x
0 8x 2
1
1 x3dx
 2
15
1 1
X0 xf X 
E xdx 0 12x 21 x2dx  2
5
1 1
EY0 yf Yydy 0 12y 21 y2dy  2
5
CovX, Y EXYEXEY
 2 2 2 
15 5 5
 2
75
If X i and X j are independent random variables
Cij 0
X i and X j are uncorrelated.
Generally, the converse is not true.

Correlated
If Cij 0, then X i and X j are said to be positively correlated.
X i i and X j j tend to occur together
X i i and X j j tend to occur together
If Cij  0 , then X iand X jare said to be negatively correlated.
X i  i and X j j tend to occur together
X i  i and X j j tend to occur together

Correlation
C ij i 1, 2,  , n
ij 
2i 2i j 1, 2,  , n
1 ij 1
If ij is close to +1, then X i and X j are highly positively correlated.
If ij is close to -1, then X i and X j are highly negatively correlated.
For the random variable in Example 4.11

1
Var
XVar
Y 25
CovX, Y 752
Cor
X, Y  1 2
Var
XVar Y 3
25
4.3 Simulation output data and
stochastic processes
Stochastic process is a collection of "similar" random variables
ordered over time, which are all defined on a common
sample space.
State space is the set of all possible values that these random
variables can take on.
Discrete-time stochastic process: X 1, X 2, 

Continuous-time stochastic process: X
t, t 0
Example 4.19
M/M/1 queue with IID interarrival times A1, A2, 
IID service times S 1, S 2,  FIFO service
Define the discrete-time stochastic process of delays in queue D1, D2, 

D1 0
Di1 maxDi S i Ai1, 0 for i 1, 2, 
D i and Di1 are positively correlated.
input random variables simulation output stochastic process
The state space: the set of nonnegative real numbers

Example 4.20
For the queueing system of Example 4.19,
Let Qtbe the number of customers in the queue at time t .
Then Qt, t 0is a continuous-time stochastic process with
state space 0, 1, 2,  
Covariance-stationary
Assumptions about the stochastic process are necessary to
draw inferences in practice.
A discrete-time stochastic process X1, X2,  is said to be

covariance-stationary, if
i  for i 1, 2,  and   
2i 2 for i 1, 2,  and 2 
and Ci , i  j  Cov( X i , X i  j ) is independent of i for j 1, 2,  .
Covariance-stationary process
For a covariance-stationary process, the mean and variance are
stationary over time, and the covariance between two
observations X i and X ij depends only on the separation j and
not actual time value i and i+j.
We denote the covariance and correlation between X i and X ij by
C j and j respectively, where
j   2  for j  1, 2,
Ci , i  j Cj Cj
 i2 i2 j C0
Example 4.22
Consider the output process D1, D2,  for a covariance-stationary
M/M/1 queue with /1 .
Warmup period
In general, output processes for queueing systems are
positively correlated.
If X1,X2,  is a stochastic process beginning at time 0 in

a simulation, then it is quite likely not to be
covariance-stationary.
However, for some simulation Xk1,Xk2,  will be

approximately covariance-stationary if k is large
enough, where k is the length of the warmup period.
4.4 Estimation of means, variance, and
correlations
Suppose X 1, X 2 ,, X n are IID random variables with finite
population mean  and finite population variance  2
Unbiased estimators: n
  Xi
Sample mean  E [ X ( n )]   Xn i1
n
n

X i X
n
2
Sample variance 
2
E[S 2 (n)]   2 S 2
n i1
n 1
How close X n  is to  ?
Density function for X (n )
X X

First observation Second observation
of X (n ) of X (n )
How close X n  is to 
to construct a confidence interval
n
XnVarn1
Var  Xi 
i1
n
 1
Var X i  (because the X i ’s are independent)
n2
i1
n
 1
n2
 VarX i 
i1
2
 1
n2  n
n2
Unbiased estimator
n

X i Xn
2
S 2n
Xn  n 
Var i1
nn 1
Density function Density function
for X n for X n
n large
n small
 
Distributions of X nfor small and large n.

Estimate the variance of the sample
mean VarX n
.
X i´s are independent X i´s are uncorrelated j  0
However, the simulation output data are almost always correlated.
X 1 , X 2 ,, X n are from a covariance-stationary stochastic process,

Then, X nis an unbiased estimator of  , however, S 2nis no
longer an unbiased estimator of 2 . Since
n1
1 j/nj
j1
ES 2n 2 1 2
n 1
However, simulation output data are always correlated. Since
n1
1 j/nj
j1 (1)
ES 2n  2 1 2
n 1
j 0 E ( S 2 (n))   2
For a covariance-stationary process:

n1
1 2 1 j/n
j
j1
VarX 
n 2 n (2)
If one estimates VarX nfrom S 2
n/n (correct in the IID case)
there are two errors:
nas an estimator of 2 .
• the bias in S 2
• the negligence of the correlation terms in Eq. (2).
Solution: combine Eq. (1) and Eq. (2)
S 2
n 
n/an1
E n  VarX 
n (3)
n 1
n1
an 1 2 1 j/nj
j1
If j 0, then a S 2
n1 and E n X
/nVar .
n
Example 4.24
D1, D2,  , D10 from the process of delays for a covariance-
stationary M/M/1 queue with 0. 9 . Eq.(1) and (3)
S 2
E  0. 03282
10
S 210
E 0. 0034Var
D 
10
10
10 10
 Di 
Di D10
2
2 VarDi  D10 i10

1
S 210 i1
9
Thus, S 210 /10 is a gross underestimate of Var D , and we are

10
likely to be overly optimistic about the closeness of D10to  E Di 
Estimate j .
nj

X i X n
X ij X n

Ĉj

j  , Ĉj  i1
S n
2 n j
In general "good" estimates of the j 's will be difficult to

obtain unless n is very large and j is small relative to n.
4.5.1 Confidence Intervals
Z n  [ X (n)   ] /  2 / n
Fn ( z )  P( Zn  z )
4.5.1 Confidence Intervals
Central Limit Theorem: Fn z z as n   , where z,
the distribution function of a normal random variable with
 0 and 2 1, is given by

z 1 z e y 2 /2dy for  z 
2 
If n is "sufficiently large", the random variable Z n will be

approximately distributed as a standard normal random variable,
regardless of the underlying distribution of the X i 's. For large n,
the sample mean Xnis approximately distributed as a normal
random variable with mean  and variance 2/n.
Z n  [ X (n)   ] /  2 / n
Xn
t n  / S 2n/n
Xn
Pz 1/2  z 1/2
S n/n
2
S 2n
P
Xnz 1/2 n
S 2n
 Xnz 1/2 n 
1 
where z 1/2 ( 0  1) is the upper 1 /2 critical point for
a standard normal random variable.
f(x)
Shaded area = 1  a
 z1a / 2 0 z1a / 2 x
1 percent
If n is sufficiently large, an approximate 100
confidence interval for  is given by
S 2n
nz 1/2
X n confidence interval
Interpretation I: If one constructs a very large number of
independent 100 1 percent confidence intervals, each
based on n observations, where n is sufficiently large, the
proportion of these confidence intervals that contains (cover)
 should be 1 .
Interpretation II: If the X i's are normal random variables, the

random variable t n  Xn / S 2
n/n has a t distribution
with n-1 degree of freedom (df), and an exact (for any n 2)
100 1 percent confidence interval for  is given by
S 2n
nt n1,1/2
X n
Where t n1,1/2 is the upper 1 /2 critical point for the t
distribution with n-1 df
f(x) Standard normal distribution
t distribution with 4df
0 x
Figure 4.16
Density function for the t distribution with 4df
and for the standard normal distribution.
Example 4.26
10 observations: 1.20, 1.50, 1.68, 1.89, 0.95, 1.49, 1.58, 1.55,
0.50, 1.09 are from a normal distribution,
To construct a 90% confidence interval for  .

101. 34 S 2
X 100. 17
S 210
10t 9,0.95
X 1. 34 1. 83 0. 17 1. 34 0. 24
10 10
Table 4.1 Estimated coverages based on 500 experiments
Distribution Skewness v n=5 n=10 n=20 n=40
Normal 0.00 0.910 0.902 0.898 0.900
Exponential 2.00 0.854 0.878 0.870 0.890
Chi Square 2.83 0.810 0.830 0.848 0.890
Lognormal 6.18 0.758 0.768 0.842 0.852
Hyperexponential 6.43 0.584 0.586 0.682 0.774

EX 3
v  v 

2 3/2
4.5.2 Hypothesis tests for the mean
H 0 :  0
If |Xn0 | is large, H 0 is not likely to be true.
If H 0 is true, the statistic t n 

Xn0
/ S 2
n/n will
have a t distribution with n-1 df.
t n1,1/2 reject H0 H0
If |t n |
t n1,1/2 "accept" H0
Example 4.27
For Example 4.26,
To test the null hypothesis H 0 that  1 at level  0. 10 .
101
X 0. 34
t 10   2. 65 1. 83 t 9,0.95
S 2
10
/10 0. 17/10
We reject H 0 .
4.6 The Strong Law of Large Numbers
Theorem 4.2 n  w.p. 1 as n  .

X
Example 4.29
Tarea II: Teoría de la probabilitad y estatística
A.M. Law and W.D. Kelton, Simulation, Modeling and

Analysis, 3rd edition, pp. 261-263.
Problems 4.1, 4.2, 4.4, 4.7, 4.9, 4.10, 4.13, 4.20, 4.21, 4.23,
4.24, 4.25, 4.26

CH04

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH04

Uploaded by

Copyright:

Available Formats

Ch.

Model a Probability Generate random

P( X  x ) : the probability associated with the event { X  x}

2. F(x) is nondecreasing [i.e., if x1  x2 , then F ( x1 )  F ( x2 ) ].

3. lim F ( x )  1 and lim F ( x )  0 .

The probability that X takes on the value x1

Probability mass function on I=[a,b]

p(x) for the demand-size random variable X.

P(2  X  3)  p(2)  p(3)  13  13  2

F(x) for the demand-size random variable X.

f(x) is called the probability density function.

P( X  [ x ' , x '  x])

Interpretation of the probability density function

f(x) for an exponential random F(x) for an exponential random

px, yPX x, Y y for all x, y

where p(x,y) is called the joint probability mass function of X and Y.

X and Y are independent if

x, yxy/27 p X xp Yy For all x, y, the random variables X and Y

X and Y are independent if

24xy for x 0, y 0, and x y 1

X and Y are not independent.

The mean is one measure of central tendency in the sense that

 116 213 313 416  52

For the uniform random variable, the mean is given by

 0 xfxdx 0 xdx  12

2. Now consider random variable Y that takes on each of the

Note that the median is insensitive to this change in the distribution.

For the demand-size random variable,

Density functions for continuous random variables with

independent (or uncorrelated).

The probability that X i is between i  1.96 i and i  1.96 i is 0.95.

The covariance between the random variables X i and X j

X i and X j are uncorrelated.

Generally, the converse is not true.

X i i and X j j tend to occur together

X i i and X j j tend to occur together

If Cij  0 , then X iand X jare said to be negatively correlated.

X i  i and X j j tend to occur together

X i  i and X j j tend to occur together

If ij is close to +1, then X i and X j are highly positively correlated.

If ij is close to -1, then X i and X j are highly negatively correlated.

For the random variable in Example 4.11

Discrete-time stochastic process: X 1, X 2, 

IID service times S 1, S 2,  FIFO service

Define the discrete-time stochastic process of delays in queue D1, D2, 

D i and Di1 are positively correlated.

input random variables simulation output stochastic process

The state space: the set of nonnegative real numbers

A discrete-time stochastic process X1, X2,  is said to be

If X1,X2,  is a stochastic process beginning at time 0 in

However, for some simulation Xk1,Xk2,  will be

Density function for X (n )

Distributions of X nfor small and large n.

However, the simulation output data are almost always correlated.

X 1 , X 2 ,, X n are from a covariance-stationary stochastic process,

For a covariance-stationary process:

Solution: combine Eq. (1) and Eq. (2)

2 VarDi  D10 i10

Thus, S 210 /10 is a gross underestimate of Var D , and we are

In general "good" estimates of the j 's will be difficult to

If n is "sufficiently large", the random variable Z n will be

Interpretation II: If the X i's are normal random variables, the

To construct a 90% confidence interval for  .