You are on page 1of 109

2

Lecture # 08
Introduction to Modeling

Disclaimer: This lecture note was edited from various sources for the solely of teaching and learning purposes. It may contain copyrighted materials from their respective
owners; therefore, apart form teaching and learning purposes, this lecture note may not be reproduced, stored, or transmitted in any form or by any means.
Introduction
3

Modeling and Simulation may be defined as


discipline of understanding and evaluating the
interaction of parts of a real or theoretical system by:
Designing its representation (model) and
Executing or running the model including the time
and space dimension (simulation).
4
Modeling is the process of developing a
mathematical representation of any dimensional
surface of object, either inanimate or living, via
specialized software.
For example, in a 3D model, it can be displayed as
a two-dimensional image through a process called
3D rendering or used in a computer simulation of
physical phenomena.
Simulation is the imitation of the operation of a
real-world process or system over time.
The act of simulating something first requires that a
model be developed; this model represents the key
characteristics or behaviors/functions of the
selected physical or abstract system or process.
The model represents the system itself, whereas the
simulation represents the operation of the system
over time.
System is a unit or process, which exists and
operates in time and space through the interaction
of its parts.
Model is a simplified representation of a real or
theoretical system at some particular point in time
or space intended to provide understanding of the
system.
http://www.silentthundermodels.com/ship_models2/images/ships/ENTERPRISECVN65
_430.jpg

http://www.flightglobal.com/blogs/wp-content/uploads/mt/
flightglobalweb/blogs/hyperbola/2012/05/10/HMS-Prins-of-Wales-
Queen-Elizabeth-class-aircraft-carrier.jpg
Simulation is used in many contexts, such as
simulation of technology for performance
optimization, safety engineering, testing, training,
education, and video games.
Often, computer experiments are used to study
simulation models.
Simulation is also used with scientific modelling of
natural systems or human systems to gain insight into
their functioning.
Model detail
Whether a model is good or not depends on the
extent to which it provides understanding.
All the models are simplification of reality: Exact
copy of a reality can only be the reality itself.
There is always a trade off as to what level of
detail is included in the model:
Too little detail: risk of missing relevant
interactions.
Too much detail: Overly complicated to
understand.
Simulation is thus the manipulation of a model in
such a way that it operates in time or space to
summarize it.
Why use simulation?
11

To understand and or to control complex stochastic


systems
Such systems are often too complex to be
understood and / or controlled using analytic or
numerical methods
Analytical Methods
can examine many decision points at once
but limited to simple models
Numerical Methods
can handle more complex models but still limited
often have to repeat computation for each
decision point
Simulation
can handle very complex and realistic systems
but has to be repeated for each decision point
Modeling the system
A good model should
Facilitate a good understanding of the system,
either analytically, numerically, or through simulation
Capture the salient details of the system, omitting
factors that are not relevant
The steps in modeling
Identify the system, including the features and state
variables that need to be included and those that
should be excluded
Make necessary probabilistic assumptions about
state variables and features
Test these assumptions
Identify the inputs and initial values of the state
variables
Identify the performance measure that we wish to
obtain from the system
Identify the mathematical relationship between the
terminal state variables and the performance
measure
Solve the model either analytically, numerically or
using simulation
Issues in simulation
What distribution do the random variables have?
How do we generate these random variables for
the simulation?
How do we analyze the output of the simulation?
How many simulation runs do we need?
How do we improve the efficiency of the simulation?
Simulation with respect to time
18

Pure Continuous Simulation


Pure Discrete Simulation
Event-oriented
Activity-oriented
Process-oriented
Combined Discrete / Continuous Simulation
Examples of both models
19

Continuous Time and Discrete Time Models:


CPU scheduling model vs. number of students
attending the class.
(a) Continuous (b) Discrete
time time
Number
CPU of
Usage students
in a
course

Time Time
(Fridays)
20
ContinuousState and Discrete State Models:
Example: Time spent by students in a weekly class
vs. Number of jobs in Q.

(b) Discrete
(a) Continuous state
state

Time Number
spent of
by jobs
students in the
queue

Time Time
(Fridays)
Other type models
21

Static
and Dynamic Models:
CPU scheduling model vs. E = mc2
Deterministic and Probabilistic Models:

Output
Output

Input Input
Simulation with respect to results
Deterministic: established or decided beyond
dispute or doubt
Stochastic: randomly determined; having a random
probability distribution or pattern that may be
analyzed statistically but may not be predicted
precisely.
Deterministic simulation
A model that does not contain probability.
Every run will result the same.
Single run is enough to evaluate the result
Stochastic simulation
A model that contains probability.
Units, process, events or their parameters are
initiated randomly using random numbers.
Ifdifferent runs are initiated with different random
number seeds, every run will result differently.
Multiple runs are required to evaluate the results.
Statistics such as averages, standard deviations
are used for evaluation.
Stochastic vs. Deterministic
26

System Model
1
Deterministic Deterministic
3

2
Stochastic Stochastic
4
Stochastic Modeling is a method in which one or
more variables within the model are random.
Monte Carlo methods can be used to study both
deterministic and stochastic problems.
For a stochastic model, it is often natural and easy
to come up with a stochastic simulation strategy due
to the stochastic nature of the model, but depending
on the question asked a deterministic method may
be used.
The use of a stochastic method is often motivated by
the fact that a deterministic method to answer the
same question is not available, that it is too
complicated to be practically useful, or that it is
computationally intractable, which is often the case
if the problem is high dimensional.
On the other hand, when a deterministic method is
applicable, it is often preferable due to the very
slow convergence of Monte Carlo methods.
No matter the reason for using Monte Carlo
methods, they will inevitably require many random
numbers.
The quality of the pseudo random generators is
crucial to the correctness of the results computed
with a stochastic algorithm, and the speed with
which they are generated is vital to the
performance of the stochastic method.
Stochastic Methods
Introduction
31

A stochastic process, or widely known as random


process, is a collection of random variables used to
represent the evolution of some random value, or
system, over time.
This is the probabilistic counterpart to a
deterministic process or deterministic system.
In deterministic system, a process can only evolve in
one way (as in the case, for example, of solutions of
an ordinary differential equation),
In a stochastic or random process there is some
indeterminacy: even if the initial condition (or
starting point) is known, there are several (often
infinitely many) directions in which the process may
evolve.
A deterministic model predicts a single outcome
from a given set of circumstances.
A stochastic model predicts a set of possible
outcomes weighted by their likelihoods, or
probabilities.
A coin flipped into the air will surely return to earth
somewhere; whether it lands heads or tails is
random.
For a "fair" coin we consider these alternatives
equally likely and assign to each the probability .
Inthe simple case of discrete time, a stochastic
process amounts to a sequence of random variables
known as a time series (for example Markov chain).
Another basic type of a stochastic process is a
random field, whose domain is a region of space, in
other words, a random function whose arguments
are drawn from a range of continuously changing
values.
Examples of processes modeled as stochastic time
series include stock market and exchange rate
fluctuations, signals such as speech, audio and
video, blood pressure or temperature, and random
movement such as Brownian motion or random walks
in a diffusion process.
Examples of random fields include static images,
random terrain (landscapes), wind waves,
composition variations of a heterogeneous material,
and manufacturing processes.
However, phenomena are not in and of themselves
inherently stochastic or deterministic.
To model a phenomenon as stochastic or
deterministic is the choice of the observer.
The choice would depend on the observer's
purpose; but the criterion for judging the choice is
usefulness.
Scientific modeling has three components:
A natural phenomenon under study,
A logical system for deducing implications about the
phenomenon, and
A connection linking the elements of the natural
system under study to the logical system used to
model it.
Classification of Models
Prescriptive/Descriptive
Prescriptive used to formulate and optimize a
system
Descriptive used to help understand the behavior
of a system
Discrete/Continuous:
Continuous models have real valued variables
Discrete models dont
Stochastic/Deterministic
Stochastic (probabilistic) includes random events
Deterministic models dont (usually expected
value)
Static/Dynamic
Static models have variables that dont change
over time (snapshot of the system in steady state)
ex. Evaluation of physical layout of a factory
Try several configurations
Evaluate performance of each
Dynamicinclude time dependent variables
ex. Queuing analysis of a bank during a whole
day
Arrival rates change
Number of tellers changes
The basic steps of stochastic modeling
Identifying the sample space;
Assigning probabilities to the elements of the
sample space;
Identifying the events of interest;
Computing the desired probabilities.
Stochastic in MatLAB
Underlying every stochastic simulation is a random
number generator: MATLAB supplies two, and from
these you can create random numbers satisfying
particular specifications.
They are rand that makes uniformly distributed
random numbers and randn that makes normally
distributed random numbers
rand(100000,1) randn(100000,1)
1200 5000

1000
4000

800
Frequency of x

Frequency of x
3000
600
2000
400

1000
200

0 0
0 0.2 0.4 0.6 0.8 1 -5 0 5
x x

Figures are histograms showing the distributions of 100,000


random numbers produced by rand (left) and randn (right)
generators.
The numbers produced by rand have a flat
histogram, indicating that all values are equally
likely.
They are distributed between a minimum value of 0
and a maximum value of 1.0.
The ones produced by randn, on the other hand,
are distributed along a classic bell-shaped curve.
To produce a single random number, type

>> x = rand
x=
0.3557
To
make arrays of random numbers,
>> x = rand(1,4)
x=
0.0806 0.2473 0.3669 0.7020
>> x = rand(2,3)
x=
0.1725 0.7048 0.9343
0.4748 0.1282 0.3119
A coin-tossing simulation
By inspecting the histogram of the uniformly
distributed random numbers, observe that half of
the values are between 0 and 0.5, and the other
half are between 0.5 and 1.0.
That is, P(0.5 > x 0) = P(1.0 > x 0.5) = 0.5.
We can use this to simulate a coin toss:
x = rand
if (x < 0.5),
toss=1 % Head
else
toss=0 % Tail
end
The expression x < 0.5 evaluates to 1 if true and 0
if false.
Therefore, we can express the coin toss more
compactly like this:
>>x = rand
toss = (x < 0.5)
Ifwe want to simulate a bunch of coin tosses, we
can do it with almost the same code.
Generate a vector of random numbers, then, the
expression x < 0.5 evaluates to a vector of 1s and
0s.
Ntoss = 100;
x = rand(1, Ntoss);
toss = (x < 0.5);
To
check how each toss goes:
>> n=numel(find(toss==0))
n=
55

>> n=numel(find(toss==1))
n=
45
Youcan make the coin toss biased.
Suppose you want P(Head) = 0.6, just change the
expression for toss to

toss = (x < 0.6);


Finally,we can create a function that will do a
prescribed number of tosses, with a coin having
P(Head) = p.
Open MatLAB editor, write this script and save it as
coin.m
function toss = coin(Ntoss, p)
x = rand(1, Ntoss);
toss = (x < p);
return;
To run this coin function, for example if you want to
see the result after 5000 tossing a coin, in your
MatLAB command window just type
>>y=coin(5000, .5)
>>h=numel(find(y==1)) % heads
>>t=numel(find(y==0)) % tails
A die-rolling simulation
Die rolling is also easy to do: create random
integers selected from the set {1, 2, 3, 4, 5, 6}, with
identical probabilities.
You can get random numbers between 0 and 6 by
scaling:
Nroll=10,000;
x = 6*rand(1, Nroll);
hist(x, 6), ... 18000

16000

xlabel('x'), ... 14000

ylabel('Frequency of x'); 12000

Frequency of x
10000

h = findobj(gca,... 8000

6000

'Type','patch'); 4000

set(h,'FaceColor',... 2000

'w','EdgeColor','b')
0 1 2 3 4 5 6
x

Histogram for Nroll=100,000


You can reduce these to integers by truncation
(throw away the fractional parts of the numbers).

x = fix(6*rand(1, Nroll)); % fix rounds toward zero


[h, xbin] = hist(x, 0:5);
stem(xbin, h), xlabel('x'), ylabel('Frequency of x');
% Makes a stem plot
18000

16000

14000

12000

Frequency of x 10000

8000

6000

4000

2000

0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
x
Now just add 1 to shift the numbers to the correct
range.
Nroll=10000;
x = fix(6*rand(1, Nroll)) + 1;
[h, xbin] = hist(x, 1:6);
stem(xbin, h) , xlabel('x'), ylabel('Frequency of x');
x1=numel(find(x==1))
18000

16000

14000

12000

Frequency of x 10000

8000

6000

4000

2000

0
1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6
x
Simulating risks and rewards
Many engineering decisions are analyzed in terms
of risks vs. benefits.
When there is a random component to the process
being studied, a simulation can be helpful.
As usual, games of chance provide simple
examples.
Example
Suppose a coin is tossed 5000 times. Each time heads
occurs, we win a dollar, otherwise we lose a dollar. Let
S(n) be our accumulated winnings after n tosses. Let us
consider how many times during the 5000 tosses S(n)
will go from a positive balance to a negative balance,
or vice versa.
Solution
The texts solution tosses the coins one at a time, and
either increments or decrements the score
depending on the outcome of each toss.
We can adapt the coin toss procedure developed
earlier.
For demonstration purposes, well use just 6 tosses.
Ntoss = 6;
u = rand(1, Ntoss)
u=
0.7382 0.1763 0.4057 0.9355 0.9169
0.4103
We want to register a score of 1 for tosses that
result in tails and +1 for tosses that result in heads.
An easy way to do this is to begin with a vector of
1s:
s = -1 * ones(1, Ntoss)
s=
-1 -1 -1 -1 -1 -1
Now, we can use the find command to identify those
turns where the toss is heads.

h = find(u<0.5)
h=
2 3 6
Change the corresponding elements of the score
vector to 1s.

s(h) = 1
s=
-1 1 1 -1 -1 1
Finally,we can use the cumsum command to
calculate our total winnings at each turn.

winnings = cumsum(s)
winnings =
-1 0 1 0 -1 0
Hereis the code for 5000 tosses, plus a graph of
the accumulated winnings for two different games.

Ntoss = 5000;
s = -1 * ones(1, Ntoss);
u = rand(1, Ntoss);
h = find(u < 0.5);
s(h) = 1;
% The two lines can be combined: s (u<0.5) = 1
winnings = cumsum(s);
plot(1:Ntoss, winnings)
xlabel('Toss')
ylabel('S(n)')
winnings(Ntoss)
40 60

20 50

40
0

30
-20
S(n)

S(n)
20
-40
10

-60
0

-80 -10

-100 -20
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
Toss Toss
Monte Carlo Methods
Introduction
76

Monte Carlo simulation is a simulation of a random


process using a large number of computer
generated samples
It is based on statistical sampling and analyzing the
outputs gives the estimate of a quantity of interest
Monte Carlo methods provide approximate
solutions to a variety of mathematical problems by
performing statistical sampling experiments.
Thus, they can be loosely defined as statistical
simulation methods, where statistical simulation is
defined in quite general terms to be any method
that utilizes sequences of random numbers to
perform the simulation.
Monte Carlo process involves performing many
simulations using random numbers and probability
to get an approximation of the answer to the
problem.
The defining characteristic of Monte Carlo methods
is its use of random numbers in its simulations.
In fact, these methods derive their collective name
from the fact that Monte Carlo, the capital of
Monaco, has many casinos and casino roulette
wheels are a good example of a random number
generator.
Key to the Monte Carlo method is the generation of
sequences of random numbers.
Example, >> randi(100,10,1) >> randi(100,10,1)
generate ans = ans =
82 16
ten random 91 98
integers in 13 96
the range 92 49
[1,100], 64 81
vertical 10 15
28 43
55 92
96 80
97 96
Generate ten random integers in the range [1,100],
horizontal
>> randi(100, 1, 10)
ans =
82 91 13 92 64 10 28 55 96 97

>> randi(100, 1, 10)


ans =
66 4 85 94 68 76 75 40 66 18
Simulating of throwing a die
Throwing a die is a random process with the
outcomes 1, 2, 3, 4, 5 and 6 occurring with equal
probability.
This can be simulated as follows:
>>randi(6,1,1) >>randi(6,1,1) >>randi(6,1,1)
ans = ans = ans =
5 6 1
60,000 simulated dice throws stored in a vector K
>>K= randi(6,60000,1);
>>hist(K,[1.0:0.1:6.0]) 12000

10000

8000

6000

4000

2000

0
0 1 2 3 4 5 6 7
Continuous random outcomes
Simulations often require continuous random
outcomes, for example:
>>X= rand(100000,1);
>>hist(X,100)
100,000 random outcomes stored in a vector X
X is distributed 1200

uniformly (except 1000


for random
fluctuations) 800

between 0 and 1. 600

Theoretically, any 400

outcome between 0
and 1 is equally 200

likely. 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Using random numbers to estimate
Populate a square with n
randomly-placed points
[the blue and red points]
-1.0 x< +1.0
-1.0 y< +1.0
Count the number of points m that lie inside a circle
of unit radius [the blue points]
Then
m/n /4 n = 100000;
x = 2*rand(n,1)-1;
4m/n y = 2*rand(n,1)-1;
m =sum(x.^2+y.^2<1);
disp(4*m/n)

>>mc_pi
3.14836
clear xc yc;
n = 10000;
x = 2*rand(n,1)-1;
y = 2*rand(n,1)-1;
j=0;
for i=1:n
if x(i)^2 + y(i)^2 < 1
j=j+1;
xc(j)=x(i);
yc(j)=y(i);
end
end
plot(x, y, 'r.') % Red dots
hold on % More plotting to come
plot(xc,yc,'b.') % Blue dots
hold off % Finished plotting
Generally a factor of 1

100 increase in n yields


0.8

a factor of 10
0.6

improvement in
0.4

0.2
accuracy.
0

This Monte Carlo -0.2

method works, but -0.4

requires very large -0.6

statistics to obtain good -0.8

accuracy -1
-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
Example 1
Findthe maximum of the following function in the
range 0 x <

= 0.5 + sin 3 2
Solution
Generate a large number of random values x in the
range 0 x < ,
Evaluate f(x) at each point, and
Record the position of the largest value.
n = 100000;
x= pi*rand(n,1); % 0 x< pi
f= x.*(0.5+exp(-x).*sin(x.^3)).^2; 1
x (0.5+exp(-x) sin(x3))2

[y, i] = max(f); 0.8

fprintf('f(%f) = %f\n', x(i),y)


0.6

0.4

0.2

f(2.990116) = 0.905360 0
0 0.5 1 1.5
x
2 2.5 3 3.5
Convergence
n
101 f(2.461499) = 0.774204
102 f(3.006002) = 0.890796
103 f(2.991390) = 0.905266
104 f(2.989944) = 0.905358
105 f(2.990159) = 0.905360
106 f(2.990135) = 0.905360
107 f(2.990134) = 0.905360
Optimization
Similarto the idea of obtaining the maximum or
minimum of a function, we sometimes wish to
optimize a system; i.e. maximize or minimize a
target quantity.
Example 2
We have a rectangular sheep enclosure (sides a and
b) constructed from 200 meters of fencing. We want
to optimize the area of the enclosure. i.e. find the
values of a and b such that the area = a b is
maximized under the constraint 2a + 2b = 200
meters (a + b = 100).
Solution
The solution to this problem is to search a =
0:0.1:100 and use max(a*b) to find the optimal
value of a.
Alternatively, in the Monte Carlo approach vector a
is a set of random values 0<a<100
n = 100000;
a = 100*rand(n,1); % 0 < a < 50
b = 100 - a; % b = 100 a
[A, i] = max(a.*b);
fprintf('a=%f, area=%f\n', a(i), A)

a=49.999780, area=2500.000000
Convergence
n
101 a=47.952550, area=2495.807948
102 a=49.578598, area=2499.822420 a=50

103 a=50.035648, area=2499.998729


A=ab
104 a=50.010014, area=2499.999900 b=50 = 2500 b=50
105 a=49.999222, area=2499.999999
106 a=50.000015, area=2500.000000 a=50

107 a=50.000008, area=2500.000000


108 a=50.000000, area=2500.000000
Example 3
Two coins and one die are 6
thrown. What is the
probability of obtaining a
Tail, Head and a 6 in
any order: P(T, H, 6)?
Tail T Head H
Solution
Throw n times and count the number of times m that
we get the outcome.
Then m/n P(T, H, 6) as n (Experimental
definition of probability requires many trials to
obtain good accuracy.)
n = 100000;
D = randi(6,n,1); % outcomes 1,2,3,4,5,6
C1 = randi(2,n,1); % outcomes 1=Tail, 2=Head
C2 = randi(2,n,1); % outcomes 1=Tail, 2=Head
m= sum( D==6 & C1~=C2 );
fprintf('1/%f = %f\n', m/n, n/m)

1/0.083390 = 11.991846
Convergence
n
101 1/0.100000 = 10.000000
102 1/0.110000 = 9.090909
103 1/0.070000 = 14.285714
104 1/0.082500 = 12.121212
105 1/0.082070 = 12.184720
106 1/0.083394 = 11.991270
107 1/0.083366 = 11.995341
Exercise
Bacteria are grown in culture dishes in a laboratory.
Experience tells us that on average in this lab 20% of
the dishes become contaminated by unwanted
bacteria (thus spoiling the culture). If the lab is
growing bacteria in ten dishes, what is the probability
that more than half of the dishes will become
contaminated? Use Monte Carlo method to solve this
problem!
Solution
We have ten dishes, P(contamination of each dish)
= 0.2
Use a Monte Carlo experiment to test each dish
against the probability of 0.2.
Repeat this n times and count the number of times m
where more than 5 dishes become contaminated.
Then m/n P(more than 5 dishes are
contaminated) as n
n=10000; m= 0;
for i=1:n
k = sum(rand(10,1)<0.2);
if k > 5 % greater than 5 dishes spoiled
m= m+1;
end
end
disp(m/n)
Convergence: Analytical calculation gives
n 0.637% (Binomial
101 0 distribution).
102 0 This method works, but
103 0.0050000 requires very large statistics
104 0.0059000 to obtain good accuracy.
105 0.0061300
106 0.0063660
107 0.0063631
References
109

E.W. Hansen: Using MATLAB for Stochastic


Simulation
H.M. Zhu, Monte Carlo Simulations, University of
New York
A. Taylan Cemgil, Monte Carlo methods,
Department of Computer Engineering, Department
of Computer Engineering, Boazii University,
Istanbul, Turkey