Subject - 02 (Two Weeks)

Temas/semanas 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Introduction of S&M
*
Introduction to
probability and ** ***
statistic
Generation of
First
random numbers Review
and Monte Carlo
Markov Simulation
and Discret Event
Modelling
Beer Game Second
Review
System Dynamics
Modelling
System Dynamics
Modelling
Agent Based Diserta Final
Delivery
Modelling tion review
*Quiz_1 (Concepts)
**Workshop_2 Introduction to Simulation * Note: It’s important in these classes that you
***Workshop_3 Probability
share knowledge with classmates and attend.
Outline
Introduction to probability and statistic
• Conceptualisation: Randomness and

probability.
• Markov Models
• Applications
Introduction to probability and statistic
Conceptualisation: Randomness and Probability
Remember: Simulation is a numerical technique to reproduce phenomena in different scenarios.
Ejemplo: Acertijo colegial de un animal que cae en un pozo de 30 metros de profundidad a las 6 de la mañana.
Sube 3 metros en el día, pero en la noche cae 2 metros (6 p.m. – 6 a.m.) ¿Cuándo sale a la superficie?
Workshop_1
El tiempo entre la ciudad A y B es de 2 horas, mientras que el recorrido en sentido inverso toma 2,5
horas. Se debe tener un servicio cada hora, a la hora en punto (de las 6 a las 16 horas) entre cada
ciudad. El conductor del vehículo debe descansar media hora cuando termina un viaje.
Obtenga a través de la simulación, la mínima cantidad de vehículos necesarios para cumplir con este
plan.
Hora salida Hora llegada Hora salida Hora llegada Sitio Hora
Vehículo A B B A Disponible Disponible
1 6 8 B 8,5
2 6 8,5 A 9
3 7 9 B 9,5
4 7 9,5 A 10
5 8 10 B 10,5
6 8 10,5 A 11
2 9 11 B 11,5
1 9 11,5 A 12
4 10 12 B 12,5
3 10 12,5 A 13
In simulation experiments there is a need to generate values of random variables that represent a
certain probability distribution. Some characteristics of random numbers are the following:
 Uniformly Distributed
 Statistically Independent
 playable
 Long period (no repeats within a given length of the sequence)
 Generated through a quick method.
 Generated through a method that does not require much storage capacity.
Additional Material:
https://www.youtube.com/watch?v=5Lwf1BWgMow
Random Number Generation Methods
1. Mean squares method

2. Congruential Methods
3. Outdated records method
[Seed - Algorithm - Validation]

P1 : Get seed (initial values)
P2 : Application of recursive algorithms
P3 : Validation of the generated data set (Random Test)
1. Mean squares method
Example: Give a seed 445
X X2 N° Random
445 1| 9802 | 5 0,9802
9802 96| 0792 | 04 0,0792
792 6 | 2726 | 4 0,2726
2726 ............... ...............
Workshop_2
Generate random numbers with the following seeds using the mean squares method:
a. 3892
b. 8364
c. 2749
d. 5357
e. 7600
f. 4583
g. 5555
2. Congruential Methods
Example in Excel
The parameters are as following: Xn+1 = (a Xn + b) mod m
a: multiplier
Xn
b: bias Un 
m: modulus m
Xo: seed (initial value)
Workshop_3
Generate random numbers with the following seed using the congruential method:
1. Seed: 22
a: 33
b: 9
m: 444
2. Seed: 4364
a: 45
b: 34
m: 3146
Digital Random Generation
1. sample(1:30,10,replace=F)
2. [1] 4 27 3 19 29 9 21 13 23 20
1. runif(5, min=3, max=4)

2. [1] 3.858063 3.425381 3.454964 3.719711 3.287005
Accumulated
(Distribution Function)
Continuous
Type of Discrete Functions of the

random behaviour of a
variables Random Variable
Punctual (Density Function)
Empirical
Theoretical
Uniform
Exponential
gamma
Continuous Distribution Weibull
Normal
Normal – Logarithmic
Beta
Triangular
Poisson
Bernoulli
Discrete Distribution
Binomial
Discreet Uniform
Uniform: unif
Exponential: exp
Continuous Distribution Gamma: gamma
Weibull: weibull
Normal: norm
T-Student: t
Poisson: pois
Discrete Distribution
Binomial: nbinom
1. rbinom(15,10,0.6)
2. plot(dbinom(0:10,10,0.6),type="h",xlab="k",ylab="P(X=k)",main="Función de Probabilidad B(10,0.6)")
1. X=rnorm(10000, 170, 12)

2. hist(X,freq=FALSE,col="lightsalmon",main="Histograma",sub="Datos simulados de una N(170,12)")
3. curve(dnorm(x,170,12),xlim=c(110,220),col="blue",lwd=2,add=TRUE)
Conceptualisation: Goodness-of-fit-tests
Goodness-of-fit tests compare the observed frequency with the expected

frequency in each class.
H0: f(x) = f0(x)
The data fit the considered distribution
H1: f(x) f0(x)
The data does not fit the considered distribution

Kolmogorov Smirnov
Continuous Distribution Anderson Darling
Shapiro Wilk
Discrete Distribution Chi-square

Conceptualisation: Goodness-of-fit-tests –
Kolmogorov Smirnov
You need to check whether it is correct to assume that the fill volume (in ounces) of a
juice vending machine follows a normal distribution, so 25 bottles are drawn at
random. The fill volume data obtained from the sample is stored in the volume
vector.
volumen <-
c(8.39,12.14,11.80,12.04,7.34,12.62,11.51,12.4
7,11.08,14.32,11.33,11.56,12.79,11.72,12.84,1
1.73,12.14,11.88,11.95,10.84,11.79,13.21,12.5
6,12.55,12.80)
H0: the filling volume follows a normal distribution.
H1: the filling volume does not follow a normal distribution.
Significance level: 0.05 (Hypothetical).

Kolmogorov Smirnov
require(car)
par(mfrow=c(1,3))
hist(volumen, xlab = "Volumen de llenado", ylab = "Frecuencia", las=1, main = "", col = "gray")
plot(density(volumen), xlab = "Volumen de llenado", ylab = "Densidad", las=1, main = "")
qqPlot(volumen, xlab="Cuantiles teóricos", ylab="Cuantiles muestrales", las=1,main="")
Steps
1. Update package (car)

2. Verify histogram
3. Plot density
4. Plot quantiles
Kolmogorov Smirnov
require(MASS)
Ajusten<-fitdistr(volumen, "normal") Calculate Hypothetical Distribution
Ajusten
mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)
Ksn<-ks.test(volumen, "pnorm", mean =Ajusten$estimate[1], sd= Ajusten$estimate[2])

Ksn
One-sample Kolmogorov-Smirnov test
data: volumen
D = 0.21198, p-value = 0.2112
alternative hypothesis: two-sided
Conceptualisation: Goodness-of-fit-tests – Anderson-
Darling
require(MASS)
Ajusten<-fitdistr(volumen, "normal")
Ajusten Calculate Hypothetical Distribution
mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)
require(goftest)
Adn<-ad.test(volumen, "pnorm", mean =Ajusten$estimate[1], sd= Ajusten$estimate[2])
Adn
Anderson-Darling test of goodness-of-fit
Null hypothesis: Normal distribution
with parameters mean = 11.816, sd = 1.3755958708865
Parameters assumed to be fixed
data: volumen
An = 1.6222, p-value = 0.1501
Conceptualisation: Goodness-of-fit-tests – Shapiro-
Wilk
require(MASS)
Ajusten<-fitdistr(volumen, "normal")
Ajusten Calculate Hypothetical Distribution
mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)
Conclusion:
Swn<-shapiro.test(volumen) According to the p-value of the Kolmogorov Smirnov (0.2112)
Swn and Anderson Darling (0.1501) tests, the filling volume follows a
normal distribution, since these values are greater than the level
of significance used in this test (0.05). However, when analysing
Shapiro-Wilk normality test the Shapiro-Wilk test, it is observed that the p-value (0.00043) is
lower than the level of significance, therefore the null
data: volumen hypothesis would be rejected and it would be concluded, with a
W = 0.81611, p-value = 0.0004272 confidence level of 95%, that the data does not follow a normal
distribution and would give meaning to what was observed in the
exploratory analysis.
Workshop_4
On the other hand, to take action on the current filling process, the quality department takes 15
samples each of size 8 and records the number of bottles that do not meet the volume specifications
stipulated by the department, these data are given in the rejected vector. However, the department
finds it more useful to analyse the proportion of bottles that do not meet specifications and wants to
verify that this variable does indeed follow a normal distribution using a significance level of 0.01.
A hypothesis test can be formulated as follows:
H0: the proportion of bottles rejected follows a normal distribution.
H1: the proportion of bottles rejected does not follow a normal distribution.
rechazadas<-c(5,1,3,3,1,4,2,2,6,4,2,3,4,3,5)
proporcion<-rechazadas/8 proporcion
Conceptualisation: Goodness-of-fit-tests – Chi-square
In a Bank, 20 hours were analysed and the data associated with the number of people who enter the
bank per hour are defined in the users vector.
usuarios<-c(50,42,60,39,44,54,48,44,43,50,66,62,43,50,45,43,47,46,52,55)
H0: The number of people entering the bank per hour follows a Poisson distribution.
H1: The number of people entering the bank per hour does not follow a Poisson distribution.
par(mfrow=c(1,2))
hist(usuarios, xlab = "Usuarios", ylab = "Frecuencia", las=1, main = "", col = "gray")
plot(density(usuarios), xlab = "Usuarios", ylab = "Densidad", las=1, main = "")
Steps
require(vcd)
gf<-goodfit(usuarios, type = "poisson", method = "MinChisq") 1. Update package (vcd)
gf$par 2. Call goodfit function
summary(gf)
Results:
$lambda
[1] 49.63808
Goodness-of-fit test for poisson distribution
X^2 df P(> X^2)

Pearson 25.89737 65 0.9999964
Conclusion:
Arrives rate: 49.63 persons per hour

Give 0.999 > 0.05 then the data follow a Poisson distribution
Workshop_5
To improve customer service, the bank's customer service staff also measured the time it
takes for a user to enter the bank until he or she leaves, with the aim of improving the
efficiency of service times and reducing queues that are formed in the bank. To do this, 20
people were randomly selected and the times (in minutes) associated with each one are
found in the time vector
tiempo<-c(9.2,10.5,10.8,12.3,14.8,15.6,16.1,19.5,22.4,24.7,25.3,26.5,29.9,30.3,30.7,35.6,46.5,50.4,68.2,90.6)
H0: the time a user spends in the bank follows an exponential distribution.
Verify the following hypotheses H1: The time a user spends in the bank follows an exponential distribution.
and determine what distribution
function have a better fit.
H0: the time a user spends in the bank follows a Weibull distribution.
H1: the time a user spends in the bank follows a Weibull distribution.
Workshop Final Project
Markov Chains
Markov Chains
A Markov chain is a probabilistic model describing a system that changes from state to state, and in
which the probability of the system being in a certain state at a certain time step depends only on
the state of the preceding time step. The probability that the j is the next state of the chain, given
that the current state is state i, is called the transition probability from i to j.
Additional Reading:
https://ciencias.medellin.unal.edu.co/cursos/algebra-lineal/clases/8-clases/25-
clase-23-aplicaciones-cadenas-de-markov.html
Thanks!

Subject - 02 (Two Weeks)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Subject - 02 (Two Weeks)

Uploaded by

Copyright:

Available Formats

Temas/semanas 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Introduction to probability and statistic

• Conceptualisation: Randomness and

Remember: Simulation is a numerical technique to reproduce phenomena in different scenarios.

Random Number Generation Methods

1. Mean squares method

[Seed - Algorithm - Validation]

1. Mean squares method

Example: Give a seed 445

The parameters are as following: Xn+1 = (a Xn + b) mod m

Digital Random Generation

1. runif(5, min=3, max=4)

Type of Discrete Functions of the

1. X=rnorm(10000, 170, 12)

Goodness-of-fit tests compare the observed frequency with the expected

H0: f(x) = f0(x)

The data fit the considered distribution

H1: f(x) f0(x)

The data does not fit the considered distribution

Continuous Distribution Anderson Darling

Discrete Distribution Chi-square

H0: the filling volume follows a normal distribution.

H1: the filling volume does not follow a normal distribution.

Significance level: 0.05 (Hypothetical).

1. Update package (car)

Ksn<-ks.test(volumen, "pnorm", mean =Ajusten$estimate[1], sd= Ajusten$estimate[2])

A hypothesis test can be formulated as follows:

H0: the proportion of bottles rejected follows a normal distribution.

X^2 df P(> X^2)

Arrives rate: 49.63 persons per hour

You might also like