You are on page 1of 35

Temas/semanas 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Introduction of S&M
*

Introduction to
probability and ** ***
statistic
Generation of
First
random numbers Review
and Monte Carlo
Markov Simulation
and Discret Event
Modelling
Beer Game Second
Review
System Dynamics
Modelling
System Dynamics
Modelling
Agent Based Diserta Final
Delivery
Modelling tion review

*Quiz_1 (Concepts)
**Workshop_2 Introduction to Simulation * Note: It’s important in these classes that you
***Workshop_3 Probability
share knowledge with classmates and attend.
Outline

Introduction to probability and statistic

• Conceptualisation: Randomness and


probability.
• Markov Models
• Applications
Introduction to probability and statistic
Conceptualisation: Randomness and Probability

Remember: Simulation is a numerical technique to reproduce phenomena in different scenarios.

Ejemplo: Acertijo colegial de un animal que cae en un pozo de 30 metros de profundidad a las 6 de la mañana.
Sube 3 metros en el día, pero en la noche cae 2 metros (6 p.m. – 6 a.m.) ¿Cuándo sale a la superficie?
Conceptualisation: Randomness and Probability

Workshop_1

El tiempo entre la ciudad A y B es de 2 horas, mientras que el recorrido en sentido inverso toma 2,5
horas. Se debe tener un servicio cada hora, a la hora en punto (de las 6 a las 16 horas) entre cada
ciudad. El conductor del vehículo debe descansar media hora cuando termina un viaje.

Obtenga a través de la simulación, la mínima cantidad de vehículos necesarios para cumplir con este
plan.
Conceptualisation: Randomness and Probability

Hora salida Hora llegada Hora salida Hora llegada Sitio Hora
Vehículo A B B A Disponible Disponible
1 6 8 B 8,5
2 6 8,5 A 9
3 7 9 B 9,5
4 7 9,5 A 10
5 8 10 B 10,5
6 8 10,5 A 11
2 9 11 B 11,5
1 9 11,5 A 12
4 10 12 B 12,5
3 10 12,5 A 13
Conceptualisation: Randomness and Probability
Conceptualisation: Randomness and Probability

In simulation experiments there is a need to generate values of random variables that represent a
certain probability distribution. Some characteristics of random numbers are the following:

 Uniformly Distributed
 Statistically Independent
 playable
 Long period (no repeats within a given length of the sequence)
 Generated through a quick method.
 Generated through a method that does not require much storage capacity.

Additional Material:

https://www.youtube.com/watch?v=5Lwf1BWgMow
Conceptualisation: Randomness and Probability

Random Number Generation Methods

1. Mean squares method


2. Congruential Methods
3. Outdated records method

[Seed - Algorithm - Validation]


P1 : Get seed (initial values)
P2 : Application of recursive algorithms
P3 : Validation of the generated data set (Random Test)
Conceptualisation: Randomness and Probability

1. Mean squares method

Example: Give a seed 445

X X2 N° Random
445 1| 9802 | 5 0,9802
9802 96| 0792 | 04 0,0792
792 6 | 2726 | 4 0,2726
2726 ............... ...............
Conceptualisation: Randomness and Probability

Workshop_2

Generate random numbers with the following seeds using the mean squares method:

a. 3892
b. 8364
c. 2749
d. 5357
e. 7600
f. 4583
g. 5555
Conceptualisation: Randomness and Probability

2. Congruential Methods
Example in Excel

The parameters are as following: Xn+1 = (a Xn + b) mod m

a: multiplier
Xn
b: bias Un 
m: modulus m
Xo: seed (initial value)
Conceptualisation: Randomness and Probability

Workshop_3

Generate random numbers with the following seed using the congruential method:

1. Seed: 22
a: 33
b: 9
m: 444
2. Seed: 4364
a: 45
b: 34
m: 3146
Conceptualisation: Randomness and Probability

Digital Random Generation

1. sample(1:30,10,replace=F)
2. [1] 4 27 3 19 29 9 21 13 23 20

1. runif(5, min=3, max=4)


2. [1] 3.858063 3.425381 3.454964 3.719711 3.287005
Conceptualisation: Randomness and Probability

Accumulated
(Distribution Function)

Continuous

Type of Discrete Functions of the


random behaviour of a
variables Random Variable
Punctual (Density Function)
Empirical

Theoretical
Conceptualisation: Randomness and Probability
Uniform
Exponential
gamma
Continuous Distribution Weibull
Normal
Normal – Logarithmic
Beta
Triangular

Poisson
Bernoulli
Discrete Distribution
Binomial
Discreet Uniform
Conceptualisation: Randomness and Probability

Uniform: unif
Exponential: exp
Continuous Distribution Gamma: gamma
Weibull: weibull
Normal: norm
T-Student: t

Poisson: pois
Discrete Distribution
Binomial: nbinom
Conceptualisation: Randomness and Probability

1. rbinom(15,10,0.6)
2. plot(dbinom(0:10,10,0.6),type="h",xlab="k",ylab="P(X=k)",main="Función de Probabilidad B(10,0.6)")
Conceptualisation: Randomness and Probability

1. X=rnorm(10000, 170, 12)


2. hist(X,freq=FALSE,col="lightsalmon",main="Histograma",sub="Datos simulados de una N(170,12)")
3. curve(dnorm(x,170,12),xlim=c(110,220),col="blue",lwd=2,add=TRUE)
Conceptualisation: Goodness-of-fit-tests

Goodness-of-fit tests compare the observed frequency with the expected


frequency in each class.

H0: f(x) = f0(x)

The data fit the considered distribution

H1: f(x) f0(x)

The data does not fit the considered distribution


Conceptualisation: Goodness-of-fit-tests

Kolmogorov Smirnov

Continuous Distribution Anderson Darling

Shapiro Wilk

Discrete Distribution Chi-square


Conceptualisation: Goodness-of-fit-tests –
Kolmogorov Smirnov

You need to check whether it is correct to assume that the fill volume (in ounces) of a
juice vending machine follows a normal distribution, so 25 bottles are drawn at
random. The fill volume data obtained from the sample is stored in the volume
vector.
volumen <-
c(8.39,12.14,11.80,12.04,7.34,12.62,11.51,12.4
7,11.08,14.32,11.33,11.56,12.79,11.72,12.84,1
1.73,12.14,11.88,11.95,10.84,11.79,13.21,12.5
6,12.55,12.80)

H0: the filling volume follows a normal distribution.

H1: the filling volume does not follow a normal distribution.

Significance level: 0.05 (Hypothetical).


Conceptualisation: Goodness-of-fit-tests –
Kolmogorov Smirnov
require(car)
par(mfrow=c(1,3))
hist(volumen, xlab = "Volumen de llenado", ylab = "Frecuencia", las=1, main = "", col = "gray")
plot(density(volumen), xlab = "Volumen de llenado", ylab = "Densidad", las=1, main = "")
qqPlot(volumen, xlab="Cuantiles teóricos", ylab="Cuantiles muestrales", las=1,main="")

Steps

1. Update package (car)


2. Verify histogram
3. Plot density
4. Plot quantiles
Conceptualisation: Goodness-of-fit-tests –
Kolmogorov Smirnov

require(MASS)
Ajusten<-fitdistr(volumen, "normal") Calculate Hypothetical Distribution
Ajusten

mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)

Ksn<-ks.test(volumen, "pnorm", mean =Ajusten$estimate[1], sd= Ajusten$estimate[2])


Ksn
One-sample Kolmogorov-Smirnov test

data: volumen
D = 0.21198, p-value = 0.2112
alternative hypothesis: two-sided
Conceptualisation: Goodness-of-fit-tests – Anderson-
Darling
require(MASS)
Ajusten<-fitdistr(volumen, "normal")
Ajusten Calculate Hypothetical Distribution

mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)

require(goftest)
Adn<-ad.test(volumen, "pnorm", mean =Ajusten$estimate[1], sd= Ajusten$estimate[2])
Adn
Anderson-Darling test of goodness-of-fit
Null hypothesis: Normal distribution
with parameters mean = 11.816, sd = 1.3755958708865
Parameters assumed to be fixed

data: volumen
An = 1.6222, p-value = 0.1501
Conceptualisation: Goodness-of-fit-tests – Shapiro-
Wilk
require(MASS)
Ajusten<-fitdistr(volumen, "normal")
Ajusten Calculate Hypothetical Distribution

mean sd
11.8160000 1.3755959
( 0.2751192) ( 0.1945386)
Conclusion:
Swn<-shapiro.test(volumen) According to the p-value of the Kolmogorov Smirnov (0.2112)
Swn and Anderson Darling (0.1501) tests, the filling volume follows a
normal distribution, since these values are greater than the level
of significance used in this test (0.05). However, when analysing
Shapiro-Wilk normality test the Shapiro-Wilk test, it is observed that the p-value (0.00043) is
lower than the level of significance, therefore the null
data: volumen hypothesis would be rejected and it would be concluded, with a
W = 0.81611, p-value = 0.0004272 confidence level of 95%, that the data does not follow a normal
distribution and would give meaning to what was observed in the
exploratory analysis.
Conceptualisation: Goodness-of-fit-tests
Workshop_4

On the other hand, to take action on the current filling process, the quality department takes 15
samples each of size 8 and records the number of bottles that do not meet the volume specifications
stipulated by the department, these data are given in the rejected vector. However, the department
finds it more useful to analyse the proportion of bottles that do not meet specifications and wants to
verify that this variable does indeed follow a normal distribution using a significance level of 0.01.

A hypothesis test can be formulated as follows:

H0: the proportion of bottles rejected follows a normal distribution.

H1: the proportion of bottles rejected does not follow a normal distribution.

rechazadas<-c(5,1,3,3,1,4,2,2,6,4,2,3,4,3,5)
proporcion<-rechazadas/8 proporcion
Conceptualisation: Goodness-of-fit-tests – Chi-square

In a Bank, 20 hours were analysed and the data associated with the number of people who enter the
bank per hour are defined in the users vector.

usuarios<-c(50,42,60,39,44,54,48,44,43,50,66,62,43,50,45,43,47,46,52,55)

H0: The number of people entering the bank per hour follows a Poisson distribution.

H1: The number of people entering the bank per hour does not follow a Poisson distribution.
Conceptualisation: Goodness-of-fit-tests – Chi-square

par(mfrow=c(1,2))
hist(usuarios, xlab = "Usuarios", ylab = "Frecuencia", las=1, main = "", col = "gray")
plot(density(usuarios), xlab = "Usuarios", ylab = "Densidad", las=1, main = "")
Conceptualisation: Goodness-of-fit-tests – Chi-square

Steps
require(vcd)
gf<-goodfit(usuarios, type = "poisson", method = "MinChisq") 1. Update package (vcd)
gf$par 2. Call goodfit function
summary(gf)

Results:
$lambda
[1] 49.63808
Goodness-of-fit test for poisson distribution

X^2 df P(> X^2)


Pearson 25.89737 65 0.9999964
Conclusion:

Arrives rate: 49.63 persons per hour


Give 0.999 > 0.05 then the data follow a Poisson distribution
Conceptualisation: Goodness-of-fit-tests
Workshop_5
To improve customer service, the bank's customer service staff also measured the time it
takes for a user to enter the bank until he or she leaves, with the aim of improving the
efficiency of service times and reducing queues that are formed in the bank. To do this, 20
people were randomly selected and the times (in minutes) associated with each one are
found in the time vector

tiempo<-c(9.2,10.5,10.8,12.3,14.8,15.6,16.1,19.5,22.4,24.7,25.3,26.5,29.9,30.3,30.7,35.6,46.5,50.4,68.2,90.6)

H0: the time a user spends in the bank follows an exponential distribution.
Verify the following hypotheses H1: The time a user spends in the bank follows an exponential distribution.
and determine what distribution
function have a better fit.
H0: the time a user spends in the bank follows a Weibull distribution.

H1: the time a user spends in the bank follows a Weibull distribution.
Workshop Final Project
Markov Chains
Markov Chains

A Markov chain is a probabilistic model describing a system that changes from state to state, and in
which the probability of the system being in a certain state at a certain time step depends only on
the state of the preceding time step. The probability that the j is the next state of the chain, given
that the current state is state i, is called the transition probability from i to j.

Additional Reading:

https://ciencias.medellin.unal.edu.co/cursos/algebra-lineal/clases/8-clases/25-
clase-23-aplicaciones-cadenas-de-markov.html
Thanks!

You might also like