You are on page 1of 78

Probability distributions

Random outcomes (variables)


• Random outcomes (variables) are results of
trials or observations, that have a numeric
value, but only one at a time.
• During each trial or observation a different
random outcome can occur based on
random factors.
Certain
Discrete
numbers
Possible
outcomes
Any value in
Continuous
an interval
Discrete and continuous random
outcomes
• A discrete random outcome is one for which
the number of possible outcomes can be
counted, for each possible outcome, there is
a measurable probability.
• A continuous random outcome is one for
which the number of possible outcomes is
infinite, even if bounds exist.
Statistics for random outcomes

Mean, variation, moments, skew,


kurtosis
Expected value
• Expected value is the average of all possible
random outcomes
• The expected value of a random variable is
the long-run average value of repetitions of


the experiment

EX  M  X 
or or

[mŷː]
(Greek alphabet)
Expected value for discrete random
outcomes
• Sum of the products of all x and p

𝐸 𝑋 = 𝜇 = ෍ 𝑥𝑖 𝑝𝑖
𝑖=1
Expected value for continuous random
outcomes

𝐸 𝑋 = න 𝑥𝑓 𝑥 𝑑𝑥
−∞
Median
• Me divides the distribution in half.
• Median is the random outcome value X, for
which the probability of outcome being
lower is equal with the probability of
outcome being larger.
• The median is the midpoint of all
observations, when they are arranged in
ascending or descending order by
probability.
Median for discrete random outcomes

• For discrete random outcomes Me  xk


i

when  pi reach 0,5.


Median for continuous random
outcomes

Mc 

 f  x dx   f  x dx  0,5


 Mc
Mode
• Mo is the most-probable value of outcome X,
outcome with the highest probability
• For discrete random outcomes Mo = xi ,
where pi = max,
• For continuous random outcomes Mo = xi, if
f(xi) = max
Variance
• Variance of a distribution D(X) measures
average deviation of X from M(X) in squared
measurements.
2
𝑉𝑎𝑟 𝑋 = 𝐸 𝑋 − 𝜇
Variance
• Discrete distributions
𝑛

𝑉𝑎𝑟 𝑋 = ෍ 𝑥𝑖 − 𝜇 2 𝑝𝑖
𝑖=1

• Continuous distributions

𝑉𝑎𝑟 𝑋 = න 𝑥 − 𝜇 2 𝑓 𝑥 𝑑𝑥
−∞
Standard deviation
• σ(X) is the average squared deviation of X
from M(X) in X measurements.

𝜎 𝑋 = 𝑉𝑎𝑟 𝑋
“sigma”
(Greek alphabet)
Moments
• Moments
n
M k   x pi k
i
i 1
• Central moments

 
n
mk    xi  M  X  pi
k

i 1
Skewness and kurtosis statistics
• Structural skewness statistic (approximate)

𝜇 − 𝑀𝑒
𝐴=
𝜇 − 𝑀𝑜
• Skewness coefficient (more precise)
m3
K3  3
 X 
Skewness and kurtosis statistics
• K3 = 0 – symmetric distribution
• K3 < 0 – negative skew
• K3 > 0 – positive skew
Skewness and kurtosis statistics
• Kurtosis coefficient – concentration of
outcomes near the average.

m4
E 4 3
 X 
Common probability
distributions
Discrete probability distributions

Uniform distribution
Binomial distribution
Geometric distribution
Poisson distribution
Uniform distribution
• Random outcome x can be from 1 to n and is
distributed uniformly, if

P m  
1
n
Geometric distribution
• A geometric distribution can be observed
when repeating n independent trials until
the event A (p = const) occurs. The random
outcome is the number of trials.

Pn k   q k 1
p
Geometric distribution
• Cumulative function

Pn k   1  q k
Geometric distribution
• Expected value
1
𝐸 𝑋 =
𝑝
• Variance
𝑞
𝐷 𝑋 = 2
𝑝
Geometric distribution
Binomial distribution
• Binomial distribution is the distribution of
number m of events occurring during n
trials based on the Bernulli formula.

Pn m  C p q m
n
m nm
Binomial distribution
• Expected value

𝐸 𝑋 = 𝑛𝑝
• Variance

𝐷 𝑋 = 𝑛𝑝𝑞
Binomial distribution
• Skewness coefficient
q p
K3 
npq
• Kurtosis coefficient
1  6 pq
E
npq
Binomial distribution
Poisson distribution
• If the number of trails is large and the
probability is small, then the probability of
m events occurring during n trials can be
calculated by the Poisson distribution
formula.

n p 0
Poisson distribution formula

  np
“lambda”
(Greek alphabet)

“Lambda” – expected value


Poisson distribution formula

 m
Pn m   e 

m!
Calculated values in table
“Poisson distribution probabilities”
(estudijas.lu.lv)
Poisson distribution
• Expected value

𝐸 𝑋 =𝜆
• Variance

𝐷 𝑋 =𝜆
Poisson distribution
• Skewness coefficient
1
K3 
• Kurtosis coefficient

1
E

Continuous distributions

Uniform distribution
Normal distribution
Laplace formula
Log-normal distribution
Uniform distribution
Uniform distribution
• Continuous random outcome X is uniformly
distributed in a range [a,b], if the differential
distribution function f(x) is constant in this
range.

0, if X  a

f  x   1 b  a , if a  X  b
0, if X  b

Uniform distribution
• Probability of random outcome being in
range [a, b] is:
a
P(a  X  b)   f ( x)dx  1
b
Uniform distribution
• Integral distribution F(x) in range [a, b] is:

0, if X  a

F  x   ( x  a ) b  a , if a  X  b
1, if X  b

Uniform distribution
• Probability, that a random outcome will be
equal to a value in the range [α, β] is:

 
P (  X   ) 
ba
Uniform distribution
• Statistics
– Expected value and median

𝑎+𝑏
𝐸 𝑋 = 𝑀𝑒 =
2
Uniform distribution
– Variance

D( x) 
b  a
2

12
– Standard deviation

ba
 ( x) 
2 3
Uniform distribution
– 4th central moment

m4 
b  a
4

80
– Skewness coefficient

K3  0
– Kurtosis coefficient

E  1,2
Uniform distribution
• Application
– Random number generation
• Gambling
• Monte-Carlo and other simulations
• Quality control
Normal (Gaussian) distribution
Wine consumption per capita at legal
drinking age, 2013
Income distribution in Canada, China
and France (log scale)
Income distribution in Australian
states, 2011
S&P 500 monthly price change
distribution, 1950 – 2009
Temperature anomaly distribution relative to
average temperature, 1951 – 2011
Normal distribution
• Differential function f(x) for random
outcome X:

  x  M  X  2

1 2 X 
f ( x) 
2
e
  X  2
Normal distribution
Normal distribution
• Since all possible outcomes of a random
event are unknown, normal distribution
parameters M(x) and σ(x) can’t be assessed
precisely. They are substituted with results
of trials or observations, using average and
standard deviation s.
Normal distribution
• Standardized values of random outcome X

xx
t
s
Normal distribution

Point probability Interval probability

Differential function of Integral function of the


the normal distribution normal distribution

P(t) = “table” or “three-


P(t) = “table”/ s
sigma rule”
Differential function of the normal
distribution
Differential function of the normal
distribution
• Table value divided with standard deviation

or

t 2

f t  
1 2
e
s 2
Differential function of the normal
distribution
• Some properties:

f t   f  t 
Me  Mo  x
Parameters of the normal distribution

• If the average changes, the form of the curve


stays the same, but moves on the x axis.
• If the standard deviation changes, the form
of the curve changes. Higher s means flatter
curve.
Integral function of the normal
distribution

F t 
Integral function of the normal
distribution

 t 
“psi”
(Greek alphabet)
Integral function of the normal
distribution

t 
“phi”
(Greek alphabet)
Integral function of the normal
distribution

 t     t 
 t   2 t 
F  t   F t 
Local Laplace formula
• If event A has a constant probability during
independent trials, probability Pn(m) can be
calculated with Bernulli formula.

M  X   np
  X   npq
Local Laplace formula

m  np
t
npq
Local Laplace formula
• Integral random outcomes

m1  np m2  np
t1  t2 
npq npq
Log-normal distribution
Log-normal distribution
• Many real-life events distribute similarly to
the normal distribution, but with a string
skewness. Use of normal distribution laws
results in errors.
• Errors can be lessened by using log-normal
distribution.
Log-normal distribution


 ln x  ln x 
2

f x  
1 2
2 S ln x
e
S ln x 2
Log-normal distribution

S = 0,2

S = 0,5
S=1
Other distributions
• Chi-squared
• Gamma distribution
• Pareto distribution
• Exponential distribution
• etc.
Law of large numbers
• Large number of random factors leads to
results, that are almost not dependent on
randomness
Law of large numbers
• Chebyshew theorem:
With a sufficiently large sample there will
be a very high probability that the average
of the observations will be close to the
expected value.
Law of large numbers
• Bernoulli theorem:
If during each of n independent trials
probability of event A is constant, then at a
large number of trials (n→∞) there is a very
high possibility of probability and relative
frequency being the same.
• 2.34. Manufacturing plant is creating components with a normally distributed
size. The ordered size is 10 mm, the standard deviation of the plant is 0.1 mm.
Find the probability, that a random component will have a size within these
parameters:
1) between 9.95 mm and 10.05 mm;
2) between 9.94 mm and 10.02 mm;
3) between 10.01 mm and 10,08 mm;
4) between 9.93 mm and 9.99 mm.
• 2.39. Quality control department is doing test measurements. Error of the
measurement x is normally distributed. Previous measurements indicate, that
the standart deviation of measurement errors is 10 mm and the device is
usually correct. What is the probability, that the measurement will be done
with an error:
1) between 5 mm and 10 mm;
2) of more than 15 mm?
• 2.41. The average harvest of wheat in a region is 135 cnt/ha, S = 20 cnt/ha. The
harvest distribution is normally distributed. What is the probability, that a
random farm has a harvest:
1) precisely 135 cnt/ha;
2) precisely 200 cnt/ha;
3) between 120 and 145 cnt/ha;
4) between 135 and 150 cnt/ha;
5) between 100 and 120 cnt/ha.

You might also like