Professional Documents
Culture Documents
SB 2024 Lecture5
SB 2024 Lecture5
Distributions
Statistics for Business
Dr. Le Anh Tuan
1
Contents
►Binomial distributions
►Hypergeometric distributions
►Poisson distributions
2
Introduction
► Discrete data
► Values that are whole number
► If there is space on the number line between each 2
possible values
► Examples: # of books in a room, number of correct
answers, # of TVs in a class room, difference in scores
between A and B sports teams can be −2.
► Continuous data
► data that can take any value (within a range); there are no
gaps
► Example: a person's height, a dog’s weight, temperature
3
Probability Distributions
4
Random variable
► For a given sample space S of some experiment, a random
variable is a rule that associates a number with each outcome
in the sample space S.
► Notation
► Random variables - usually denoted by uppercase
letters near the end of our alphabet (e.g. X, Y).
► Particular value - use lowercase letters, such as x,
which correspond to the random variable X
5
Types of random variables
►A discrete random variable
►Have outcomes that take on whole numbers
►A finite number of values
6
Discrete Probability Distributions Rules
7
Discrete Probability Distributions Example
4 possible outcomes
Probability Distribution
T T x Value Probability
0 1/4 = .25
T H 1 2/4 = .50
2 1/4 = .25
H T Probability
.50
H H .25
0 1 2 x
8
Discrete Probability Distributions Example
► Our classroom has 6 computers.
► Let X denote the number of these computers that are in use
during weekend {0, 1, 2… 6}.
► Suppose that the probability distribution of X is as given in the
following table:
0.3
xi p(xi) 0.25
0 0.05 0.2
1 0.10
Probability
0.15
2 0.15 p(x)
3 0.25 0.1
4 0.20 0.05
5 0.15 0
6 0.10 0 1 2 3
X
4 5 6
9
What is a PDF or CDF?
► A probability distribution function (PDF) is a mathematical
function that shows the probability of each X-value.
► A cumulative distribution function (CDF) is a mathematical
function that shows the cumulative sum of probabilities, adding
from the smallest to the largest X-value, gradually approaching
unity.
PDF P(X=x) CDF (P(X<x)
1.2
0.3
0.25 1
0.2 0.8
Probability
Probability
0.15 0.6
p(x)
0.1 0.4
0.05 0.2
0 0
0 1 2 3 4 5 6 0 1 2 3 4 5 6
Value of X Value of X
10
Discrete Probability Distributions Rules
11
The Mean of Discrete Probability
Distributions
► The mean or expected value E(x) of a discrete random variable
is the sum of all X-values weighted by their respective
probabilities.
► E(X) is a measure of central tendency.
► If there are N distinct values of X, then
+
12
Calculate mean
► Let X denote the number of these computers that are in use
during weekend {0, 1, 2… 6}.
0.3
xi P(xi) xi*P(xi)
0.25
0 0.05
0.2
1 0.10
2 0.15
Probability
0.15
0.1
p(x)
3 0.25
0.05
4 0.20
0
5 0.15
0 1 2 3
X
4 5 6
6 0.10
Total 1
%
The mean (expected) number of computers ! &" '(&" )
that are in use during weekend is ______ "#$
13
The Variances of Discrete Probability
Distributions
► The variance is a measure of the spread of the individual values
around the mean of a data set.
► Variance of a discrete random variable X
,
15
The Variances of Discrete Probability
Distributions
► An equivalent shortcut formula for the variance:
*
$ % = &(,' % -(,' ) − 0%
'()
!" = $%
16
Calculate variance
► Let X denote the number of these computers that are in use
during weekend {0, 1, 2… 6}.
► ! = 3.3
xi P(xi) xi*P(xi) xi-! (xi-!)2 (xi-!)2P(xi)
0 0.05 0.00
1 0.10 0.10
2 0.15 0.30
3 0.25 0.75
4 0.20 0.80
5 0.15 0.75
6 0.10 0.60
Total 3.3
)
The variance is 2.61 computer squared
%(+& −!). /(+& )
➔ the SD = 0. 12=1.62 computer
&'(
17
Expected Money Value
► The expected money value (EMV) is the mean of a discrete
probability distribution when the discrete random variable is
expressed in terms of dollars.
► The EMV represents a long-term average, as if outcomes
from the distribution occurred many times.
► Calculate of EMV for the profits from facemasks.
Status Profit Probability
Covid-19 Increase $10,000 0.20
Normal $4000 0.50
Covid-19 Decrease $1000 0.30
Total
► EMV for the profits from facemasks is
………………………………….………………………………….
19
Probability Distributions
20
Probability Distributions
Probability
Distributions
Discrete Continuous
Probability Probability
Distributions Distributions
Binomial Uniform
Hypergeometric Normal
Poisson Exponential
21
Bernoulli Experiments
► A random experiment with only two outcomes is a
Bernoulli experiment.
► Consider only two outcomes: “success” or “failure”
► Let a denote the probability of success
► Let 1 – a be the probability of failure
► Define random variable X:
x = 1 if success, x = 0 if failure
22
Bernoulli Experiments
► The mean is µ = a
)
! = # $ = % *+ * = 0 1 − / + 1. / = /
&'(
2 3 = # $ − µ 3 = # $ − µ 3 +(*)
= 0 − / 2 1 − / + 1 − / 2/
= /(1 − /)
23
Binomial Distributions
24
Binomial Distributions
25
Binomial Distributions
26
Binomial Distributions
►When a student is randomly selected (with
replacement), there is a 0.75 probability that this
student knows how to use ChatGPT. Assume that we
want to find the probability that exactly three of four
randomly selected students know how to use ChatGPT.
a. Does this survey result in a binominal
distribution?
b. If yes, identify the values of n, x, p and q
27
Binomial Distributions
►When a student is randomly selected (with
replacement), there is a 0.75 probability that this
student knows how to use ChatGPT. Assume that we
want to find the probability that exactly three of four
randomly selected students know how to use ChatGPT.
❶ A number of trials is fixed (4 observations).
❷ The 4 trials are independent because the answer of
this student does not affected by the answer of the other
students).
❸ Each trial must have all outcomes classified into exactly
two categories: KNOW or DO NOT KNOW.
❹ The probability of a success remains the same in all
trials (0.75)
28
Binomial Distributions
►When a student is randomly selected (with
replacement), there is a 0.75 probability that this
student knows how to use ChatGPT. Assume that we
want to find the probability that exactly three of four
randomly selected students know how to use ChatGPT.
❶ A number of trials is fixed (4 observations).
❷ The 4 trials are independent because the answer of
this student does not affected by the answer of the other
students).
❸ Each trial must have all outcomes classified into exactly
two categories: YES or NO.
❹ The probability of a success remains the same in all
trials (0.75)
29
Binomial Distributions
►When a student is randomly selected (with
replacement), there is a 0.75 probability that this
student knows how to use ChatGPT. Assume that we
want to find the probability that exactly two of four
randomly selected students know how to use ChatGPT.
30
Binomial Distributions
►When a student is randomly selected (with
replacement), there is a 0.75 probability that this
student knows how to use ChatGPT. Assume that we
want to find the probability that exactly two of four
randomly selected students know how to use ChatGPT.
+!
! " = $(&, "). )".* +,- = +,- !-!
.)".* +,-
/01 " = 0, 1, 2, … , &
n = number of trials
x = number of success among n trials
p is the probability of success in one of the n trials
q=1-p is the probability of failure in one of the n trials
P(x) – the probability of getting exactly x successes
among the n trials.
32
Excel and Megastat
► Excel:
=BINOM.DIST(3,4,0.75,FALSE)
33
Tables
34
Tables
Examples:
n = 10, x = 3, P = 0.35: P(x = 3|n =10, p = 0.35) = .2522
n = 10, x = 8, P = 0.45: P(x = 8|n =10, p = 0.45) = .0229
35
Questions
►If X is binomially distributed with 6 trials and a
probability of success equal to 0.25 at each attempt,
what is the probability of:
(a) exactly 4 successes
(b) at least one success
(c) fewer than two successes
36
Binomial Distribution Mean and Variance
► For binomial distributions:
►Mean: ! = #$
►Variance: % & = #$(1 − $)
►SD: %= #$+
38
Binomial Distribution
39
Binomial Distribution Shape
► A binomial distribution
► skewed right if p < 0.50
► skewed left if p > 0.50
► symmetric only if p = 0.50
► However, skewness decreases as n increases, regardless of the
value of p.
► Notice that p = 0.20 and p = 0.80 have the same shape, except
reversed from left to right.
► This is true for any values of p and q=1 − p.
40
Binomial Distribution Shape
Binomial distribution (n = 6, p = 0.1) Binomial distribution (n = 6, p = 0.2)
0.60 0.45
0.40
0.50
0.35
0.40 0.30
0.25
P(X)
0.30
P(X)
0.20
0.20 0.15
Binomial distribution (n = 6, p = 0.5)
0.10 0.10
0.05
0.35
0.00
0.00 0.30
0 1 2 3 4 5 6
0 1 2 3 4 5 6
X
X 0.25
P(X)
0.15
0.10
► symmetric skewness
0.40 0.30
0.25
P(X)
P(X)
0.30
0.20
0.20 0.15
0.10
(p=0.5)
0.10
0.05
0.00 0.00
0 1 2 3 4 5 6 0 1 2 3 4 5 6
X X
Probability
Distributions
Discrete
Probability
Distributions
Binomial
Hypergeometric
Poisson
42
Hypergeometric distributions
43
Hypergeometric distributions
Where
N = population size
S = number of successes in the population
N – S = number of failures in the population
n = sample size
x = number of successes in the sample
n – x = number of failures in the sample
44
Hypergeometric distributions
45
Hypergeometric distributions
46
Hypergeometric distributions
► What is the probability that 0, 1, or 2 of the 3 selected iPods are
damaged?
2! 8!
%&' %() (0! 2!)(3! 5!)
! "=0 = ) = = 0.467
%*' 3!
( )
3! 7!
2! 8!
%&* %(& (1! 2!)(2! 6!)
! "=1 = ) = = 0.467
%*' 3!
(3! 7!)
2! 8!
%&& %(* (2! 2!)(1! 7!)
! "=2 = ) = = 0.066
%*' 3!
(3! 7!)
47
Excel and Megastat
► Excel
► P(X=2)
48
Megastat
! $ "1 !("1)
! "1 $ =
! $ "1 ! "1 + ! $ "2 !("2)
49
Hypergeometric distributions
*
► The (expected value) mean ! = # ∗ % &ℎ()( % = +
+12
► The standard deviation ,- = #% 1 − % ( )
+13
50
How to recognize a hypergeometric
situation?
► Look for a finite population (N) containing a known number of
successes (s)
► Sampling without replacement (n items in the sample) where
the probability of success is not constant for each sample item
drawn.
►Both the binomial and hypergeometric involve samples of
size n and treat X as the number of successes.
►The binomial sample is with replacement while the
hypergeometric sample is without replacement.
►If n/N < 0.05, it is safe to use the binomial approximation
to the hypergeometric, using sample size n and success
probability p = s/N.
51
Poisson Distribution
52
Poisson Distribution
►The Poisson distribution describes the number of
occurrences of some events over a specified interval.
►The random variable x is the number of occurrences of the
event in an interval. The interval can be time, distance,
area, volume,…
►The probability of the event occurring x times over an
interval is given by:
$% .' ()
! " = *!
where, e=2.71828, the base of the natural logarithm system
x = number of occurrences in an interval
l= expected number of occurrences in an interval
53
Poisson Distribution Characteristics
►The mean is !
►The standard deviation is " = !
►A particular Poisson distribution is determined only by
the mean !
►Unlike the binomial, X has no obvious limit, that is, the
number of events that can occur in a given unit of time
is not bounded. It is 0, 1, 2,…with no upper limit.
54
Poisson Distribution Characteristics
►The number of industrial injuries per working week in a
particular factory is known to follow a Poisson
distribution with mean λ = 0.5.
►Find the probability that
►(a) in a particular week there will be:
►(i) less than 2 accidents,
►(ii) more than 2 accidents;
►(b) in a three week period there will be no
accidents.
55
How to recognize Poisson Applications
57
Excel and Megastat
58
Poisson Table
59
Use the Poisson approximation to the
binomial
►The Poisson distribution may be used to approximate a
binomial by setting ! = np. This approximation is helpful
when the binomial calculation is difficult (e.g., when n is
large).
►The general rule for a good approximation is that n should
be “large” and p should be “small.”
►A common rule of thumb says the approximation is
adequate if " ≥ 20 and p≤0.05.
60
Exercise
► Next week: one online session for exercises
► Homework
► Mid-term exam
► 50 Multiple-choice questions, 90 minutes.
► MYISB schedule
► Closed book exam, Equation sheet is provided
► Calculators are allowed for use, but the use of laptops and
electronic devices is not permitted.
► Prepare a printed version (without any notes) of Appendix
A - Binomial Probabilities and Appendix B - Poisson
Probabilities.
61