Professional Documents
Culture Documents
7COM1073
➢Sample space:
➢E: getting two heads
➢Size of the sample space
➢Number of times E occurs
Number of times E occuring
➢𝑃(𝐸) =
size of the sample space
Discrete Random Variables
Informally, a random variable is a map from the outcome space (𝛺) to
the real numbers.
▪ 𝑋 (1,1) = 2, 𝑋 (2,3) = 5
Discrete Random Variables
Event defined by random variables
𝑓𝑋 𝑥𝑘 = 𝑃(𝑋 = 𝑥𝑘 )
( 𝑥𝑘 − 𝜇𝑋 )2 𝑓𝑋 𝑥𝑘 𝑋: 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝜎𝑥2 = ∞
𝑘
න (𝑥 − 𝜇𝑋 )2 𝑓𝑋 𝑥 𝑑𝑥 𝑋: 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠
−∞
The variance should be regarded as the average of the difference of the actual values from
the average.
Example
Three products are selected at random from 9 products, of which 2 are
defective. The sample space consists of the distinct, equally likely
samples of size 3. Let 𝑋 be the random variable which counts the
number of defective items in a sample. The values of 𝑋 are 0, 1, and 2.
What is the expected value of defective products in a sample of size 3?
𝐸(𝑋) = σ𝑘 𝑥𝑘 𝑓𝑋 𝑥𝑘 𝑋: 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒
𝑛 𝑛!
𝐶 𝑛, 𝑘 = 𝑘
=
𝑘! 𝑛−𝑘 !
Example
Three products are selected at random from 9 products, of which 2 are defective. The
sample space consists of the distinct, equally likely samples of size 3. Let 𝑋 be the random
variable which counts the number of defective items in a sample. The values of 𝑋 are
0, 1, and 2. What is the expected value of defective products in a sample of size 3?
• Solution:
➢ The number of ways of choosing 𝑥𝑖 defectives from 2 defectives and choosing
3 − 𝑥𝑖 nondefectives from 7 nondefectives is : 𝑥2 3−𝑥7
𝑖 𝑖
9
➢ The total number of possible outcomes is 3
➢ The probability of the value 𝑥𝑖 of 𝑋 is
2 7 9
𝑝𝑖 = 𝑥𝑖 3−𝑥𝑖
/ 3
(𝑥𝑖 =0, 1, 2)
➢ 𝐸(𝑋) = σ𝑖 𝑥𝑖 𝑝𝑖 𝑋: 𝑑𝑖𝑠𝑐𝑟𝑒𝑡𝑒
Example
Letting 𝑋 be a random variable. Consider its distribution function on
the interval [0, 1] has the probability density function
0 if 𝑥 < 0 or 𝑥 > 1
𝑓𝑋 𝑥 = ቊ
1 if 0 ≤ 𝑥 ≤ 1
Compute 𝐸 𝑋 .
Solution
∞
•𝐸 𝑋 = −∞ 𝑥𝑓𝑋 𝑥 𝑑𝑥
0 1 ∞
= −∞ 𝑥 × 0𝑑𝑥 + 0 𝑥 × 1𝑑𝑥 + 1 𝑥 × 0𝑑𝑥
1 2 1
= 0 + 𝑥 |0 + 0
2
1
=
2
The Discrete Uniform Distribution
Interval [𝑎, 𝑏]
𝑛 =𝑏−𝑎+1
1
𝑓𝑋 𝑥 = ቐ 𝑎≤𝑥≤𝑏
𝑛
0 Otherwise
𝑎+𝑏
𝜇𝑋 = 𝐸 𝑋 = 2
𝑛2 −1
𝜎𝑥2 = 𝑉𝑎𝑟 𝑋 = 12
Example:
For the special distributions, you just need to know Suppose we throw a die. Let 𝑋 be the
how to use formulas to get the mean and the random variable denoting the obtained
variance. You need not to derive it by yourself. number.
The Bernoulli Distribution
• If 𝑋 is a random variable with this distribution
𝑃 𝑋=1 =𝑝
𝑃(𝑋 = 0) = 1 − 𝑝
The probability mass function 𝑓of this distribution is:
𝑝 𝑖𝑓 𝑋 = 1
𝑓 𝑋; 𝑝 = ቊ
𝑞 = 1−𝑝 𝑖𝑓 𝑋 = 0
Credit: http://cmdlinetips.com/2018/03/probability-distributions-in-python/
The Bernoulli Distribution
𝜇𝑋 = 𝐸 𝑋 = 𝑝
𝜎𝑋2 = 𝑉𝑎𝑟 𝑋 = 𝑝(1 − 𝑝)
𝑃 𝑋=𝑘 =
𝜇𝑥 = 𝐸 𝑋 = 𝐸 𝑋1 + ⋯ + 𝑋𝑛 = 𝑝 + ⋯ + 𝑝 = 𝑛𝑝
Examples:
• The number of defective/non-defective
products in a production run.
• Yes/No survey
• The number of successful sales calls.
The Binomial Distribution
For example: run 20 independent experiments, each
having a Bernoulli distribution with parameter p=0.6.
Repeat the whole procedure 10000 times.
𝑃 𝑋=𝑘 =
𝜇𝑥 = 𝐸 𝑋 = 𝑛𝑝
For example, for X = 12, that is, we set 𝑘 to 12, then we have:
20!
𝑃 𝑋 = 12 = 0.612 (1 − 0.6)20−12 ≈ 0.1797
12! 20 − 12 !
…})
Example
• 90% of all students pass the module
• A sample of 10 new students is selected
• Find the probability that exactly seven will pass
10!
𝑃 𝑋=7 = 0.907 (1 − 0.90)10−7 = 5.74%
7! 10 − 7 !
The Poisson Distribution
𝜆 𝑘
𝑃 𝑋 = 𝑘 events in an interval = 𝑒 −𝜆 𝑘 = 0, 1, ⋯
𝑘!
𝜆> 0, is the average number of events per interval
e is the number 2.71828
𝜇𝑋 = 𝐸 𝑋 = 𝜆
𝜎𝑥2 = 𝑉𝑎𝑟 𝑋 = 𝜆
Applications:
➢The number of customers entering a supermarket during various intervals of time.
➢The number of misprints on a page of a document.
The Poisson Distribution
𝜆 𝑘
𝑃 𝑋 = 𝑘 events in an interval = 𝑒 −𝜆 𝑘 = 0, 1, ⋯
𝑘!
Generate 20,000 random numbers following the Poisson distribution with 𝜆= 0.4
0
0.4
For example, 𝑃 𝑋 ≥ 1 = 1 − 𝑃 𝑋 = 0 = 1 − 𝑒 −0.4 ≈ 0.3297
0!
Example
Example: A doctor was able to see 3 patients an hour on average. Find
the probability that she can see 5 patients the next hour
5
3
𝑃 𝑘 = 5 = 𝑒 −3 = 0.1008
5!
The Law of Large Numbers
The law of large numbers states that if we repeat a procedure over and
over, the relative frequency probability will approach the actual
probability.
➢Sample space:
➢E: getting two heads
➢Size of the sample space
➢Number of times E occurs
Number of times E occurs
➢𝑃(𝐸) =
size of the sample space
Credit:
http://cmdlinetips.com/2018/03/probability-distributions-in-python/
The Continuous Uniform Distribution
Example: Credit:
http://resources.esri.com/help/9.3/arcgisengine/java/gp_t
oolref/process_simulations_sensitivity_analysis_and_error
Student’s height can take any value within a range _analysis_modeling/distributions_for_assigning_random_
values.htm
The Gaussian or Normal Distribution
A random variable 𝑋 is distributed normally with mean
𝜇 and variance 𝜎 2 if its density is
We write
𝜇𝑋 = 𝐸 𝑋 = 𝜇
𝜎𝑥2 = 𝑉𝑎𝑟 𝑋 = 𝜎 2
Examples: https://studiousguy.com/real-life-examples-normal-distribution/
The Gaussian or Normal Distribution
Credit: https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html
The Normal Distribution
Credit:
https://www.mathsisfun.com/data/standard-
normal-distribution.html
Z-scores
• Measure how far away a single data point is from the mean
𝑥−𝑥ҧ 𝑥−𝜇
𝑧= or 𝑧 =
𝑠 𝜎
➢𝑥 is the data point
➢𝑥ҧ is the sample mean; 𝜇 is the population mean
➢𝑠 is the sample standard deviation;
➢𝜎 is the population standard deviation Credit:
http://www.z-table.com/
• Table: areas under the standard normal curve to the left of Z.
Example
Cumulative distribution function: gives us the area under the probability density function for the interval negative
infinity to x.
We can see: 1) the probability of x taking on values less than -2.5 is nearly 0;
2) the values sampled from x will mostly be less than 2.5.
Central Limit Theorem [3]
Let 𝑋1,⋯, 𝑋𝑛 be 𝑛 independent random variables, each of which has
mean 𝜇 and standard deviation 𝜎. Let 𝑌 = (𝑋1 + ⋯ + 𝑋𝑛 )/𝑛 be the
average; thus, 𝑌 has mean 𝜇 and standard deviation 𝜎/ 𝑛. If 𝑛 is large,
then the cumulative distribution of 𝑌 is very nearly equal to the
cumulative distribution of the Gaussian with mean 𝜇 and standard
deviation 𝜎/ 𝑛.
Applications:
https://en.wikipedia.org/wiki/Central_limit_theorem#Applications_and_examples
Compound Events
• The probability of either 𝐴 or 𝐵 occurs
𝑃 𝐴 ∪ 𝐵 = 𝑃 𝐴 𝑜𝑟 𝐵
have cancer B
𝐴∩𝐵
A
• 𝑃(𝐴 ∩ 𝐵) =??
• 𝑃 𝐴 ∪ 𝐵 =? ?
Example [1]
• 100 people who showed up for a new test for cancer
• Event A: people actually have cancer
• Event B: people’s test result was positive (it claimed that they had cancer)
• The number of event A: 25
• The number of event B: 30
• Among 30 people whose test Sample space
results were positive, 20 actually
𝐴∩𝐵
have cancer
B
20 𝐴∩𝐵
• 𝑃 𝐴∩𝐵 = = 20% A
100
35
• 𝑃 𝐴∪𝐵 = = 35%
100
The Rules of Probability - The Addition Rule
• The probability of someone either has cancer or has the positive test result
𝑃 𝐴∪𝐵 =𝑃 𝐴 +𝑃 𝐵 −𝑃 𝐴∩𝐵 =𝑃 𝐴 +𝑃 𝐵
• Example:
Considering a module, students pass the module and students fail
the module are mutually exclusive.
Conditional Probability
𝑃 𝐴 >0
Problem: A math teacher gave her class two tests. 25% of the
class passed both tests and 42% of the class passed the first
test. What percent of those who passed the first test also
passed the second test?
Credit: https://www.mathgoodies.com/lessons/vol6/conditional
See: http://setosa.io/ev/conditional-probability/
Exercise
• Flipping a fair coin three times.
• Let 𝐴 be the event that the first flip is a head.
• Let 𝐵 be the event of getting exactly two heads.
Compute 𝑃(𝐵|𝐴)
Solution
Let 𝛺 be the sample space of all eight outcomes of flipping a coin three times, and
1
let 𝑃 𝑋 = 8 .
𝐴 ∩ 𝐵 = 𝐻𝐻𝑇, 𝐻𝑇𝐻
2/8 1
𝑃 𝐵𝐴 = =
4/8 2
Exercise
Sarah has 2 children. You learn that she has a son, Mark. What is the
probability that Mark’s sibling is a brother?
𝑃(𝐴∩𝐵)
Conditional probability: 𝑃 𝐵𝐴 =
𝑃(𝐴)
➢What is the sample space?
➢What is the event A?
➢What is the event B?
References
[1] Chapters 5-6 in Principles of Data Science, Sinan Ozdemir, 2016
Packt Publishing
[2] Probability Theory, Fundamentals of Machine Learning (Part 1) by
William Fleshman
[3] Linear Algebra and Probability for Computer Science Applications by
Ernest Davis, 2012.
[4] Schaum’s Outlines Probability, Random Variables, & Random
Processes by Hwei Hsu, 1997