Professional Documents
Culture Documents
PROBABILITY
Lecture #1
RANDOM VARIABLES AND PROBABILITY
DISTRIBUTION
JANICE A. BULAO
Nambaran Agro-Industrial National High School
Quote of the day..
2. Recall all your first and second quarter grades in all
the different subjects, what is your assurance and your
probability that you will not be promoted to Grade 12?
Example: Consider there is a drawer containing 100 socks: 30 red, 20 blue and
50 black socks.
We can use probability to answer questions about the selection of a
random sample of these socks.
PQ1. What is the probability that we draw two blue socks or two red socks from
the drawer?
PQ2. What is the probability that we pull out three socks or have matching pair?
PQ3. What is the probability that we draw five socks and they are all black?
SQ1: A random sample of 10 socks from the drawer produced one blue, four red, five
black socks. What is the total population of black, blue or red socks in the drawer?
SQ2: We randomly sample 10 socks, and write down the number of black socks and
then return the socks to the drawer. The process is done for five times. The mean
number of socks for each of these trial is 7. What is the true number of black socks in
the drawer?
etc.
In general, when all the outcomes are equally likely, the
probability of a particular event occurs is
Probability of an event = Number of outcomes in which the event occurs
total number of possible outcomes
Values (𝒙) of
Random Variable (𝑿)
Random Variable 𝑿
The number of boys in a family of 3 children 0, 1, 2, or 3
The temperature in Baguio City on a particular day From 8.5 oC to 29 oC
The percentage of SHS students infected by COVID
From 0 % to 100 %
19 in CAR
The age of a child selected at random From 1 to 17 years
old
The number of phone calls received by a food
0, 1, 2, 3, …, 𝑛
delivery service crew in a day
CS 40003: Data Analytics 12
Activity: Say yes if it is a random variable and
no, if it’s not.
Situation Answer
1. The color of the next car to go past
my house
2. The height of a school building
selected at random
3. The number of attempts a student
needs to pass his or her NC II
assessment
4. The brand of COVID-19 vaccines
authorized and recommended by the
Food and Drug Administration (FDA)
of the Philippines
5. The price of a particular face mask
Sample Space
𝑆 = {𝐻𝑒𝑎𝑑, 𝑇𝑎𝑖𝑙} or 𝑆 = {𝐻, 𝑇}
Random Variables
(below are two examples from tossing a coin)
Continuous RV
Discrete RV
Continuous RV
Continuous RV
Discrete RV
Probability distribution
𝑿 𝑷(𝒙)
0 1/4
1 2/4 or ½
2 ¼
CS 40003: Data Analytics 26
Computing the probability of each
outcome
1. Sum of frequencies of outcomes is 4.
2. Probability of each outcome
𝑿 𝑷(𝒙)
1 0.30
2 0.25
3 0.25
4 0.05
5 0.15
The probability that the random variable takes a given value can be computed
using the rules governing probability.
For example, the probability that means either mother or father but not both has had
measles is . Symbolically, it is denoted as P(X=1) = 0.32
Example 4.3: Given that is the probability that a person (in the ages between
17 and 35) has had childhood measles. Then the probability distribution is given
by
X Probability
?
0 0.64
1 0.32
2 0.04
0.64
Example: Measles Study
0 1 2
0.64 0.32 0.04 0.32
f(x)
0.04
The use of simulation studies can often eliminate the need of costly experiments and
is also often used to study problems where actual experimentation is impossible.
Examples 4.4:
1) A study involving testing the effectiveness of a new drug, the number of cured
patients among all the patients who use such a drug approximately follows a
binomial distribution.
The function for computing the probability for the binomial probability
distribution is given by
for x = 0, 1, 2, …., n
Here, where denotes “the number of success” and denotes the number of
success in trials.
Thus,
15 38 68 39 49 54 19 79 38 14
If the value of the digit is 0 or 1, the outcome is “had childhood measles”, otherwise,
(digits 2 to 9), the outcome is “did not”.
For example, in the first pair (i.e., 15), representing a couple and for this couple, x = 1. The
frequency distribution, for this sample is
x 0 1 2
f(x)=P(X=x) 0.7 0.3 0.0
If a given trial can result in the k outcomes with probabilities then the
probability distribution of the random variables representing the number of
occurrences for in n independent trials is
where =
and
Example 4.8:
Probability of observing three red cards in 5 draws from an ordinary deck of 52
playing cards.
You draw one card, note the result and then returned to the deck of cards
Reshuffled the deck well before the next drawing is made
• The hypergeometric distribution does not require independence and is based on the
sampling done without replacement.
with
Example 4.9:
Number of clients visiting a ticket selling counter in a metro station.
1. [ is the mean ]
2. [ is the variance ]
and
2. Hypergeometric distribution
The hypergeometric distribution function is characterized with the size of a sample ,
the number of items and labelled success. Then
f(x)
x1 x2 x3 x4
X=x
Discrete Probability distribution
f(x)
X=x
Continuous Probability Distribution
When the random variable of interest can take any value in an interval, it is
called continuous random variable.
Every continuous random variable has an infinite, uncountable number of possible
values (i.e., any value in an interval)
1.
f(x)
a b
X=x
f(x)
c
A B
X=x
Note:
a)
b) )= where both and are in the interval (A,B)
f(x)
σ2
µ1
µ1 µ2
µ2 µ1 = µ2
Normal curves with µ1< µ2 and σ1 = σ2 Normal curves with µ1 = µ2 and σ1< σ2
σ1
σ2
µ1 µ2
CS 40003: Data Analytics Normal curves with µ1<µ2 and σ1<σ2 62
Properties of Normal Distribution
The curve is symmetric about a vertical axis through the mean
The random variable can take any value from
The most frequently used descriptive parameter s define the curve itself.
The mode, which is the point on the horizontal axis where the curve is a
maximum occurs at .
The total area under the curve and above the horizontal axis is equal to .
[Z-transformation]
0.09
0.4
0.08 σ σ=1
0.07
0.3
0.06
0.05
0.2
0.04
0.03
0.02 0.1
0.01
0.00 0.0
-5 0 5 10 15 20 25 -3 -2 -1 0 1 2 3
x=µ µ=0
f(x: µ, σ) f(z: 0, 1)
for
Further,
Note:
[An important property]
The continuous random variable has a gamma distribution with parameters and
such that:
1.0
σ=1, β=1
0.8
0.6
f(x)
0.4
σ=2, β=1
0.2
σ=4, β=1
0.0
0 2 4 6 8 10 12
x
Note:
1) The mean and variance of gamma distribution are
and
0.7
µ=0, σ=1
0.6
0.5
0.4
f(x)
0.3
0.2
µ=1, σ=1
0.1
0.0
0 5 10 15 20
x
The continuous random variable has a Weibull distribution with parameter and
such that.
where and
1.0
ß=1
The mean and variance of Weibull
0.8
distribution are:
0.6
f(x)
0.4
ß=2
0.2
ß=3.5
0.0
0 2 4 6 8 10 12
x