You are on page 1of 61

STAT 130

Nonparametric Statistics

Lecture 1
Probability Distribution and
Special Discrete PDs

LIZA T. BILLONES, MSc

Department of Physical Sciences and Mathematics


College of Arts and Sciences
University of the Philippines Manila
Population, Experiment, Outcomes, Sample Space

Population: The set of all subjects under consideration


Example: Students in this class: Carlos, Chloe, Crissa, Cris

Experiment: A procedure carried out under a certain set of conditions


that can be repeated under the same set of conditions
Outcomes: Results of the experiment
Sample Space (S): Set of all the possible outcomes

Examples:
Population, Experiment, Outcomes, Sample Space
Examples:
Experiment 1. Selecting a student from this class (observe sex)
Sample Space: {M, F}
Experiment 2: Selecting 2 students from this class (observe sex)
Sample Space: {MM, MF, FM, FF}
Experiment 3. Tossing a coin
Sample Space: {H, T}
Experiment 4. Tossing two coins
Sample Space: {HH, HT, TH, TT}
Experiment 5: Rolling a die
Sample Space: {1, 2, 3, 4, 5, 6}
Experiment 6: Rolling 2 dice
Sample Space: {11, 12, 13, 14, 15, 16
21, 22, …, 31, 32, …, 66}
Sample Space, Random Variable, Probability
Experiment. Selecting a student from a class of 10 students, observe sex
Random variable (X). A function that maps the outcomes to numerical
quantities (real numbers)
Define: X = number of males
Probability, P(x). A function that maps the values of X to [0, 1]

S [0, 1]
M 1 0.1
M 1 0.1
F 0 0.1
F 0 0.1
M 1 0.1
M 1 0.1
M 1 0.1
F 0 0.1
M 1 0.1
F 0 0.1

X P(X)
Probability Distribution: Example
Probability distribution. Listing of all possible values of X together
with the corresponding probabilities

The probability
distribution
of the random variable X:
S [0, 1]
M
M
1
1
0.1
0.1
S X P(X)
F 0 0.1
F 0 0.1
M 1 0.1
M
M
1
1
0.1
0.1
M 1 0.6
F 0 0.1
M 1 0.1
F 0 0.1
F 0 0.4
X P(X)
Probability Distribution: Example
Experiment: Flip a coin 2 times (or flipping 2 coins)
S: {HH, HT, TH, TT}
X: Number of tails
x: 0, 1, 2

S X P(X) The probability


distribution
HH 0 1/4 of the random variable X:
HT 1 1/4
TH 1 1/4
TT 2 1/4
Probability Distributions
Discrete: Probability mass function or pmf
Assigns probabilities (masses) to individual outcomes

Continuous: Probability density function or pdf


Assigns probabilities to range of outcomes (density)

Cumulative Distribution Function (cdf) or distribution


function:
Total probability assigned to all values less than a
given value
PMF vs CDF
PMF vs CDF: Example 1
Experiment: Rolling 2 Dice (Red/Green)
Y = sum of the up faces
S = {2, 3, 4, ... 12}
PMF vs CDF: Example 2
Experiment: Rolling 2 Dice

PMF: p(y)
PMF vs CDF: Example 2

CDF: F(y)
Mean, Variance, and SD
Mean (Expected Value): Average value of an RV (or
function of RV)

Variance: Average squared deviation between a


realization of a RV (or function of RV) and its
mean

Standard Deviation: Positive square root of the Variance

Notations:
Mean: E(Y) = μ
Variance: V(Y) = σ2
Standard Deviation: σ
Mean, Variance, and SD
Review!
1. What is an outcome?
2. What is a sample space?
3. What is a random variable?
4. What is a probability distribution?
5. Differentiate probability function and distribution function?
6. Relate frequency function with anyone of these concepts.
7. Differentiate pmf and pdf?
8. Differentiate f(y) and F(y).
9. Prove each of the properties of the mean and variance.
10. What do mean and variance represent?
Discrete Probability Distribution
• probability distribution for discrete random variable
• discrete rvs are those that take a discrete set of values

- generally take a finite set of values


Examples: number of heads (or tails)
number of cellphone numbers

- can also take a countable set of values (0, 1, 2, 3,


…)
Examples: population size
number of patients
Special Discrete PDs
Uniform Discrete
Experiment:
• single trial
• observe the outcome
Random Variable: X = x1, x2 , … , xk
= code/label/outcome
= 1, 2, ... , k
Parameter: k = number of possible outcomes
Assumption: Outcomes are equally likely: p(xi) = 1/k
Examples:
• Tossing a fair coin and observing face up
.
X = 1, 2 (1 for a Head and 2 for a Tail): k = 2
• Tossing a fair die and observing face up
X = 1, 2, ... , 6: k = 6
Special Discrete PDs
Uniform Discrete
Special Discrete PDs
Uniform Discrete
Example 1.
A box contains 2 white balls, 2 blue balls, and 2
red balls. Draw a ball at random then observe the
color.

Define the random variable:


X=1 if the ball drawn is white;
X=2 if the ball drawn is blue; and
X=3 if the ball drawn is red.

Find the pmf, cdf, mean, and variance of X.


Special Discrete PDs
Uniform Discrete
Solution:
Special Discrete PDs
Uniform Discrete
Solution:
Special Discrete PDs
Bernoulli
Experiment:
• single trial
• classify outcome as “success” or “failure”

Random Variable: X = 0, 1
= 0, if “failure” and
= 1, if “success”
= code/label/outcome

Parameter: p = probability of success

Assumption: There are only two outcomes


(can be classified as Success or Failure).
Special Discrete PDs
Bernoulli
Example:
• Tossing a fair coin: consider Head as “success”
X = 0, if outcome is a Tail
= 1, if outcome is a Head
. p = 0.5 (Why?)

• Tossing a fair die: consider 1 or 2 as “success”


X = 0, if outcome is a 3, 4, 5, or 6
= 1, if outcome is a 1 or a 2
p = 1/3 (Why?)
Special Discrete PDs
Bernoulli
Special Discrete PDs
Bernoulli
Example 2.
A box containing 7 red balls and 11 white balls.
Draw one ball at random from the box. Observe if
the color is red.
Let:
X= 0, if the ball drawn is white
= 1, if the ball drawn is red.
Give the pmf, cdf, mean, and variance of X.
Special Discrete PDs
Bernoulli
Special Discrete PDs
Bernoulli
Special Discrete PDs
Binomial
Experiment:
• n identical trials, n≥1
• classify outcome in each trial as “success” or
“failure”
• series of Bernoulli trials

Random Variable:
X = number of successes in n trials
= 0, 1, 2, ... , n
Parameters: n = number of trials
p = probability of success

Assumptions:
• p is constant for all trials: p1 = p2 = …
• trials are independent: P(1st and 2nd) = P(1st).P(2nd)
Special Discrete PDs
Binomial

Examples:

• Tossing a fair coin thrice

X = number of heads
= 0, 1, 2, 3

p = 1/2 (Why?)
.

• Tossing a fair die and observing face up


X = number of 2’s
p=?
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Example 3.
In a certain city, the need for money to buy drugs
is stated as the reason for 75% of all thefts.
Find the probability that among the next 5 thefts
cases reported in this district,
a. exactly 2 resulted from the need for money to
buy drugs,
b. at most 3 resulted from the need for money to
buy drugs, and also
c. give the mean and variance
Special Discrete PDs
Binomial
Solution: Let X= number of thefts due to need for money to
buy drugs

a.
Special Discrete PDs
Binomial
Solution: Let X= number of thefts due to need for money to buy
drugs

b.

c.
Special Discrete PDs
Hypergeometric vs Binomial
• Binomial: n trials or n samples with replacement
(to satisfy the requirement of “independence among
samples”)
Hypergeometric: n samples without replacement
(“independence” is not required)

• Binomial: each trial/sample can fall into 2 categories


“success” or “failure”
Hypergeometric: each sample can fall into 2 categories
“success” or “failure”
Special Discrete PDs
Hypergeometric
Experiment:
.

• selecting n random samples from a population of size


N
• each sample falls into one of two categories

Random Variable:
X = number of successes in the sample
= 0, 1, 2, ... , min(k, n)

Parameters: n = sample size


N = population size (finite)
k = number of successes in the population

Assumption: sampling is done without replacement


Special Discrete PDs
Hypergeometric
Example: Drawing 2 balls WOR from an urn containing 5
white balls and 3 red balls and observing the
number of white balls included in the sample

where:

N-k = number of failures in the population


Special Discrete PDs
Hypergeometric
Special Discrete PDs
Hypergeometric
Example 4.

The members of a committee of size 3 were randomly


selected from 4 doctors and 2 nurses.

1. Write a formula for the probability distribution of the


random variable representing the number of doctors
on the committee and

2. Find .
Special Discrete PDs
Hypergeometric
Solution: Let X = number of doctors on the committee

a.

b.
Special Discrete PDs
Binomial Approximation to Hypergeometric

When to Use:
If a random sample of size n is selected from a finite
population of size N, where N is large.
The population size is considered large if the sampling fraction
n/N is quite small (say, below 0.10).
How to Use:
If k ≥ n , use p = k/N.
If k < n, p = n/N.
Special Discrete PDs
Binomial Approximation to Hypergeometric

Example 5.

It is estimated that 4,000 of the 10,000 voting


residents of a town are against a new sales tax.
If 15 eligible voters are selected at random and
asked about their opinion, what is the probability
that at most 7 favor the new tax?
Special Discrete PDs
Binomial Approximation to Hypergeometric
Solution:
Let X= number of voters who favor …

Since N is large and k < n, p = 6,000/10,000 = 0.6.

By binomial approximation to hypergeometric,


Special Discrete PDs
Poisson
Experiment:

• observing the number of occurrences of some


characteristic (successes) in time or space

Random Variable:

X = number of occurrences at a given time


interval (or region of space)
= 0, 1, 2, ...
.
Parameter: λ = mean number of occurrences:
.
Special Discrete PDs
Poisson
.

Assumptions:
.
• number of occurrences are independent for any two
disjoint time interval (or region of space)

• P(one outcome during a certain time interval/space) is


proportional to the length of time interval/space.

• P(more than 1 outcome in a very short time interval) is


negligible
Special Discrete PDs
Poisson
Examples:

• Number of patients arriving at a certain clinic during


1-hr interval

• Arrivals of customers in a queue

• Number of flaws in a roll of fabric

• Number of typos per page of text


Special Discrete PDs
Poisson
Special Discrete PDs
Poisson
Example 6.
Hospital administrators in large cities anguish about
problems with traffic in emergency rooms in hospitals. For
a particular hospital in a large city, the staff on hand
cannot accommodate the patient traffic if there are more
than 10 emergency cases in a given hour. On the average,
5 emergencies arrive per hour.

a. What is the probability that in a given hour the staff can


no longer accommodate the traffic?

b. What is the probability that more than 20 emergencies


arrive during a 3-hour shift of personnel?
Special Discrete PDs
Poisson
Solution:

Let X= number of patients arrival, then


Special Discrete PDs
Poisson Approximation to Binomial

When to use:
When n is very large, particularly when p
deviates markedly from 0.5, in such a way that
np = λ is a constant.

How to use:

Calculate for λ = np.


Special Discrete PDs
Poisson Approximation to Binomial

Example 7.

The probability that a person will die from a certain


respiratory infection is 0.002. Find the probability
that fewer than 5 of the 2,000 so infected will die.
Solution:

Let X = number of person who dies from respiratory


infection, then
Special Discrete PDs
Poisson Approximation to Binomial
Since p = 0.002 is very close to zero and n =
2,000 is quite large, we shall approximate with the
Poisson distribution using for

λ= (2,000)(0.002) = 4.
Special Discrete PDs
Geometric vs Binomial
• Binomial: independent trials
Geometric: independent trials

• Binomial: each trial can result in “success” or “failure”


Geometric: each trial can result in “success” or “failure”

• Binomial: probability of success is constant


Geometric: probability of success is constant

• Binomial: n trials, x are “successes”


(successes occur at random from trial 1 to trial n)
Geometric: trials are repeated until the first success
occurs (the only one success occurs at the last
trial)
Special Discrete PDs
Geometric
Experiment:
• observing at which trial the “first success” occurs
• each trial is classified as success or failure

Random Variable:
X = the trial at which the first success occur
= 1, 2, 3, ...

Parameters: p = probability of success

Assumption: p is constant from trial to another

Example: Tossing a fair coin and observing on which trial


the first Tail occurred
Special Discrete PDs
Geometric
Special Discrete PDs
Geometric
Example 8.

According to a study published by a group of


University of Massachusetts sociologists, about
two-thirds of the 20 million persons in this country
who take Valium are women.

Assuming this figure to be a valid estimate, find


the probability that on a given day the fifth
prescription written by a doctor for Valium is the
first prescribing valium for a woman.
Special Discrete PDs
Geometric
Solution:

Let = number of prescription, then


Special Discrete PDs
Geometric vs Negative Binomial

• Geometric: number of trials until the “1st success”


Negative Binomial: number of trials until the “rth success”

Negative Binomial is an extension of Geometric

Negative Binomial is widely used to model count data


when the Poisson model does not fit well
due to over-dispersion:
V(Y) > E(Y).
Special Discrete PDs
Negative Binomial
Experiment:
• observing at which trial the “rth success” occurs
• each trial is classified as success or failure

Random Variable:

X = the trial at which the rth success occur


= r, r+1, r+2, r+3, ...

Parameters: p = probability of success

Assumption: p is constant from trial to another

Example: Tossing a fair coin until the 5th Tail occurred


Special Discrete PDs
Negative Binomial
𝑥−1 𝑥 x = r, r+1, r+2, r+3, ...
𝑝 𝑥 = 𝑝 (1 − 𝑝)𝑥−𝑟
𝑟−1
𝑟 𝑟(1−𝑝)
𝐸 𝑋 = V 𝑋 =
𝑝 𝑝2

Example 9.

According to a study published by a group of University of Massachusetts


sociologists, about two-thirds of the 20 million persons in this country who
take Valium are women.

Assuming this figure to be a valid estimate, find the probability that on a


given day the fifth prescription written by a doctor for Valium is the third
prescribing valium for a woman.