You are on page 1of 61

# STAT 130

Nonparametric Statistics

Lecture 1
Probability Distribution and
Special Discrete PDs

## Department of Physical Sciences and Mathematics

College of Arts and Sciences
University of the Philippines Manila
Population, Experiment, Outcomes, Sample Space

## Population: The set of all subjects under consideration

Example: Students in this class: Carlos, Chloe, Crissa, Cris

## Experiment: A procedure carried out under a certain set of conditions

that can be repeated under the same set of conditions
Outcomes: Results of the experiment
Sample Space (S): Set of all the possible outcomes

Examples:
Population, Experiment, Outcomes, Sample Space
Examples:
Experiment 1. Selecting a student from this class (observe sex)
Sample Space: {M, F}
Experiment 2: Selecting 2 students from this class (observe sex)
Sample Space: {MM, MF, FM, FF}
Experiment 3. Tossing a coin
Sample Space: {H, T}
Experiment 4. Tossing two coins
Sample Space: {HH, HT, TH, TT}
Experiment 5: Rolling a die
Sample Space: {1, 2, 3, 4, 5, 6}
Experiment 6: Rolling 2 dice
Sample Space: {11, 12, 13, 14, 15, 16
21, 22, …, 31, 32, …, 66}
Sample Space, Random Variable, Probability
Experiment. Selecting a student from a class of 10 students, observe sex
Random variable (X). A function that maps the outcomes to numerical
quantities (real numbers)
Define: X = number of males
Probability, P(x). A function that maps the values of X to [0, 1]

S [0, 1]
M 1 0.1
M 1 0.1
F 0 0.1
F 0 0.1
M 1 0.1
M 1 0.1
M 1 0.1
F 0 0.1
M 1 0.1
F 0 0.1

X P(X)
Probability Distribution: Example
Probability distribution. Listing of all possible values of X together
with the corresponding probabilities

The probability
distribution
of the random variable X:
S [0, 1]
M
M
1
1
0.1
0.1
S X P(X)
F 0 0.1
F 0 0.1
M 1 0.1
M
M
1
1
0.1
0.1
M 1 0.6
F 0 0.1
M 1 0.1
F 0 0.1
F 0 0.4
X P(X)
Probability Distribution: Example
Experiment: Flip a coin 2 times (or flipping 2 coins)
S: {HH, HT, TH, TT}
X: Number of tails
x: 0, 1, 2

## S X P(X) The probability

distribution
HH 0 1/4 of the random variable X:
HT 1 1/4
TH 1 1/4
TT 2 1/4
Probability Distributions
Discrete: Probability mass function or pmf
Assigns probabilities (masses) to individual outcomes

## Continuous: Probability density function or pdf

Assigns probabilities to range of outcomes (density)

## Cumulative Distribution Function (cdf) or distribution

function:
Total probability assigned to all values less than a
given value
PMF vs CDF
PMF vs CDF: Example 1
Experiment: Rolling 2 Dice (Red/Green)
Y = sum of the up faces
S = {2, 3, 4, ... 12}
PMF vs CDF: Example 2
Experiment: Rolling 2 Dice

PMF: p(y)
PMF vs CDF: Example 2

CDF: F(y)
Mean, Variance, and SD
Mean (Expected Value): Average value of an RV (or
function of RV)

## Variance: Average squared deviation between a

realization of a RV (or function of RV) and its
mean

## Standard Deviation: Positive square root of the Variance

Notations:
Mean: E(Y) = μ
Variance: V(Y) = σ2
Standard Deviation: σ
Mean, Variance, and SD
Review!
1. What is an outcome?
2. What is a sample space?
3. What is a random variable?
4. What is a probability distribution?
5. Differentiate probability function and distribution function?
6. Relate frequency function with anyone of these concepts.
7. Differentiate pmf and pdf?
8. Differentiate f(y) and F(y).
9. Prove each of the properties of the mean and variance.
10. What do mean and variance represent?
Discrete Probability Distribution
• probability distribution for discrete random variable
• discrete rvs are those that take a discrete set of values

## - generally take a finite set of values

Examples: number of heads (or tails)
number of cellphone numbers

## - can also take a countable set of values (0, 1, 2, 3,

…)
Examples: population size
number of patients
Special Discrete PDs
Uniform Discrete
Experiment:
• single trial
• observe the outcome
Random Variable: X = x1, x2 , … , xk
= code/label/outcome
= 1, 2, ... , k
Parameter: k = number of possible outcomes
Assumption: Outcomes are equally likely: p(xi) = 1/k
Examples:
• Tossing a fair coin and observing face up
.
X = 1, 2 (1 for a Head and 2 for a Tail): k = 2
• Tossing a fair die and observing face up
X = 1, 2, ... , 6: k = 6
Special Discrete PDs
Uniform Discrete
Special Discrete PDs
Uniform Discrete
Example 1.
A box contains 2 white balls, 2 blue balls, and 2
red balls. Draw a ball at random then observe the
color.

## Define the random variable:

X=1 if the ball drawn is white;
X=2 if the ball drawn is blue; and
X=3 if the ball drawn is red.

## Find the pmf, cdf, mean, and variance of X.

Special Discrete PDs
Uniform Discrete
Solution:
Special Discrete PDs
Uniform Discrete
Solution:
Special Discrete PDs
Bernoulli
Experiment:
• single trial
• classify outcome as “success” or “failure”

Random Variable: X = 0, 1
= 0, if “failure” and
= 1, if “success”
= code/label/outcome

## Assumption: There are only two outcomes

(can be classified as Success or Failure).
Special Discrete PDs
Bernoulli
Example:
• Tossing a fair coin: consider Head as “success”
X = 0, if outcome is a Tail
= 1, if outcome is a Head
. p = 0.5 (Why?)

## • Tossing a fair die: consider 1 or 2 as “success”

X = 0, if outcome is a 3, 4, 5, or 6
= 1, if outcome is a 1 or a 2
p = 1/3 (Why?)
Special Discrete PDs
Bernoulli
Special Discrete PDs
Bernoulli
Example 2.
A box containing 7 red balls and 11 white balls.
Draw one ball at random from the box. Observe if
the color is red.
Let:
X= 0, if the ball drawn is white
= 1, if the ball drawn is red.
Give the pmf, cdf, mean, and variance of X.
Special Discrete PDs
Bernoulli
Special Discrete PDs
Bernoulli
Special Discrete PDs
Binomial
Experiment:
• n identical trials, n≥1
• classify outcome in each trial as “success” or
“failure”
• series of Bernoulli trials

Random Variable:
X = number of successes in n trials
= 0, 1, 2, ... , n
Parameters: n = number of trials
p = probability of success

Assumptions:
• p is constant for all trials: p1 = p2 = …
• trials are independent: P(1st and 2nd) = P(1st).P(2nd)
Special Discrete PDs
Binomial

Examples:

= 0, 1, 2, 3

p = 1/2 (Why?)
.

## • Tossing a fair die and observing face up

X = number of 2’s
p=?
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Special Discrete PDs
Binomial
Example 3.
In a certain city, the need for money to buy drugs
is stated as the reason for 75% of all thefts.
Find the probability that among the next 5 thefts
cases reported in this district,
a. exactly 2 resulted from the need for money to
b. at most 3 resulted from the need for money to
c. give the mean and variance
Special Discrete PDs
Binomial
Solution: Let X= number of thefts due to need for money to

a.
Special Discrete PDs
Binomial
Solution: Let X= number of thefts due to need for money to buy
drugs

b.

c.
Special Discrete PDs
Hypergeometric vs Binomial
• Binomial: n trials or n samples with replacement
(to satisfy the requirement of “independence among
samples”)
Hypergeometric: n samples without replacement
(“independence” is not required)

## • Binomial: each trial/sample can fall into 2 categories

“success” or “failure”
Hypergeometric: each sample can fall into 2 categories
“success” or “failure”
Special Discrete PDs
Hypergeometric
Experiment:
.

## • selecting n random samples from a population of size

N
• each sample falls into one of two categories

Random Variable:
X = number of successes in the sample
= 0, 1, 2, ... , min(k, n)

## Parameters: n = sample size

N = population size (finite)
k = number of successes in the population

## Assumption: sampling is done without replacement

Special Discrete PDs
Hypergeometric
Example: Drawing 2 balls WOR from an urn containing 5
white balls and 3 red balls and observing the
number of white balls included in the sample

where:

## N-k = number of failures in the population

Special Discrete PDs
Hypergeometric
Special Discrete PDs
Hypergeometric
Example 4.

## The members of a committee of size 3 were randomly

selected from 4 doctors and 2 nurses.

## 1. Write a formula for the probability distribution of the

random variable representing the number of doctors
on the committee and

2. Find .
Special Discrete PDs
Hypergeometric
Solution: Let X = number of doctors on the committee

a.

b.
Special Discrete PDs
Binomial Approximation to Hypergeometric

When to Use:
If a random sample of size n is selected from a finite
population of size N, where N is large.
The population size is considered large if the sampling fraction
n/N is quite small (say, below 0.10).
How to Use:
If k ≥ n , use p = k/N.
If k < n, p = n/N.
Special Discrete PDs
Binomial Approximation to Hypergeometric

Example 5.

## It is estimated that 4,000 of the 10,000 voting

residents of a town are against a new sales tax.
If 15 eligible voters are selected at random and
that at most 7 favor the new tax?
Special Discrete PDs
Binomial Approximation to Hypergeometric
Solution:
Let X= number of voters who favor …

## By binomial approximation to hypergeometric,

Special Discrete PDs
Poisson
Experiment:

## • observing the number of occurrences of some

characteristic (successes) in time or space

Random Variable:

## X = number of occurrences at a given time

interval (or region of space)
= 0, 1, 2, ...
.
Parameter: λ = mean number of occurrences:
.
Special Discrete PDs
Poisson
.

Assumptions:
.
• number of occurrences are independent for any two
disjoint time interval (or region of space)

## • P(one outcome during a certain time interval/space) is

proportional to the length of time interval/space.

## • P(more than 1 outcome in a very short time interval) is

negligible
Special Discrete PDs
Poisson
Examples:

1-hr interval

## • Number of typos per page of text

Special Discrete PDs
Poisson
Special Discrete PDs
Poisson
Example 6.
problems with traffic in emergency rooms in hospitals. For
a particular hospital in a large city, the staff on hand
cannot accommodate the patient traffic if there are more
than 10 emergency cases in a given hour. On the average,
5 emergencies arrive per hour.

## a. What is the probability that in a given hour the staff can

no longer accommodate the traffic?

## b. What is the probability that more than 20 emergencies

arrive during a 3-hour shift of personnel?
Special Discrete PDs
Poisson
Solution:

## Let X= number of patients arrival, then

Special Discrete PDs
Poisson Approximation to Binomial

When to use:
When n is very large, particularly when p
deviates markedly from 0.5, in such a way that
np = λ is a constant.

How to use:

## Calculate for λ = np.

Special Discrete PDs
Poisson Approximation to Binomial

Example 7.

## The probability that a person will die from a certain

respiratory infection is 0.002. Find the probability
that fewer than 5 of the 2,000 so infected will die.
Solution:

## Let X = number of person who dies from respiratory

infection, then
Special Discrete PDs
Poisson Approximation to Binomial
Since p = 0.002 is very close to zero and n =
2,000 is quite large, we shall approximate with the
Poisson distribution using for

λ= (2,000)(0.002) = 4.
Special Discrete PDs
Geometric vs Binomial
• Binomial: independent trials
Geometric: independent trials

## • Binomial: each trial can result in “success” or “failure”

Geometric: each trial can result in “success” or “failure”

## • Binomial: probability of success is constant

Geometric: probability of success is constant

## • Binomial: n trials, x are “successes”

(successes occur at random from trial 1 to trial n)
Geometric: trials are repeated until the first success
occurs (the only one success occurs at the last
trial)
Special Discrete PDs
Geometric
Experiment:
• observing at which trial the “first success” occurs
• each trial is classified as success or failure

Random Variable:
X = the trial at which the first success occur
= 1, 2, 3, ...

## Example: Tossing a fair coin and observing on which trial

the first Tail occurred
Special Discrete PDs
Geometric
Special Discrete PDs
Geometric
Example 8.

two-thirds of the 20 million persons in this country
who take Valium are women.

## Assuming this figure to be a valid estimate, find

the probability that on a given day the fifth
prescription written by a doctor for Valium is the
first prescribing valium for a woman.
Special Discrete PDs
Geometric
Solution:

## Let = number of prescription, then

Special Discrete PDs
Geometric vs Negative Binomial

## • Geometric: number of trials until the “1st success”

Negative Binomial: number of trials until the “rth success”

## Negative Binomial is widely used to model count data

when the Poisson model does not fit well
due to over-dispersion:
V(Y) > E(Y).
Special Discrete PDs
Negative Binomial
Experiment:
• observing at which trial the “rth success” occurs
• each trial is classified as success or failure

Random Variable:

## X = the trial at which the rth success occur

= r, r+1, r+2, r+3, ...

## Example: Tossing a fair coin until the 5th Tail occurred

Special Discrete PDs
Negative Binomial
𝑥−1 𝑥 x = r, r+1, r+2, r+3, ...
𝑝 𝑥 = 𝑝 (1 − 𝑝)𝑥−𝑟
𝑟−1
𝑟 𝑟(1−𝑝)
𝐸 𝑋 = V 𝑋 =
𝑝 𝑝2

Example 9.

## According to a study published by a group of University of Massachusetts

sociologists, about two-thirds of the 20 million persons in this country who
take Valium are women.

## Assuming this figure to be a valid estimate, find the probability that on a

given day the fifth prescription written by a doctor for Valium is the third
prescribing valium for a woman.