You are on page 1of 53

Discrete Probability Distributions

Because large number of applications, three


discrete probability distributions serve as models.
The Binomial, and the Poisson probability
distributions are just two that we will discuss.
*what are definition of those models
* what are their probability distributions
* what are their mean value and variance.
* application of those models
Properties of binomial distribution:
1. Experiment consist of n identical trials
2. Each trial has only two outcomes
3. The probability of one outcome is p and the
other is q=1-p
4. The trials are independent
5. We are interested in x, the number of success
observed during the n trials.
Example we worked on in Class
A survey of the French found that .43 had
an upbeat view of the US

We survey 3, 5, 100 people what is the
probability that ____ of them have an
upbeat view of the US
Binomial distribution:






Intuition

npq
npq
np
x n x
n
C
q p C x P
n
x
x n x n
x
=
=
=

=
=

o
o

2
)! ( !
!
) (
Problem: basketball shooting
One player stand in the foul line to shoot free-
throws 10 times. Suppose the probability that
he makes it is 0.5
What is the probability that he get 6 out of 10?
What is the mean and variance?
Step 1 Does this meet the criteria of a
binomial distribution?
Step 2 Define variables and use formulas
Problem: Vote
A committee consisting of 5 members votes on
whether or not to hire a new professor. The
probability that each members vote for candidate A
is 0.6. Only if over half of the committee agrees to
hire her does candidate A receive the offer.
1) What is the probability that the candidate A gets
an offer.
2) What is the probability that the candidate A does
not get the offer
The Binomial Table
Pick the appropriate table
Each entry is the probability that x<a
You may need to subtract to get what you
need

Poisson Probability Distribution:
*Poisson Distribution: is a distribution of the
number of rare events that occur in a unit of
time, distance, space, and so on.
Examples:
number of insurance claims in a unit of time
number of accidents in a ten-mile highway
number of airplane crash in triangle area.

The Poisson distribution:





Where, x=number of rare events per unit of time,
distance, space, and so forth and e = 2.71828
**dont need to remember those formulas, if on the test,
formula will be given

o


=
=
=

2
: var
.
!
) (
mean
x
e
x P
x
Poisson Example
Serious worker injuries at a steel-fabricating
company average 2.7 per year. Given that safety
conditions at the plant remain the same next year,
what is the probability that the number of the
serious injuries less than 2?
Step 1 Is it a poisson
Step 2 Identify the mean and x and use formula
Note - there is also a table for this distribution which works
much like the binomial table.


Do an article turned in

Continuous Random Variable probability
distribution
If we assign a positive probability to each value of a
random variable, the sum of those probability will not
be one due to infinite number of value of continuous
random variable. Time, height, distance, ect.
*Probability Distribution (density function):
is a function or curve f(x) so that the probability that
x falls in the interval a<x<b is the area under the
curve for f(x) between the two points a and b.
Normal distribution






The e and are constants given by 2.7183 and 3.1416


2
2
) (
variance
2
1
) (
2
2
o

t o
o

=
=
< <
=

mean
x
e x f
x
t
What the normal distribution looks like?
1) The mean is located in the center of the distribution
2) Distribution is symmetric about its mean
3) Shape of distribution is determined by standard deviation:
large value of SD reduce the height and increase the spread
of the curve, small value of SD increase the height and
reduce the spread of the curve.
4) Almost all of the distribution will lie within 3 deviations of
the mean
5) the total area under the curve is 1
6) The curve extends indefinitely in both directions,
approaching, but never touching, the horizontal axis as it
does so.

Major facts about normally distributed
variables and normal-curve areas:
* Once we know the mean and the standard deviation
of a normally distributed variable, we know its
distribution and associated normal curve. The
mean and standard deviation are normal
distributions sufficient statistics, they
completely define the variables distribution.
* The probability a normally distributed variable
assumes a value between a and b is equal to the
area under the curve between a and b
Calculation of the probability that a normal
random variable lies within some interval

Theoretically, we need to calculate the area under
the curve between the two end points of the interval.
Integration for each and every different normal
curve

Due to the complication of calculation and the
frequency in which it is done, a standardized way
has been derived
*Standard normal distribution: the normal
distribution with mean=0 and standard
deviation =1.

*Standard normal random variable: The normal
random variable with the standard normal
distribution is called standard normal random
variable.
* Basic properties of the standard normal curve
1) All the criteria from the normal curve
2) The curve is symmetric around 0;
3) Almost all the area under the curve lies
between 3 and 3
4) A table has been set up that has the areas under
this curve already calculated (page 610)
Problem: z is a standard normal random variable, a is
some constant greater than 0. The table shows
P(0 z a)
Using the table, answer following questions:
Note: Always draw distribution so you get it right
1) P(0 z 1.63)
2) P(-0.5 z 1.0)
3) P(z -0.53)
4) P(z 1.25)
5) P(0 z 4.5)


s
>
s
s s
s
s
s
s s
s s
Nonstandard Normal Random Variable Case
*Given a random variable x, with a normal
distribution, the standardized random
variable z will have standard normal
distribution.

And this process is called standardization
*Important property: P (a x b)=P( z )
o

=
x
z
s s
o
a
s s
o
b
Example: a variable x is normally distributed with
mean =10 and standard deviation =2. find
those probabilities:
1) P (x>13.5)
2) P (x<8.2)
3) P (9.4<x<10.6)
Problem: grades distribution
Suppose the final grades of econ 70 in the
last semester is normally distributed.
The mean is 75 and standard deviation
is 10
1) What is the probability that one student
get 95?
2) 95% of students will get the grades in
which interval?
3) What would you need to score to do
better than 90% of the class
Normal Approximation of the Binomial
Probability Distribution
Recall the Binomial discrete distribution





It would be nice if we didnt have to go thought all
of these calculations
)! ( !
!
) (
x n x
n
C
q p C x P
n
x
x n x n
x

=
=

0
.2461
prob
0
1
2
3
4
5
6
7
8
9
10
Binomial distribution with n=10, p=0.5
Binomial distribution
for varying probabilities
of success and sample
size.
Note that when p is far
from 0.5 the binomial
distribution is skewed.
However, as sample size
(i.e. number of trials)
rises even these skewed
distributions become
more symmetrical and
normal
So under certain circumstances the continuous
normal distribution is a good approximation of the
discrete binomial distribution.
When the probability (p) of binomial distribution is
near zero or 1, or n (times of trials) is small, the
binomial distribution will be nonsymmetrical and
the normal will not give an good approximation.
To determine when the normal approximation will
be adequate:
Rule of thumb: calculate np and n(1-p), both of them should
be greater than or equal to 5

The normal approximation to the binomial
probability distribution
Approximate the binomial probability distribution
by using a normal curve with


Where n=number of trials
p=probability of success on a single trial
q=1-p
Recall: and are all you need to define the
normal distribution

np =
npq = o
o
Example: a binomial distribution with n=10, p=0.5.
1) Calculate the probability that x=2, 3, 4 by using
binomial distribution formula
2) Is this an appropriate place to use the normal
approximation
3) Calculate the probability that x=2, 3, 4 by using
normal approximation.

Correction for Continuity:
The process of adding or subtracting 0.5 in adjusting
the values of x for the binomial distribution to those
for the approximation normal distribution is called
correction for continuity.
*why we need to adjust?
Look at the graph
*when add and when subtract 0.5?
Beginning point 0.5 and end point +0.5

Go back to example and finish the problem 2
Compare the answer of question 1 and question 2,
are they close?

Procedure for using Normal Approximation to the
Binomial Distribution:
Step 1: Determine, n,the number of trials, and p, the success
probability.
Step 2: Determine whether both np and n(1-p) are 5 or
greater. If they are not, do not use the normal approximation.
(Rule of Thumb)
Step 3: Find mean and standard deviation, using the formulas
and
Step 4: Make the correction for continuity and find the
required area under the normal curve.
npq = o
np =
Problem: fuse test
The reliability of a fuse is 0.98 (i.e. test 100 fuses
and average 2 defectives). Now we test 1000
fuses.
What is the probability that we find more than 27
defective products?
Mean and variance of continuous random variable

The mean of random variable x:

The expectation of the function of x, g(x):

The expectation of x
2
:
The variance of x :
You should verify that the formula for mean and variance of
a continuous distribution are the same as those given earlier
for discrete distributions except that integral signs are
substituted for summation signs and f(x)dx is substituted for
p(x)
}


= dx x xf x E ) ( ) (
}


= dx x f x g x g E ) ( ) ( )) ( (
}


= dx x f x x E ) ( ) (
2 2
}


= dx x f x x V ) ( ) ( ) (
2

Uniform distribution:

when a<=x<=b
otherwise





*What does the uniform density function looks like?
* What is the sufficient statistics of uniform distribution?

12
) (
) (
2
1
0
) /( 1
{ ) (
a b
b a
a b
x f

=
+ =

=
o

Calculation of the probability


A uniform random variable with a and b.
the probability that
P(c<x<d)=(d-c)/(b-a) , if c,d within [a,b]


Example: Cutting 8 foot planks of lumber leaves
scrap pieces that are uniformly distributed
between 1 and 11 inches long.
1) What is expected length of the scrap?
2) What is the probability that a scrap is more than
5 inches long?

The Exponential Probability Distribution
*exponential distribution is a model for random
variable that represent waiting times. The
time until a machine breaks down, the
waiting time in a service line, and the length
of life of a piece of industrial equipment.
>0, x>=0
*density function: otherwise


Note: How is this different than the Poissan
o


/ 1
/ 1
0
{ ) (
=
=
=
x
e
x f

*Sufficient statistics:
*Shape of Distribution:
*Calculation of probability:

a>=0 and >0

*Example:suppose that the delivery time of an item after
placing a factory order follows an exponential distribution
with a mean of 10 days. What is the probability that it takes
longer than 3 weeks(21 days) from the date of order to the
day of delivery? (book example on page 180)

a
e a x P

= > ) (

Sampling Distribution

* Why do we use samples?
Due to the data availability and cost, we have
to use samples instead of population data.

* Why do we need to know the sampling distribution?
Before we use sample to make inference about
the population, we need to know the feature of
sample distribution.

Sampling Distribution of Statistics

*Statistics: numerical descriptive measures
calculated from a sample are called statistics.
We will be using means and proportions.

*Sampling distribution of a statistic (mean or
proportion) is the probability distribution for
all possible values of the statistic that results
when random samples of size n are
repeatedly drawn from the population.

Sampling Distribution of the
Sample Mean
You are taking a population (N) and pulling
out of that population a sample of size (n)
then calculating its mean
You do this for all possible combinations of
n out of N and you construct the sampling
distribution of the sample mean from those
means
Sampling Distribution of the
Sample Mean
Example: from Berenson and Levine 1996
A certain population consists of 4 typists, the typists make
3, 2, 1, and 4 mistakes respectively. You take a
random sample of two typists from that population
with replacements, order does matter.

1) What is the population mean and standard deviation?
2) What are all the possible combinations of samples?
3) Construct a sampling distribution of the sample mean
from this example, what is the sample mean and
standard deviation?

Some facts about sampling distribution of
the sample mean:
Fact 1: if a random sample of n measurement is selected
from a population with mean and standard
deviation , the sampling distribution of the
sample mean will possess a mean

And standard deviation (called the standard error) for a
sufficiently large population


=
x
n
x
o
o =
o

Fact 2: if the population possesses a normal


distribution, then the sampling distribution of
will be exactly normally distributed,
regardless of the sample size n

Fact 3: if the population is non-normal, the
sampling distribution of will be closer and
closer to a normal distribution with the rise of
sample size n
x
x
The Central Limit Theorem
If random samples of n observations are drawn
from a non-normal population with finite mean
and standard deviation , then when n is large ,
the sampling distribution of the sample mean x is
approximately normally distributed with mean and
standard deviation:
and
The approximation will become more and more
accurate as n becomes larger and larger
=
x n
x
o
o =

o
*A rule of thumb: sampling distributions of
will be approximately normal for sample size
as small as n=25 for most populations of
measurements.
*Example: suppose that you select a random
sample of n=25 from a population with
mean=8 and standard deviation=0.6.
1) Find the approximate probability that the
sample mean will be less than 7.9
2) Find the approximate probability that the sample
mean will lie within 0.1 of the population
mean 8
x
x
x
The sampling distribution of a sample
proportion

*sample proportion: a random sample of n objects
is selected from the population and if x of
these possess the specified characteristic,
then the sample proportion is:


Note: We are working with Binomial distributions
here useful with survey response data
n x p =
^
Sampling distribution of a
proportion
It is known that the population proportion of
those who like economics is 0.5. You go
and ask 6 people if the like economics.
Create the probability distribution table for
your random variable the proportion of those
who like economics.
What is the expected value (average) and
standard deviation of your random variable?
Properties of sampling distribution of the
sample proportion
1. If a random sample of n observations is
selected from a binomial population with
parameter p, the sampling distribution of the
sample proportion

will have a mean
and a standard deviation:
where q=1-p

n x p =
^
p
p
=
^

n
pq
p
=
^
o
2. When the sample size n is large, the
sampling distribution of sample
proportion will be approximately
normal. Remember the rule is np and nq
both greater than or equals to 5

- this is a result of our ability to approximate
binomial distributions with normal
distributions
Example: According to the recent poll, about 46%
Americans approve of Bushs overall status. Now we
select 100 people and ask for their opinions.
1) What is mean of sample proportion
2) How many people in our sample are expected to
think the policy is unsuccessful?
3) What is the probability that over 50% of sample
approve of Bushs overall status?

You might also like