You are on page 1of 69

Module 4

Probability Distributions
Taxonomy of Probability Distributions

Discrete probability distributions


•Binomial distribution
•Multinomial distribution
•Poisson distribution
•Hypergeometric distribution

Continuous probability distributions


•Normal distribution
•Standard normal distribution
•Gamma distribution
•Exponential distribution
•Chi square distribution
•Lognormal distribution
•Weibull distribution
Discrete Probability Distributions
Binomial Distribution
• In many situations, an outcome has only two outcomes: success and failure.
• An experiment when consists of repeated trials, each with such outcomes is called Bernoulli
process. Each trial in it is called a Bernoulli trial.
Example: Firing bullets to hit a target.
• Suppose, in a Bernoulli process, we define a random variable X number of successes in
trials.
• Such a random variable obeys the binomial probability distribution, if the experiment satisfies the
following conditions:
1)The experiment consists of n trials.
2)Each trial results in one of two mutually exclusive outcomes, one labelled a “success” and the
other a “failure”.
3)The probability of a success on a single trial is equal to . The value of remains constant
throughout the experiment.
4)The trials are independent.
Defining Binomial Distribution
Definition: Binomial distribution
The function for computing the probability for the binomial probability distribution is
given by

for x = 0, 1, 2, …., n
Here, where denotes “the number of success” and
denotes the number of success in trials.

If follows binomial distribution with parameters and symbolically, we express


or
Moment Generating Function
Mean and Variance of Binomial distribution

Mean of the Binomial distribution is


Variance of Binomial distribution is
Ex: The mean of a binomial distribution is 5 and standard deviation is 2. Determine
the distribution.
Answer:

Hence the Binomial distribution is


Ex: The mean and variance of a binomial distribution are 4 and 4 / 3. Find

Answer:
Given mean , Variance
Ex: An irregular 6 faced die is such that the probability that it gives 3 even numbers
in 5 throws is twice the probability that it gives 2 even numbers in 5 throws. How
many sets of exactly 5 trials can be expected to give no even number out of 2500
sets?
Answer:
Let denote the number of even numbers obtained in 5 trials.
Given

Hence , and
Now, (getting no even number)= =
Number of sets having no success (no even number) out of sets =
Required number of sets =2500* 1 / 243=10, nearly.
Problem 1:A die is thrown 4 times. Getting a number greater than 2is a success.
Find the probability of getting (i) exactly one success, (ii)less than 3 successes.

Problem 2:If the chance that any one of 5 telephone lines is busy at any instant is 0.01, what is the probability that
all the lines are busy? What is the probability that more than 3 lines are busy?

Problem 3:A die is thrown three times. Getting a ”3” or a ”6” is considered to be success. Find the probability of
getting at least two successes.

Problem 4:If 20% of the bolts produced by a machine are defective, determine the probability that out of 4 bolts
chosen at random (i) 1,(ii) 0 will be defective.

Problem 5:Out of 1000 families of 3 children each, how many families would you expect to have two boys and
one girl, assuming that boys and girls are equally likely.

Problem 6:The average percentage of failures in a certain examination is 40. What is the probability that out of
a group of 6 candidates, at least 4 pass in the examination?

Problem 7: X follows a binomial distribution such that 4P(X= 4) =P(X= 2). If n= 6, find p the probability of
success.
Problem 8:Find the maximum n such that the probability of getting no head in tossing a coin n times is
greater than 0.1.

Problem 9:If the sum of the mean and variance of a binomial distribution of 5 trials is 95, find the binomial
distribution.
Poisson Distribution
A type of probability distribution useful in describing the number of events that will
occur in a specific period of time or in a specific area or volume is the Poisson
distribution.

The following are some of the examples, which may be analysed using Poisson
distribution.
1. The number of alpha particles emitted by a radioactive source in a given time
interval.
2. The number of telephone calls received at a telephone exchange in a given time
interval.
3. The number of defective articles in a packet of 100 .
4. The number of printing errors at each page of a book.
5. The number of road accidents reported in a city per day.
The Poisson Distribution
Properties of Poisson process
• The number of outcomes in one time interval is independent of the number that
occurs in any other disjoint interval [Poisson process has no memory]
• The probability that a single outcome will occur during a very short interval is
proportional to the length of the time interval and does not depend on the number
of outcomes occurring outside this time interval.
• The probability that more than one outcome will occur in such a short time interval
is negligible.
Poisson distribution
The probability distribution of the Poisson random variable , representing the number of
outcomes occurring in a given time interval , is

where is the average number of outcomes per unit time and


Poisson frequency distribution: If an experiment satisfying the requirements of
Poisson distribution is repeated N times, the expected frequency distribution of
getting successes is given by

Both the mean and the variance of the Poisson distribution p(x; λt) are λt. (Home
work!!)

If we denote mean as , then we can write the probability distribution as


Poisson Distribution as Limiting Form of Binomial Distribution

Poisson distribution is a limiting case of binomial distribution under the following


conditions:
(i) the number of trials is indefinitely large, i.e., .
(ii) the constant probability of success in each trial is very small, i.e., .
(iii) is finite or and .
Ex: During a laboratory experiment, the average number of radioactive particles
passing through a counter in 1 millisecond is 4. What is the probability that 6 particles
enter the counter in a given millisecond?
Answer:
Using the Poisson distribution with x = 6 and λt = 4, we have

=0.1042
Ex: In a manufacturing process where glass products are made, defects or bubbles
occur, occasionally rendering the piece undesirable for marketing. It is known that, on
average, 1 in every 1000 of these items produced has one or more bubbles.
What is the probability that a random sample of 8000 will yield fewer than 7 items
possessing bubbles?
Answer:
Ex: The number of monthly breakdowns of a computer is a random variable having
Poisson distribution with mean equal to 1.8. Find the probability that this computer
will function for a month.
(a)Without a breakdown
(b)With only one breakdown and
(c)With at least one breakdown
Answer:
Let denotes the number of breakdowns of the computer in a month.

follows a Poisson distribution with mean


.
Ex: Fit a binomial distribution for the following data:

Solution: Fitting a binomial distribution means assuming that the given distribution is
approximately binomial and hence finding the probability mass function and then
finding the theoretical frequencies.
To find the binomial frequency distribution , which fits the given
data, we require N, n and p. We assume N = total frequency = 80 and n = no. of
trials = 6 from the given data.
To find p, we compute the mean of the given frequency distribution and equate it to
np (mean of the binomial distribution).
If the given distribution is nearly binomial, the theoretical frequencies are given by the
successive terms in the expansion of . Thus we get,

Converting these values into whole numbers consistent with the condition that the total
frequency is 80, the corresponding binomial frequency distribution is as follows:
Ex: Fit a Poisson distribution for the following
distribution:

Solution Fitting a Poisson distribution for a given distribution means assuming that
the given distribution is approximately Poisson and hence finding the probability
mass function and then finding the theoretical frequencies.
To find the probability mass function

of the approximate Poisson distribution, we require λ, which is the mean of the


Poisson distribution.

We find the mean of the given distribution and assume it as λ.


Problem 1: There are 50 telephone lines in an exchange. The probability of them being
busy is 0.1. What is the probability that all the lines are busy?
Problem 2: The probability that a bomb dropped from an envelope will strike a certain
target is . If 6 bombs are dropped, find the probability that (i ) exactly 2 will strike the
target and (ii) at least 2 will strike the target.
Problem 3: Suppose that P(X = 2) = P(X = 1), find P(X=0).
Problem 4: Probability of getting no misprint in a page of book is exp(-4). What is the
probability that a page contains more than two misprints?
Problem 5: Six coins are tossed 6400 times using Poisson distribution. What is the
approximate probability of getting six heads 10 times?
Problem 6: Fit a Poisson distribution to the following data and calculate the expected
value and expected (theoretical) frequencies.
The Normal Distribution
Normal Distribution, also called Gaussian Distribution, is one of the widely used
continuous distributions existing which is used to model a number of scenarios such as
marks of students, heights of people, salaries of working people etc.

f(X) Changing μ shifts the


distribution left or right.

Changing σ increases or
decreases the spread.

X
The Normal Distribution:
as mathematical function (pdf)

1 x 2
1  ( )
f ( x)  e 2 
 2
This is a bell shaped curve
Note constants: with different centers and
=3.14159 spreads depending on 
e=2.71828 and 
The Normal PDF

It’s a probability function, so no matter what the values of 


and , must integrate to 1!
 1 x 2
1  ( )

 2
 e 2  dx 1
Normal distribution is defined by its mean
and standard dev.
 1 x 2
1  ( )
E(X)= = x
 2
e 2  dx

 1 x 2
Var(X)=2 = 1  ( )
 x2 dx )   2
( e 2 

  2

Standard Deviation(X)=

The moment generating function of a normal distribution with respect to


origin is
**The beauty of the normal curve:

No matter what  and  are, the area between - and + is about 68%;
the area between -2 and +2 is about 95%; and the area between -3
and +3 is about 99.7%. Almost all values fall within 3 standard
deviations.
68-95-99.7 Rule

68% of
the data

95% of the data

99.7% of the data


68-95-99.7 Rule
in Math terms…
  1 x 2
1  ( )

  
 2
 e 2  dx  .68

  2 1 x 2
1  ( )

  
2 2
 e 2  dx  .95

  3 1 x 2
1  ( )

  
3 2
 e 2  dx  .997
Example
• Suppose GATE scores roughly follows a normal distribution in the
Indian population of college-bound students (with range restricted
to 200-800), and the average score is 500 with a standard deviation
of 50, then:
• 68% of students will have scores between 450 and 550
• 95% will be between 400 and 600
• 99.7% will be between 350 and 650
The Standard Normal (Z):
“Universal Currency”
The formula for the standardized normal probability
density function is

1 Z 0 2 1
1  ( ) 1  ( Z )2
p( Z )  e 2 1
 e 2
(1) 2 2
The Standard Normal Distribution (Z)
All normal distributions can be converted into the standard
normal curve by subtracting the mean and dividing by the
standard deviation:

X 
Z

Somebody calculated all the integrals for the standard normal


and put them in a table! So we never have to integrate!
Even better, computers now do all the integration.
Comparing X and Z units

100 200 X ( = 100,  = 50)

0 2.0 Z ( = 0,  = 1)
What is the area to the
left of Z=1.51 in a
standard normal curve?

Area is 93.45%
Z=1.51

Z=1.51
Example
• For example: What’s the probability of getting a math SAT score of 575 or less,
=500 and =50?

575  500
Z  1.5
50
i.e., A score of 575 is 1.5 standard deviations above the mean

575 1 x 500 2 1.5 1


1  ( ) 1  Z2
 P( X  575)   (50)
200
2
 e 2 50 dx   

2
 e 2 dz

Yikes!
But to look up Z= 1.5 in standard normal chart  no problem! = .9332
Practice problem
If birth weights in a population are normally distributed with a
mean of 109 oz and a standard deviation of 13 oz,
a. What is the chance of obtaining a birth weight of 141 oz or heavier
when sampling birth records at random?
b. What is the chance of obtaining a birth weight of 120 or lighter?
Answer
a. What is the chance of obtaining a birth weight of 141 oz or heavier
when sampling birth records at random?

141  109
Z  2.46
13

From the chart  Z of 2.46 corresponds to a right tail (greater than)


area of: P(Z≥2.46) = 1-(.9931)= .0069 or .69 %
Answer
b. What is the chance of obtaining a birth weight of 120 or lighter?

120  109
Z  .85
13

From the chart  Z of .85 corresponds to a left tail area of:


P(Z≤.85) = .8023= 80.23%
Ex: A certain type of storage battery lasts, on average, 3.0 years with a standard
deviation of 0.5 year. Assuming that battery life is normally distributed, find the
probability that a given battery will last less than 2.3 years.

Solution: To find P(X < 2.3), we need to evaluate the area under the normal curve
to the left of 2.3. This is accomplished by finding the area to the left of the
corresponding z value. Hence, we find that

and then, using Table,


Ex: The marks obtained by a number of students in a certain subject are approximately
normally distributed with mean 65 and standard deviation 5. If 3 students are selected
at random from this group, what is the probability that at least 1 of them would have
scored above 75?

Solution: If X represents the marks obtained by the students, X follows the distribution
N(65, 5).
P(a student scores above 75)

Let p = P(a student scores above 75) = 0.0228 then q = 0.9772 and n = 3. Since p is the
same for all the students, the number Y, of (successes) students scoring above 75,
follows a binomial distribution.
Exponential Distribution
Definitions: A continuous RV X is said to follow an exponential distribution or negative
exponential distribution with parameter λ> 0, if its probability density function is given
by

Mean and Variance of the Exponential Distribution

Raw moments about the origin of the exponential distribution are given by

E(X) = Mean of the exponential distribution


Graphs of several exponential pdf’s

The exponential probability distribution is useful in


describing the time it takes to complete a task.
The exponential random variables can be used to describe:
•Time between vehicle arrivals at a toll booth
•Time required to complete a questionnaire
•Distance between major defects in a highway
In waiting line applications, the exponential distribution is
often used for service times.
Memoryless Property of the Exponential Distribution

Another important application of the exponential distribution is to model the


distribution of component lifetime.

A partial reason for the popularity of such applications is the “memoryless”


property of the exponential distribution.

(1)
Ex: The mileage which car owners get with a certain kind of radial tire is a RV having
an exponential distribution with mean 40,000 km. Find the probabilities that one of
these tires will last (i) at least 20,000 km and (ii) at most 30,000 km.

Solution Let X denote the mileage obtained with the tire


Ex: The time (in hours) required to repair a machine is exponentially distributed with
parameter λ = 1/2.
(a) What is the probability that the repair time exceeds 2 h?
(b) What is the conditional probability that a repair takes at least 10 h given
that its duration exceeds 9 h?

Solution If X represents the time to repair the machine, the density function of
X is given by

(by the memoryless property)


The Gamma Function
• To define the family of gamma distributions, we first need to introduce a
function that plays an important role in many branches of mathematics.

• Definition: For  > 0, the gamma function is defined by

(4.6)
•The most important properties of the gamma function are the
following:

1. For any  > 1, = ( – 1)  ( – 1)


[via integration by parts]

2. For any positive integer, n, = (n – 1)!

3.
Definition: A continuous RV is said to follow an Erlang distribution or General
Gamma distribution with parameters λ> 0 and k > 0, if its probability density
function is given by
Mean and Variance of Erlang Distribution
Ex: In a certain city, the daily consumption of electric power in millions of kilowatt-hours
can be treated as a RV having an Erlang distribution with parameters λ = and . If
the power plant of this city has a daily capacity of millions kilowatt-hours, what is the
probability that this power supply will be inadequate on any given day.

Solution: Let X represent the daily consumption of electric power (in millions of
kilowatt-hours). Then the density function of X is given as

P(the power supply is inadequate)


Ex: If a company employees sales persons, its gross sales in thousands of rupees may
be regarded as a RV having an Erlang distribution with λ = and . If the sales
cost is Rs 8000 per salesperson, how many salespersons should the company employ to
maximise the expected profit?

Solution: Let represent the gross sales (in Rupees) by salespersons.


Ex: Consumer demand for milk in a certain locality, per month, is known to be a general
Gamma (Erlang) RV. If the average demand is liters and the most likely demand is liters
( ), what is the variance of the demand?
Solution Let X represent the monthly consumer demand of milk.
Average demand is the value of E(X).
Most likely demand is the value of the mode of X or the value of X for which its density
function is maximum.
If f (x) is the density function of X, then
Weibull Distribution

Definition: A continuous RV is said to follow a Weibull distribution with parameters


, if the RV follows the exponential distribution with density
function .

Density Function of the Weibull Distribution

By the transformation rule, we have


Mean and Variance of the Weibull Distribution
Ex: Each of the tubes of a radio set has a life length (in years) which may be
considered as a RV that follows a Weibull distribution with parameters and
. If these tubes function independently of one another, what is the probability
that no tube will have to be replaced during the first months of service?

Solution If represents the life length of each tube, then its density function is
given by

P(all the 6 tubes are not to be replaced during the first 2 months)

= 0.0155
Ex: If the life X (in years) of a certain type of car has a Weibull distribution with the
parameter = 2, find the value of the parameter , given that probability that the life
of the car exceeds 5 years is . For these values of and , find the mean and
variance of X.

Solution The density function of X is given by


For the Weibull distribution with parameters

You might also like