You are on page 1of 38

Binomial random variable

Toss a coin with prob p of Heads n times

X : # Heads in n tosses

X is a Binomial random variable with parameter n,p.

X is Bin(n, p)

An X that counts the number of successes in many


independent bernoulli trials is called a binomial random
variable. The two parameters are
I

n the number of trials

p the probability of success in a trial

n trials

Each trial has only two outcomes, Success or Failure

trials are independent

prob of success = p, in all trials

X = # successes is Bin (n,p)

Sampling with and without replacement


If there is a large dichotomous population and a sample is
drawn from it, and we look at X the number of success in the
sample.
If the sample is drawn without replacement then clearly X is not
binomial. However, if the sample size is small relative to
population size, binomial probabilities provide a good
approximation. In this case, in practice X is modeled as a
binomial.

n
x

px (1 p)nx

P(X = x) =

E(X ) = np, V (X ) = np(1 p)


p
s.d (X) = (np(1 p)

Calculating probabilities: from table, software

Binomial table

Problem 4.4

Use the table to find the following probabilities


1. P( x =2) for n=10, p =.4
2. P(x 5) for n = 15, p = .6
3. P(x > 1) for n = 5, p =.1
4. P(x 10) for n=15, p =.9

tophat 4.54

Problem 4.5

calculate , 2 , for the following binomial variables


1. n=25, p =.5
2. n=80, p= .2

problem 4.48

Among guests in a hotel 66% were aware of its Green program


and among those who were aware of the program 72%
participated in it. Let x be the number of guests in a random
sample of 15 who were aware of the Green program and
participated in it.

problem 4.48
I

Explain why x is approximately a binomial random variable

n identical trials. Although the trials are not exactly


identical, they are close. Taking a sample of size n = 15
from a very large population will result in trials being
essentially identical.

Two possible outcomes. The hotel guests are either aware


of and participate in the conservation efforts or they do not.
S = hotel guest is aware of and participates in conservation

P(S) remains the same from trial to trial. If we sample


without replacement, then P(S) will change slightly from
trial to trial. However, the differences are extremely small
and will essentially be 0.

Trials are independent. Again, although the trials are not


exactly independent, they are very close.

The random variable x = number of hotel guests who are


aware of and participate in conservation efforts in n =15
trials. Thus, x is very close to being a binomial. We will

problem 4.48

Among guests in a hotel 66% were aware of its Green program


and among those who were aware of the program 72%
participated in it. Let x be the number of guests in a random
sample of 15 who were aware of the Green program and
participated in it.
I

determine p

problem 4.48

Define the following events: A: hotel guest is aware of


conservation program
B: Hotel guest participates in conservation efforts

Then, P(A B) = P(A)(B|A) = .72(.66) = .4752.

assume p=.4 and find the probability that x is at least 10


from table 1-.966 =0.034

problem 4.58
The engineers forecast that 10% of all Denver bridges will have
ratings of 4 or below
I

Find the probability that in a random sample of 10 bridges


at least 3 will have a ratings below 4

We have a binomial with n=10, p=.1

P(x 3) = 1 P(x 2) = from tables 1 .930 = .07

problem 4.58

If you actually observe that x 3, what would you infer?

Since the probability of seeing at least 3 bridges out of 10 with


ratings of 4 or less is so small, we can conclude that the
forecast of 10% of all major Denver bridges will have ratings of
4 or less in 2020 is too small. There would probably be more
than 10%.

Problem 4.58
You have purchased 5 million switches and your supplier has
guaranteed that there will be no more that .1% defectives. You
randomly sample 500 switches and find 4 defectives. Do you
think the supplier has complied with the guarantee?
Assuming the suppliers claim is true,

= np = 500(.001) = .5

500 .001 .009 = .707

Problem 4.58

If the suppliers claim is true, we would only expect to find .5


defective switches in a sample of size 500. Therefore, it is not
likely we would find 4. Based on the sample, the guarantee is
probably inaccurate.
z-value of observed result is

4.5
.707

= 4.95

This is an unusually large z-score.

Poisson
A Poisson random variable takes values x = 0, 1, 2, . . .. It has
one parameter .

X is a Poisson() variable if, for > 0,

P(X = x) = e

E(X ) =

V (X ) =

x
for x = 0, 1, 2, . . .
x!

Poisson distribution

Why study Poisson? If X is binomial(n, p) with n large and p


small then with = np,

P(X = x)
= e
x!

Poisson process is a very important topic in probability theory

0.0

0.0

0.8

40

x
80

0.4

0.8

ppois(x, 50)

0.4

pbinom(x, 500, p = 0.1)

0
0
40
x

80

Poisson distribution

How do we calculate probabilities for a Poisson random


variable?

first figure out the mean or from the problem

use tables or software (there is no table in the text)

Poisson Distribution Example


Customers arrive at a
rate of 72 per hour.
What is the probability
of 4 customers arriving
in 3 minutes?
1995 Corel Corp.

Poisson Distribution Solution


72 Per Hr. = 1.2 Per Min. = 3.6 Per 3 Min. Interval

p( x)

x e
x!

3.6
p (4)

e 3.6
.1912
4!
4

Poisson Probability Table (Portion)

.02
:
3.4
3.6
3.8
:

0
.980
:
.033
.027
.022
:

x
3

:
:
.558 .744
.515 .706
.473 .668
:
:

:
.997
.996
.994
:

Cumulative Probabilities

p(x 4) p(x 3) = .706 .515 = .191

Problem 4.70

Over the last ten years the average number of bank failures per
year was 45. Assume that X , the number of bank failures per
year follows a Poisson distribution
I

Find E(X) and s.d (X)

E(X) = 45, s.d(X) =

45 = 6.71

In 2011, 360 banks failed. How far does this value lie
above the mean?

z=

36045
671

= 47

In 2010, 65 banks failed. Find P(X 65)

from table P(X 65) = .998

Hypergeometric Random Variable

The experiment consists of randomly drawing n elements


without replacement from a set of N elements, r of which
are Ss (for success) and (N - r) of which are Fs (for failure)

.
I

The hypergeometric random variable x is the number of Ss


in the draw of n elements.

p(x) =

nr
N

2 =

r
x

(Nr
nx

N
n

r (N r )n(N n)
N 2 (N 1)

Sampling with and without replacement


If there is a large dichotomous population and a sample is
drawn from it, and we look at X the number of success in the
sample.
If the sample is drawn without replacement then X is
Hypergoemetric. However, if the sample size is small relative to
population size, binomial probabilities provide a good
approximation. In this case, in practice X is modeled as a
binomial.

Continuous random variables

probability density, mean, s.d.

Normal distribution

A random variable X is continuous if P(X = x) = 0 for all x

uniform random variable

Suppose X is a number picked at random from the


interval [0, 1]

P(X = x) = 0 for all x

But Probability X falls in an interval is equal to the length of


the interval,and is nonzero.

Probabilities involving X can be modeled by the following


picture

0.0

1.0

2.0

Identifies prob with area under a function

1.0

0.5

0.0

0.5

1.0

1.5

2.0

Continuous Probability Density


Function
The graphical form of the probability distribution for a
continuous random variable x is a smooth curve

Density curves
A density curve is a mathematical model of a distribution.
The total area under the curve, by definition, is equal to 1, or 100%.

The area under the curve for a range of values is the probability of all
observations for that range.

Histogram of a sample with the


smoothed, density curve
describing theoretically the
population.

Density curves come in any


imaginable shape.

Some are well known


mathematically and others arent.

Our interest is in a special type of density called Normal density


or Normal Distribution