You are on page 1of 68

Random Variables

A random variable is a numerical description of the


outcome of an experiment.
It is a set of possible values from a random
experiment.
Ex. Height, Weight or age etc. when the values
obtained arise as a result of chance factors, so that
they cannot be exactly predicted in advance.

A discrete random variable may assume either a


finite number of values or an infinite sequence of
values.
A continuous random variable may assume any
numerical value in an interval or collection of
intervals.

Which of the following is a discrete random variable?


I. The average height of a randomly selected group of boys.
II. The annual number of sweepstakes winners from New
York City.
III. The number of presidential elections in the 20th century.
Discrete Probability Distributions

The probability distribution for a random variable


describes how probabilities are distributed over
the values of the random variable.

We can describe a discrete probability distribution


with a table, graph, or formula.

The probability distribution is defined by a


probability function, denoted by f(x), which provides
the probability for each value of the random variable.
The required conditions for a discrete probability
function are:
f(x) > 0

f(x) = 1
Ex. Probability distribution of TV sales
Number
Units Sold of Days x f(x)
0 80 0 .40
1 50 1 .25
2 40 2 .20
3 10 3 .05
4 20 4 .10
200 1
The discrete uniform probability distribution is the
simplest example of a discrete probability
distribution given by a formula.

The discrete uniform probability function is

f(x) = 1/n

where:
n = the number of values the random
variable may assume
The expected value, or mean, of a random variable
is a measure of its central location.
E(x) =  = xf(x)

The expected value is a weighted average of the


values the random variable may assume.
The variance summarizes the variability in the
values of a random variable.
Var(x) =  2 = (x - )2f(x)

The variance is a weighted average of the squared


deviations of a random variable from its mean. The
weights are the probabilities.

The standard deviation, , is defined as the positive


square root of the variance.
Expected Value

x f(x) xf(x)
0 .40 .00
1 .25 .25
2 .20 .40
3 .05 .15
4 .10 .40
E(x) = 1.20expected number of TVs sold in a
day
Variance

x x- (x - )2 f(x) (x - )2f(x)

0 -1.2 1.44 .40 .576


1 -0.2 0.04 .25 .010
2 0.8 0.64 .20 .128
3 1.8 3.24 .05 .162
TVs
4 2.8 7.84 .10 .784
squared
Variance of daily sales = s 2 = 1.660
Standard deviation of daily sales = 1.2884 TVs
Workout

Children Couples p(X)


0 1
1 4
2 3
3 2
4 2
Total 12
H.W
Probability distribution of a random variable y
Y f(y)
2 .20
4 .30
7 .40
8 .10
a. Compute E(y)
b. Compute Var(y) and s.d
Discrete probability distributions

- The binomial

- The Poisson
Binomial Probability Distributions
• A coin-tossing experiment is a simple example of an
important discrete random variable called the
binomial random variable.

• Other situations that are similar to the coin-tossing


experiment:
- A sociologist is interested in the proportion of
elementary school teachers who are men.
- A soft-drink marketer is interested in the proportion
of cola drinkers who prefer her brand.
Definition: A binomial experiment is one that has
these four characteristics:
1. The experiment consists of n identical results.
2. Each trial results in one of two outcomes: one
outcome is called a success, S, and the other a
failure, F.
3. The probability of success on a single trial is
equal to p and remains the same from trial to trial.
The probability of failure is equal to (1 - p) = q.
4. The trials are independent.
Ex. An insurance sales person who visits 10
randomly selected families. The outcome
associated with each visit is classified as a
success if the family purchases an insurance
policy and a failure if the family does not.
From past experience, the sales person knows
the probability of a randomly selected family
will purchase policy is .10.

Check the properties of Binomial experiment.


1. The experiment consists of 10 identical trials,
each trial involves contacting one family.
2. Two outcomes are possible on each trial : the
family purchases a policy (success) or the
family does not purchase a policy (failure)
3. The probability of a purchase and a non
purchase are assumed to be the same for each
sales call, with p=.10 and 1-p=.90.
4. The trials are independent because the
families are randomly selected.
The Binomial Probability function

n! x (nx )
f (x)  p (1  p )
x !(n  x )!

x = the number of successes


p = the probability of a success on one trial
n = the number of trials
f(x) = the probability of x successes in n trials
Mean and Standard Deviation for the
Binomial Random Variable:

Mean: m = np

Variance: s 2 = npq

Standard deviation:   npq


Ex1.. Mr. X applied for a loan from a psu bank
and he got information that over the years
bank has received about 2920 loan
applications per year and the probability of
approval at 0.85.
a) Mr.X wants to know the average and
standard deviation of number of loans
approved per year.
Binomial Formula. Suppose a binomial experiment consists
of n trials and results in x successes. If the probability of
success on an individual trial is P, then the binomial probability
is: b(x; n, P) = nCx * Px * (1 - P)n – x

EXAMPLE 2

Suppose a die is tossed 5 times. What is the probability of getting


exactly 2 fours?
Solution: This is a binomial experiment in which the number of
trials is equal to 5, the number of successes is equal to 2, and
the probability of success on a single trial is 1/6 or about
0.167. Therefore, the binomial probability is:
b(2; 5, 0.167) = 5C2 * (0.167)2 * (0.833)3 
b(2; 5, 0.167) = 0.161
Ex3. In city x, 30% of workers take public
transportation daily. A. In a sample of 10
workers, what is the probability that exactly
three workers take public transportation daily.
b. In sample of 10 workers, what is the
probability that at least three workers take
public transportation daily.
n x
a. f ( x)    ( p) (1  p)n  x
 x

10!
f (3)  (.30)3 (1  .30)10 3
3!(10  3)!

10(9)(8)
f (3)  (.30)3 (1  .30)7  .2668
3(2)(1)

b. P(x  3) = 1 - f (0) - f (1) - f (2)

10!
f (0)  (.30)0 (1  .30)10  .0282
0!(10)!

10!
f (1)  (.30)1 (1  .30)9  .1211
1!(9)!

10!
f (2)  (.30)2 (1  .30)8  .2335
2!(8)!

P(x  3) = 1 - .0282 - .1211 - .2335 = .6172


Ex.4. A university found that 20% of its students
withdraw without completing the introductory
statistics course. Assume that 20 students
registered for the course.
1. Compute the probability that two or fewer will
withdraw.
2. Compute the probability that exactly four will
withdraw.
3. Compute the probability that more than three
will withdraw.
4. Compute the expected number of withdrawals.
a. f (0) + f (1) + f (2) = .0115 + .0576 + .1369 = .2060

b. f (4) = .2182

c. 1 - [ f (0) + f (1) + f (2) + f (3) ] = 1 - .2060 - .2054 = .5886

d.  = n p = 20 (.20) = 4
Ex.5
Nine percent of undergraduate students carry credit
card balances greater that 7000. Suppose 10
undergraduate students are selected randomly to be
interviewed about credit card usage.
a. Is the selection of 10 students a binomial
experiment? Explain?
b. What is the probability that two of the students will
have a credit balance greater than 7000.
c. What is the probability that none will have a credit
card balance greater than 7000.
d. What is the probability that at least three will have a
credit card balance greater than 7000.
Binomial n = 10 and p = .09

10!
f ( x)  (.09) x (.91)10 x
x !(10  x)!

a. Yes. Since they are selected randomly, p is the same from trial to trial and the trials are independent.

b. f (2) = .1714

c. f (0) = .3894

d. 1 - f (0) - f (1) - f (2) = 1 - (.3894 + .3851 + .1714) = .0541


• Cumulative Binomial Probability
• A cumulative binomial probability refers to the
probability that the binomial random variable falls
within a specified range
• For example, we might be interested in the
cumulative binomial probability of obtaining 45 or
fewer heads in 100 tosses of a coin (see Example 1
below). This would be the sum of all these
individual binomial probabilities.
• b(x < 45; 100, 0.5) = 
b(x = 0; 100, 0.5) + b(x = 1; 100, 0.5) + ... + b(x = 44;
100, 0.5) + b(x = 45; 100, 0.5)
Example
The probability that a student is accepted to a prestigious
college is 0.3. If 5 students from the same school apply,
what is the probability that at most 2 are accepted?
• Solution: To solve this problem, we compute 3
individual probabilities, using the binomial formula. The
sum of all these probabilities is the answer we seek.
Thus,
• b(x < 2; 5, 0.3) = b(x = 0; 5, 0.3) + b(x = 1; 5, 0.3) + b(x =
2; 5, 0.3)
b(x < 2; 5, 0.3) = 0.1681 + 0.3601 + 0.3087 
b(x < 2; 5, 0.3) = 0.8369
Poisson Probability Distributions
• The Poisson probability distribution is a good model for data
that represent the number of occurrences of a specified
event in a given unit of time or space.
• Some examples of Poisson random variables:
- The number of calls received by a switchboard during a
given period of time.

- The number of customer arrivals at a checkout counter


during a given minute
- The number of machine breakdowns during a given day
- The number of traffic accidents at a given intersection
during a given time period
• A Poisson experiment is a statistical experiment that has
the following properties:
• The experiment results in outcomes that can be classified
as successes or failures.
• The average number of successes (μ) that occurs in a
specified region is known.
• The probability that a success will occur is proportional to
the size of the region.
• The probability that a success will occur in an extremely
small region is virtually zero.
• Note that the specified region could take many forms. For
instance, it could be a length, an area, a volume, a period
of time, etc.
• Poisson Formula. Suppose we conduct a
Poisson experiment, in which the average
number of successes within a given region is
μ. Then, the Poisson probability is:
P(x; μ) = (e-μ) (μx) / x!
• where x is the actual number of successes that
result from the experiment, and eis
approximately equal to 2.71828.
Example 1

The average number of homes sold by the Acme Realty company is 2 homes
per day. What is the probability that exactly 3 homes will be sold tomorrow?
• Solution: This is a Poisson experiment in which we know the following:
• μ = 2; since 2 homes are sold per day, on average.
• x = 3; since we want to find the likelihood that 3 homes will be sold
tomorrow.
• e = 2.71828; since e is a constant equal to approximately 2.71828.
• We plug these values into the Poisson formula as follows:
• P(x; μ) = (e-μ) (μx) / x! 
P(3; 2) = (2.71828-2) (23) / 3! 
P(3; 2) = (0.13534) (8) / 6 
P(3; 2) = 0.180 

• Thus, the probability of selling 3 homes tomorrow is 0.180 .


Ex.2. An average 15 aircraft accidents occur each year.
a. Compute the mean number of accidents per
month
b. Compute the probabilities of no accidents during a
month
c. Compute the probability of exactly one accident
per month
d. Compute the probability of more than one
accident during a month.
a.  = 1.25 per month

1.250 e1.25
b. f (0)   .2865
0!
1.251 e1.25
c. f (1)   .3581
1!

d. P(More than 1) = 1 - f (0) - f (1) = 1 - 0.2865 - 0.3581 = .3554


Ex.3 Airline passengers arrive randomly and
independently at the passenger-screening facility at
a major international airport. The mean arrival rate
is 10 passengers per minute.
a. Compute the probability of no arrivals in a one-
minute period.
b. Compute the probability that three or fewer
passengers arrive in a one-minute period.
c. Compute the probability of no arrivals in a 15
second period.
d. Compute the probability of at least one arrival in a
15 second period.
100 e 10
a. f (0)   e 10  .000045
0!

b. f (0) + f (1) + f (2) + f (3)

f (0) = .000045 (part a)

101 e10
f (1)   .00045
1!

Similarly, f (2) = .00225, f (3) = .0075

and f (0) + f (1) + f (2) + f (3) = .010245

c. 2.5 arrivals / 15 sec. period Use  = 2.5

2.50 e2.5
f (0)   .0821
0!

d. 1 - f (0) = 1 - .0821 = .9179


Less than 3
f(0)+f(1)+f(2)
Greater than or more than 3
1- f(0)+f(1)+f(2)+f(3)
At least 3 (p(x>=3))
1- f(0)+f(1)+f(2)
At most 3 (p(x<=3)
f(0)+f(1)+f(2)+f(3)
Continuous Probability Distributions
 A continuous random variable can assume any value
in an interval on the real line or in a collection of
intervals.
 It is not possible to talk about the probability of the
random variable assuming a particular value.
 Instead, we talk about the probability of the random
variable assuming a value within a given interval.
Continuous Probability Distributions

Uniform Probability Distribution


Normal Probability Distribution

Exponential Probability Distribution


f (x) Exponential
Uniform
f (x)
Normal
f (x)

x
x
x
Continuous Probability Distributions

 The probability of the random variable assuming a


value within some given interval from x1 to x2 is
defined to be the area under the graph of the
probability density function between x1 and x2.

f (x) Exponential
Uniform
f (x)
Normal
f (x)

x
x x1 xx12 x2
x1 x2
x
x1 x2
Normal Probability Distribution
• Normal Probability Density Function
1  ( x   )2 /2 2
f (x)  e
 2

where:
 = mean
 = standard deviation
 = 3.14159
e = 2.71828
Normal Probability Distribution

 Characteristics

The distribution is symmetric; its skewness


measure is zero.

x
Normal Probability Distribution

 Characteristics

The entire family of normal probability


distributions is defined by its mean m and its
standard deviation s .

Standard Deviation s

x
Mean m
Normal Probability Distribution

 Characteristics

The highest point on the normal curve is at the


mean, which is also the median and mode.

x
Normal Probability Distribution

 Characteristics

The mean can be any numerical value: negative,


zero, or positive.

x
-10 0 25
Normal Probability Distribution

 Characteristics

The standard deviation determines the width of the


curve: larger values result in wider, flatter curves.

s = 15

s = 25

x
Normal Probability Distribution

 Characteristics

Probabilities for the normal random variable are


given by areas under the curve. The total area
under the curve is 1 (.5 to the left of the mean and
.5 to the right).

.5 .5
x
Normal Probability Distribution

 Characteristics (basis for the empirical rule)

68.26% of values of a normal random variable


are within +/- 1 standard deviation of its mean.

95.44% of values of a normal random variable


are within +/- 2 standard deviations of its mean.

99.72% of values of a normal random variable


are within +/- 3 standard deviations of its mean.
Normal Probability Distribution

 Characteristics (basis for the empirical rule)


99.72%
95.44%
68.26%

m x
m – 3s m – 1s m + 1s m + 3s
m – 2s m + 2s
Standard Normal Probability Distribution
 Characteristics

A random variable having a normal distribution


with a mean of 0 and a standard deviation of 1 is
said to have a standard normal probability
distribution.
Standard Normal Probability Distribution

 Characteristics

The letter z is used to designate the standard


normal random variable.

s=1

z
0
Standard Normal Probability Distribution

 Converting to the Standard Normal Distribution

x
z

We can think of z as a measure of the number of


standard deviations x is from .
1. When x is less than mean the value of z is negative
2. When x is more than mean the value of z is positive
3. When x=mean the value of z=0.
Two types of questions

1. Specifies a value, or values for Z and asks


to use the table to determine the
corresponding areas or probabilities.
2. Provides a area, or probability and asks us
to use the table to determine the
corresponding Z value..
Computing probabilities for a Normal Distribution
To find probability of Procedure
Being
Less than ZLook up z in the table

More than Z Subtract above answer from 1

Between Z1 and Z 2 Look up z1 and z2in the table


and subtract smaller probability from larger

Not between z1 and Z2subtract above answer


from 1
A random variable is normally distributed with a
mean of 50 and a s.d of 5.
a. Sketch a normal curve for the probability
density function. Label the horizontal axis
with values of 35,40,45,50,55,60, and 65.
b. What is the probability the random variable
will assume a value between 45 and 55?
c. What is the probability the random variable
will assume a value between 40 and 60?.
Chapter 6. 12. page 281.
a. P(0 ≤ z ≤ .83) = .7967 - .5000 = .2967
b. P(-1.57 ≤ z ≤ 0) = .5000 - .0582 = .4418
c. P(z > .44) = 1 - .6700 = .3300
d. P(z ≥ -.23) = 1 - .4090 = .5910
e. P(z < 1.20) = .8849
 f. P(z ≤ -.71) = .2389
 
Ex.2 Given that z is a standard normal random
variable, find z for each situation
a. The area to the left of z is .9750
b. The area between 0 and z is .4750
c. The area to the left of z is .7291
d. The area to the right of z is .1314
e. The area to the left of z is .6700
f. The area to the right of z is .3300
14. a. The z value corresponding to a cumulative probability of .9750 is z = 1.96.

b. The z value here also corresponds to a cumulative probability of .9750: z = 1.96.

c. The z value corresponding to a cumulative probability of .7291 is z = .61.

d. Area to the left of z is 1 - .1314 = .8686. So z = 1.12.

e. The z value corresponding to a cumulative probability of .6700 is z = .44.

f. The area to the left of z is .6700. So z = .44.


Ex. Given that Z is a standard normal random
variable, find z for each situation.
a. The area to the left of z is .2119
b. The area between –z and +z is .9030
c. The area between –z and +z is .2052
d. The area to the left of z is .9948
e. The area to the right of z is .6915
a. The z value corresponding to a cumulative probability of .2119
is z = -.80.

b. Compute .9030/2 = .4515; z corresponds to a cumulative


probability of .5000 + .4515 = .9515. So z = 1.66.
 
c. Compute .2052/2 = .1026; z corresponds to a cumulative
probability of .5000 + .1026 = .6026. So z = .26.
 
d. The z value corresponding to a cumulative probability
of .9948 is z = 2.56.
 
e. The area to the left of z is 1 - .6915 = .3085. So z = -.50.
Ex.3.
For borrower with good credit scores, the mean debt for
revolving and installment account is 15015. Assume the
s.d is 3540 and the debt amount are normally distributed.
a. What is the probability that the debt for a borrower with
good credit is more than 18000.
b. What s the probability that the debt for a borrower with
good credit is less than 10000.
c. What is the probability that the debt for a borrower with
good credit is between 12000 and 18000.
d. What is the probability that the debt for a borrower with
good credit is no more than 14000.
Let x = debt amount

 = 15,015,  = 3540

18,000  15,015
a. z  .84
3540

P(x > 18,000) = 1- P(z ≤ .84) = 1 - .7995 = .2005


10,000  15,015
b. z   1.42
3540

P(x < 10,000) = P(z < -1.42) = .0778

c. At 18,000, z = .84 from part (a)

12,000  15,015
At 12,000, z   .85
3540

P(12,000 < x < 18,000) = P(-.85 < z < .84) = .7995 - .1977 = .6018

14,000  15,015
d. z  .29
3540

P(x  14,000) = P(z  -.29) = .3859


Ex. The average stock price for companies
making up the S&P 500 is dollar 30 and the s.d
is dollar 8.20. Assume the stock price are
normally distributed
a. What is the probability a company will
have a stock price of at least dollar 40.
b. What is the probability of a company will
have a stock price no higher than dollar 20.
c. How high does a stock price have to be to
put a company in the top 10%.
 = 30 and  = 8.2

40  30
a. At x = 40, z   1.22
8.2

P(z  1.22) = .8888

P(x  40) = 1 - .8888 = .1112

20  30
b. At x = 20, z   1.22
8.2

P(z ≤ -1.22) = .1112

So, P(x  20) = .1112

c. A z-value of 1.28 cuts off an area of approximately 10% in the upper tail.

x = 30 + 8.2(1.28) = 40.50

A stock price of $40.50 or higher will put a company in the top 10%
Ex.4 The average return for large-cap domestic stock funds
over the three years 2009-2011 was 14.4%. Assume the
three year returns were normally across funds with a
standard deviation of 4.4%.
a. What is the probability an individual large-cap domestic
stock fund had a three-year return of at least 20%?
b. What is the probability an individual large-cap domestic
stock fund had a three-year return of 10% or less?
c. How big does the return have to be put a domestic
stock fund in the top 10% for the three-year period?.
Ex. According to sleep foundation, the average
nights sleep is 6.8 hours. Assume the standard
deviation is .6 hours and that the probability
distribution is normal.
a. What is the probability that a randomly
selected person sleeps more than 8 hours?
b. What is the probability that a randomly
selected person sleeps 6 hours or less?
c. Doctors suggest getting between 7 and 9
hours of sleep each night. What percentage
of the population gets this much sleep?
 = 6.8,  = .6

8  6.8
a. At x = 8, z   2.00
.6

P(x > 8) = P(z > 2.0) = 1 - .9772 = .0228


6  6.8
b. At x = 6, z   1.33
.6

P(x  6) = P(z  -1.33) = .0918

9  6.8
c. At x = 9, z   3.67
.6

7  6.8
At x = 7, z   .33
.6

P(7 < x < 9) = P(.33 < z < 3.67) = 1 - .6293 = .3707

Only 37.07 percent of the population get the amount of sleep recommended by doctors. Most get
less.

You might also like