You are on page 1of 33

Statistics for Business Analysis

Day 5 Session-I CONTINUOUS PROBABILITY DISTRIBUTIONS

Learning Objectives
To learn the concept of continuous random variable To compute probabilities from the normal distribution To use the normal probability plot to determine whether a set of data is approximately normally distributed To compute probabilities from the uniform distribution To compute probabilities from the exponential distribution To compute probabilities from the normal distribution to approximate probabilities from the binomial distribution

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Probability Distributions
Probability Distributions Discrete Probability Distributions Binomial Poisson Hypergeometric
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Continuous Probability Distributions Normal Uniform Exponential

Continuous Probability Distributions


A continuous random variable is a variable that can assume any value on a continuum (can assume an uncountable number of values)
thickness of an item time required to complete a task temperature of a solution height, in inches

These can potentially take on any value, depending only on the ability to measure accurately.
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Distribution


Probability Distributions Continuous Probability Distributions Normal Uniform Exponential
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Distribution


Bell Shaped Symmetrical Mean, Median and Mode are Equal Location is determined by the mean, Spread is determined by the standard deviation, The random variable has an infinite theoretical range: + to
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

f(X)

Mean = Median = Mode

Many Normal Distributions

By varying the parameters and , we obtain different normal distributions


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Distribution Shape


f(X) Changing shifts the distribution left or right. Changing increases or decreases the spread.

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Probability Density Function


The formula for the normal probability density function with X ~ Normal(, 2) is

f(X) =

2 1 e (1/2)[(X)/] 2

Where

e = the mathematical constant approximated by 2.71828 = the mathematical constant approximated by 3.14159 = the population mean = the population standard deviation X = any value of the continuous variable

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Empirical Rules
What can we say about the distribution of values around the mean? There are some general rules:
f(X) 1 encloses about 68% of Xs

-1
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

+1

68.26%

The Empirical Rule


(continued)

2 covers about 95% of Xs 3 covers about 99.7% of Xs

2 x

3 x

95.44%

99.73%

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Standardized Normal Distribution


Also known as the Z distribution Mean is 0 Standard Deviation is 1
f(Z) 1 0 Z

Values above the mean have positive Z-values, values below the mean have negative Z-values
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Standardized Normal


Any normal distribution (with any mean and standard deviation combination) can be transformed into the standardized normal distribution (Z) Need to transform X units into Z units

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Translation to the Standardized Normal Distribution


Translate from X to the standardized normal (the Z distribution) by subtracting the mean of X and dividing by its standard deviation:

Z=

The Z distribution always has mean = 0 and standard deviation = 1


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Standardized Normal Probability Density Function


The formula for the standardized normal probability density function is

f(Z) =

2 1 e (1/2)Z 2

Where

e = the mathematical constant approximated by 2.71828 = the mathematical constant approximated by 3.14159 Z = any value of the standardized normal distribution

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Example
If X is distributed normally with mean of 100 and standard deviation of 50, the Z value for X = 200 is

Z=

X 200 100 = = 2.0 50

This says that X = 200 is two standard deviations (2 increments of 50 units) above the mean of 100.
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Comparing X and Z units

100 0

200 2.0

X Z

( = 100, = 50) ( = 0, = 1)

Note that the distribution is the same, only the scale has changed. We can express the problem in original units (X) or in standardized units (Z)
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Finding Normal Probabilities


Probability is the Probability is measured area under the curve! under the curve f(X)

by the area

P (a X b) = P (a < X < b)
(Note that the probability of any individual value is zero)

a
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Probability as Area Under the Curve


The total area under the curve is 1.0, and the curve is symmetric, so half is above the mean, half is below f(X) P( < X < ) = 0.5

P( < X < ) = 0.5

0.5

0.5

P( < X < ) = 1.0


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Standardized Normal Table


The Cumulative Standardized Normal table in the textbook (Appendix table E.2) gives the probability less than a desired value for Z (i.e., from negative infinity to Z)
Example: P(Z < 2.00) = 0.9772 0
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

0.9772

2.00

The Standardized Normal Table


(continued)

The column gives the value of Z to the second decimal point


Z 0.00 0.01 0.02

The row shows the value of Z to the first decimal point

0.0 0.1

. . .

2.0

.9772

2.0 P(Z < 2.00) = 0.9772


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The value within the table gives the probability from Z = up to the desired Z value

General Procedure for Finding Probabilities


To find P(a < X < b) when X is distributed normally: Draw the normal curve for the problem in terms of X Translate X-values to Z-values Use the Standardized Normal Table Suppose X is normal with = 8.0 & =5.0 Find P(X < 8.6)
X 8.0
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

8.6

Finding Normal Probabilities


(continued)

Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(X < 8.6)
Z= X 8.6 8.0 = = 0.12 5.0

=8 = 10

=0 =1

8 8.6

0 0.12

P(X < 8.6)


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

P(Z < 0.12)

Solution: Finding P(Z < 0.12)


Standardized Normal Probability Table (Portion) P(X < 8.6) = P(Z < 0.12) .5478

.00

.01

.02

0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871
Z

0.3 .6179 .6217 .6255


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

0.00 0.12

Upper Tail Probabilities


(continued)

Now Find P(X > 8.6)


P(X > 8.6) = P(Z > 0.12) = 1.0 - P(Z 0.12) = 1.0 - 0.5478 = 0.4522 0.5478 1.0 - 0.5478 = 0.4522

1.000

Z 0 0.12
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Z 0 0.12

Probability Between Two Values


Suppose X is normal with mean 8.0 and standard deviation 5.0. Find P(8 < X < 8.6)
Calculate Z-values:

Z=

X 8 8 = =0 5
8 8.6 X Z 0 0.12

Z=

X 8.6 8 = = 0.12 5

P(8 < X < 8.6) = P(0 < Z < 0.12)

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Solution: Finding P(0 < Z < 0.12)


Standardized Normal Probability Table (Portion) P(8 < X < 8.6) = P(0 < Z < 0.12) = P(Z < 0.12) P(Z 0) = 0.5478 - .5000 = 0.0478 0.0478 0.5000

.00

.01

.02

0.0 .5000 .5040 .5080

0.1 .5398 .5438 .5478


0.2 .5793 .5832 .5871 0.3 .6179 .6217 .6255
0.00
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Z 0.12

Probabilities in the Lower Tail


Suppose X is normal with mean 8.0 and standard deviation 5.0. Now Find P(7.4 < X < 8)

X 8.0 7.4
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Probabilities in the Lower Tail


(continued)

Now Find P(7.4 < X < 8)


P(7.4 < X < 8) = P(-0.12 < Z < 0) = P(Z < 0) P(Z -0.12) = 0.5000 - 0.4522 = 0.0478 The Normal distribution is symmetric, so this probability is the same as P(0 < Z < 0.12)
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

0.0478

0.4522

7.4 8.0 -0.12 0

X Z

Numerical Problems
Ref. # 5-38 Page No.270: Given that a random variable, X, has a normal distribution with mean 6.4 and standard deviation 2.7, find a. P(4.0 < X <5.0) b. P(X > 2.0) c. P(X < 7.2) d. P(X < 3.0) Ans. a. .1150 b. .9484 c. .6165 d. .8960
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Numerical Problems
Ref. # 5-46 Page No.271: Glenn Howell, VP of personnel for the Standard Insurance Company, has developed a new training program, that is entirely self-paced. New employees work various stages at their own pace; completion occurs when the material is learned. Howells program has been especially effective in speeding up the raining process, as an employees salary during training is only 67% of that earned upon completion of the program. In the last several years, average completion time of the program was 44 days, and the standard deviation was 12 days.
a. b. c. What is the probability that an employee will finish the program in 33 to 42 days? What is the probability of finishing the program in fewer than 30 days? Fewer than 25 or more than 60 days?

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

a. b. c.

Ans. .2537 .1210 .1489

Numerical Problems
Ref. # 5-48 Page No.271: R. V. Poppin, the concession stand manager for the local hockey rink, just had 2 cancellations on his crew. This means that if more than 72000 people come to tonights hockey game, the lines for hot dogs will constitute a disgrace to Mr. Poppin and will harm business at future games. Mr. Poppin knows from experience that the number of people who come to the game is normally distributed with mean 67000 and standard deviation 4000 people. What is the probability that there will be more than 72000 people? Suppose Mr. Poppin can hire two temporary employees to make sure business wont be harmed in the future at an additional cost of $200. if be believes the future harm to business of having more than 72000 fans at the game would be $5000, should he hire the employees? Explain (Assume there will be no harm if 72000 or fewer fans show up, and that the harm due to too many fans doesnt depend on how many more than 72000 show up? Ans. 0.1056 Yes, the $200 cost is less than $528 expected loss to future business. (Since P(x>72000) = 0.1056 than expected loss $5000*.1056 = $528)

a. b.

a. b.

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Numerical Problems
Ref. # 5-49 Page No.272: Maurine Lewis, an editor for a large publishing company, calculate that it requires 11 months on average to complete the publication process from manuscript to finished book, with a standard deviation of 2.4 months. She believes that the normal distribution well describes the distribution of publication times. Out of 19 books she will handle this year, approximately how many will complete the process in less than a year? Ans. 13 as P (x<12) = .6615

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Finding the X value for a Known Probability


Steps to find the X value for a known probability:
1. Find the Z value for the known probability 2. Convert to X units using the formula:

X = + Z
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Finding the X value for a Known Probability


(continued)

Example: Suppose X is normal with mean 8.0 and standard deviation 5.0. Now find the X value so that only 20% of all values are below this X
0.2000

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

? ?

8.0 0

X Z

Find the Z value for 20% in the Lower Tail


1. Find the Z value for the known probability
Standardized Normal Probability Table (Portion)

Z -0.9

.03

.04

.05

20% area in the lower tail is consistent with a Z value of -0.84


0.2000

.1762 .1736 .1711

-0.8 .2033 .2005 .1977


-0.7 .2327 .2296 .2266

? 8.0 -0.84 0
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

X Z

Finding the X value


2. Convert to X units using the formula:

X = + Z = 8.0 + ( 0.84 )5.0 = 3.80


So 20% of the values from a distribution with mean 8.0 and standard deviation 5.0 are less than 3.80
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Numerical Problems
Ref. # 5-39 Page No.270: In a normal distribution with a standard deviation of 5.0, the probability that an observation selected at random exceeds 21 is 0.14 Find the mean of the distribution Find the value below which 4% of the values in the distribution lie. Find the value of Z such that P(X >21) = 0.14 or P(X < 21) = 0.86 The value of Z from the table +1.08 (+ sign because the area is on right of mu) = X- Z = 21- 1.08*5.0 = 15.6 X= 6.85 P (X< 6.85 ) = 4%

a. b. Ans.

a. b.

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Numerical Problems
Ref. # 5-44 Page No.271: Jarrid Medical, Inc., is developing a compact kidney dialysis machine, but its chief engineer, Mike Crowe, is having trouble controlling the variability of the rate at which fluid moves through the device. Medical standards require that the hourly flow be 4 liter, plus or minus 0.1 liter, 80% pf the time. Mr. Crowe, in testing the prototype, has found that 68% of the time, the hourly flow is within 0.08 liter of 4.02 liters. Does the prototype satisfy the medical standards? Ans. Given P( 4.02-0.08 <X < 4.02+0.08) = 68% & obtain & then test the value of P(3.9 < X < 4.1) = 80%??? a. b. Since above P is 68% with in 4.02 0.8 implying = 4.02, = 0.8 P(3.9 < X < 4.1) = 0.7745 i.e. 77.45% not 80%

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Numerical Problems
Ref. # 5-50 Page No.272: The Quickie Sales Corporation has just been given two conflicting estimates of sales for the upcoming quarter. Estimate I says that sales (in millions of dollars) will be normally distributed with = 325 and = 60. Estimate II says that sales will be normally distributed with = 300 and = 50. The board of directors finds that each estimated appears to be equally believable a priori. In order to determine which estimate should be used fro future predictions, the board of directors has decided to meet again at the end of the quarter to use updated sales information to make a statement about the credibility of each estimate. Assuming that estimate I is accurate, what is the probability that Quickie will have quarterly sales in excess of $350 million? Rework part (a) assuming that estimate II is correct. At the end of the quarter, the board of directors finds that Quickie Sales Corp. has had sales in excess of $350 million. Given this updated information, what is the probability that Estimate I was originally the accurate one? (Use Bayes Theorem) Rework part for Estimate II. Ans. 0.3385 0.1587 P(E1:x > 350) = ?

a. b. c.

d.

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

a. b. c.

Solution Ctd
P(E1 | D : X > 350) = = P(D | E1)P(E1) P(D | E1)P(E1) + P(D | E2)P(E2) (0.3385)(0.5) (0.3385)(0.5) + (0.1587)(0.5) 0.1692 = = 0.680 0.1692 + 0.0793
P(E2 | D : X > 350) = = P(D | E2)P(E2) P(D | E1)P(E1) + P(D | E2)P(E2)

(0.1587)(0.5) (0.3385)(0.5) + (0.1587)(0.5) 0.0793 = = 0.32 0.1692 + 0.0793


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Evaluating Normality
Not all continuous random variables are normally distributed It is important to evaluate how well the data set is approximated by a normal distribution

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Evaluating Normality
(continued)

Construct charts or graphs


For small- or moderate-sized data sets, do stem-andleaf display and box-and-whisker plot look symmetric? For large data sets, does the histogram or polygon appear bell-shaped?

Compute descriptive summary measures


Do the mean, median and mode have similar values? Is the interquartile range approximately 1.33 ? Is the range approximately 6 ?
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Assessing Normality
(continued)

Observe the distribution of the data set


Do approximately 2/3 of the observations lie within mean 1 standard deviation? Do approximately 80% of the observations lie within mean 1.28 standard deviations? Do approximately 95% of the observations lie within mean 2 standard deviations?

Evaluate normal probability plot


Is the normal probability plot approximately linear with positive slope?
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Probability Plot


Normal probability plot
Arrange data into ordered array Find corresponding standardized normal quantile values Plot the pairs of points with observed data values on the vertical axis and the standardized normal quantile values on the horizontal axis Evaluate the plot for evidence of linearity

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Normal Probability Plot


(continued)

A normal probability plot for data from a normal distribution will be approximately linear:

90 60 30 -2 -1 0 1 2

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Normal Probability Plot


(continued)

Left-Skewed
X 90
60 30 -2 -1 0 1 2 Z

Right-Skewed
X 90
60 30 -2 -1 0 1 2 Z

Rectangular
X 90
60 30
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Nonlinear plots indicate a deviation from normality


-2 -1 0 1 2 Z

Numerical Problems
Example: The following ordered array (from left to right) depicts the amount of money (in dollars) withdrawn from a cash machine by 25 customers at a local bank: 40, 50, 50, 50, 50, 70, 70, 80, 80, 90,100, 100 90,100, 100 90,100, 100, 110, 110, 120, 120, 130, 140, 140, 150, 160, 160, 200 Decide whether or not the data appear to be approximately normally distributed by Evaluating the actual versus theoretical properties. Constructing a normal probability plot. Ans. Look at the MS-Excel worksheet

a.

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Check for Normality


Mean Median mode Midrange midhinge
Interquartile range

106.8 100 100 120 105 50

300

200 Largest
200

135 Quartile 3 100 Median

100

80 Quartile 1 40 Smallest

0
N= 25

VAR001

Histogram

Standard deviation Range Skewness Kurtosis


Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

10

37.38 160
F requency

0.3849 0.2019

Std. Dev = 38.16 Mean = 106.8 N = 25.00 50.0 75.0 100.0 125.0 150.0 175.0 200.0

VAR001

Check for Normality


Check for normality median =mode = mean 2/3 of observation in mean S.d. 4/5 of observation in mean 1.28S.d. 19/20 of observation in mean 2S.d. 1.33 times of SD 6 times of SD Skewness = 0
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Results
median =mode mean
1.00

Normal P-P Plot of VAR00001

(69,144) 18 (App. Equal) (59,155) 19 (App. Equal) (32,182) 24 (App. Equal) 49.7242 224.3197 50 (App. Equal) 160 (Not equal) .38 (App.
Equal)
Expected Cum Prob

.75

.50

.25

0.00 0.00 .25 .50 .75 1.00

Observed Cum Prob

Normal Approximation to the Binomial Distribution


The binomial distribution is a discrete distribution, but the normal is continuous The normal Approximation to the Binomial is very convenient because it enables us to solve the problem without extensive tables of the Bin. Distribution. To use the normal to approximate the binomial, accuracy is improved if you use a correction for continuity adjustment Example:
X is discrete in a binomial distribution, so P(X = 4) can be approximated with a continuous normal distribution by finding P(3.5 < X < 4.5)
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Normal Approximation to the Binomial Distribution

(continued)

The closer p is to 0.5, the better the normal approximation to the binomial The larger the sample size n, the better the normal approximation to the binomial General rule:
The normal distribution can be used to approximate the binomial distribution if np 5 and n(1 p) 5
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Normal Approximation to the Binomial Distribution


The mean and standard deviation of the binomial distribution are = np
= np(1 p)

(continued)

Transform binomial to normal using the formula:


Z= X X np = np(1 p)

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Using the Normal Approximation to the Binomial Distribution


If n = 1000 and p = 0.2, what is P(X 180)? Approximate P(X 180) using a continuity correction adjustment: P(X 180.5) Transform to standardized normal:
Z= X np 180.5 (1000)(0.2) = = 1.54 np(1 p) (1000)(0.2)(1 0.2)

So P(Z -1.54) = 0.0618


180.5 -1.54 200 0 X Z

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Statistics for Business Analysis


Day 5 Session-II CONTINUOUS PROBABILITY DISTRIBUTIONS

The Uniform Distribution


Probability Distributions Continuous Probability Distributions Normal Uniform Exponential
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Uniform Distribution


The uniform distribution is a probability distribution that has equal probabilities for all possible outcomes of the random variable Also called a rectangular distribution
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Uniform Distribution


(continued)

The Continuous Uniform Distribution:

1 ba

if a X b

f(X) =
0 otherwise

where f(X) = value of the density function at any X value a = minimum value of X b = maximum value of X

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Properties of the Uniform Distribution


The mean of a uniform distribution is
= a+b 2

The standard deviation is


= (b - a) 2 12

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Uniform Distribution Example


Example: Uniform probability distribution over the range 2 X 6: 1 f(X) = 6 - 2 = 0.25 for 2 X 6
f(X) 0.25 X
= a+b 2+6 = =4 2 2

2
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

(b - a) 2 = 12

(6 - 2) 2 = 1 .1547 12

Uniform Distribution Example


(continued)

Example: Using the uniform probability distribution to find P(3 X 5): P(3 X 5) = (Base)(Height) = (2)(0.25) = 0.5
f(X) 0.25 X

2
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Exponential Distribution


Probability Distributions Continuous Probability Distributions Normal Uniform Exponential
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Exponential Distribution


Often used to model the length of time between two occurrences of an event (the time between arrivals)
Examples:
Time between trucks arriving at an unloading dock Time between transactions at an ATM Machine Time between phone calls to the main operator

Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

The Exponential Distribution


Defined by a single parameter, its mean (lambda) The probability that an arrival time is less than some specified time X is

P(arrival time < X) = 1 e X


where e = mathematical constant approximated by 2.71828 = the population mean number of arrivals per unit X = any value of the continuous variable where 0 < X <
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in

Exponential Distribution Example


Example: Customers arrive at the service counter at the rate of 15 per hour. What is the probability that the arrival time between consecutive customers is less than three minutes? The mean number of arrivals per hour is 15, so = 15 Three minutes is 0.05 hours P(arrival time < .05) = 1 e-X = 1 e-(15)(0.05) = 0.5276 So there is a 52.76% probability that the arrival time between successive customers is less than three minutes
Created by: Prabhat Mittal Email-Id: profmittal@yahoo.co.in