You are on page 1of 32

BUSINESS

STATISTICS
MAIN MODULE

SESSION - VII

INSTRUCTOR: DR. ANKIT SHARMA

QUANTITATIVE METHODS & OPERATIONS MANAGEMENT

09/12/22 Business Statistics: MAIN MODULE SESSION VII


SESSIO
N POPULATION
PROPORTION

FLOW S

SAMPLING CENTRAL SUMMARY


DISTRIBUTIO LIMIT &
N THEOREM WAY
FORWARD
SAMPLING
DISTRIBUTIO
N OF MEAN

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution

T he automated production line at the Oxford Cereals main plant fills thousands of boxes of cereal during each
shift. As the plant operations manager, you are responsible for monitoring the amount of cereal placed in each box.
To be consistent with package labeling, boxes should contain a mean of 368 grams of cereal. Because of the speed
of the process, the cereal weight varies from box to box, causing some boxes to be under filled and others to be
overfilled. If the automated process fails to work as intended, the mean weight in the boxes could vary too much
from the label weight of 368 grams to be acceptable.

Because weighing every single box is too time-consuming, costly, and inefficient, you must take a sample of boxes.
For each sample you select, you plan to weigh the individual boxes and calculate a sample mean. You need to
determine the probability that such a sample mean could have been randomly selected from a population whose
mean is 368 grams. Based on your analysis, you will have to decide whether to maintain, alter, or shut down the
cereal-filling process.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution
In many applications, you want to make inferences that are based on statistics calculated from samples to
estimate the values of population parameters.

The main concern when making a statistical inference is reaching conclusions about a population, not
about a sample.
For example, a political pollster is interested in the sample results only as a way of estimating the actual
proportion of the votes that each candidate will receive from the population of voters.

Likewise, as plant operations manager for Oxford Cereals, you are only interested in using the mean weight
calculated from a sample of cereal boxes to estimate the mean weight of a population of boxes.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution of the Mean
The sampling distribution of the mean is the distribution of all possible sample means if you select all
possible samples of a given size.

The sample mean is unbiased because the mean of all the possible sample means (of a given sample size, n) is
equal to the population mean, μ.

 Assume there is a population …


 Population size N=4.
 Variable of interest is, X, age of individuals.
 Values of X: 18, 20, 22, 24 (years).

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Developing a Sampling Distribution
Summary Measures for the Population Distribution:

μ
 X i P(x)
N .3
18  20  22  24 .2
  21
4 .1
0
18 20 22 24 x
σ
 i
(X  μ) 2

 2.236
A B C D
N Uniform Distribution

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Developing a Sampling Distribution
Now consider all possible samples of size n=2

1st 2nd Observation


16 Sample
Obs Means
18 20 22 24
1st 2nd Observation
18 18,18 18,20 18,22 18,24 Obs 18 20 22 24
20 20,18 20,20 20,22 20,24 18 18 19 20 21
22 22,18 22,20 22,22 22,24
20 19 20 21 22
24 24,18 24,20 24,22 24,24
16 possible samples 22 20 21 22 23
(sampling with
24 21 22 23 24
replacement)

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Developing a Sampling Distribution
Sampling Distribution of All Sample Means.

16 Sample Means Sample Means


Distribution
1st 2nd Observation
P(X)
Obs 18 20 22 24
.3
18 18 19 20 21
.2
20 19 20 21 22
.1
22 20 21 22 23
0 18 19 20 21 22 23 24
_
24 21 22 23 24 X
(no longer uniform)

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Developing a Sampling Distribution
Mean and standard deviation of the sample means

18  19  19    24
μX   21
16

(18 - 21) 2  (19 - 21) 2    (24 - 21) 2


σX   1.58
16

Note: Here we divide by 16 because there are 16 different samples of


size 2.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Comparing the Population Distribution to the Sample Means
Distribution
Population Sample Means Distribution
N=4 n=2

μ  21 σ  2.236 μ X  21 σ X  1.58

P(X)
.3 P(X)
.3
.2
.2
.1
.1
0 X
18 20 22 24 0 _
A B C D 18 19 20 21 22 23 24 X
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Standard Error of the Mean
The value of the standard deviation of all possible sample means, called the standard error of the mean,
expresses how the sample means vary from sample to sample.

As the sample size increases, the standard error of the mean decreases by a factor equal to the square root of
the sample size.

defines the standard error of the mean when sampling with replacement or sampling without replacement
from large or infinite populations.
Returning to the cereal-filling process described earlier, if you randomly select a sample of 25 boxes without
replacement from the thousands of boxes filled during a shift, the sample contains a very small portion of the
population. Given that the standard deviation of the cereal-filling process is 15 grams, compute the standard
error of the mean.
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Sampling from Normally Distributed Populations
If you are sampling from a population that is normally distributed with mean μ and standard deviation σ,
then regardless of the sample size, n, the sampling distribution of the mean is normally distributed,
with mean and standard error of the mean

In addition, as the sample size increases, the sampling distribution of the mean still follows a normal distribution,
with μX = μ, but the standard error of the mean decreases so that a larger proportion of sample means are closer to
the population mean.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling from Normally Distributed Populations
To find the area below 365 grams,

The area corresponding to Z = -1.00 in Table is 0.1587. Therefore, 15.87% of all the possible samples of 25 boxes have a
sample mean below 365 grams.
The preceding statement is not the same as saying that a certain percentage of individual boxes will contain less
than 365 grams of cereal.
You compute that percentage as follows:

The area corresponding to Z = -0.20 in Table is 0.4207. Therefore, 42.07% of the individual boxes are expected
to contain less than 365 grams.
This result is explained by the fact that each sample consists of 25 different values, some small and some large.
Averaging process dilutes the importance of any individual value, particularly when the sample size is large. Therefore, the
chance that the sample mean of 25 boxes is very different from the population mean is less than the chance that a single box is
very different from the population mean.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution Properties

Normal Population
Distribution
μx  μ
μ x
(i.e. x is unbiased )
Normal Sampling
Distribution
(has the same mean)

μx
x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Sampling Distribution Properties

As n increases, Larger
σ x decreases sample size

Smaller
sample size

μ x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
etermining An Interval Including A Fixed Proportion of the Sample Means

Find a symmetrically distributed interval around µ that will include 95% of the
sample means when µ = 368, σ = 15, and n = 25.
 Since the interval contains 95% of the sample means 5% of the sample means will be
outside the interval.
 Since the interval is symmetric 2.5% will be above the upper limit and 2.5% will be
below the lower limit.
 From the standardized normal table, the Z score with 2.5% (0.0250) below it is -1.96
and the Z score with 2.5% (0.0250) above it is 1.96.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


etermining An Interval Including A Fixed Proportion of the Sample Means

 Calculating the lower limit of the interval:


σ 15
XL  μ Z  368  (1.96)  362.12
n 25
 Calculating the upper limit of the interval:
σ 15
XU  μ  Z  368  (1.96)  373.88
n 25
 Based on samples of size 25, the sample means in 95% of all samples are
between 362.12 and 373.88.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution: If the Population is not Normal

 We can apply the Central Limit Theorem:


 Even if the population is not normal,
 …sample means from the population will be approximately normal as long as the
sample size is large enough.

 Properties of the sampling distribution:

σ

μx  μ and
σx 
n

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Central Limit Theorem

the sampling
As the n↑ distribution of
sample size the sample mean
gets large becomes almost
enough… normal
regardless of
shape of
population.

x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Central Limit Theorem

Population Distribution

Sampling distribution
properties:
Central Tendency
μx  μ μ x
Sampling Distribution
Variation
σ (becomes normal as n increases)
σx  Smaller sample
Larger
n size
sample
size

μx x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
How Large the value of n is Large Enough?

 For most distributions, n > 30 will give a sampling distribution that is


nearly normal.
 For fairly symmetric distributions, n > 15.
 For a normal population distribution, the sampling distribution of the
mean is always normally distributed.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Concept Check
 Suppose a population has mean μ = 8 and standard deviation σ = 3. Suppose a
random sample of size n = 36 is selected.
 What is the probability that the sample mean is between 7.8 and 8.2?
• Even if the population is not normally distributed, the
central limit theorem can be used (n > 30).
x
• … so the sampling distribution of is
approximately normal.
• … with meanμx = 8.
• …and standard deviationσ  σ  3  0.5
x
n 36

09/12/22 Business Statistics: MAIN MODULE SESSION VII


 
 7.8 - 8 X -μ 8.2 - 8 
P(7.8  X  8.2)  P   
 3 σ 3 
 36 n 36 
 P(-0.4  Z  0.4)  0.6554 - 0.3446  0.3108

Population Sampling Standard Normal


Distribution Distribution Distribution
???
? ??
? ? Sample Standardize
? ? ?
?
7.8 8.2 -0.4 0.4
μ8 X x μz  0 Z

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Population Proportions
Consider a categorical variable that has only two categories, such as the customer prefers your brand or the customer
prefers the competitor’s brand.

You are interested in the proportion of items belonging to one of the categories—for example, the proportion of customers
that prefer your brand.

The population proportion, represented by π, is the proportion of items in the entire population with the characteristic of
interest.

The sample proportion, represented by p, is the proportion of items in the sample with the characteristic of interest.

The sample proportion, a statistic, is used to estimate the population proportion, a parameter.

To calculate the sample proportion, you assign one of two possible values, 1 or 0, to represent the presence or absence of
the characteristic.

You then sum all the 1 and 0 values and divide by n, the sample size.

For example, if, in a sample of five customers, three preferred your brand and two did not, you have three 1s and two 0s.
Summing the three 1s and two 0s and dividing by the sample size of 5 results in a sample proportion of 0.60.

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Population Proportions

π = the proportion of the population having some characteristic.

• Sample proportion (p) provides an estimate of π:

X number of items in the sample having the characteri stic of interest


p 
n sample size

• 0 ≤ p ≤ 1.
• p is approximately distributed as a normal distribution when n is large.
(assuming sampling with replacement from a finite population or without replacement from an
infinite population.)

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Sampling Distribution of p

Approximated by a normal distribution if:

n  5 Sampling Distribution
P( ps)
and .3
.2
n(1   )  5 .1
0
where 0 .2 .4 .6 8 1 p

μp  πand π (1  π ) Z-Value for Proportions


σp 
n p  p 
Z 
σp  (1   )
n
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Concept Check

• If the true proportion of voters who support Proposition A is π = 0.4, what


is the probability that a sample of size 200 yields a sample proportion
between 0.40
• and if 0.45?
π = 0.4 and n = 200, what is
P(0.40 ≤ p ≤ 0.45) ?

 (1   ) 0.4(1  0.4)
Find : σ p σp    0.03464
n 200

Convert to  0.40  0.40 0.45  0.40 


P(0.40  p  0.45)  P Z 
standardized  0.03464 0.03464 
normal:  P(0  Z  1.44)
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Utilize the cumulative normal table:
P(0 ≤ Z ≤ 1.44) = 0.9251 – 0.5000 = 0.4251

Sampling Distribution Standardized


Normal Distribution

0.4251
Standardize

0.40 0.45
p 0 1.44
Z

09/12/22 Business Statistics: MAIN MODULE SESSION VII


Concept Check
What do workers around the world want in a job? An EY global study of full-time workers on work-life
challenges found that one of the most important factors when seeking a job is flexibility, with 74% of workers
saying that being able to work flexibly and still be on track for a promotion is important. (Data extracted from
“Study highlights: people want flexibility,” bit.ly/1I7hOvW.) Suppose you select a sample of 100 full-time
workers.

a. What is the probability that in the sample fewer than 78% say that that being able to work flexibly and still be
on track for a promotion is important?

b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still
be on track
for a promotion is important?

c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be
on track for a promotion is important?

d. If a sample
09/12/22 of 400 is taken, how does thisBusiness
change your
Statistics: MAINanswers to (a)VIIthrough (c)?
MODULE SESSION
a. What is the probability that in the sample fewer than 78% say that that being able to work flexibly and still be on track for a
promotion is important?

b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still be on track
for a promotion is important?

c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be on track for a
promotion is important?

d. If a sample of 400 is taken, how does this change your answers to (a) through (c)?

09/12/22 Business Statistics: MAIN MODULE SESSION VII


a. What is the probability that in the sample fewer than 78% say that that being able to work flexibly and still be on track for a
promotion is important?

b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still be on track
for a promotion is important?

c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be on track for a
promotion is important?

P(p > 0.76) = P (Z > 0.9119) = 0.1809


09/12/22 Business Statistics: MAIN MODULE SESSION VII
09/12/22 Business Statistics: MAIN MODULE SESSION VII

You might also like