Professional Documents
Culture Documents
STATISTICS
MAIN MODULE
SESSION - VII
FLOW S
T he automated production line at the Oxford Cereals main plant fills thousands of boxes of cereal during each
shift. As the plant operations manager, you are responsible for monitoring the amount of cereal placed in each box.
To be consistent with package labeling, boxes should contain a mean of 368 grams of cereal. Because of the speed
of the process, the cereal weight varies from box to box, causing some boxes to be under filled and others to be
overfilled. If the automated process fails to work as intended, the mean weight in the boxes could vary too much
from the label weight of 368 grams to be acceptable.
Because weighing every single box is too time-consuming, costly, and inefficient, you must take a sample of boxes.
For each sample you select, you plan to weigh the individual boxes and calculate a sample mean. You need to
determine the probability that such a sample mean could have been randomly selected from a population whose
mean is 368 grams. Based on your analysis, you will have to decide whether to maintain, alter, or shut down the
cereal-filling process.
The main concern when making a statistical inference is reaching conclusions about a population, not
about a sample.
For example, a political pollster is interested in the sample results only as a way of estimating the actual
proportion of the votes that each candidate will receive from the population of voters.
Likewise, as plant operations manager for Oxford Cereals, you are only interested in using the mean weight
calculated from a sample of cereal boxes to estimate the mean weight of a population of boxes.
The sample mean is unbiased because the mean of all the possible sample means (of a given sample size, n) is
equal to the population mean, μ.
μ
X i P(x)
N .3
18 20 22 24 .2
21
4 .1
0
18 20 22 24 x
σ
i
(X μ) 2
2.236
A B C D
N Uniform Distribution
18 19 19 24
μX 21
16
μ 21 σ 2.236 μ X 21 σ X 1.58
P(X)
.3 P(X)
.3
.2
.2
.1
.1
0 X
18 20 22 24 0 _
A B C D 18 19 20 21 22 23 24 X
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Standard Error of the Mean
The value of the standard deviation of all possible sample means, called the standard error of the mean,
expresses how the sample means vary from sample to sample.
As the sample size increases, the standard error of the mean decreases by a factor equal to the square root of
the sample size.
defines the standard error of the mean when sampling with replacement or sampling without replacement
from large or infinite populations.
Returning to the cereal-filling process described earlier, if you randomly select a sample of 25 boxes without
replacement from the thousands of boxes filled during a shift, the sample contains a very small portion of the
population. Given that the standard deviation of the cereal-filling process is 15 grams, compute the standard
error of the mean.
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Sampling from Normally Distributed Populations
If you are sampling from a population that is normally distributed with mean μ and standard deviation σ,
then regardless of the sample size, n, the sampling distribution of the mean is normally distributed,
with mean and standard error of the mean
In addition, as the sample size increases, the sampling distribution of the mean still follows a normal distribution,
with μX = μ, but the standard error of the mean decreases so that a larger proportion of sample means are closer to
the population mean.
The area corresponding to Z = -1.00 in Table is 0.1587. Therefore, 15.87% of all the possible samples of 25 boxes have a
sample mean below 365 grams.
The preceding statement is not the same as saying that a certain percentage of individual boxes will contain less
than 365 grams of cereal.
You compute that percentage as follows:
The area corresponding to Z = -0.20 in Table is 0.4207. Therefore, 42.07% of the individual boxes are expected
to contain less than 365 grams.
This result is explained by the fact that each sample consists of 25 different values, some small and some large.
Averaging process dilutes the importance of any individual value, particularly when the sample size is large. Therefore, the
chance that the sample mean of 25 boxes is very different from the population mean is less than the chance that a single box is
very different from the population mean.
Normal Population
Distribution
μx μ
μ x
(i.e. x is unbiased )
Normal Sampling
Distribution
(has the same mean)
μx
x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Sampling Distribution Properties
As n increases, Larger
σ x decreases sample size
Smaller
sample size
μ x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
etermining An Interval Including A Fixed Proportion of the Sample Means
Find a symmetrically distributed interval around µ that will include 95% of the
sample means when µ = 368, σ = 15, and n = 25.
Since the interval contains 95% of the sample means 5% of the sample means will be
outside the interval.
Since the interval is symmetric 2.5% will be above the upper limit and 2.5% will be
below the lower limit.
From the standardized normal table, the Z score with 2.5% (0.0250) below it is -1.96
and the Z score with 2.5% (0.0250) above it is 1.96.
σ
μx μ and
σx
n
the sampling
As the n↑ distribution of
sample size the sample mean
gets large becomes almost
enough… normal
regardless of
shape of
population.
x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
Central Limit Theorem
Population Distribution
Sampling distribution
properties:
Central Tendency
μx μ μ x
Sampling Distribution
Variation
σ (becomes normal as n increases)
σx Smaller sample
Larger
n size
sample
size
μx x
09/12/22 Business Statistics: MAIN MODULE SESSION VII
How Large the value of n is Large Enough?
You are interested in the proportion of items belonging to one of the categories—for example, the proportion of customers
that prefer your brand.
The population proportion, represented by π, is the proportion of items in the entire population with the characteristic of
interest.
The sample proportion, represented by p, is the proportion of items in the sample with the characteristic of interest.
The sample proportion, a statistic, is used to estimate the population proportion, a parameter.
To calculate the sample proportion, you assign one of two possible values, 1 or 0, to represent the presence or absence of
the characteristic.
You then sum all the 1 and 0 values and divide by n, the sample size.
For example, if, in a sample of five customers, three preferred your brand and two did not, you have three 1s and two 0s.
Summing the three 1s and two 0s and dividing by the sample size of 5 results in a sample proportion of 0.60.
• 0 ≤ p ≤ 1.
• p is approximately distributed as a normal distribution when n is large.
(assuming sampling with replacement from a finite population or without replacement from an
infinite population.)
n 5 Sampling Distribution
P( ps)
and .3
.2
n(1 ) 5 .1
0
where 0 .2 .4 .6 8 1 p
(1 ) 0.4(1 0.4)
Find : σ p σp 0.03464
n 200
0.4251
Standardize
0.40 0.45
p 0 1.44
Z
a. What is the probability that in the sample fewer than 78% say that that being able to work flexibly and still be
on track for a promotion is important?
b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still
be on track
for a promotion is important?
c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be
on track for a promotion is important?
d. If a sample
09/12/22 of 400 is taken, how does thisBusiness
change your
Statistics: MAINanswers to (a)VIIthrough (c)?
MODULE SESSION
a. What is the probability that in the sample fewer than 78% say that that being able to work flexibly and still be on track for a
promotion is important?
b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still be on track
for a promotion is important?
c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be on track for a
promotion is important?
d. If a sample of 400 is taken, how does this change your answers to (a) through (c)?
b. What is the probability that in the sample between 70% and 78% say that being able to work flexibly and still be on track
for a promotion is important?
c. What is the probability that in the sample more than 76% say that that being able to work flexibly and still be on track for a
promotion is important?