Professional Documents
Culture Documents
(a) If you guarantee the sale of at least 600,000 copies next week, what chance
will you face of failing to meet the guarantee?
(b) How many copies of your next issue must you print, in order to have only
one chance in 100 of being caught short?
(c) What mean weekly circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?
(d) What total circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?
(e) The estimate of mean weekly circulation is actually based on a
sample of 156 weeks (3 years). Give a 95%-confidence interval
for the true mean weekly circulation.
Advertising Sales
TV Ratings
The A.C. Nielsen Co. wishes to estimate, for each of several cities, the
amount of time a typical adult living in the city spends watching television
each week. In San Diego (which has an adult population of 400,000), they
sample 100 people. The sample mean is 25 hours, with a sample standard
deviation of 10 hours.
Miscellany
Explain, briefly, the difference between sampling bias and sampling error.
A 95%-confidence interval for an estimate indicates how much you can trust
the (fill in the blank) __________________ .
A firm has conducted a study, and has found that a 95%-confidence interval
for the mean time required to process new insurance policies is (10 days,
12 days). This means that
(a) only 5 percent of all policies take less than 10 or more than 12 days to process.
(b) only 5 percent of all policies take between 10 and 12 days to process.
(c) about 95 out of every 100 intervals similarly constructed will contain the true
mean processing time.
(d) the probability is 0.95 that the true mean lies between 10 and 12 days.
Excel Functions
1000 mean
200 standard deviation
900 target
30.8538% #VALUE!
Pr(normally-distributed random variable with given mean and
standard deviation is less than or equal to target value)
1000 mean
200 standard deviation
95.0000% target left-tail probability
1328.97 #VALUE!
value that a normally-distributed random variable with given mean
and standard deviation has "target probability" of being less than
or equal to
9 degrees of freedom
1.5 target value (must be non-negative)
8.3925% #VALUE!
Pr(t-distributed random variable with given number of degrees of
freedom [and with mean 0 and standard deviation 1] is greater
than or equal to target value)
9
5.0000% target two-tail probability
2.2622 #VALUE!
value that the absolute value of a t-distributed random variable with
given number of degrees of freedom has "target probability" of
exceeding
Excel offers functions for computing the variance and standard deviation of a
"population" of values, and a matching set of functions for using sample
values to estimate a population's variance and standard deviation.
1.0
2.0 a range of sample data
3.0
4.0
2.5000 #VALUE! population and sample mean sum of observations, divided by "n"
1.2500 #VALUE! population variance sum of squared differences from the mean, divided by "n"
1.1180 #VALUE! population standard deviation square root of population variance
1.6667 #VALUE! sample variance sum of squared differences from the mean, divided by "n-1"
1.2910 #VALUE! sample standard deviation square root of sample variance
Circulation
Circulation
(a) If you guarantee the sale of at least 600,000 copies next week, what chance
will you face of failing to meet the guarantee?
You will only fail to meet the guarantee if demand runs below 600,000.
(b) How many copies of your next issue must you print, in order to have only
one chance in 100 of being caught short?
You will only run short if demand exceeds the number of copies printed.
(c) What mean weekly circulation figure would you be 95% safe in
guaranteeing over the next (52-week) year?
In a typical advertising contract, you promise to rebate some of the fees if your
circulation doesn't reach a specified target level. In this case, you want a 95%
chance that circulation will be above the guaranteed level, i.e., only a 5% chance
of circulation being below that level.
The key here is to remember that the variance of a sum of independent random
variables is the sum of their variances. Therefore the standard deviation of a
sum of n independent, identically-distributed random variables is sqrt(n) times
the standard deviation of a single one, and the standard deviation of the mean is
the original standard deviation divided by sqrt(n).
52 time period
650,000 mean per week
5,547 standard deviation of mean over time period
5% chance of failing to meet guarantee
Page 5
Circulation
The answer obviously must be 52 times the answer to the previous question.
52 time period
33,800,000 expected total over time period
288,444 standard deviation of total over time period
5% chance of failing to meet guarantee
Page 6
Advertising sales
Advertising Sales
530 35.2800
(c) How would your answer in (b) change, if the subscription list included 400,000 people?
(d) How would your answer in (b) change, if the subscription list included 1,000 people?
Again, it wouldn't change. But this is assuming that the sample procedure
used simple random sampling with replacement. Had it used simple random
sampling without replacement, the answers to (b)-(d) would be:
(e) How would your answer in (b) change, if the sample size were only 10?
(f) Did any of your answers above require special assumptions?
530 128.7642
2.262157 how many standard deviations out from the mean you must go in
the t-distribution with 10-1 = 9 degrees of freedom, in order to cut
off 5% of the distribution in the two tails
(g) If (b) had asked for a 99% confidence interval, would you have reported a
wider or narrower interval?
Page 7
Advertising sales
Page 8
TV ratings
TV Ratings
The A.C. Nielsen Co. wishes to estimate, for each of several cities, the
amount of time a typical adult living in the city spends watching television
each week. In San Diego (which has an adult population of 400,000), they
sample 100 people. The sample mean is 25 hours, with a sample standard
deviation of 10 hours.
25 1.96
(b) How many more people should they interview, in order to be 95%-confident
that their final estimate is wrong by no more than 1 hour?
It takes roughly four times as large a sample, in order to cut the margin
of error in the estimate in half (this is the "square-root" effect). Hence, they
need a total sample of roughly 400 viewers.
(c) Assuming that viewing time varies from person to person in Los Angeles by
about as much as it does in San Diego, and that Nielsen wishes to make as
precise an estimate for L.A. as they did in part (a) for San Diego, how large
a sample should they take in L.A. (which has an adult population of
1,600,000)?
The same size. For practical purposes (i.e., ignoring the distinction between
sampling with and without replacement), the margin of error in an estimate
depends on the sample size, but not on the population size.
Page 9
Miscellany
Miscellany
Explain, briefly, the difference between sampling bias and sampling error.
A 95%-confidence interval for an estimate indicates how much you can trust
the (fill in the blank) __________________ .
... estimation procedure. The point here is that we don't have any idea how
accurate our actual estimate is, but we do know how much accuracy is
inherent in the procedure we used to make the estimate.
A firm has conducted a study, and has found that a 95%-confidence interval
for the mean time required to process new insurance policies is (10 days,
12 days). This means that
(a) only 5 percent of all policies take less than 10 or more than 12 days to process.
(b) only 5 percent of all policies take between 10 and 12 days to process.
(c) about 95 out of every 100 intervals similarly constructed will contain the true
mean processing time.
(d) the probability is 0.95 that the true mean lies between 10 and 12 days.
(a) and (b) are totally wrong, since they refer to the distribution of individuals
within the population, and a 95%-confidence interval tells us nothing about
that. It could well be that half of all the policies are approved instantly, and
the other half take about three weeks.
(c) is objectively (no one could argue) correct. It restates the definition of a
95%-confidence interval.
Page 10
Miscellany
(d) is subjectively (from the viewpoint of the person making the estimate,
who sees only this sample data) correct, if we accept the principle that
most people equate lack-of-personal-knowledge with randomness. (d) is the
interpretation most commonly used in practice.
Page 11