Professional Documents
Culture Documents
Fall 2019
I expect your work to be typed and well organized. You need to explain / show all of the steps you
used to arrive at your answer. Submit your work through Blackboard as a Word or pdf file.
1. Use Stata and the textbook data file CARS2 for this problem. This dataset consists of
highway mileage (hwympg) for 147 cars from the year 2003.
a. Use Stata’s histogram command with the frequency option to obtain a histogram of
the mileage. Include a title for your graph using Stata’s graphing editor. Copy this
graph and paste the graph into your solutions.
b. Use Stata’s univar command to obtain summary statistics. Copy Stata’s output as a
“picture” and paste this into your solutions. What is the average, median, and
standard deviation?
Solutions:
a. Command: histogram hwympg, frequency
The average (mean) is 28.15 mpg; the median (Mdn) is 28 mpg; and the standard deviation (S.D.) is
6.53 mpg.
2. Calculate the following probabilities using the standard normal distribution:
a. 𝑃(0.0 ≤ 𝑍 ≤ 1.11)
b. 𝑃(−0.79 ≤ 𝑍 ≤ 0.0)
c. 𝑃(0.2 ≤ 𝑍 ≤ 1.5)
d. 𝑃(−2.03 ≤ 𝑍 ≤ 1.71)
e. 𝑃(𝑍 ≤ 1.23)
Solution:
a. 𝑃(0.0 ≤ 𝑍 ≤ 1.11) = 0.3665
b. 𝑃(−0.79 ≤ 𝑍 ≤ 0.0) = 𝑃(0.0 ≤ 𝑍 ≤ 0.79) = 0.2852
c. 𝑃(0.2 ≤ 𝑍 ≤ 1.5) = 𝑃(0.0 ≤ 𝑍 ≤ 1.5) − 𝑃(0.0 ≤ 𝑍 ≤ 0.2) = 0.4332 − 0.0793 =
0.3539
d. 𝑃(−2.03 ≤ 𝑍 ≤ 1.17) = 𝑃(0 ≤ 𝑍 ≤ 2.03) + 𝑃(0 ≤ 𝑍 ≤ 1.17) = 0.4788 +
0.4564 = 0.9352
e. 𝑃(𝑍 ≤ 1.23) = 0.5 + 𝑃(0.0 ≤ 𝑍 ≤ 1.23) = 0.5 + 0.3907 = 0.8907
3. Bob owns a business which makes fence posts. When Bob’s machines are calibrated
correctly, the fence posts are produced with an average length of 250 cm and a standard
deviation of 0.4 cm. The lengths of the individual fence posts are normally distributed.
a. What is the probability that an individual fence post is longer than 250.5 cm?
b. Suppose a sample of 49 fence posts is taken at random. What is the probability that
the sample mean is longer than 250.5 cm?
c. How would your answer in (b) change if the lengths of the fence posts were not
normally distributed? Briefly explain.
Solution:
250.5−250
(a) 𝑃(𝑋 > 250.5) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 1.25) = 0.5 − 0.3944 = 0.1056. In words,
0.4
the probability that an individual fence post is longer than 250.5 cm is about 10.6%.
250.5−250
(b) 𝑃(𝑋̄ > 250.5) = 𝑃 (𝑍 > 0.4 ) = 𝑃(𝑍 > 8.75) ≈ 0. In words, the probability that the
⁄
√49
average length is longer than 250.5 cm is about 0%.
(c) Yes, the answer in part (b) is still valid because of the Central Limit Theorem. This theorem
ensures the sampling distribution is approximately normal regardless of the distribution of
the lengths for the individual fence posts.
4. A marketing team hired by a politician running for office wants to determine the average age
of adults in a particular voting district. Based on a random sample of 300 adults, the sample
mean is calculated to be 44 years with a sample standard deviation of 16 years. Construct a
95% confidence interval estimate of the population average age for this voting district.
Round the values in your interval to the nearest tenth digit.
Solution:
The setup of the question gives the sample standard deviation which is a hint that the confidence
interval uses the t test statistic. The 300 adults being sampled here implies the t test statistic has 299
degrees of freedom. Although 299 is literally off the chart when looking at the t table in the
appendix to our text, we can use information from the bottom row that corresponds to infinite (∞)
degrees of freedom.
𝑡𝛼𝐶⁄ 𝐶
= 𝑡.025,299 𝐶
≈ 𝑡.025,∞ = 1.96
2,𝑛−1
At this point, we can proceed using the formula for a confidence interval:
𝑠
𝑥̄ ± 𝑡𝛼,𝑛−1 ×
2 √𝑛
16
⇒ 44 ± 1.96 ×
√300
⇒ (42.2, 45.8)
Thus, we are “95% confident” that the average age of adults in this district is somewhere between
42.2 and 45.8 years.
On a side note, there is something peculiar about the t distribution and the degrees of freedom. As
the degrees of freedom gets really large (e.g., 299 degrees of freedom), then the distribution
becomes set in that, for example, an area of 0.025 is above a t value of 1.96. What is so special
about this? Look at the Z table for Z=1.96 and you will see that this value corresponds to an area of
0.025 to the right. It turns out that as the degrees of freedom become larger and larger, the t
distribution becomes more and more like the Z distribution
5. You work for a regulatory agency which has just begun an investigation into whether the
weight of the candy M&M’s is less than the advertised 10 ounces. You take a random
sample of 50 bags of M&M’s and weigh each bag. The mean weight from your sample is
9.75 ounces, with a sample standard deviation of 0.8 ounces. Test the hypotheses:
𝑯𝟎 : 𝝁 ≥ 𝟏𝟎
𝑯𝒂 : 𝝁 < 𝟏𝟎
at the 5% level of significance. Assume the population is approximately normal. State the
decision rule, the test statistic, and your decision. Based on your test, is there evidence that
the bags of M&M’s actually weigh less than the advertised 10-ounces?
Solution:
The setup of the problem mentions the sample mean and sample standard deviation which is a hint
that we should work with the t test statistic. Also, the inequality in the null hypothesis indicates that
this is a one-sided test. The critical value for a one-sided test at the 5% level and 50 observations is
𝐶 𝐶
𝑡𝛼,𝑛−1 = 𝑡.05,49 . If you are using the t-table in the appendix to our textbook, you will not see an
entry for 49 degrees of freedom. In this case, you can round the degrees of freedom down to the
nearest number that you see in the table (rounding the degrees of freedom downward is the most
𝐶 𝐶
conservative thing to do). Rounding down, we see 𝑡.05,49 ≈ 𝑡.05,40 = 1.684. Note, however, that the
hypotheses imply the rejection region is on the negative side of the distribution, thus the critical
value in this problem -1.684. Now we can address each part of the question:
The mean rate of return for the sample of mutual funds is -13.2%. The 99% confidence interval
for the population mean return of mutual funds is -18.4% to -8.1%. Note: you could also issue
the command shown in (b) below to obtain the same confidence interval information.
Variable Obs Mean Std. Err. Std. Dev. [99% Conf. Interval]
Ha: mean < -18 Ha: mean != -18 Ha: mean > -18
Pr(T < t) = 0.9919 Pr(|T| > |t|) = 0.0163 Pr(T > t) = 0.0081
Solution:
(a) Command: ttest graduation_rate == graduation_rate1, unpaired unequal level(90)
Two-sample t test with unequal variances
Variable Obs Mean Std. Err. Std. Dev. [90% Conf. Interval]
The difference in graduation rates is defined as the rate for public schools minus the rate for public
schools. The 90% confidence interval for the difference in the population mean graduation rates is -
39.8% to -32.6%. This is telling us that the average graduation rate is much lower for public schools
than for private schools.
(b): We can use the information from the previous part (a). The hypotheses are:
𝑯𝟎 : 𝝁 𝟎 − 𝝁 𝟏 ≥ 𝟎
𝑯𝒂 : 𝝁 𝟎 − 𝝁 𝟏 < 𝟎
Note: 0 is the population mean
Using the p value method:
graduation rate for public schools
Decision rule: reject 𝐻0 if 𝑝 value < 0.01 and 1 is the population mean
Test statistic: 𝑝 value = 0.0000 graduation rate for private schools.
Decision: Reject 𝐻0 .
Conclusion: There is evidence that supports Joe Bob’s claim that, on average, graduation
rates at private schools are higher than graduation rates at public schools.
8. Answer each of the questions below using about five sentences or less.
a. Suppose you have constructed a 99% confidence interval of (1, 5) for a population
mean. Does this imply the population mean is most likely 3? Explain.
c. A drug must be demonstrated to be safe before the Food and Drug Administration
(FDA) will allow the drug to be sold in the U.S. In essence, the FDA’s null
hypothesis is a drug is unsafe. While the FDA has received criticism for the length of
time involved in its approval process, it says it needs this time to mitigate mistakes.
Briefly explain Type I and Type II errors in the context of the FDA approval process.
Solution (a): A confidence interval indicates the range that a population mean is likely to be found,
but we cannot conclude anything more. According to the 99% confidence interval, the population
mean could be 1.01, 1.8, 2.5, 3.8, 4.21, or any other number between 1 and 5. We cannot say it is
most likely to be 3 since this is in the middle of the interval.
Solution (b): The null is rejected at the 5% level, too. The rule using the p-value method is to reject
the null if the p-value is less than alpha. Since the null is rejected at the 1% level, this implies the p-
value is less than 0.01 (the p value could be, for example, 0.006). Therefore, it must be true that the
null is rejected at the 5% level, too.
Solution (c): A drug must be demonstrated to be safe before the FDA will allow the drug to be sold
in the U.S. In essence, the FDA’s null hypothesis is a drug is unsafe. The FDA commits a type I
error if it approves an unsafe drug to be sold in the U.S. A type II error is committed when the FDA
does not approve a safe drug.