You are on page 1of 9

ECON 601 Module 2 Problem Set

Fall 2019

I expect your work to be typed and well organized. You need to explain / show all of the steps you
used to arrive at your answer. Submit your work through Blackboard as a Word or pdf file.

1. Use Stata and the textbook data file CARS2 for this problem. This dataset consists of
highway mileage (hwympg) for 147 cars from the year 2003.
a. Use Stata’s histogram command with the frequency option to obtain a histogram of
the mileage. Include a title for your graph using Stata’s graphing editor. Copy this
graph and paste the graph into your solutions.
b. Use Stata’s univar command to obtain summary statistics. Copy Stata’s output as a
“picture” and paste this into your solutions. What is the average, median, and
standard deviation?

Solutions:
a. Command: histogram hwympg, frequency

b. Command: ssc install univar , then univar hwympg


-------------- Quantiles --------------
Variable n Mean S.D. Min .25 Mdn .75 Max
-------------------------------------------------------------------------------
hwympg 147 28.15 6.53 13.00 25.00 28.00 31.00 68.00
-------------------------------------------------------------------------------

The average (mean) is 28.15 mpg; the median (Mdn) is 28 mpg; and the standard deviation (S.D.) is
6.53 mpg.
2. Calculate the following probabilities using the standard normal distribution:

a. 𝑃(0.0 ≤ 𝑍 ≤ 1.11)

b. 𝑃(−0.79 ≤ 𝑍 ≤ 0.0)

c. 𝑃(0.2 ≤ 𝑍 ≤ 1.5)

d. 𝑃(−2.03 ≤ 𝑍 ≤ 1.71)

e. 𝑃(𝑍 ≤ 1.23)

Solution:
a. 𝑃(0.0 ≤ 𝑍 ≤ 1.11) = 0.3665
b. 𝑃(−0.79 ≤ 𝑍 ≤ 0.0) = 𝑃(0.0 ≤ 𝑍 ≤ 0.79) = 0.2852
c. 𝑃(0.2 ≤ 𝑍 ≤ 1.5) = 𝑃(0.0 ≤ 𝑍 ≤ 1.5) − 𝑃(0.0 ≤ 𝑍 ≤ 0.2) = 0.4332 − 0.0793 =
0.3539
d. 𝑃(−2.03 ≤ 𝑍 ≤ 1.17) = 𝑃(0 ≤ 𝑍 ≤ 2.03) + 𝑃(0 ≤ 𝑍 ≤ 1.17) = 0.4788 +
0.4564 = 0.9352
e. 𝑃(𝑍 ≤ 1.23) = 0.5 + 𝑃(0.0 ≤ 𝑍 ≤ 1.23) = 0.5 + 0.3907 = 0.8907
3. Bob owns a business which makes fence posts. When Bob’s machines are calibrated
correctly, the fence posts are produced with an average length of 250 cm and a standard
deviation of 0.4 cm. The lengths of the individual fence posts are normally distributed.
a. What is the probability that an individual fence post is longer than 250.5 cm?
b. Suppose a sample of 49 fence posts is taken at random. What is the probability that
the sample mean is longer than 250.5 cm?
c. How would your answer in (b) change if the lengths of the fence posts were not
normally distributed? Briefly explain.

Solution:
250.5−250
(a) 𝑃(𝑋 > 250.5) = 𝑃 (𝑍 > ) = 𝑃(𝑍 > 1.25) = 0.5 − 0.3944 = 0.1056. In words,
0.4
the probability that an individual fence post is longer than 250.5 cm is about 10.6%.
250.5−250
(b) 𝑃(𝑋̄ > 250.5) = 𝑃 (𝑍 > 0.4 ) = 𝑃(𝑍 > 8.75) ≈ 0. In words, the probability that the

√49
average length is longer than 250.5 cm is about 0%.

(c) Yes, the answer in part (b) is still valid because of the Central Limit Theorem. This theorem
ensures the sampling distribution is approximately normal regardless of the distribution of
the lengths for the individual fence posts.
4. A marketing team hired by a politician running for office wants to determine the average age
of adults in a particular voting district. Based on a random sample of 300 adults, the sample
mean is calculated to be 44 years with a sample standard deviation of 16 years. Construct a
95% confidence interval estimate of the population average age for this voting district.
Round the values in your interval to the nearest tenth digit.

Solution:
The setup of the question gives the sample standard deviation which is a hint that the confidence
interval uses the t test statistic. The 300 adults being sampled here implies the t test statistic has 299
degrees of freedom. Although 299 is literally off the chart when looking at the t table in the
appendix to our text, we can use information from the bottom row that corresponds to infinite (∞)
degrees of freedom.

𝑡𝛼𝐶⁄ 𝐶
= 𝑡.025,299 𝐶
≈ 𝑡.025,∞ = 1.96
2,𝑛−1

At this point, we can proceed using the formula for a confidence interval:
𝑠
𝑥̄ ± 𝑡𝛼,𝑛−1 ×
2 √𝑛
16
⇒ 44 ± 1.96 ×
√300
⇒ (42.2, 45.8)

Thus, we are “95% confident” that the average age of adults in this district is somewhere between
42.2 and 45.8 years.

On a side note, there is something peculiar about the t distribution and the degrees of freedom. As
the degrees of freedom gets really large (e.g., 299 degrees of freedom), then the distribution
becomes set in that, for example, an area of 0.025 is above a t value of 1.96. What is so special
about this? Look at the Z table for Z=1.96 and you will see that this value corresponds to an area of
0.025 to the right. It turns out that as the degrees of freedom become larger and larger, the t
distribution becomes more and more like the Z distribution
5. You work for a regulatory agency which has just begun an investigation into whether the
weight of the candy M&M’s is less than the advertised 10 ounces. You take a random
sample of 50 bags of M&M’s and weigh each bag. The mean weight from your sample is
9.75 ounces, with a sample standard deviation of 0.8 ounces. Test the hypotheses:
𝑯𝟎 : 𝝁 ≥ 𝟏𝟎
𝑯𝒂 : 𝝁 < 𝟏𝟎
at the 5% level of significance. Assume the population is approximately normal. State the
decision rule, the test statistic, and your decision. Based on your test, is there evidence that
the bags of M&M’s actually weigh less than the advertised 10-ounces?

Solution:
The setup of the problem mentions the sample mean and sample standard deviation which is a hint
that we should work with the t test statistic. Also, the inequality in the null hypothesis indicates that
this is a one-sided test. The critical value for a one-sided test at the 5% level and 50 observations is
𝐶 𝐶
𝑡𝛼,𝑛−1 = 𝑡.05,49 . If you are using the t-table in the appendix to our textbook, you will not see an
entry for 49 degrees of freedom. In this case, you can round the degrees of freedom down to the
nearest number that you see in the table (rounding the degrees of freedom downward is the most
𝐶 𝐶
conservative thing to do). Rounding down, we see 𝑡.05,49 ≈ 𝑡.05,40 = 1.684. Note, however, that the
hypotheses imply the rejection region is on the negative side of the distribution, thus the critical
value in this problem -1.684. Now we can address each part of the question:

• The decision rule is to reject 𝐻0 if 𝑡 < −1.684, otherwise do not reject 𝐻0 .


𝑥̄ −𝜇 9.75−10
• The test statstic is 𝑡 = 𝑠 = 0.8 = −2.21
⁄ 𝑛 ⁄
√ √50
• Reject 𝐻0 .
• There is evidence that bags of M&M’s advertised as 10 ounces actually weigh less.
6. Use Stata and the textbook data file ONERET2 for this problem. The data consists of one-
year returns for a random sample of 83 mutual funds from July 1, 2002. Pretend that the 1-
year return for the S&P 500 stock index was -18% over the same time period. Round values
to the nearest tenth.
a. What is the mean rate of return for the sample of mutual funds? What is the 99%
confidence interval estimate for the population mean rate of return?
b. Test whether the average one-year return for the mutual funds performed the same as
the S&P 500 index. Use a 1% level of significance. Copy the output from Stata as a
picture and paste this into your solutions. State the hypotheses to be tested, the
decision rule using the p-value method (see p. 38), and your decision to reject or not
reject the null hypothesis. What can you conclude regarding the average
performance of mutual funds versus the S&P 500?
Solutions:
(a) Command: ci means ret1yr, level(99)
Variable Obs Mean Std. Err. [99% Conf. Interval]

ret1yr 83 -13.24217 1.939664 -18.3573 -8.127034

The mean rate of return for the sample of mutual funds is -13.2%. The 99% confidence interval
for the population mean return of mutual funds is -18.4% to -8.1%. Note: you could also issue
the command shown in (b) below to obtain the same confidence interval information.

(b) Command: ttest ret1yr==-18, level(99)


One-sample t test

Variable Obs Mean Std. Err. Std. Dev. [99% Conf. Interval]

ret1yr 83 -13.24217 1.939664 17.67118 -18.3573 -8.127034

mean = mean(ret1yr) t = 2.4529


Ho: mean = -18 degrees of freedom = 82

Ha: mean < -18 Ha: mean != -18 Ha: mean > -18
Pr(T < t) = 0.9919 Pr(|T| > |t|) = 0.0163 Pr(T > t) = 0.0081

The hypotheses are:


Note:  is the population average
𝑯𝟎 : 𝝁 = −𝟏𝟖 mutual fund return.
𝑯𝒂 : 𝝁 ≠ −𝟏𝟖

Using the p-value method:

Decision rule: reject 𝐻0 if 𝑝 value < 0.10, otherwise do not reject 𝐻0


Test statistic: 𝑝 value = 0.0163
Decision: The p value is 0.0163. Thus, we do not reject 𝐻0 at the 1% level of significance.
Conclusion: There is not sufficient evidence to believe that, on average, mutual funds
performed differently than the S&P 500 index. In other words, our test did not indicate a
statistical difference in the average performance of mutual funds versus the overall market.
7. Use Stata for this problem. The textbook data file named PRIVATE2 contains the
graduation rates for 195 schools and a variable coded 1 for private schools and 0 for public
schools.
a. Construct a 90% confidence interval estimate for the difference between population
average graduation rates for public and private schools. Briefly describe what the
interval is telling us about average graduation rates between public and private
schools.
b. Joe Bob claims that, on average, graduate rates at private schools are higher than
graduation rates at public schools. Use a 10% level of significance to test this claim
and assume unequal population variances. State the hypotheses to be tested, the
decision rule using the p-value method (see p. 38), and your decision to reject or not
reject the null hypothesis. What can you conclude about Joe Bob’s claim?

Solution:
(a) Command: ttest graduation_rate == graduation_rate1, unpaired unequal level(90)
Two-sample t test with unequal variances

Variable Obs Mean Std. Err. Std. Dev. [90% Conf. Interval]

gradua~e 95 .36 .0153438 .1495525 .3345105 .3854895


gradua~1 100 .7224 .0154751 .1547505 .6967054 .7480946

combined 195 .5458462 .0169522 .2367252 .5178284 .5738639

diff -.3624 .0217924 -.3984182 -.3263818

diff = mean(graduation_rate) - mean(graduation_rate1) t = -16.6297


Ho: diff = 0 Satterthwaite's degrees of freedom = 192.942

Ha: diff < 0 Ha: diff != 0 Ha: diff > 0


Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000

The difference in graduation rates is defined as the rate for public schools minus the rate for public
schools. The 90% confidence interval for the difference in the population mean graduation rates is -
39.8% to -32.6%. This is telling us that the average graduation rate is much lower for public schools
than for private schools.

(b): We can use the information from the previous part (a). The hypotheses are:
𝑯𝟎 : 𝝁 𝟎 − 𝝁 𝟏 ≥ 𝟎
𝑯𝒂 : 𝝁 𝟎 − 𝝁 𝟏 < 𝟎
Note:  0 is the population mean
Using the p value method:
graduation rate for public schools
Decision rule: reject 𝐻0 if 𝑝 value < 0.01 and  1 is the population mean
Test statistic: 𝑝 value = 0.0000 graduation rate for private schools.
Decision: Reject 𝐻0 .
Conclusion: There is evidence that supports Joe Bob’s claim that, on average, graduation
rates at private schools are higher than graduation rates at public schools.
8. Answer each of the questions below using about five sentences or less.

a. Suppose you have constructed a 99% confidence interval of (1, 5) for a population
mean. Does this imply the population mean is most likely 3? Explain.

b. If a null hypothesis is rejected at the 1% level of significance, what decision would


have been made at the 5% level? Briefly explain your answer.

c. A drug must be demonstrated to be safe before the Food and Drug Administration
(FDA) will allow the drug to be sold in the U.S. In essence, the FDA’s null
hypothesis is a drug is unsafe. While the FDA has received criticism for the length of
time involved in its approval process, it says it needs this time to mitigate mistakes.
Briefly explain Type I and Type II errors in the context of the FDA approval process.

Solution (a): A confidence interval indicates the range that a population mean is likely to be found,
but we cannot conclude anything more. According to the 99% confidence interval, the population
mean could be 1.01, 1.8, 2.5, 3.8, 4.21, or any other number between 1 and 5. We cannot say it is
most likely to be 3 since this is in the middle of the interval.

Solution (b): The null is rejected at the 5% level, too. The rule using the p-value method is to reject
the null if the p-value is less than alpha. Since the null is rejected at the 1% level, this implies the p-
value is less than 0.01 (the p value could be, for example, 0.006). Therefore, it must be true that the
null is rejected at the 5% level, too.

Solution (c): A drug must be demonstrated to be safe before the FDA will allow the drug to be sold
in the U.S. In essence, the FDA’s null hypothesis is a drug is unsafe. The FDA commits a type I
error if it approves an unsafe drug to be sold in the U.S. A type II error is committed when the FDA
does not approve a safe drug.

You might also like