You are on page 1of 13

Stats

250 W17 Exam 2



PRINT Name: __________________________________ UMID: _______________________

Seat: ___________________ Lab Section and GSI Name: ______________________________________

Page Problem Points Points Grader


Number Number Possible Received Initial
1 1 9
2 2 11
3 & 4 3 & 4 15
5 5 9
6 6 6
7 7 8
8 8 & 9 9
9 10 8
Total -- 75



Instructions:
1. Fill in your name, UMID number, seat, and Lab Section with GSI name in the spaces provided above.
2. You may use a calculator and the provided blue Stats Help Card. No other aids of any kind are permitted.
3. Some questions require that you clearly write your final answer in a designated place.
4. No credit will be given for providing only the answer without correct supporting work. Required:
You must show how you arrived at your answer (formula used, plug in values, draw a picture, etc.). Review instructions
(Giving steps that are specific to your calculator is not sufficient work.) and initial here
5. When appropriate, you must carry out numerical answers to 4 decimal places.
6. Final answers must be legible and clear. If multiple answers are given, the worst will be graded.
7. There are 9 pages to your exam (excluding this cover page and formula correction note).
8. Points for each question are indicated in square brackets [X points].
9. Exams are expected to be returned to students electronically by the end of next week.
10. Please help us maintain the high standards of the University of Michigan. We are Michigan. And we are proud.
Excerpt from the LSA Academic Integrity Statement: “…The College holds all members of its community to high standards of scholarship
and integrity. To accomplish its mission of providing an optimal educational environment and developing leaders of society, the College
promotes the assumption of personal responsibility and integrity and prohibits all forms of academic dishonesty and misconduct.
Academic dishonesty may be understood as any action or attempted action that may result in creating an unfair academic advantage
for oneself or an unfair academic advantage or disadvantage for any other member or members of the academic community...”

www.lsa.umich.edu/academicintegrity/

No questions will be answered during the exam.

Answer each question the best you can.


Make sure you include relevant formulas with notation and your calculations.
Good Luck!








Note: Your Blue Formula Card provided with this exam
has an error in the standard error expression
under the Two Population Means General Column.

The boxed cell below shows the correct standard error expression.




1. Vacation Guilt ~ Getting away from it all offers travelers a wide array of benefits—from recharging to reconnecting
with family and friends. Yet research results from the 2017 Alamo Rent A Car Family Vacation Survey show

U.S. workers are feeling more guilt than ever before about taking their vacations.
A new study aims to analyze whether there is a difference in the rates of Millennials and non-Millennials affected by
the issue. In a random sample of 598 Millennials, 356 said they did not use all of their vacation days in the last full
calendar year. In a random sample of 102 non-Millennials, 50 said they did not use all of their vacation days in the last

full calendar year. [9 points]
a. Formulate the null and the alternative hypotheses for assessing if there is a difference in the rates of Millennials
and non-Millennials who say they did not use all of their vacation days in the last full calendar year (1=Millennials,

2=non-Millennials). Circle one.



b. The samples in the new study are random independent samples. Verify the remaining conditions necessary for
testing the hypotheses in part (a) above.











c. Compute the test statistic value and find the corresponding p-value.

















Test statistic: _________ = ____________________ , p-value= ___________________________ .
symbol

d. Select the appropriate decision at the 5% significance level.


Circle one: Reject H0 Fail to Reject H0


Page 1 of 9


2. The State of American Jobs ~ Americans are putting more time in at work, both in the average length of a work week
and in the average number of weeks worked per year. In 1980, the average number of weeks worked per year was
43 weeks per year. Could it be that in 2016 Americans worked more weeks per year, on average? The Pew Research
Center surveyed 1,096 full-time employed Americans and one of the variables recorded was the number of weeks
worked in 2016. Using a 5% level of significance, we would like to assess if, on average, the number of weeks worked
per year in 2016 has increased since 1980, that is H0: µ = 43 weeks versus Ha: µ > 43 weeks. The numerical summaries
and R output are provided below. [11 points]


Summary Statistics
Mean Std. Dev Sample Size Std. Error
43.22 weeks/year 4.574 weeks/year 1,096 0.1382 weeks/year


]]




a. Because the sample is large enough, we can conclude that the distribution of the number of weeks worked (per
year) in 2016 for the population of all full-time employed Americans is approximately normal.

Circle one: True False


b. What is the estimated average distance by which the possible values of the sample mean number of weeks worked
per year would vary from the population mean number of weeks worked per year? Include units in your answer.

Final Answer: _________________________________

c. Provide an interpretation in context of the test statistic value of 1.608.







d. Provide a well-labeled, complete sketch showing the p-value for this test. Be sure to include relevant values on
the horizontal axis and shade the area that corresponds to the p-value.

distribution label:

t(80)





x-axis label:


e. Complete the two sentences below by circling the appropriate choice in each statement:

At the 5% significance level, the results ARE ARE NOT statistically significant.

Thus, there IS IS NOT sufficient evidence to conclude the population mean number of weeks worked per

year in 2016 by all full-time employed Americans has increased from the 43 weeks per year level stated in 1980.

Page 2 of 9


3. What Type of Students Sleep More? Students in two statistics classes at the University of California at Davis were
asked how much sleep they got the night before class (the question was asked on a Monday). Class 1 was made up of
liberal arts majors, and Class 2 was made up of students from more technical majors. The following is based on data
similar to that used in the study. Consider the following definitions for the parameters:
• µ1 is the population mean amount of time (hours) students in liberal arts majors sleep on Sunday nights
• µ2 is the population mean amount of time (hours) students in more technical majors sleep on Sunday nights

A random sample of 25 students was selected from the liberal arts majors, and a random sample of 30 students was
selected from the more technical majors. The following is the R output that is relevant to our analysis. [9 points]
Rcmdr> numSummary(Sleep[,"Sleep"], groups=Sleep$Major, statistics=c("mean", "sd"))
mean sd data:n
LibArts 7.760000 1.362596 25
Technical 6.933333 1.311312 30
Rcmdr> leveneTest(Sleep ~ Major, data=Sleep, center="mean")
Levene's Test for Homogeneity of Variance (center = "mean")
Df F value Pr(>F)
group 1 0.0541 0.817
53

a. Should we use the general (unpooled) or pooled methods to continue our analysis?

Circle one: General (unpooled) Pooled


Explain why you chose the method above. Be specific.












b. We want to see if there is a difference in the mean amount of sleep on Sunday nights for the two groups of
students. Which of the following R outputs is appropriate to use based on your answer in part (b)?

Circle one: Output #1 Output #2





Output #1 Welch Two Sample t-test

data: Sleep by Major


t = 2.2789, df = 50.465, p-value = 0.02694
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.09823602 1.55509731
sample estimates:
mean in group LibArts mean in group Technical
7.760000 6.933333

Output #2 Two Sample t-test


data: Sleep by Major
t = 2.287, df = 53, p-value = 0.02622
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
0.1016699 1.5516634
sample estimates:
mean in group LibArts mean in group Technical



Page 3 of 9


Problem 3 continues on page 4
c. We wish to test the hypotheses H0: µ1 = µ2 versus Ha: µ1 ≠ µ2 at the 5% significance level. Which is the appropriate
decision based on your selected output in part (b)?

Circle one: Reject H0 Fail to reject H0



d. State your conclusion in the context of the problem.








e. Suppose we want to increase the power of the test. Which of the following would accomplish that?
(Circle all that are appropriate.)
• Decrease the samples sizes
• Increase the sample sizes
• Decrease the significance level of the hypothesis test
• Increase the significance level of the hypothesis test

4. What Americans Know About Science ~ A new Pew Research Center survey found that most Americans can answer
basic questions about several scientific terms and concepts, such as layers of the Earth and elements needed to make
nuclear energy. The results were based on a random sample of Americans who were asked a series of 12 multiple
choice questions representing a small slice of science knowledge. The 90% confidence interval for the mean score on
the multiple choice scientific survey for all Americans was reported to be (7.15 points, 8.65 points). The corresponding
95% confidence interval on the same set of data was calculated to be (6.65 points, 9.15 points). [6 points]

a. Which of the following could be a possible p-value for the hypotheses: H0: μ = 7 points versus Ha: μ ≠ 7 points?

Circle all that apply. 0.01 0.05 0.07 0.10 0.12



b. Complete the following statement:

Based on the results, we would estimate the population mean score obtained by Americans

on the multiple choice scientific survey to be ____________ points.

c. For each statement below, determine if it is a correct or incorrect statement. Clearly circle your answer.

One of the assumptions we must check is that the population mean score on the
Correct Not Correct
multiple choice scientific survey for all Americans follows a normal distribution.

If this study were repeated many times, about 90% of the time we would expect
the population mean score on the multiple choice scientific survey for all Correct Not Correct
Americans to be between 7.15 points to 8.65 points.
With 95% confidence, we estimate that, on average, Americans would obtain a
score of anywhere from 6.65 points to 9.15 points on the multiple choice scientific Correct Not Correct
survey, for the population of Americans represented by the sample.

Page 4 of 9


Page 5 of 9



5. When should you purchase your airfare? For anyone planning to travel, the pricing plans of airlines remain a guarded
mystery. A study by CheapAir.com suggested that the cheapest fares are found 49 days before a flight. However,
according to Travelers Today, the best prices are offered 21 days before a flight. Detroit Metropolitan Airport (DTW)
is a hub for Delta Airlines, therefore offering many non-stop flights from DTW to many destinations around the world.
We randomly selected 36 destinations for which Delta Airlines offers non-stop flights and recorded the best price for
a one-way ticket 21 and 49 days prior to the flight. We would like to estimate, with 99% confidence, the population
mean difference in airfare price (21 days before a flight - 49 days before a flight) for a one-way, non-stop ticket from

DTW to all destinations offered by Delta Airlines. [9 points]
> numSummary(airprice[,c("Difference", "Price21days", "Price49days")],
+ statistics=c("mean", "sd", "IQR", "quantiles"), quantiles=c(0,.25,.5,.75,1))
mean sd IQR 0% 25% 50% 75% 100% n
Difference -9.389 39.57 55.00 -83 -39.5 -11.0 15.5 85 36
Price21days 153.056 19.47 32.25 107 138.0 154.0 170.2 195 36
Price49days 162.444 32.37 44.25 86 145.0 159.5 189.2 218 36

a. Consider the histogram and QQplot provided at the


right. Clearly state the assumption in context that
would be checked by examining these two plots.







b. Construct a 99% confidence interval for the population mean difference in airfare prices (21 days before a
flight - 49 days before a flight) for a one-way, non-stop ticket from DTW to all destinations offered by Delta Airlines.







Final Answer: ( ___________________ , ____________________ )

c. Provide an interpretation of your resulting confidence interval in context.









d. Based only on your 99% confidence interval in part (b), what is an appropriate recommendation? Circle one.

• buy airline tickets 21 days before departure


• buy airline tickets 49 days before departure


• no difference in airline ticket prices 21 days versus 49 days before departure, on average

Explain:

Page 6 of 9


6. Caffeine, please ~ Suppose the caffeine amount for a standard 12-ounce, regular milk latte made at Starbucks follows
a normal distribution, with a mean of 64 mg and a standard deviation of 1.5 mg of caffeine. Suppose the amount of
caffeine in a standard 12-ounce, regular milk latte made at Espresso Royale follows a normal distribution, with a mean
of 69 mg and a standard deviation of 2 mg of caffeine.

A UM student, on a strict budget and very sleep deprived, has received a standard 12-ounce, regular milk latte
delivered in a personal cup and is hoping it has a lot of caffeine. The student decides he will measure the amount of
caffeine in the drink and if it is 66 mg or higher, he will conclude the latte is from Espresso Royale.

He decides to view this decision-making process in terms of testing the following hypotheses:

H0: the latte is from Starbucks versus Ha: the latte is from Espresso Royale

Thus, the student will reject H0 if the amount of caffeine in the latte is 66 mg or higher. [6 points]

a. Determine the significance level for this test.











Final Answer: _________________________

b. Find the power of this test.











Final Answer: _________________________


c. The student has just now measured the amount of caffeine in the provided latte and it was found to be 67 mg.
Thus the decision is to reject the null hypothesis.

Now that the decision was made, …

i. …what is the probability that the student has made a type 1 error? _____________________


ii. …what is the probability that the student has made a type 2 error? _____________________

Page 7 of 9


7. Speed Demons ~ Researchers speculate that drivers who do not wear a seatbelt drive at higher speeds, on average,
than drivers who do wear one. The following summary is based on a random sample of 20 drivers who were
clocked to see how fast they were driving (miles per hour), and then stopped to see whether they were wearing a
seatbelt (yes or no).

Summary Statistics
Group Mean Std. Dev. Sample Size Pooled Std. Dev.
1 = Yes 65.3 7.487 12
8.030
2 = No 72.5 8.816 8

We are told that we have random samples, and they are independent samples because the driving speeds of those
who wear seatbelts should not impact the driving speeds of those who do not wear a seatbelt (and vice versa).
Graphical displays have been examined, and it is reasonable to assume that both underlying population distributions
for speeds are approximately normal. [8 points]

a. The researchers have determined it is reasonable to use the pooled method to construct a confidence interval for
the difference between the population mean speeds. Calculate the pooled standard error.









Final Answer: __________________________________

b. Compute a 98% confidence interval for the difference in the population mean speeds (that for drivers wearing a
seatbelt minus that for drivers not wearing a seatbelt).









Final Answer: ( ___________________________ , ___________________________ )

c. At the 2% significance level, can you conclude from your confidence interval in part (b) that, on average, drivers
who do not wear a seatbelt are more likely to speed than drivers who do wear one?

Circle one: Yes No

Explain:

Page 8 of 9


8. Naughty or Nice? ~ Yale University graduate student J. Kiley Hamlin conducted an experiment in which 16 ten-month-
old babies were asked to watch a climber character attempt to ascend a hill. On two occasions, the baby witnesses
the character fail to make the climb. On the third and fourth attempts, the baby witnesses either a helper toy push
the character up the hill or a hinderer toy prevent the character from making the ascent. The helper and hinderer toys
were shown to each baby in a random order for a fixed amount of time. The baby was then placed in front of each toy
and allowed to choose which toy to play with. In 14 out of 16 cases, the baby chose the helper toy. [5 points]

a. The hypotheses to be tested is that babies prefer the helper toy to the hinderer toy and can be represented by
H0: p = 0.50 versus Ha: p > 0.50. Provide a complete definition for the quantity p.

Let p = ___________________________________________________________________________________

___________________________________________________________________________________

b. Based on the results of the study, report the appropriate p-value.












p-value = _____________________

9. Inflatable Bounce House: Inflatable bounce houses have a weight limit because there is only so much the inflatable
object can hold up. Suppose an inflatable bounce house has a weight limit of 2200 pounds. The manufacturer knows
that the weight for elementary school kids is approximately normal with mean of 70 pounds and standard deviation
of 10 pounds. In order to avoid having to weigh kids before letting them play in the bounce house, the manufacturer
formulated the weight limit as “holds no more than 30 elementary school kids”. What is the probability that the total
weight for a random sample of 30 elementary school kids (who will enter the inflatable bounce house) will exceed the
2200 pound limit? Hint: remember that the average = total/n. [4 points]
















Final Answer: _________________________

Page 9 of 9


10. Name That Scenario ~ A college department is conducting a study and will gather data for a number of variables for
a random sample of recent graduates. The department chair comes to you for your statistical expertise to help decide
the appropriate method to address various research questions. For each part (a) to (d), you first need to determine if
the most appropriate method to address the problem would be to make a confidence interval (CI) or to conduct a test
of hypotheses (HT). If the appropriate method is a confidence interval (CI), then provide the appropriate symbol for
the corresponding parameter that the confidence interval will be used to estimate. If instead the appropriate method
is a test of hypotheses (HT), then provide the corresponding null and alternative hypotheses. Thus, you complete only

one of the two boxes for each part (a) to (d) below. [8 points]
a. The department chair would like to learn what the 4-year graduation rate is, that is, the rate of their students that
graduate within 4 years.

Make a Confidence Interval (CI) Conduct a test of hypotheses (HT) of

H0: ____________________________________

for __________________

Ha: ____________________________________

b. The department student services coordinator would like to assess if students who regularly see their advisors
(defined as schedule an appointment each term) graduate faster on average than students who do not regularly
see their advisors.

Make a Confidence Interval (CI) Conduct a test of hypotheses (HT) of

H0: ____________________________________

for __________________

Ha: ____________________________________

c. The department has three different sub-programs for its major. The program chairs would like to determine if
there are any differences, on average, between the GPA for students across these three sub-programs.

Make a Confidence Interval (CI) Conduct a test of hypotheses (HT) of

H0: ____________________________________

for __________________

Ha: ____________________________________

d. The department chair believes that the proportion of graduates that are ‘very satisfied’ with their capstone course
experience is more than 30% and would like to assess this claim.

Make a Confidence Interval (CI) Conduct a test of hypotheses (HT) of



H0: ____________________________________

for __________________

Ha: ____________________________________


When done, please BRING your blue help card, seat number(s), exam, and ID up front to sign out ~ and collect your belongings.
Check Canvas over weekend for further announcements. -- The Stats 250 Instructional Team

Page 10 of 9