You are on page 1of 7

Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

Statistics 252 – Midterm Exam

Instructor: Paul Cartledge

Instructions:

1. Read all the instructions carefully.


2. This is a closed book exam.
3. You may use the formula sheets and the tables provided and a calculator only.
4. You have 90 minutes to complete and submit the exam.
5. The exam is out of a total of 50 marks.
6. Show your work in all sections to receive full credit. Final numerical answers should
have THREE significant decimal places (such as 0.00314).
7. Use the backs of the pages for scrap work.
8. Make sure your name and signature are on the front and that your ID number is
on the top of page two.
9. When referring to “log”, I am always referring to the natural log.
10. If no significance level is given, use the “judgment approach”.
11. When asked for a “confidence interval”, state the estimate, the standard error,
and the critical value. Then, calculate and interpret the interval.
12. When asked to “carry out a full analysis in detail”, set up the hypotheses,
calculate the test statistic, state the distribution of the test statistic (such as t9 or
F3,10), provide an exact p-value or range, and state your conclusion in plain
English.

Name: ______________________________________

Signature: ___________________________________

Component Notes Worth Mark

Short Answer 5 questions 10


Long Answer
Question 6 3 parts 15
Question 7 4 parts 14
Question 8 4 parts 11

Total 50

1
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

ID:___________________________

Question 1 (2 marks) At the start of their midterm, a student sees an F-statistic of 5.22.
From a data structure of three groups and eight observations from each group, what is the
range of the corresponding p-value?

Question 2 (2 marks) Using information from Question 1, apply the “judgment


approach” to make a decision and conclusion about the rejection (OR non-rejection) of
the null hypothesis that claims equality among the three means.

Question 3 (2 marks) Is the following statement true or false? Defend your answer
either way in one or two sentences. Simply an answer of “true” or “false” will not
receive any credit. “It is possible to make causal inferences in an experiment.”

Question 4 (2 marks) A learning statistician attempts to analyse two independent


random samples. Some skewness in the sample distributions requires a log
transformation of the data. Thus, they report a 98% confidence interval of the difference
of the means on the log scale to be (0.049, 0.526). What can you say about the possible
rejection of the null hypothesis testing to see if the medians on the original scale are the
same? Explain your answer in one or two sentences. No calculation required.

Question 5 (2 marks) Using information from Question 4, what is the multiplicative


effect of the medians on the original scale?

2
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

Question 6 (15 marks total) If Darwinism has taught us anything, it’s that science is best
understood by analyzing superior beings; since aliens allegedly don’t exist, let’s take a
look at varsity athletes. Keeping it simple, though, let’s analyse their heights. The tables
below summarize: summary statistics of varsity teams (measured in inches); the ANOVA
output; selected linear combinations. Assume all assumptions hold.

Group Team ni Sample Mean Sample S.D.


1 Pandas Basketball 15 70.93 2.28
2 Pandas Hockey 25 66.60 2.45
3 Pandas Volleyball 18 71.11 2.45
4 Golden Bears Basketball 14 76.36 4.27
5 Golden Bears Hockey 26 70.92 3.08
6 Golden Bears Volleyball 17 76.29 3.20

Source of Variation Sum of Squares df Mean Square F-Statistic p-value


Between (Extra) 1337.34 30.45
Within (Full)
Total (Reduced) 114

Contrast Coefficients

Type

Contrast 1 2 3 4 5 6

1 1 -1 0 1 -1 0

2 0 1 -1 0 1 -1

3 -1 0 1 -1 0 1

4 1 1 1 0 -3 0

5 2 -1 0 0 -1 0

Contrast Tests
Value of
Contrast Contrast Std. Error t df Sig. (2-tailed)
Height Assume equal 1 9.770 1.379 7.085 109 .000
variances
2 -9.880 1.301 -7.593 109 .000
3 0.110 1.489 .074 109 .941
4 -4.120 2.113 -1.950 109 .054
5 4.340 1.741 2.493 109 .014

a) (2 marks) If you had to fully analyse “unplanned comparisons” upon the groups, how
many unique pairings of the groups will you need? Also, if the experiment-wise
confidence level is 98.5%, what are the corresponding individual confidence levels?

3
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

The following ANOVA output compares average height by grouping the varsity teams by
gender (Pandas are female, Golden Bears are male):
ANOVA
Height
Sum of
Squares df Mean Square F Sig.
Between Groups 645.6 1 645.6 44.24 .000
Within Groups 1649.0 113 14.6
Total 2294.6 114

b) (3 marks) Suppose we want to test if a model, where all teams have potentially
different mean heights, is significantly better than this “gender” model. List appropriate
null and alternative hypotheses for such a test, as well as identify SSR and df for the
respective models.

c) Are the volleyball players taller than the basketball players? Carry out an
appropriate test to answer this question.
i) (2 marks) First, define a linear combination of means contrasting the average
heights of volleyball players and basketball players. Fill in the blanks below for
coefficients in your contrast.

γ = ____μ1 + ____μ2 + ____μ3 + ____μ4 + ____μ5 + ____μ6

ii) (2 marks) Give the estimate and standard error for this contrast.

g = ________________________________

S.E.(g) = ________________________________

iii) (3 marks) What are the test statistic and exact p-value for the test to answer the
question listed at the start of part c)?

iv) (3 marks) Make a decision and state your conclusion in plain English.

4
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

Question 7 (14 marks total) Numerous CEOs at large corporations have recently
become concerned with how much REM (rapid eye movement) sleep they’ve been
having and the average price of their corporation’s stock in the following week.
Consulting an “a-maize-ing” Cobb (yeah, he sounds pretty corny), the data was
surprisingly easy to obtain for the two variables. Time of REM sleep (x) was measured
in minutes and average stock price in the following week (y) was measured in $US.
Assume all regression model assumptions hold.

Here is the SPSS output for SLR analysis of Sleep on StockPrice.


Model Summary

Adjusted R Std. Error of


Model R R Square Square the Estimate
1 -.862a .743 .732 4.4404
a Predictors: (Constant), Sleep
ANOVAb

Sum of
Model Squares df Mean Square F Sig.
1 Regression 1310.957 1 1310.957 66.487 .000a
Residual 453.501 23 19.717
Total 1764.458 24
a Predictors: (Constant), Sleep
b Dependent Variable: StockPrice
Coefficientsa

Unstandardized Standardized
Coefficients Coefficients

Model B Std. Error Beta t Sig.


1 (Constant) 286.209 16.463 17.384 .000
Sleep -1.304 .160 -.861 -8.154 .000
a Dependent Variable: StockPrice

a) (2 marks) Predict the average stock price of a company in the following week if the
REM sleep of the CEO is 1 hour and 40 minutes.

b) (2 marks) Circle the appropriate words in the parentheses.

Interpreting the correlation value, the variables have a (strong, moderate, weak)
relationship that has a (positive, negative) association.

5
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

c) (7 marks) Test if the simple linear regression model is significantly better than the
“one-mean” model. Carry out a full analysis in detail.

d) (3 marks) Calculate a 99% confidence interval for the intercept of the simple linear
regression model above.

6
Statistics 252 – Midterm Exam – Paul Cartledge – Winter 2011

Question 8 (11 marks) A YouTube-obsessed physicist (Roy G. Biv) has suddenly


become intrigued with the amount of light being emitted in certain areas of the United
States. In the respective small towns of Daub River and Alfa Ti Vuoto Vizio, Roy and a
friend each used a spectrometer to measure wavelengths at noon on each day over four
months. Randomly sampling 81 observations each from the data obtained in each town,
Roy and his friend, Violet, obtained the following summary statistics (units are in nm).

Summary statistic DR ATVV Difference


Average 577.00 554.00 23.00
Standard Deviation 44.2 50.8 63.0
NOTE: Please note that the third column summarize the differences from the original observations. By
choosing a test, you will be using certain columns of the above table, but not all of them.

Based on statistical evidence, does light in Daub River have a higher wavelength?

a) (2 marks) Is the above situation two independent samples or a paired sample?

b) (3 marks) Write the appropriate null and alternative hypotheses.

c) (4 marks) Suppose S.E.(Estimate) = 7.482. Calculate the test statistic, state the
distribution of the test statistic, and determine the range of the p-value.

d) (2 marks) Using part c), make a decision and state a conclusion in plain English.

You might also like