You are on page 1of 13

CS7290

Midterm

October 11, 2019

Time: 1 hour 40min

Name (please print):

Show all your work and calculations. Partial credit will be given for work that is partially correct.
Points will be deducted for false statements, even if the final answer is correct. Please circle your
final answer where appropriate.

This exam is closed-book. You may consult one page with your hand-written notes. Calculators
with no online access are permitted.

Honor code: I promise not to cheat on this exam. I will neither give nor receive any unauthorized
assistance. I will not to share information about the exam with anyone who may be taking it at a
different time. I have not been told anything about the exam by someone who has taken it earlier.

Signature: Date:

1
Question Possible Points Actual Points

1 40

2 60

3 15

Total 115

2
1. For each question, circle whether the statement is true or false, and provide a brief justifica-
tion.

(a) (5 pts) In a linear regression model with p parameters, Maximum Likelihood estimation
yields an unbiased estimate of the error variance.
TRUE FALSE
Answer:
SSE
False. Maximum Likelihood estimate of σ 2 is s2 = n . An unbiased estimate is more
conservative s2 = SSE
n−p .

(b) (5 pts) You are interested in the association between two Normally distributed variables
X and Y . A colleague advises you to calculate the correlation ρ and test H0 : ρ1 = 0.
Another colleague advises you to calculate the slope of linear regression Y = β0 +β1 X +
and test H0 : β1 = 0. The two approaches are equivalent.
TRUE FALSE
Answer:
True. A regression views one variable as response (i.e., random), and the other as pre-
dictor (i.e., fixed), and examines the probability distribution of the response conditional
on the predictor. Correlation views both X and Y as random. If both X and Y are
Normally distributed, there is a direct mathematical relationship between
r the slope of the
σY2
linear regression and the Pearson coefficient of correlation: β1 = ρ 2
σX
. Therefore the
1
two hypotheses are equivalent.

(c) (5 pts) P-value is the probability that the null hypothesis is true.
TRUE FALSE
Answer:
False. P-value is the probability to observe a summary of the data (e.g., t statistic) more
extreme than what we currently observe, if H0 is true. In frequentist statistics there is
no probability associated with the status of H0 .

(d) (5 pts) In a linear regression with one predictor, we test H0 : β1 = 0 vs Ha : β1 6= 0.


When the Type I error α decreases, the statistical power increases.
TRUE FALSE
Answer:
False. Type I error and statistical power have the opposite relationship. Smaller Type I
error leads to a more conservative test, and to a loss of statistical power.

3
(e) (5 pts) The probability of incorrectly rejecting at least one H0 increases with the increase
of the number of tests.
TRUE FALSE
Answer:
True.

P {Type I error1 ∪ Type I error2 }


= P {Type I error1 } + P {Type I error2 } − P {Type I error1 ∩ Type I error2 }
≥ P {Type I error1 }

(f) (5 pts) 95% confidence interval indicates that the probability that the true value falls
into this interval is 95%.
TRUE FALSE
Answer:
False. A 95% CI indicates that the lower and the upper bounds cover the true value with
the probability of 95%. In frequentist statistics there is no probability associated with the
status of the true value.

(g) (5 pts) In a linear regression with one predictor and fixed values of X, when the sample
size increases by a factor of k, the width of the confidence
√ interval for the expected value
of a new observation decreases by the factor of k.
TRUE FALSE
Answer:

True. It decreases proportionally to the standard error, which is a function of k.

(h) (5 pts) In a linear regression with one predictor and fixed values of X, when the sample
size increases by a factor of√k, the width of the prediction interval for a new observation
decreases by the factor of k.
TRUE FALSE
Answer:
False. The width of the prediction interval has a non-reducible component MSE, which
does not depend on the sample size.

4
2. College admissions officers are interested in predicting the college GPA of the admitted stu-
dents (Y ) from their high school scores. They consider three possible predictors: high school
math score (X1 ), high school science score (X2 ), and high school English score (X3 ). They col-
lected the data on n = 224 current students. The scatterplot and the tables below summarize
the data, and the results of fitting three regression models.
Model 1 Model 2 Model 3
X1 X1 , X2 X1 , X2 , X3
Summary Y X1 X2 X3 b0 0.9076 0.7404 0.5898
Sample mean 2.6352 8.3214 8.0892 8.0937 b1 0.2076 0.1756 0.1685
Sample variance s2 0.6074 2.6854 2.8888 2.2736 b2 0.0535 0.0451
b3 0.0343
2 4 6 8 10 3 4 5 6 7 8 9

● ● ● ● ● ● ● ● ● R2 0.1905 0.1997 0.2046

4

● ● ●
● ● ● ● ●
● ● ●
● ●
● ● ●
● ●
● ● ● ●
● ●
● ● ● ● ● ●
● ●
● ● ●
● ● ● ● ● ● ● ●
● ● ● ● ●
● ●

● ● ●
● ●
● ● ● ● ● ●
● ● ● ● ●

● ●
● ●
● ●
● ● ●
● ●
● ●
● ●
● ● ●
● ●
● ●

● ● ● ● ●
● ● ● ● ● ●
● ● ● ●
● ● ● ●
● ● ● ● ● ● ●

3
● ●
● ●
● ● ●
● ●
● ● ●
● ● ●
● ● ●

● ●
● ● ● ●
● ●
● ● ● ●
● ●
● ●
● ● ● ● ● ● ●

● ● ● ● ● ● ● ● ● ●
● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ● ●
● ●
● ● ●
● ●
● ●
● ● ● ● ●
● ●
● ●
● ● ●
● ● ● ●


gpa ● ●
● ●
● ●
● ● ●
● ● ● ● ● ● ●
● ●
● ●

● ●
● ● ● ● ●
● ● ● ●
● ● ● ● ● ●
● ● ●
● ● ● ● ● ● ●
● ● ● ● ●
● ●
● ●
● ● ● ● ● ● ● ● ● ● ●

2
● ●
● ● ● ●
● ●
● ●
● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ● ● ●
● ● ●
● ● ● ●
● ● ● ●

● ● ● ● ●
● ● ● ●

● ● ● ●
● ●
● ●
● ● ●
● ● ● ● ● ● ●
● ● ●

1
● ● ● ● ● ●
● ● ● ● ● ●
● ● ●
● ●

● ● ● ● ● ●
● ● ●

0
10

● ● ●

● ●● ●
●●●●

●●●

●●
●●
●●
●●

●●
●●●

●●●

●●●
●● ● ● ● ● ● ● ● ● ● ● ●

● ●● ●
●●●● ●

●●●●●

●●
●●
●●●
●●●

●●
●●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

●● ●●
●●●●●●
●●●

●●●
●●●●
●●●●
●●●
● ● ● ● ● ● ● ● ● ● ● ● ● ●
8

● ●● ●
●● ● ●●●●●●●
●●●
● ●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●

● ● ●
● ●●

●●●
●●●●

● ●●
● ●
math ● ● ● ● ● ● ● ● ● ● ●
6

● ●● ● ● ● ● ● ● ● ● ●

● ● ●● ● ● ● ● ● ● ●
4

● ● ●

● ● ●
2

● ● ● ●●●
●●●●

●●●●

●●
●●
●●
●●
●●
●●●

●●
●●●
●●● ● ● ● ● ● ● ● ● ●

● ●●● ●
● ●●●
●●
●●●
●●●

●●●●●●●

●●
● ●●
●● ● ● ● ● ● ● ● ● ● ● ●
9

● ●●● ●
●● ● ●

●●●●
●●
●●●
●●●●● ●
● ● ● ● ● ● ● ● ● ● ● ● ●
8

●● ● ● ●●●●● ●

●●●

●●●●●●
●●●● ●
●●●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7

● ● ● ●●
●●●●
●●●
●●●●● ● ●● ● ● ● ● ● ● ● ●
science ● ● ● ● ●
6

● ●● ●
● ●
●● ● ● ● ● ● ● ●
5

● ●● ● ●● ● ● ● ● ● ● ● ● ● ●
4

● ● ●
3

● ● ● ●
●●●●●●

●●
●●●
●●●
●●
●●

● ●
●●●● ● ● ● ● ● ● ● ● ●

● ●
● ●● ●●●●●●
●●
●●●●

●●●
●●
●●
●●
●●
●●●

●●● ● ● ● ● ● ● ● ● ● ● ● ●
9


●● ●●

●●●

●●
●●●
●●●●●●
●●
●●
●●
●●
●●● ● ● ● ● ● ● ● ● ● ● ● ● ●
8

● ●● ● ●
● ●●●
●●●
●●●
●●●●
●●●

●●●
●●
●●● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
7

● ● ●● ●
●●●●
●●●●●● ●●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
english
6

●● ● ● ● ● ● ● ● ● ● ● ●
5

● ●● ● ● ● ● ● ● ●
4

● ● ●
3

0 1 2 3 4 3 4 5 6 7 8 9

(a) (5 pts) Consider Model 1. State all the assumptions, and express them in mathematical
notation used in class.
Answer:
iid
Yi = β0 + β1 X1 + i , i ∼ N (0, σ 2 )

(b) (5 pts) Which assumptions above may potentially be violated, and why?
Answer:
Normality of the response, since the score is bounded by 0 and 10. Independence of
the students, in case they have studied in same classes or similar programs. No strong
evidence against constant variance.

(c) (5 pts) Report the correlation between X1 and Y .


Answer:
√ √
rx1 y = R2 = 0.1905 = 0.4365

5
(d) (5 pts) Consider Model 1. Report the SSTO and SSE in this model. [Hint: use the
partial information in the two tables at the beginning of the problem.]
Answer:
SST O = (n − 1) · s2Y = 223 · 0.6074 = 135.4502
SSE = (1 − R2 ) · SST O = (1 − 0.1905) · 135.4502 = 109.6469

(e) (5 pts) Consider Model 1. Calculate the 95% confidence interval for the mean college
GPA of a student with a score of 5 in high school math.
Answer: q
ar{ŶX1 =5 }
The confidence interval is ŶX1 =5 ± t(1 − 0.05/2, 224 − 2) Vd
ŶX1 =5 = 0.9076 + 0.2076 · 5 = 1.9456.
With 222 degrees of freedom, the Student distribution is approximated by the normal
distribution, i.e. t(1 − 0.05/2, 222) ≈ z(1 − 0.05/2) = 1.96

!
(Xh − X̄1 )2 (Xh − X̄1 )2
 
SSE 1 SSE 1
ar{ŶX1 =5 } =
Vd +P = +
n−2 n i (Xi1 − X̄1 )
2 n − 2 n (n − 1) · s2X1
(5 − 8.3214)2
 
109.6469 1
= · + = 0.0113
222 224 223 · 2.6854

The CI is 1.9456 ± 1.96 · 0.0113 = 1.9456 ± 1.96 · 0.1063. It is (1.7372, 2.153948)

(f) (5 pts) Consider Model 1. Calculate the 95% prediction interval for the college GPA of
a future applicant with a score of 5 in high school math.
Answer: q
The prediction interval is ŶX1 =5 ± t(1 − 0.05/2, 224 − 2) M SE + Vd ar{ŶX1 =5 }.
q
1.9456 ± 1.96 · 109.6469
222 + 0.0113 = 1.9456 ± 1.96 · 0.7107. It is (0.5524, 3.3387)

6
(g) (5 pts) Suppose that the officers want to repeat the study in the future, and fit Model
1, but this time they can only collect data on 51 students. What will be the statistical
power of rejecting H0 : β1 = 0 in this study? Assume that the true value of slope is 0.2
(similar to that in the current dataset), that the error variance in the future study is 0.5
(similar to the current dataset), and that s2X1 and s2Y will be as in the current dataset.
Use α = 0.05.
Answer:
H0 : β1 = 0 vs Ha : β1 6= 0. The relationship between the quantities above is
2
σ2 β1 − 0

=
(n − 1)s2X1 z1−α/2 + zpower

Substituting the values estimated from the current experiment, and solving for the power:
 2
0.5 0.2
=
50 · 2.6854 1.96 + zpower
r
50 · 2.6854
zpower = 0.2 · − 1.96
0.5
r !
50 · 2.6854
power = Φ 0.2 · − 1.96 = Φ(1.3174) = 0.9061
0.5

where Φ is the CDF of N (0, 1)


In other words, a much smaller sample size will have a large enough statistical power.

(h) (5 pts) Test H0 : β2 = β3 = 0 against H0 : β2 6= 0 or β3 6= 0. Use the confidence level


α = 0.05. [Hint: use the information in the two tables at the beginning of the problem.]
Answer:
SSE
Since R2 = 1 − SST O ⇒ SSE = SST O · (1 − R2 ).

[SSE(Reduced) − SSE(F ull)]/[dfE (Reduced) − dfE (F ull)]


F =
SSE(F ull)/dfE (F ull)
[SST O · (1 − R (Reduced)) − SST O · (1 − R2 (F ull)]/2
2
=
SST O · (1 − R2 )/[224 − 4]
[R2 (F ull) − R2 (Reduced)]/2
=
(1 − R2 (F ull))/220
(0.2046 − 0.1905)/2
= = 1.9499 < F (1 − 0.05, 2, 220) = 2.99
(1 − 0.2046)/220

Conclude: no evidence against H0 .

7
(i) (5 pts) True/False For the test above, there is an equivalent test using Student T
distribution.
TRUE FALSE
Answer:
False. The equivalence between the F test and the student t test only exists when we test
a simple null hypothesis H0 : βj = 0. In this case, t2 (1 − α/2, n − p) = F (1 − α, 1, n − p).
There is no such relationships when we test composite null hypotheses involving more
than one parameter of linear regression.

(j) (5 pts) Calculate and interpret the partial coefficient of multiple determination of X2 ,
given that X1 is already in the model. [Hint: use the information in the two tables at
the beginning of the problem.]
Answer:
Similarly to 2(h),

SSE(X1 ) − SSE(X1 , X2 ) R2 (X1 , X2 ) − R2 (X1 ) 0.1997 − 0.1905


R2 (X2 |X1 ) = = 2
= = 0.01136
SSE(X1 ) 1 − R (X1 ) 1 − 0.1905

Conclude: the extra variation explained by X2 given that X1 is already in the model is
almost 0.

(k) (5 pts) True/False If the college GPA is reported on a new scale, obtained by multi-
plying Y by 10, this will not change the values of R2 and partial R2 .
TRUE FALSE
Answer:
True. R2 and partial R2 are proportions of total variation explained in the response,
explained by its linear association with the predictor. Multiplying the response by a
constant will not change the proportions.

8
(l) (5 pts) Given the results above, what model do you recommend to the admissions
officers, and what are the potential limitations of the chosen model?
Answer:
The predictors are collinear. The first predictor (math) has a moderate predictive ability.
The remaining two predictors do not add more information. If the model with a single
predictor is chosen, inference should be used with caution due to potential violations of
the model assumptions (such as deviations from the Normal distribution of Y given X).

3. An article used multiple regression to predict attitudes towards homosexuality. The re-
searchers found that the effect of number of years of education on a measure of support of
homosexuality varied from essentially no effect for political conservatives to a strong positive
effect for political liberals.

(a) (5 pts) Draw a sketch, and explain how this is an example of statistical interaction.
Answer:
The intercepts and the slopes of the linear relationships between years of education and
level of support differ for each political affiliation.
Support
Liberal

Conservative

Education

(b) (5 pts) State a multiple regression model for this study, using conservative political
views as the baseline. Define all the variables, and state the assumptions. Spell out the
design matrix X for this model, and the expected values of Y for each group. Assume
2 observations from the conservative group, and two from the liberal group.
Answer:
Define Y - support, X1 - years of education, X2 = Iliberal . The model is
iid
Yi = β0 + β1 Xi1 + β2 Xi2 + β3 Xi1 · Xi2 + i , i ∼ N (0, σ 2 ), i = 1, . . . , 4.
The design matrix is
 
1 X11 0 0
 1 X21 0 0 
X=  1 X31 1 X31 

1 X41 1 X41

E{Y |conservative} = β0 + β1 X1 and E{Y |liberal} = (β0 + β2 ) + (β1 + β3 )X1

9
(c) (5 pts) State a multiple regression model for this study, using the average population as
the baseline (assume that the population has an equal number of liberals and conserva-
tives). Define all the variables, and state the assumptions. Spell out the design matrix
X for this model, and the expected values of Y for each group. Assume 2 observations
from the conservative group, and two from the liberal group.
Answer:
Define Y - support, X1 - years of education, X2 = −1 if conservative and X2 = 1 if
iid
liberal. The model is Yi = β0 +β1 Xi1 +β2 Xi2 +β3 Xi1 ·Xi2 +i , i ∼ N (0, σ 2 ), i = 1, . . . , 4.
The design matrix is
 
1 X11 −1 −X11
 1 X21 −1 −X11 
X=  1 X31 1

X31 
1 X41 1 X41

E{Y |conservative} = (β0 −β2 )+(β1 −β3 )X1 and E{Y |liberal} = (β0 +β2 )+(β1 +β3 )X1

10
Standard Normal Probabilities

Table entry

Table entry for z is the area under the standard normal curve
to the left of z.
z

z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09

–3.4 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0002
–3.3 .0005 .0005 .0005 .0004 .0004 .0004 .0004 .0004 .0004 .0003
–3.2 .0007 .0007 .0006 .0006 .0006 .0006 .0006 .0005 .0005 .0005
–3.1 .0010 .0009 .0009 .0009 .0008 .0008 .0008 .0008 .0007 .0007
–3.0 .0013 .0013 .0013 .0012 .0012 .0011 .0011 .0011 .0010 .0010
–2.9 .0019 .0018 .0018 .0017 .0016 .0016 .0015 .0015 .0014 .0014
–2.8 .0026 .0025 .0024 .0023 .0023 .0022 .0021 .0021 .0020 .0019
–2.7 .0035 .0034 .0033 .0032 .0031 .0030 .0029 .0028 .0027 .0026
–2.6 .0047 .0045 .0044 .0043 .0041 .0040 .0039 .0038 .0037 .0036
–2.5 .0062 .0060 .0059 .0057 .0055 .0054 .0052 .0051 .0049 .0048
–2.4 .0082 .0080 .0078 .0075 .0073 .0071 .0069 .0068 .0066 .0064
–2.3 .0107 .0104 .0102 .0099 .0096 .0094 .0091 .0089 .0087 .0084
–2.2 .0139 .0136 .0132 .0129 .0125 .0122 .0119 .0116 .0113 .0110
–2.1 .0179 .0174 .0170 .0166 .0162 .0158 .0154 .0150 .0146 .0143
–2.0 .0228 .0222 .0217 .0212 .0207 .0202 .0197 .0192 .0188 .0183
–1.9 .0287 .0281 .0274 .0268 .0262 .0256 .0250 .0244 .0239 .0233
–1.8 .0359 .0351 .0344 .0336 .0329 .0322 .0314 .0307 .0301 .0294
–1.7 .0446 .0436 .0427 .0418 .0409 .0401 .0392 .0384 .0375 .0367
–1.6 .0548 .0537 .0526 .0516 .0505 .0495 .0485 .0475 .0465 .0455
–1.5 .0668 .0655 .0643 .0630 .0618 .0606 .0594 .0582 .0571 .0559
–1.4 .0808 .0793 .0778 .0764 .0749 .0735 .0721 .0708 .0694 .0681
–1.3 .0968 .0951 .0934 .0918 .0901 .0885 .0869 .0853 .0838 .0823
–1.2 .1151 .1131 .1112 .1093 .1075 .1056 .1038 .1020 .1003 .0985
–1.1 .1357 .1335 .1314 .1292 .1271 .1251 .1230 .1210 .1190 .1170
–1.0 .1587 .1562 .1539 .1515 .1492 .1469 .1446 .1423 .1401 .1379
–0.9 .1841 .1814 .1788 .1762 .1736 .1711 .1685 .1660 .1635 .1611
–0.8 .2119 .2090 .2061 .2033 .2005 .1977 .1949 .1922 .1894 .1867
–0.7 .2420 .2389 .2358 .2327 .2296 .2266 .2236 .2206 .2177 .2148
–0.6 .2743 .2709 .2676 .2643 .2611 .2578 .2546 .2514 .2483 .2451
–0.5 .3085 .3050 .3015 .2981 .2946 .2912 .2877 .2843 .2810 .2776
–0.4 .3446 .3409 .3372 .3336 .3300 .3264 .3228 .3192 .3156 .3121
–0.3 .3821 .3783 .3745 .3707 .3669 .3632 .3594 .3557 .3520 .3483
–0.2 .4207 .4168 .4129 .4090 .4052 .4013 .3974 .3936 .3897 .3859
–0.1 .4602 .4562 .4522 .4483 .4443 .4404 .4364 .4325 .4286 .4247
–0.0 .5000 .4960 .4920 .4880 .4840 .4801 .4761 .4721 .4681 .4641

11
Table A.5. Percentiles of the t-Distribution
df 90% 95% 97.5% 99% 99.5% 99.9%
1 3.078 6.314 12.706 31.821 63.657 318.309
2 1.886 2.920 4.303 6.965 9.925 22.327
3 1.638 2.353 3.183 4.541 5.841 10.215
4 1.533 2.132 2.777 3.747 4.604 7.173
5 1.476 2.015 2.571 3.365 4.032 5.893
6 1.440 1.943 2.447 3.143 3.708 5.208
7 1.415 1.895 2.365 2.998 3.500 4.785
8 1.397 1.860 2.306 2.897 3.355 4.501
9 1.383 1.833 2.262 2.822 3.250 4.297
10 1.372 1.812 2.228 2.764 3.169 4.144
11 1.363 1.796 2.201 2.718 3.106 4.025
12 1.356 1.782 2.179 2.681 3.055 3.930
13 1.350 1.771 2.160 2.650 3.012 3.852
14 1.345 1.761 2.145 2.625 2.977 3.787
15 1.341 1.753 2.132 2.603 2.947 3.733
16 1.337 1.746 2.120 2.584 2.921 3.686
17 1.333 1.740 2.110 2.567 2.898 3.646
18 1.330 1.734 2.101 2.552 2.879 3.611
19 1.328 1.729 2.093 2.540 2.861 3.580
20 1.325 1.725 2.086 2.528 2.845 3.552
21 1.323 1.721 2.080 2.518 2.831 3.527
22 1.321 1.717 2.074 2.508 2.819 3.505
23 1.319 1.714 2.069 2.500 2.807 3.485
24 1.318 1.711 2.064 2.492 2.797 3.467
25 1.316 1.708 2.060 2.485 2.788 3.450
26 1.315 1.706 2.056 2.479 2.779 3.435
27 1.314 1.703 2.052 2.473 2.771 3.421
28 1.313 1.701 2.048 2.467 2.763 3.408
29 1.311 1.699 2.045 2.462 2.756 3.396
30 1.310 1.697 2.042 2.457 2.750 3.385
40 1.303 1.684 2.021 2.423 2.705 3.307
80 1.292 1.664 1.990 2.374 2.639 3.195
∞ 1.282 1.645 1.960 2.326 2.576 3.090

1
12
13

You might also like