Stat 401B Exam 2 Key F15

Stat 401B Exam 2
Fall 2015
I have neither given nor received unauthorized assistance on this exam.
________________________________________________________
Name Signed Date
_________________________________________________________
Name Printed
ATTENTION!
Incorrect numerical answers unaccompanied by supporting reasoning will receive NO

partial credit.
Correct numerical answers to difficult questions unaccompanied by supporting

reasoning may not receive full credit.
SHOW YOUR WORK/EXPLAIN YOURSELF!
Completely absurd answers (that fail basic sanity checks but that you don't identify as
clearly incorrect) may receive negative credit.
1
1. Below are some data (and corresponding summary statistics) taken from a paper by Leigh and
Taylor that appeared in the Ceramic Bulletin in 1990. They concern measured densities (in g/cc) of
crushed T-61 tabular alumina powder under r  4 different measurement protocols.
Protocol 1 Protocol 2 Protocol 3 Protocol 4

2.13, 2.15, 2.15, 1.96, 2.01,1.91, 2.23, 2.19, 2.18, 1.88,1.90,1.87,
2.19, 2.20 1.95, 2.00 2.21, 2.22 1.89,1.89
n1  5 n2  5 n3  5 n4  5
y1  2.164 y2  1.966 y3  2.206 y4  1.886
s1  .030 s2  .040 s3  .021 s4  .011
Initially, consider only data from Protocol 1.
6 pts a) Give two-sided limits that you are 95% sure would contain the next measured density produced
under Protocol 1. (Plug in completely, but you need not simplify.)
5 pts b) A lab manager wishes to announce with 95% confidence that the "measurement capability"
(defined as 2 ) for Protocol 1 is no worse that some number (say, #). Provide an appropriate number,
#, for this person based on the data above.
Now consider data from Protocols 1 and 4 only.

5 pts c) Do the two samples provide definitive indication that the two measurement protocols have different
precisions (different associated variabilities)? Compute some appropriate statistic and use an
appropriate reference distribution. (Say exactly what reference distribution you are considering and
support a "Yes" or a "No" answer.)
2
6 pts d) Give 95% two-sided confidence limits for the difference in mean densities produced by Protocols 1
and 4. (Plug in completely, but you need not simplify.)
Now consider data from all 4 protocols.
7 pts e) Find a single-number estimate of the standard deviation of measured density for any fixed protocol
under the one-way normal model.
4 pts f) Below is a normal plot of 20 values y  y . Say what it indicates about the reliability of inferences
ij i
based on the one-way normal model in this context. (The line on the plot has intercept 0 and slope
1/ sP .)
3
8 pts g) As it turns out, the grand sample variance of the 20 measured densities recorded on page 2 is
0.01937342 . Use this fact and your answer to part e) above to complete the ANOVA table below.
(If you were unable to do part e), you may use the incorrect value of .020 here.)
SOURCE SS df MS F
6 pts h) Protocols 1 and 3 were in fact carried out using 6-mesh material while Protocols 2 and 4 were
carried out using 60-mesh material. Compare the average of 6-mesh mean densities to the average of
60-mesh mean densities using two-sided 95% confidence limits and your value of sP from the
ANOVA table in part g). (Plug in completely, but you need not simplify.)
6 pts i) Suppose that at some later time, 30 of 50 measurements made using Protocol 2 produce values less
than 2.00 g/cc. Give a lower 95% confidence bound for the fraction of all Protocol 2 measurements
less than 2.00 g/cc. (Plug in completely, but you need not simplify.)
4
2. A data set in Probability and Statistics With R for Engineers & Scientists by M. Akritas concerns
heat produced during hardening of cement as related to the composition of the cement. Available were
y  measured heat produced (calories/gm)
x1  % tricalcium aluminate
x2  % tricalcium silicate
x3  % tetracalcium alumino ferrite
x4  % dicalcium silicate
values for n  13 cement batches. There is some R code and output based on these data at the end of
this exam. Use it as appropriate in the rest of the exam.
5 pts a) On what basis would you suggest that x4 is the best single predictor of y (from among the
predictors available here)? (What about the printout suggests this?)
Consider first a simple linear regression of y on x2 until further notice.
5 pts b) Give 95% two-sided confidence limits for the standard deviation of measured heat produced at a
fixed tricalcium silicate percentage. (Plug in completely, but there is no need to simplify.)
5 pts c) Give 95% two-sided confidence limits for the rate of change of mean heat produced with respect to
% tricalcium silicate (in the units of the data). (Plug in completely, but there is no need to simplify.)
5 pts d) For what percentage of tricalcium silicate do these data provide the best information about mean
heat produced? Explain.
5
Now consider the other predictor variables (not just x2 ).
5 pts e) In a model that includes only predictors x1 and x2 give 95% two-sided limits for the rate of change
of mean heat measurement (in cal/g) with respect to % tricalcium silicate.
7 pts f) Under the model that includes only predictors x1 and x2 give limits that you are 95% sure will
contain a next heat measurement under the conditions that x1  7and x2  26 .
5 pts g) What fraction of the raw variability in heat produced is accounted for by fitting an equation
involving all of x1 , x2 , and x3 ?
5 pts h) Give and interpret the p -value for testing the hypothesis that together the three predictor variables
x1 , x2 , and x3 fail to be useful in modeling heat produced.
5 pts i) In the presence of x1 and x2 , does x3 add (statistically) significantly to one's ability to model heat
produced? Give a p -value and say what hypothesis is being tested in what model.
6
R Code and OutPut
> CementVS
y x1 x2 x3 x4
1 78.5 7 26 6 60
2 74.3 1 29 15 52
3 104.3 11 56 8 20
4 87.6 11 31 8 47
5 95.9 7 52 6 33
6 109.2 11 55 9 22
7 102.7 3 71 17 6
8 72.5 1 31 22 44
9 93.1 2 54 18 22
10 115.9 21 47 4 26
11 83.8 1 40 23 34
12 113.3 11 66 9 12
13 109.4 10 68 8 12
> cor(CementVS)
y x1 x2 x3 x4
y 1.0000000 0.7307175 0.8162526 -0.5346707 -0.8213050
x1 0.7307175 1.0000000 0.2285795 -0.8241338 -0.2454451
x2 0.8162526 0.2285795 1.0000000 -0.1392424 -0.9729550
x3 -0.5346707 -0.8241338 -0.1392424 1.0000000 0.0295370
x4 -0.8213050 -0.2454451 -0.9729550 0.0295370 1.0000000
> plot(CementVS)
7
> cement.out1<-lm(y~x2,data = CementVS)
> summary(cement.out1)
Call:
lm(formula = y ~ x2, data = CementVS)
Residuals:
Min 1Q Median 3Q Max
-10.752 -6.008 -1.684 3.794 21.387
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 57.4237 8.4906 6.763 3.1e-05 ***
x2 0.7891 0.1684 4.686 0.000665 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 9.077 on 11 degrees of freedom

Multiple R-squared: 0.6663, Adjusted R-squared: 0.6359
F-statistic: 21.96 on 1 and 11 DF, p-value: 0.0006648
> anova(cement.out1)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x2 1 1809.43 1809.43 21.961 0.0006648 ***
Residuals 11 906.34 82.39
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> predict(cement.out1,se.fit=TRUE,interval="confidence",level=.95)
$fit
fit lwr upr
1 77.94093 68.03527 87.84659
2 80.30830 71.30279 89.31381
3 101.61467 95.35687 107.87247
4 81.88655 73.45303 90.32007
5 98.45817 92.73668 104.17967
6 100.82555 94.73114 106.91996
7 113.45154 103.33218 123.57091
8 81.88655 73.45303 90.32007
9 100.03642 94.08677 105.98607
10 94.51255 88.95500 100.07010
11 88.98867 82.67707 95.30028
12 109.50592 100.87732 118.13452
13 111.08417 101.87504 120.29330
$se.fit
[1] 4.500556 4.091580 2.843181 3.831701 2.599516 2.768946 4.597653 3.831701
[9] 2.703176 2.525028 2.867626 3.920335 4.184094
$df
[1] 11
$residual.scale
[1] 9.077126
>
8
> cement.out2<-lm(y~x1+x2,data = CementVS)
Call:
lm(formula = y ~ x1 + x2, data = CementVS)
Residuals:
-2.893 -1.574 -1.302 1.363 4.048
Coefficients:
(Intercept) 52.57735 2.28617 23.00 5.46e-10 ***
x1 1.46831 0.12130 12.11 2.69e-07 ***
x2 0.66225 0.04585 14.44 5.03e-08 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

F-statistic: 229.5 on 2 and 10 DF, p-value: 4.407e-09
Response: y
x1 1 1450.1 1450.08 250.43 2.088e-08 ***
x2 1 1207.8 1207.78 208.58 5.029e-08 ***
Residuals 10 57.9 5.79
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> predict(cement.out2,se.fit=TRUE,interval="confidence",level=.95)
$fit
fit lwr upr
1 80.07400 77.38679 82.76122
2 73.25092 70.50710 75.99473
3 105.81474 103.96593 107.66355
4 89.25848 86.61956 91.89740
5 97.29251 95.74212 98.84291
6 105.15249 103.33331 106.97167
7 104.00205 100.77704 107.22706
8 74.57542 71.94224 77.20860
9 91.27549 89.00610 93.54487
10 114.53754 110.56117 118.51391
11 80.53567 78.23565 82.83570
12 112.43724 110.05956 114.81493
13 112.29344 109.81199 114.77489
$se.fit
[1] 1.2060356 1.2314382 0.8297558 1.1843598 0.6958245 0.8164554 1.4473996
[8] 1.1817850 1.0185111 1.7846157 1.0322647 1.0671170 1.1136881
$df
[1] 10
$residual.scale
[1] 2.406335
>
9
> cement.out3<-lm(y~x1+x2+x3,data = CementVS)
Call:
lm(formula = y ~ x1 + x2 + x3, data = CementVS)
Residuals:
-3.2543 -1.4726 0.1755 1.5409 3.9711
Coefficients:
(Intercept) 48.19363 3.91330 12.315 6.17e-07 ***
x1 1.69589 0.20458 8.290 1.66e-05 ***
x2 0.65691 0.04423 14.851 1.23e-07 ***
x3 0.25002 0.18471 1.354 0.209
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

F-statistic: 166.3 on 3 and 9 DF, p-value: 3.367e-08
Response: y
x1 1 1450.08 1450.08 271.2642 4.996e-08 ***
x2 1 1207.78 1207.78 225.9385 1.108e-07 ***
x3 1 9.79 9.79 1.8321 0.2089
Residuals 9 48.11 5.35
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
10

Stat 401B Exam 2 Key F15

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat 401B Exam 2 Key F15

Uploaded by

Copyright:

Available Formats

Stat 401B Exam 2

I have neither given nor received unauthorized assistance on this exam.

Incorrect numerical answers unaccompanied by supporting reasoning will receive NO

Correct numerical answers to difficult questions unaccompanied by supporting

SHOW YOUR WORK/EXPLAIN YOURSELF!

Protocol 1 Protocol 2 Protocol 3 Protocol 4

Initially, consider only data from Protocol 1.

Now consider data from Protocols 1 and 4 only.

Now consider data from all 4 protocols.

Consider first a simple linear regression of y on x2 until further notice.

Residual standard error: 9.077 on 11 degrees of freedom

Residual standard error: 2.406 on 10 degrees of freedom

Residual standard error: 2.312 on 9 degrees of freedom

You might also like