# Stat 331 - Tutorial 1

Suppose we would like to investigate the relationship between the maintenance cost (in thousands of dollars) of Boeing 787 aircraft and the number of ﬂight hours. The dataset “airline_slr.txt” is available on Learn and contains data for 100 aircraft. The model: yi = β0 + β1 xi + i ,
i

∼ N (0, σ 2 ) , i = 1, 2, . . . , 100 , where

• y is the maintenance cost (in thousands of \$) of a randomly selected Boeing 787 aircraft, • x is the corresponding number of ﬂight hours. You are provided with the following output from R: dataset = read.table("airline_slr.txt", header = T) x = dataset\$hours y = dataset\$cost mean(x) ## [1] 148.8 mean(y) ## [1] 1822 sum((x - mean(x))^2) ## [1] 731093 sum((y - mean(y))^2) ## [1] 47854282 sum((x - mean(x)) * (y - mean(y))) ## [1] 4686071 slr_model = lm(cost ~ hours, data = dataset) summary(slr_model) ## ## Call: ## lm(formula = cost ~ hours, data = dataset) ## ## Residuals: ## Min 1Q Median 3Q Max ## -815.4 -346.5 50.2 340.3 658.8 ## ## Coefficients: ## Estimate Std. Error t value Pr(>|t|) ## (Intercept) 868.487 85.588 10.2 <2e-16 *** ## hours 6.410 0.499 12.8 <2e-16 *** ## --## Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 ## ## Residual standard error: 426 on 98 degrees of freedom ## Multiple R-squared: 0.628,Adjusted R-squared: 0.624 ## F-statistic: 165 on 1 and 98 DF, p-value: <2e-16 ## sum(slr_model\$residuals^2) ## [1] 17818085

98) = 5. Test H0 : β1 = 0 vs H1 : β1 = 0 at a 5% signiﬁcance level using the t-test. Solution: sxy 4686071 ˆ = β1 = = 6. 4. t0.1818 5.8) = 868.1. F0.e.4097 · (148. Solution: Assuming H0 : β1 = 0.8215 ˆ s. we reject H0 at a 5% level.025 (1. F0.628 implies that the model provides a fair ﬁt to the data.853 ˆ s. the test statistic is t∗ = ˆ β1 − 6 = 0. 98) = 3. For the remaining parts.9845 F0.9381 and so. t0. Verify the estimate of σ from the model summary.6606.05 (1. 99) = 3.) Solution: Test β1 ≤ 6 vs β1 > 6.05 (1. we reject H0 in favour of H1 at a 5% level.(β1 ) . Test (at a 5% signiﬁcance level) the hypothesis that on average.025 (99) = 1. n 2 i=1 ei Solution: σ = ˆ n−2 = 17818085 = 426.4874 ¯ ˆ ¯ 2.05 (99) = 1. Alternatively. the maintenance cost increases by more than \$6000. you may require the following information: t0. if you look at the p-value provided in the model summary.6604.05 (1. we will also reject H0 at a 5% level. t0. Use the R2 value to comment on the goodness of ﬁt of the model.9371. the test statistic is t∗ = ˆ β1 = 12. F0. the F statistic is F ∗ = 165 > F0.05 (98) = 1. Verify the least squares estimates of β0 and β1 from the model summary (using the formulas derived in class). 98) = 3. the maintenance cost is given in thousands of dollars.9381.4097 sxx 731093 ˆ β0 = y − β1 x = 1822 − 6. Solution: An R2 of 0. F-test: From the model summary.4003 98 3.025 (98) = 1. for a 1 hour increase in ﬂight time.1802.9842. (Remember that in the dataset. Calculate the value of the sample correlation coefﬁcient. Conﬁrm your conclusion using the ANOVA F-test. 6. Assuming H0 .025 (1.025 (98) = 1. we see that it is less than 2 × 10−16 and so.(β1 ) Since t∗ > t0.9845.e. 99) = 5.

we couldn’t reject the null hypothesis in favour of the alternative. As such.(β1 ) =⇒ (5.489.5% level. since the residual standard error is very large. Recall that V (ˆp − yp ) = σ 2 1 + y ¯ 1 (xp − x)2 + n Sxx Solution: ˆ ˆ ˆ ˆ Predicted Value: µp = β0 + β1 xp = β0 + β1 · 120 = 1637.6485 ˆ ˆ y 95% P. For a certain Boeing 787 aircraft.Since t∗ < t0. A 99% prediction interval is (511. in the previous part.I. the conﬁdence interval tells us that. Solution: Based on the prediction interval. believe that this is too costly and that other external factors have changed signiﬁcantly since the data was collected.025 (98) · s.7728.600. or do not.600 could possibly imply that external factors have changed. 7.I.42.(ˆp − yp ) =⇒ (786. . it is estimated that the number of ﬂight hours in the next month would be 120.05 (98) = 1. we were fairly conﬁdent that the cost would have lied between 786 and 2. Assuming the model assumptions are satisﬁed. Informally.: µp ± t0. we would have rejected H0 in favour of H1 at a 2. maybe we should have created a prediction interval with a higher prediction level (say 99%). given the sample. explain brieﬂy why you do. 7. Construct a 95% conﬁdence interval to support the conclusion in the previous part.3993) To be more concrete. Solution: 95% C. Had the conﬁdence interval been strictly greater than 6. it is plausible that either β ≤ 6 or β > 6.025 (98) · s.6606. 8.e.5242) 9. A cost of 2.3051. the maintenance cost was 2.5% level.9918) and we now see that 2. we do not reject H0 .e. However. for β1 : ˆ ˆ β1 ± t0. 2488.600 lies in this interval. A month has passed and for an aircraft that has ﬂown for 120 hours. note that the hypothesis in question 6) will also be rejected at a 2. Predict the maintenance cost for this aircraft and construct a 95% prediction interval. 2763.