You are on page 1of 10

EXAMPLE EXAM QUESTIONS ON SIMPLE LINEAR REGRESSION

Questions 1-7 refer to the following situation: Stock Prices, Y, are assumed to be affected by the annual rate
of dividend of stock, X. A simple linear regression analysis was performed on 20 observations and the
results were::

Regression Equation Section


Independent Regression Standard T-Value Prob
Variable Coefficient Error (Ho: B=0) Level
INTERCEPT -7.964633 3.11101359 -2.560 0.0166
X1 12.548580 1.27081204 9.874 0.0001

1. What statistical conclusion should you make about the effect of the dividend on average stock price?
A. Since 11.30869 > table value, reject the null hypothesis.
B. Since 12.54858 > table value, reject the null hypothesis.
C. Since 9.874 < table value, reject the null hypothesis.
D. Since 9.874 > table value, reject the null hypothesis.
E. Since 0.7895 < table value, fail to reject the null hypothesis.

2. What is the 95% confidence interval for a value of Y given an X value of 2.36? You are given the
standard error of this estimate is 3.351
1) in the sample is interpreted as: I am 95% confident that
A. the stock price for a stock with a dividend rate of 2.36% falls between $14.61 and $28.69.
B. the mean stock price for all stocks with a dividend rate of 2.36% falls between $14.61 and $28.69.
C. the variance in stock price for all stocks falls between $14.61 and $28.69.
D. the dividend rate for all stocks falls between $14.61 and $28.69.
E. for each one point increase in dividend rate, the stock price will increase from $14.61 and $28.69

3. Which one of the following assumptions is incorrectly stated?


A. The stock price is normally distributed for any dividend rate.
B. The stock price has the same variability for any dividend rate.
C. The stock price for any dividend rate is a linear function of dividend rate.
D. The difference between the stock price and the expected stock price
given the dividend rate is independent from company to company.

4. The interpretation of 0.7895, the value of R-square (the coefficient of determination) is:
A. 78.95% of the sample stock prices (around the mean stock price) can be attributed to a linear relationship
with the dividend rate in the population.
B. the mean stock price will be estimated to increase $97.50 for each point increase in the rate.
C. the mean stock price will be increase $78.95 for each point increase in the rate.
D. the stock price will increase $78.95 for each point increase in the rate.
E. 78.95% of the sample variability in stock price (around the mean stock price) can be attributed to a linear
relationship with the dividend rate.

5. What is the estimate of the change in expected stock prices when the dividend rate increases by one point?
A. 97.50
B. -7.964633
C. This is a parameter not a statistic.
D. 12.54858
E. 5.36546
6. The estimate of the slope will vary from sample to sample, the estimate of the standard deviation of beta-
hat is:
A. 3.36284
B. 3.14983
C. 0.39274
D. 12.54858
E. 1.27081

7. A 95% confidence interval for the average stock price given the rate of return will use the following t
value:
A. 9.874
B. -2.560
C. 2.101
D. 2.045
E. 2.153

Answers to 1-7
1. D from computer printout use the t-test value across from X1
2. A this is a confidence interval for a conditional mean
3. C the mean stock price falls on the line
4. E r-square is % of sample variation of y explained by x
5. D This is beta-hat see computer printout to the right of X1
6. E This is the standard error of hat to right of X1
7. C All t-values in simple linear regression have n-2 d. f.

Questions 8-17 are concerned with the following situation: A fire insurance company wants to relate the
amount of fire damage (y) in major residential fires to the distance between the residence and the nearest
fire station (x). The study is to be conducted in a large suburb of a major city, a sample of 15 recent fires in
this suburb is selected. The 15 values and the printout follow:

OBS X Y

1 3.4 26.2
2 1.8 17.8
3 4.6 31.3
4 2.3 23.1
5 3.1 27.5
6 5.5 36.0
7 0.7 14.1
8 3.0 22.3
9 2.6 19.6
10 4.3 31.3
11 2.1 24.0
12 1.1 17.3
13 6.1 43.2
14 4.8 36.4
15 3.8 26.1
16 3.5 .
Dependent Variable: Y $1000 fire damage

Analysis of Variance

Sum of Mean
Source DF Squares Square F Value Pro

Model 1 841.76636 841.76636 156.886 0.0001


Error 13 69.75098 5.36546
Total(Adjusted) 14 911.51733

Root MSE 2.31635 R-square 0.9235


Dep Mean 26.41333 Adj R-sq 0.9176
C.V. 8.76961

Parameter Estimates

Parameter Standard T for H0:


Variable Estimate Error Parameter=0 Prob > |T|

INTERCEPT 10.277929 1.42027781 7.237 0.0001


X 4.919331 0.39274775 12.525 0.0001

Dep Actual Predicted 95% LCL 95% UCL 95% LCL 95%
Obs Y Value Mean Mean Individual Individual

16 . 27.4956 26.1901 28.8011 22.3239 32.66

8. Which one of the following assumptions is incorrect?


(A) The difference between the fire damage and the expected fire damage given the distance is independent
from house to house.
(B) The fire damage is normally distributed for any distance.
(C) The mean fire damage has the same variability for any distance.
(D) The mean fire damage for any distance is a linear function of distance.

9. You will find the value 4.919331 in the printout under Parameter Estimates. This is interpreted as:
(A) The mean fire damage will increase $4,919.33 for each mile from the fire station.
(B) The mean fire damage will be estimated to increase $4,919.33 for each mile from the fire station.
(C) The fire damage will increase $4,919.33 for each mile from the fire station.
(D) The mean fire damage will be $4,919.33 given the distance.
(E) The estimated mean fire damage will be $4,919.33 given the distance.

10. The estimate of the standard deviation of fire damage for all homes the same distance from the fire
station is (in thousands of dollars)
(A) 0.392744775
(B) 2.31635
(C) no information available.
(D) 69.75098
(E) 5.36546
11. The interpretation of 0.9235, the value of R-square (the coefficient of determination) is:
(A) 92.35% of the variability in fire damage (around the mean fire damage) can be attributed to a linear
relationship with the distance to the fire station in the population.
(B) the mean fire damage will be estimated to increase $923.50 for each mile from the fire station.
(C) the mean fire damage will be increase $923.50 for each mile from the fire station.
(D) the fire damage will increase $923.50 for each mile from the fire station.
(E) 92.35% of the sample variability in fire damage (around the mean fire damage) can be attributed to a
linear relationship with the distance to the fire station.

12. To test the null hypothesis that the parameter of the slope is zero, the test statistic value is:
(A) 0.9235
(B) 0.9176
(C) 0.39274775.
(D) 12.525
(E) 7.237

13. For testing the slope is zero versus the alternative that the slope is not zero (use alpha of 0.05), the
rejection region is: Reject the null hypothesis if
(A) t > 2.160 or t < -2.160
(B) | t | < 12.525
(C) t > 1.771
(D) t > 12.525
(E) t > 2.160

14. The 95% confidence interval for the mean fire damage for all house 3.5 miles from the fire station is: (in
thousands of dollars)
(A) 15.3442 to 25.8279
(B) 4.070 to 5.768
(C) 10.1999 to 21.1785
(D) 13.4329 to 17.9455
(E) 26.1901 to 28.8011

15. The 95% confidence interval for the mean (25.7076 to 28.2997) for the first house (OBS 1) in the sample
is interpreted as: I am 95% confident that
(A) the fire damage for a house 3.4 miles from the fire station falls between $25,707.60 and $28,299.70.
(B) the fire damage for all houses 3.4 miles from the fire station falls between $25,707.60 and $28,299.70.
(C) the variance in fire damage for all houses 3.4 miles from the fire station falls between $25,707.60 and
$28,299.70.
(D) the average fire damage for all houses 3.4 miles from the fire station falls between $25,707.60 and
$28,299.70.
(E) for each one mile from the fire station, the mean fire damage will increase from $25,707.60 and
$28,299.70

16. In this sample for each one standard deviation that a house is from the fire station, the mean fire damage
will be estimated to increase 0.96 standard deviations. This is
(A) the coefficient of correlation, r
(B) the sample standard deviation, s
(C) the test statistic value, t
(D) coefficient of determination, r-square
(E) the least squares coefficient, beta hat
17. The difference between the actual value of y and the predicted
value of y (y-yhat) is called
(A) a standard deviation
(B) a slope
(C) a residual
(D) a sample standard deviation
(E) an error

ANSWERS for 8-17

8. C fire damage has the same variance given distance for any distance
9. B this is the beta hat
10 B this is an estimate of sigma of y given x, the square root of MSE
11 E r-square is % of sample variation of y explained by x
12 D use t value from computer printout across from X.
13 A use a t with n-2 degrees of freedom
14 E see the 16th observation under the mean columns
15 D this is a confidence interval for a conditional mean
16 A this is the definition of pearsons r from class notes
17 C this is the definition of the residual
QUESTIONS 18-27 DEAL WITH THE FOLLOWING SITUATION: The expected sales of a product in a
city are assumed to be affected by the per capita discretionary income and the population of the city. Per
capita discretionary income will be referred to as PCDI in all the questions. In Questions 1-10 examine only
the effect of per capita discretionary income on the mean sales. Thus the following model is hypothesized:

E(Y) = B0 + B1 X1 where

Y = Sales (in thousands of dollars)


X1 = Per Capita Discretionary Income (in dollars)
A sample of 15 cities, along with their sales, per capita discretionary income, and the population of the city
(in thousands) is given in the attached printout. The 15 values and a printout follow:

OBS INCOME SALES


1 2450 162
2 3254 120
3 3802 223
4 2838 131
5 2347 67
6 3782 169
7 3008 81
8 2450 192
9 2137 116
10 2560 55
11 4020 252
12 4427 232
13 2660 144
14 2088 103
15 2605 212
16 2500 .
17 3500 .

Root MSE 49.51434 R-square 0.4087


Dep Mean 150.60000 Adj R-sq 0.3632

Parameter Estimates

Coefficient Standard T for H0:


Variable Estimate Error B=0 Prob
INTERCEP -10.207 55.147 -0.185 0.8560
INCOME 0.054 0.018 2.998 0.0103

Dep 95% LCL 95% UCL 95% LCL 95% UCL


Obs Actual Predicted Mean Mean Individual Individual

16 . 125.5 92.5 158.5 13.5 237.5


17 . 179.8 145.1 214.5 67.3 292.3
18. The 95% confidence interval for the mean sales of all cities with PCDI = 2500 is
A. 92.5 to 158.5
B. can not be calculated because of missing values
C. 3500
D. 88.6 to 156.9
E. 13.5 to 237.5

19. When testing the null hypothesis that the slope equals to zero versus the alternative hypothesis that the
slope does not equal to zero, the rejection region would be: reject the Null if
A. t > t(14, 0.025) or t < -t(14, 0.025)
B. t > t(13, 0.05)
C. F < F(1, 13, 0.05)
D. |t| > t(13, 0.025)
E. p-value > alpha

20. What distribution would you use to infer about the variation of sales among all cities with the same
PCDI?
A. the Chi-square distribution
B. the t distribution
C. the F distribution
D. a t with no interaction and an F with interaction

21. Given the p-value of the F-test is 0.0103, we can interpret this as
A. Given the null is true, there is a 1.03% chance of finding this value of the test statistic or something more
extreme.
B. The percent of sample variability of Y explained by the independent variable is 1.03%
C. There is a 98.97% probability that the null hypothesis is right.
D. There is a 98.97% probability that the null hypothesis is wrong.
E. The probability of a type I error is 0.0103.

22. Does the PCDI help predict the sales of the product?
A. Yes, because 2.998 > the table value
B. No, because .8560 is greater than alpha
C. Yes, because 8.986 < the table value
D. Yes, because of MSE = 2451.66959
E. No, because 0.018 is less than the table value

23. What is the interpretation of the coefficient of determination?


A. Don't know and don't care (Hint, this is a wrong answer and best left unspoken within hearing of
instructor).
B. 40.87 probability that sales is linearly related to PCDI.
C. 40.87 percent of the sample variability of sales can be attributed to changes in PCDI.
D. 40.87 percent of the variability of PCDI can be attributed to a linear relationship between mean PCDI and
sales.
E. 40.87 percent of the sample variability of PCDI can be attributed to a linear relationship between mean
PCDI and sales.
24. What table value would you use in the calculation of a 90% confidence interval for a value of Y given a
value of X?
A. 1.645
B. 3.140
C. 1.771
D. 2.650
E. 2.998

25. How many estimated standard errors is the point estimate of the slope away from zero? Slope is the
change in the mean sales for each dollar increase in PCDI.
A. 0.054
B. 0.4087
C. -10.207
D. 2.998
E. 0.018

26. You know that most cities have small PCDI and only a few have large PCDI. Is this a violation of any
assumption?
A. Yes, because the variation of PCDI would then be unequal.
B. No, because sales has to be normally distributed but PCDI does not have to be.
C. Yes, this would violate the linear relationship between the mean sales and PCDI.
D. No, because the variance of sales has nothing to do with the problem.
E. Yes, a violation of normality.

27. What would be the change in the estimated mean sales for each one standard deviation increase in
PCDI?
A. 0.3632 standard deviations
B. can not be calculated.
C. 0.4087 squared dollars
D. 0.6393 (square root of 0.4087) standard deviations
E. 0.0540 dollars
Answers to 18-27
-----------------------
18. A see observation number 16
19. D use a t with n-2 degrees of freedom
20. A variance is related to chi-squared, see Table 3 in class notes
21. A see definition of p-value in text book
22. A use the F test here
23. C see the definition of r-squared
24 C use t with n-2 d.f
25 D defintion of t-test value
26 B assumptions apply to y|x or to e but not on x
27. D this is the definition of r in class notes