Chapter 12

Chapter 12
Simple Regression
True / False Questions
1. A scatter plot is used to visualize the association (or lack of association) between two
quantitative variables.
True False
2. The correlation coefficient r measures the strength of the linear relationship between
two variables.
True False
3. Pearson's correlation coefficient (r) requires that both variables be interval or ratio
data.
True False
4. If r = .55 and n = 16, then the correlation is significant at α = .05 in a two-tailed

test.
True False
5 A sample correlation r = .40 indicates a stronger linear relationship than r = -.60.

.
True False
6. A common source of spurious correlation between X and Y is when a third

unspecified variable Z affects both X and Y.
True False
7. The correlation coefficient r always has the same sign as b in Y = b + b X.

1 0 1
True False
8. The fitted intercept in a regression has little meaning if no data values near X = 0
have been observed.
True False
9. The least squares regression line is obtained when the sum of the squared residuals
is minimized.
True False
10. In a simple regression, if the coefficient for X is positive and significantly different
from zero, then an increase in X is associated with an increase in the mean (i.e., the
expected value) of Y.
True False
11. In least-squares regression, the residuals e , e , . . . , e will always have a zero

1 2 n
mean.
True False
12. When using the least squares method, the column of residuals always sums to
zero.
True False
13. In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads will increase
sales by 7.37 percent.
True False
14. If R = .36 in the model Sales = 268 + 7.37 Ads with n = 50, the two-tailed test for
2
correlation at α = .05 would say that there is a significant correlation between Sales
and Ads.
True False
15. If R = .36 in the model Sales = 268 + 7.37 Ads, then Ads explains 36 percent of the
2
variation in Sales.
True False
16 The ordinary least squares regression line always passes through the point
. .
True False
17 The least squares regression line gives unbiased estimates of β and β .

0 1
.
True False
18 In a simple regression, the correlation coefficient r is the square root of R . 2

.
True False
19. If SSR is 1800 and SSE is 200, then R is .90.

2
True False
20. The width of a prediction interval for an individual value of Y is less than standard
error s .
e
True False
21. If SSE is near zero in a regression, the statistician will conclude that the proposed
model probably has too poor a fit to be useful.
True False
22. For a regression with 200 observations, we expect that about 10 residuals will
exceed two standard errors.
True False
23. Confidence intervals for predicted Y are less precise when the residuals are very
small.
True False
24. Cause-and-effect direction between X and Y may be determined by running the

regression twice and seeing whether Y = β + β X or X = β + β Y has the larger R .
0 1 1 0
2
True False
25. The ordinary least squares method of estimation minimizes the estimated slope and
intercept.
True False
26. Using the ordinary least squares method ensures that the residuals will be normally
distributed.
True False
27. If you have a strong outlier in the residuals, it may represent a different causal
system.
True False
28. A negative correlation between two variables X and Y usually yields a negative p-
value for r.
True False
29. In linear regression between two variables, a significant relationship exists when the
p-value of the t test statistic for the slope is greater than α.
True False
30. The larger the absolute value of the t statistic of the slope in a simple linear
regression, the stronger the linear relationship exists between X and Y.
True False
31. In simple linear regression, the coefficient of determination (R ) is estimated from

2
sums of squares in the ANOVA table.
True False
32. In simple linear regression, the p-value of the slope will always equal the p-value of
the F statistic.
True False
33 An observation with high leverage will have a large residual (usually an

. outlier).
True False
34. A prediction interval for Y is narrower than the corresponding confidence interval for
the mean of Y.
True False
35. When X is farther from its mean, the prediction interval and confidence interval for Y
become wider.
True False
36. The total sum of squares (SST) will never exceed the regression sum of squares
(SSR).
True False
37. "High leverage" would refer to a data point that is poorly predicted by the model
(large residual).
True False
38. The studentized residuals permit us to detect cases where the regression predicts
poorly.
True False
39 A poor prediction (large residual) indicates an observation with high leverage.

.
True False
40. Ill-conditioned refers to a variable whose units are too large or too small (e.g.,
$2,434,567).
True False
41. A simple decimal transformation (e.g., from 18,291 to 18.291) often improves data
conditioning.
True False
42. Two-tailed t-tests are often used because any predictor that differs significantly from
zero in a two-tailed test will also be significantly greater than zero or less than zero
in a one-tailed test at the same α.
True False
43. A predictor that is significant in a one-tailed t-test will also be significant in a two-
tailed test at the same level of significance α.
True False
44 Omission of a relevant predictor is a common source of model misspecification.

.
True False
45. The regression line must pass through the

origin.
True False
46. Outliers can be detected by examining the standardized residuals.
True False
47. In a simple regression, there are n - 2 degrees of freedom associated with the error
sum of squares (SSE).
True False
48. In a simple regression, the F statistic is calculated by taking the ratio of MSR to the
MSE.
True False
49. The coefficient of determination is the percentage of the total variation in the
response variable Y that is explained by the predictor X.
True False
50. A different confidence interval exists for the mean value of Y for each different value
of X.
True False
51 A prediction interval for Y is widest when X is near its mean.

.
True False
52. In a two-tailed test for correlation at α = .05, a sample correlation coefficient r = 0.42
with n = 25 is significantly different than zero.
True False
53. In correlation analysis, neither X nor Y is designated as the independent variable.
True False
54. A negative value for the correlation coefficient (r) implies a negative value for the
slope (b ).
1
True False
55 High leverage for an observation indicates that X is far from its mean.
.
True False
56. Autocorrelated errors are not usually a concern for regression models using cross-
sectional data.
True False
57. There are usually several possible regression lines that will minimize the sum of
squared errors.
True False
58. When the errors in a regression model are not independent, the regression model is
said to have autocorrelation.
True False
59 In a simple bivariate regression, F = t .

calc calc
2
.
True False
60. Correlation analysis primarily measures the degree of the linear relationship
between X and Y.
True False
Multiple Choice Questions
61. The variable used to predict another variable is called the:
A. response variable.
B. regression variable.
C. independent variable.
D. dependent variable.
62 The standard error of the regression:

.
A. is based on squared deviations from the regression line.
B. may assume negative values if b < 0. 1
C. is in squared units of the dependent variable.
D. may be cut in half to get an approximate 95 percent prediction interval.
63. A local trucking company fitted a regression to relate the travel time (days) of its
shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + 0.0214 Distance, based on a sample of 20 shipments. The
estimated standard error of the slope is 0.0053. Find the value of t to test for zero
calc
slope.
A. 2.46
B. 5.02
C. 4.04
D. 3.15
Time = -7.126 + .0214 Distance, based on a sample of 20 shipments. The estimated
standard error of the slope is 0.0053. Find the critical value for a right-tailed test to
see if the slope is positive, using α = .05.
A. 2.101
B. 2.552
C. 1.960
D. 1.734
65. If the attendance at a baseball game is to be predicted by the equation Attendance

= 16,500 - 75 Temperature, what would be the predicted attendance if Temperature
is 90 degrees?
A. 6,750
B. 9,750
C. 12,250
D. 10, 020
66. A hypothesis test is conducted at the 5 percent level of significance to test whether
the population correlation is zero. If the sample consists of 25 observations and the
correlation coefficient is 0.60, then the computed test statistic would be:
A. 2.071.
B. 1.960.
C. 3.597.
D. 1.645.
67. Which of the following is not a characteristic of the F-test in a simple

regression?
A. It is a test for overall fit of the

model.
B. The test statistic can never be

negative.
C. It requires a table with numerator and denominator degrees of freedom.
D. The F-test gives a different p-value than the t-

test.
68. A researcher's Excel results are shown below using Femlab (labor force
participation rate among females) to try to predict Cancer (death rate per 100,000
population due to cancer) in the 50 U.S. states.
Which of the following statements is not true?
A. The standard error is too high for this model to be of any predictive use.
B. The 95 percent confidence interval for the coefficient of Femlab is -4.29 to -

0.28.
C. Significant correlation exists between Femlab and Cancer at α

= .05.
D. The two-tailed p-value for Femlab will be less than .05.
69. A researcher's results are shown below using Femlab (labor force participation rate
among females) to try to predict Cancer (death rate per 100,000 population due to
cancer) in the 50 U.S. states.
Which statement is valid regarding the relationship between Femlab and Cancer?
A. A rise in female labor participation rate will cause the cancer rate to decrease
within a state.
B. This model explains about 10 percent of the variation in state cancer rates.
C. At the .05 level of significance, there isn't enough evidence to say the two
variables are related.
D. If your sister starts working, the cancer rate in your state will decline.
What is the R for this regression?

2
A. .9018
B. .0982
C. .8395
D. .1605
71. A news network stated that a study had found a positive correlation between the
number of children a worker has and his or her earnings last year. You may
conclude that:
A. people should have more children so they can get better jobs.
B. the data are erroneous because the correlation should be negative.
C. causation is in serious doubt.
D. statisticians have small families.
72. William used a sample of 68 large U.S. cities to estimate the relationship between
Crime (annual property crimes per 100,000 persons) and Income (median annual
income per capita, in dollars). His estimated regression equation was Crime = 428 +
0.050 Income. We can conclude that:
A. the slope is small so Income has no effect on Crime.
B. crime seems to create additional income in a city.
C. wealthy individuals tend to commit more crimes, on

average.
D. the intercept is irrelevant since zero median income is impossible in a large

city.
73. Mary used a sample of 68 large U.S. cities to estimate the relationship between
income per capita, in dollars). Her estimated regression equation was Crime = 428 +
0.050 Income. If Income decreases by 1000, we would expect that Crime will:
A. increase by 428.
B. decrease by 50.
C. increase by 500.
D. remain unchanged.
74. Amelia used a random sample of 100 accounts receivable to estimate the
relationship between Days (number of days from billing to receipt of payment) and
Size (size of balance due in dollars). Her estimated regression equation was Days =
22 + 0.0047 Size with a correlation coefficient of .300. From this information we can
conclude that:
A. 9 percent of the variation in Days is explained by Size.
B. autocorrelation is likely to be a
problem.
C. the relationship between Days and Size is significant.
D. larger accounts usually take less time to pay.
75 Prediction intervals for Y are narrowest when:

.
A. the mean of X is near the mean of Y.
B. the value of X is near the mean of X.
C. the mean of X differs greatly from the mean of Y.
D. the mean of X is small.
76 If n = 15 and r = .4296, the corresponding t-statistic to test for zero correlation is:
.
A. 1.715.
B. 7.862.
C. 2.048.
D. impossible to determine without

α.
77. Using a two-tailed test at α = .05 for n = 30, we would reject the hypothesis of zero
correlation if the absolute value of r exceeds:
A. .2992.
B. .3609.
C. .0250.
D. .2004.
78. The ordinary least squares (OLS) method of estimation will minimize:
A. neither the slope nor the intercept.
B. only the slope.
C. only the intercept.
D. both the slope and intercept.
79. A standardized residual e = -2.205 indicates:

i
A. a rather poor prediction.
B. an extreme outlier in the residuals.
C. an observation with high leverage.
D. a likely data entry error.
80. In a simple regression, which would suggest a significant relationship between X

and Y?
A. Large p-value for the estimated

slope
B. Large t statistic for the slope
C. Large p-value for the F

statistic
D. Small t-statistic for the slope

81. Which is indicative of an inverse relationship between X and
Y?
A. A negative F statistic
B. A negative p-value for the correlation coefficient
C. A negative correlation coefficient
D. Either a negative F statistic or a negative p-value
82. Which is not correct regarding the estimated slope of the OLS regression
line?
A. It is divided by its standard error to obtain its t

statistic.
B. It shows the change in Y for a unit change in X.
C. It is chosen so as to minimize the sum of squared

errors.
D. It may be regarded as zero if its p-value is less than

α.
83 Simple regression analysis means that:

.
A. the data are presented in a simple and clear way.
B. we have only a few observations.
C. there are only two independent variables.
D. we have only one explanatory variable.
84. The sample coefficient of correlation does not have which

property?
A. It can range from -1.00 up to
+1.00.
B. It is also sometimes called Pearson's

r.
C. It is tested for significance using a t-

test.
D. It assumes that Y is the dependent variable.
85. When comparing the 90 percent prediction and confidence intervals for a given
regression analysis:
A. the prediction interval is narrower than the confidence

interval.
B. the prediction interval is wider than the confidence interval.
C. there is no difference between the size of the prediction and confidence

intervals.
D. no generalization is possible about their comparative

width.
86. Which is not true of the coefficient of determination?
A. It is the square of the coefficient of correlation.
B. It is negative when there is an inverse relationship between X and Y.
C. It reports the percent of the variation in Y explained by X.
D. It is calculated using sums of squares (e.g., SSR, SSE, SST).
87. If the fitted regression is Y = 3.5 + 2.1X (R = .25, n = 25), it is incorrect to conclude
2
that:
A. Y increases 2.1 percent for a 1 percent increase in X.
B. the estimated regression line crosses the Y axis at 3.5.
C. the sample correlation coefficient must be positive.
D. the value of the sample correlation coefficient is 0.50.
88. In a simple regression Y = b + b X where Y = number of robberies in a city

0 1
(thousands of robberies), X = size of the police force in a city (thousands of police),

and n = 45 randomly chosen large U.S. cities in 2008, we would be least likely to
see which problem?
A. Autocorrelated residuals (because this is time-series data)
B. Heteroscedastic residuals (because we are using totals uncorrected for city

size)
C. Nonnormal residuals (because a few larger cities may skew the

residuals)
D. High leverage for some observations (because some cities may be

huge)
89. When homoscedasticity exists, we expect that a plot of the residuals versus the
fitted Y:
A. will form approximately a straight

line.
B. crosses the centerline too many times.
C. will yield a Durbin-Watson statistic near

2.
D. will show no pattern at all.
90. Which statement is not correct?

A. Spurious correlation can often be reduced by expressing X and Y in per capita
terms.
B. Autocorrelation is mainly a concern if we are using time-series data.
C. Heteroscedastic residuals will have roughly the same variance for any value of
X.
D. Standardized residuals make it easy to identify outliers or instances of poor

fit.
91. In a simple bivariate regression with 25 observations, which statement is most

nearly correct?
A. A non-standardized residual whose value is e = 4.22 would be considered an

i
outlier.
B. A leverage statistic of 0.16 or more would indicate high leverage.
C. Standardizing the residuals will eliminate any heteroscedasticity.
D. Non-normal residuals imply biased coefficient estimates, a major

problem.
92. A regression was estimated using these variables: Y = annual value of reported
bank robbery losses in all U.S. banks ($millions), X = annual value of currency held
by all U.S. banks ($millions), n = 100 years (1912 through 2011). We would not
anticipate:
A. autocorrelated residuals due to time-series data.
B. heteroscedastic residuals due to the wide variation in data

magnitudes.
C. nonnormal residuals due to skewed data as bank size increases over time.
D. a negative slope because banks hold less currency when they are robbed.
93. A fitted regression for an exam in Prof. Hardtack's class showed Score = 20 + 7
Study, where Score is the student's exam score and Study is the student's study
hours. The regression yielded R = 0.50 and SE = 8. Bob studied 9 hours. The quick
2
95 percent prediction interval for Bob's grade is approximately:

A. 69 to 97.
B. 75 to 91.
C. 67 to 99.
D. 76 to 90.
94. Which is not an assumption of least squares

regression?
A. Normal X values
B. Non-autocorrelated errors
C. Homoscedastic errors
D. Normal
errors
95 In a simple bivariate regression with 60 observations there will be _____ residuals.

.
A. 60
B. 59
C. 58
D. 57
96. Which is correct to find the value of the coefficient of determination (R )?

2
A. SSR/
SSE
B. SSR/
SST
C. 1 -
SST/SSE
97. The critical value for a two-tailed test of H : β = 0 at α = .05 in a simple regression
0 1
with 22 observations is:
A. ±1.725
B. ±2.086
C. ±2.528
D. ±1.960
98. In a sample of size n = 23, a sample correlation of r = .400 provides sufficient

evidence to conclude that the population correlation coefficient exceeds zero in a
right-tailed test at:
A. α = .01 but not α = .05.
B. α = .05 but not α = .01.
C. both α = .05 and α = .01.
D. neither α = .05 nor α = .01.
99. In a sample of n = 23, the Student's t test statistic for a correlation of r = .500 would
be:
A. 2.559.
B. 2.819.
C. 2.646.
D. can't say without knowing α.

100 In a sample of n = 23, the critical value of the correlation coefficient for a two-tailed
. test at α = .05 is:
A. ±.524
B. ±.412
C. ±.500
D. ±.497
101 In a sample of n = 23, the critical value of Student's t for a two-tailed test of
. significance for a simple bivariate regression at α = .05 is:
A. ±2.229
B. ±2.819
C. ±2.646
D. ±2.080
102 In a sample of n = 40, a sample correlation of r = .400 provides sufficient evidence

. to conclude that the population correlation coefficient exceeds zero in a right-tailed
test at:
A. α = .025 but not α = .05.
B. α = .05 but not α = .025.
C. both α = .025 and α = .05.
103 In a sample of n = 20, the Student's t test statistic for a correlation of r = .400 would
. be:
A. 2.110
B. 1.645
C. 1.852
D. can't say without knowing if it's a two-tailed or one-tailed

test.
A. ±.587
B. ±.412
C. ±.444
D. ±.497
A. ±2.060
B. ±2.052
C. ±2.898
D. ±2.074
106 In a sample of size n = 36, a sample correlation of r = -.450 provides sufficient

. evidence to conclude that the population correlation coefficient differs significantly
from zero in a two-tailed test at:
A. α = .01
B. α = .05
C. both α = .01 and α = .05.
107 In a sample of n = 36, the Student's t test statistic for a correlation of r = -.450
. would be:
A. -2.110.
B. -2.938.
C. -2.030.
A. ±.329
B. ±.387
C. ±.423
D. ±.497
. significance of the slope for a simple regression at α = .05 is:
A. 2.938
B. 2.724
C. 2.032
D. 2.074
110 A local trucking company fitted a regression to relate the travel time (days) of its
. shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + 0.0214 Distance. If Distance increases by 50 miles, the expected
Time would increase by:
A. 1.07 days
B. 7.13 days
C. 2.14 days
D. 1.73 days
111 A local trucking company fitted a regression to relate the cost of its shipments as a
. function of the distance traveled. The Excel fitted regression is shown.
Based on this estimated relationship, when distance increases by 50 miles, the

expected shipping cost would increase by:
A. $286.
B. $143.
C. $104.
D. $301.
112 If SSR is 2592 and SSE is 608, then:

.
A. the slope is likely to be insignificant.
B. the coefficient of determination is .81.
C. the SST would be smaller than SSR.
D. the standard error would be large.
113. Find the sample correlation coefficient for the following data.
A. .8911
B. .9124
C. .9822
D. .9556
114. Find the slope of the simple regression = b + b x.

0 1
A. 1.833
B. 3.294
C. 0.762
D. -2.228
115. Find the sample correlation coefficient for the following data.
A. .7291
B. .8736
C. .9118
D. .9563
116. Find the slope of the simple regression = b + b x.

0 1
A. 2.595
B. 1.109
C. -2.221
D. 1.884
117. A researcher's results are shown below using n = 25 observations.
The 95 percent confidence interval for the slope is:
A. [ -3.282, -1.284].
B. [ -4.349, -0.217].
C. [1.118, 5.026].
D. [ -0.998, +0.998].
118. A researcher's regression results are shown below using n = 8

observations.
A. [1.333, 2.284].
B. [1.602, 2.064].
C. [1.268, 2.398].
D. [1.118, 2.449].
119 Bob thinks there is something wrong with Excel's fitted regression. What do you
. say?
A. The estimated equation is obviously incorrect.
B. The R looks a little high but otherwise it looks OK.

2
C. Bob needs to increase his sample size to decide.
D. The relationship is linear, so the equation is

credible.
Short Answer Questions
120 Pedro became interested in vehicle fuel efficiency, so he performed a simple

. regression using 93 cars to estimate the model CityMPG = β + β Weight where
0 1
Weight is the weight of the vehicle in pounds. His results are shown below. Write a
brief analysis of these results, using what you have learned in this chapter. Is the
intercept meaningful in this regression? Make a prediction of CityMPG when
Weight = 3000, and also when Weight = 4000. Do these predictions seem
believable? If you could make a car 1000 pounds lighter, what change would you
predict in its CityMPG?
121 Mary noticed that old coins are smoother and more worn. She weighed 31 nickels
. and recorded their age, and then performed a simple regression to estimate the
model Weight = β + β Age where weight is the weight of the coin in grams and
0 1
Age is the age of the coin in years. Her results are shown below. Write a brief
analysis of these results, using what you have learned in this chapter. Make a
prediction of Weight when Age = 10, and also when Age = 20. What does this tell
you? Is the intercept meaningful in this regression?
Chapter 12 Simple Regression Answer Key
True / False Questions
1. A scatter plot is used to visualize the association (or lack of association) between two
quantitative variables.
TRUE
The scatter plot shows association between two quantitative variables.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Learning Objective: 12-01 Calculate and test a correlation coefficient for
significance.
Topic: Visual Displays and Correlation Analysis
2. The correlation coefficient r measures the strength of the linear relationship between
two variables.
TRUE
A correlation coefficient measures linearity between two variables.

AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
significance.
3. Pearson's correlation coefficient (r) requires that both variables be interval or ratio
data.
TRUE
Correlation assumes quantitative data with at least interval measurements.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
significance.
4. If r = .55 and n = 16, then the correlation is significant at α = .05 in a two-tailed test.
TRUE
t = r[(n - 2)/(1 - r )] = (.55)[(16 - 2)/(1 - .55 )] = 2.464 > t = 2.145 for d.f. = 16 - 2 =
calc
2 1/2 2 1/2
.025
14.
AACSB: Analytic
Blooms: Apply
Difficulty: 2 Medium
significance.
5 A sample correlation r = .40 indicates a stronger linear relationship than r = -.60.

.
FALSE
The sign only indicates the direction, not the strength, of the linear relationship.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
significance.
6. A common source of spurious correlation between X and Y is when a third
unspecified variable Z affects both X and Y.
TRUE
Both X and Y could be influenced by Z.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
significance.
7 The correlation coefficient r always has the same sign as b in Y = b + b X. 1 0 1
.
TRUE
The t-test for the slope in simple regression gives the same result as the t-test for r.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 12-04 Fit a simple regression on an Excel scatter
plot.
Topic: Regression Terminology
8. The fitted intercept in a regression has little meaning if no data values near X = 0
have been observed.
TRUE
Predicting Y for X = 0 makes little sense if the observed data have no values near X =
0.
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
Learning Objective: 12-02 Interpret the slope and intercept of a regression
equation.
Topic: Simple Regression
9. The least squares regression line is obtained when the sum of the squared residuals
is minimized.
TRUE
The OLS method minimizes the sum of squared residuals.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
plot.
Topic: Ordinary Least Squares Formulas
10. In a simple regression, if the coefficient for X is positive and significantly different
from zero, then an increase in X is associated with an increase in the mean (i.e., the
expected value) of Y.
TRUE
The conditional mean of Y depends on X (unless the slope is effectively zero).
AACSB: Analytic
Blooms: Understand
Difficulty: 1 Easy
equation.
11. In least-squares regression, the residuals e , e , . . . , e will always have a zero

1 2 n
mean.
TRUE
The residuals must sum to zero if the OLS method is used, so their mean is zero.
AACSB: Analytic
Blooms: Remember
equation.
12. When using the least squares method, the column of residuals always sums to
zero.
TRUE
The residuals must sum to zero if the OLS method is used.

AACSB: Analytic
Blooms: Remember
equation.
13. In the model Sales = 268 + 7.37 Ads, an additional $1 spent on ads will increase
sales by 7.37 percent.
FALSE
The slope coefficient is in the same units as Y (dollars, not percent, in this case).
AACSB: Analytic
Blooms: Apply
equation.
14. If R = .36 in the model Sales = 268 + 7.37 Ads with n = 50, the two-tailed test for
2
correlation at α = .05 would say that there is a significant correlation between Sales
and Ads.
TRUE
t = r[(n - 2)/(1 - r )] = (.60)[(50 - 2)/(1 - .36)] = 5.196 > t = 2.011 for d.f. = 50 - 2 =
calc
2 1/2 1/2
.025
48.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
significance.
15. If R = .36 in the model Sales = 268 + 7.37 Ads, then Ads explains 36 percent of the
2
variation in Sales.
TRUE
We can interpret R as the fraction of variation in Y explained by X (expressed as a

2
percent).
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 12-08 Interpret the standard error; R2; ANOVA table; and F
test.
16 The ordinary least squares regression line always passes through the point
. .
TRUE
The OLS formulas require the line to pass through this point.
AACSB: Analytic
Blooms: Remember
equation.
17. The least squares regression line gives unbiased estimates of β and β . 0 1
TRUE
The expected values of the OLS estimators b and b are the true parameters β and
0 1 0
β.1
AACSB: Analytic
Blooms: Remember
plot.
18. In a simple regression, the correlation coefficient r is the square root of R . 2
TRUE
In fact, we could use the notation r instead of R when talking about simple
2 2
regression.
AACSB: Analytic
Blooms: Remember
test.
19 If SSR is 1800 and SSE is 200, then R is .90. 2
.
TRUE
R = SSR/SST = SSR/(SSR + SSE) = 1800/(1800 + 200) = .90.

2
AACSB: Analytic
Blooms: Apply
test.
Topic: Tests for Significance
20. The width of a prediction interval for an individual value of Y is less than standard
error s .e
FALSE
The formula for the interval width multiplies the standard error by an expression > 1.
AACSB: Analytic
Blooms: Understand
Learning Objective: 12-09 Distinguish between confidence and prediction intervals for
Y.
Topic: Confidence and Prediction Intervals for Y
21. If SSE is near zero in a regression, the statistician will conclude that the proposed
model probably has too poor a fit to be useful.
FALSE
SSE is the sum of the square residuals, which would be smaller if the fit is good.
AACSB: Analytic
Blooms: Apply
test.
22. For a regression with 200 observations, we expect that about 10 residuals will
exceed two standard errors.
TRUE
If the residuals are normal, 95.44 percent (190 of 200) will lie within ±2s (so 10 e
outside).
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-11 Identify unusual residuals and high-leverage
observations.
Topic: Unusual Observations
23. Confidence intervals for predicted Y are less precise when the residuals are very
small.
FALSE
Small residuals imply a small standard error and thus a narrower prediction interval.
AACSB: Analytic
Blooms: Understand
Y.
24. Cause-and-effect direction between X and Y may be determined by running the

regression twice and seeing whether Y = β + β X or X = β + β Y has the larger R .
0 1 1 0
2
FALSE
Cause and effect cannot be determined in the context of simple regression models.
AACSB: Analytic
Blooms: Understand
equation.
25. The ordinary least squares method of estimation minimizes the estimated slope and
intercept.
FALSE
OLS minimizes the sum of squared residuals.

AACSB: Analytic
Blooms: Remember
plot.
26. Using the ordinary least squares method ensures that the residuals will be normally
distributed.
FALSE
OLS produces unbiased estimates but cannot ensure normality of the residuals.
AACSB: Analytic
Blooms: Remember
Learning Objective: 12-10 Test residuals for violations of regression
assumptions.
Topic: Residual Tests
27. If you have a strong outlier in the residuals, it may represent a different causal
system.
TRUE
Outliers might come from a different population or causal system.
AACSB: Analytic
Blooms: Understand
observations.
Topic: Other Regression Problems (Optional)
28. A negative correlation between two variables X and Y usually yields a negative p-
value for r.
FALSE
The p-value cannot be negative.
AACSB: Analytic
Blooms: Understand
Learning Objective: 12-06 Test hypotheses about the slope and intercept by using t
tests.
29. In linear regression between two variables, a significant relationship exists when the
p-value of the t test statistic for the slope is greater than α.
FALSE
Reject β = 0 if the p-value is less than α.

1
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
tests.
30. The larger the absolute value of the t statistic of the slope in a simple linear
regression, the stronger the linear relationship exists between X and Y.
TRUE
The correlation coefficient measures linearity, regardless of its sign (+ or -).
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
tests.
31. In simple linear regression, the coefficient of determination (R ) is estimated from 2
sums of squares in the ANOVA table.
TRUE
R = SSR/SST or R = 1 - SSE/SST.
2 2
AACSB: Analytic
Blooms: Remember
test.
32. In simple linear regression, the p-value of the slope will always equal the p-value of
the F statistic.
TRUE
This is true only if there is one predictor (but is no longer true in multiple regression).
AACSB: Analytic
Blooms: Remember
test.
Topic: Analysis of Variance: Overall Fit
33 An observation with high leverage will have a large residual (usually an

. outlier).
FALSE
The concepts are distinct (a high-leverage point could have a good fit).
AACSB: Analytic
Blooms: Understand
observations.
34. A prediction interval for Y is narrower than the corresponding confidence interval for
the mean of Y.
FALSE
Predicting an individual case requires a wider confidence interval than predicting the
mean.
AACSB: Analytic
Blooms: Remember
Y.
35. When X is farther from its mean, the prediction interval and confidence interval for Y
become wider.
TRUE
The width increases when X differs from its mean (review the formula).
AACSB: Analytic
Blooms: Understand
Y.
36. The total sum of squares (SST) will never exceed the regression sum of squares
(SSR).
FALSE
The identity is SSR + SSE = SST.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
test.
37. "High leverage" would refer to a data point that is poorly predicted by the model
(large residual).
FALSE
A high-leverage observation may have a good fit (only its X value determines its
leverage).
AACSB: Analytic
Blooms: Remember
observations.
38. The studentized residuals permit us to detect cases where the regression predicts
poorly.
TRUE
Studentized residuals resemble a t-distribution. A large studentized t-value (e.g., t <

-2.00 or t > + 2.00) would implies a poor fit.
AACSB: Analytic
Blooms: Understand
observations.
39. A poor prediction (large residual) indicates an observation with high leverage.
FALSE
High leverage indicates an unusually large or small X value (not a poor prediction).
A high-leverage observation may have a good fit or a poor fit. Only its X value
determines its leverage.
AACSB: Analytic
Blooms: Understand
observations.
40. Ill-conditioned refers to a variable whose units are too large or too small (e.g.,
$2,434,567).
TRUE
In Excel, a symptom of poor data conditioning is exponential notation (e.g., 4.3E +

06).
AACSB: Analytic
Blooms: Remember
Learning Objective: 12-07 Perform regression analysis with Excel or other
software.
41. A simple decimal transformation (e.g., from 18,291 to 18.291) often improves data
conditioning.
TRUE
Keeping data magnitudes similar helps avoid exponential notation (e.g., 4.3E + 06).
AACSB: Analytic
Blooms: Understand
software.
42. Two-tailed t-tests are often used because any predictor that differs significantly from
zero in a two-tailed test will also be significantly greater than zero or less than zero
in a one-tailed test at the same α.
TRUE
True because the critical t is larger in the two-tailed test (the default in most
software).
AACSB: Analytic
Blooms: Apply
tests.
43. A predictor that is significant in a one-tailed t-test will also be significant in a two-
tailed test at the same level of significance α.
FALSE
False because the critical t would be larger in a two-tailed test.
AACSB: Analytic
Blooms: Remember
tests.
44 Omission of a relevant predictor is a common source of model misspecification.

.
TRUE
In a multivariate world, simple regression may be inadequate.

AACSB: Analytic
Blooms: Remember
software.
45. The regression line must pass through the origin.
FALSE
The OLS intercept estimate does not, in general, equal zero. We might be unable to
reject a zero intercept if a t-test, but the fitted intercept is rarely zero.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
plot.
46. Outliers can be detected by examining the standardized residuals.
TRUE
A poor fit implies a large t-value (e.g., larger than ±3 would be an outlier).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
observations.
47. In a simple regression, there are n - 2 degrees of freedom associated with the error
sum of squares (SSE).
TRUE
This is true in simple regression because we estimate two parameters (β and β ). 0 1
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
test.
48. In a simple regression, the F statistic is calculated by taking the ratio of MSR to the
MSE.
TRUE
By definition, F = MSR/MSE (obtained from the ANOVA table).

calc
AACSB: Analytic
Blooms: Understand
test.
49. The coefficient of determination is the percentage of the total variation in the
response variable Y that is explained by the predictor X.
TRUE
R = SSR/SST or R = 1 - SSE/SST lies between 0 and 1 and often is expressed as

2 2
a percent.
AACSB: Analytic
Blooms: Understand
test.
50. A different confidence interval exists for the mean value of Y for each different value
of X.
TRUE
Both the interval width and also E(Y|X) =β + β X depend on the value of X.
0 1
AACSB: Analytic
Blooms: Remember
Y.
51. A prediction interval for Y is widest when X is near its mean.
FALSE
The prediction interval is narrowest when X is near its mean. Review the formula,
which has a term (x - ) in the numerator. The minimum would be when x = .
i
2
i
AACSB: Analytic
Blooms: Remember
Y.
52. In a two-tailed test for correlation at α = .05, a sample correlation coefficient r = 0.42
with n = 25 is significantly different than zero.
TRUE
t = r[(n - 2)/(1 - r )] = (.42)[(25 - 2)/(1 - .42 )] = 2.219 > t = 2.069 for d.f. = 25 - 2
calc
2 1/2 2 1/2
.025
= 23.
AACSB: Analytic
Blooms: Apply
significance.
53 In correlation analysis, neither X nor Y is designated as the independent variable.

.
TRUE
In correlation analysis, X and Y covary without designating either as "independent."
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
significance.
54. A negative value for the correlation coefficient (r) implies a negative value for the
slope (b ).1
TRUE
The sign of r must be the same as the sign of the slope estimate b . 1
AACSB: Analytic
Blooms: Remember
plot.
55. High leverage for an observation indicates that X is far from its mean.
TRUE
By definition, observations have higher leverage when X is far from its

mean.
AACSB: Analytic
Blooms: Remember
observations.
56. Autocorrelated errors are not usually a concern for regression models using cross-
sectional data.
TRUE
We more often expect autocorrelated residuals in time series data.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
assumptions.
57. There are usually several possible regression lines that will minimize the sum of
squared errors.
FALSE
The OLS solution for the estimators b and b is unique. 0 1
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
plot.
58. When the errors in a regression model are not independent, the regression model is
said to have autocorrelation.
TRUE
For example, in first-order autocorrelation ε depends on ε . t t-1
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
assumptions.
59 In a simple bivariate regression, F = t . calc calc

2
.
TRUE
This statement is true only in a simple regression (one predictor).
AACSB: Analytic
Blooms: Remember
test.
60. Correlation analysis primarily measures the degree of the linear relationship
between X and Y.
TRUE
The sign of r indicates the direction and its magnitude indicates the degree of
linearity.
AACSB: Analytic
Blooms: Remember
significance.
Multiple Choice Questions
61 The variable used to predict another variable is called the:

.
A. response variable.
B. regression variable.
C. independent variable.
D. dependent variable.
We might also call the independent variable a predictor of Y.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
equation.
62. The standard error of the regression:
A. is based on squared deviations from the regression line.
B. may assume negative values if b < 0. 1
C. is in squared units of the dependent variable.
D. may be cut in half to get an approximate 95 percent prediction interval.
In a simple regression, the standard error is the square root of the sum of the
squared residuals divided by (n - 2).
AACSB: Analytic
Blooms: Apply
test.
Time = -7.126 + 0.0214 Distance, based on a sample of 20 shipments. The
estimated standard error of the slope is 0.0053. Find the value of t to test for zero calc
slope.
A. 2.46
B. 5.02
C. 4.04
D. 3.15
t = = (0.0214)/(0.0053) = 4.038.
calc
AACSB: Analytic
Blooms: Apply
tests.
Time = -7.126 + .0214 Distance, based on a sample of 20 shipments. The estimated
standard error of the slope is 0.0053. Find the critical value for a right-tailed test to
see if the slope is positive, using α = .05.
A. 2.101
B. 2.552
C. 1.960
D. 1.734
For d.f. = n - 2 = 20 - 2 = 18, Appendix D gives t = 1.734. .05
AACSB: Analytic
Blooms: Apply
tests.
65. If the attendance at a baseball game is to be predicted by the equation Attendance

= 16,500 - 75 Temperature, what would be the predicted attendance if Temperature
is 90 degrees?
A. 6,750
B. 9,750
C. 12,250
D. 10, 020
The predicted Attendance is 16,500 - 75(90) = 9,750.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
equation.
66. A hypothesis test is conducted at the 5 percent level of significance to test whether
the population correlation is zero. If the sample consists of 25 observations and the
correlation coefficient is 0.60, then the computed test statistic would be:
A. 2.071.
B. 1.960.
C. 3.597.
D. 1.645.
t = r[(n - 2)/(1 - r )] = (.60)[(25 - 2)/(1 - .60 )] = 3.597.

calc
2 1/2 2 1/2
Comment: Requires formula handout or memorizing the formula.

AACSB: Analytic
Blooms: Apply
significance.
67. Which of the following is not a characteristic of the F-test in a simple regression?
A. It is a test for overall fit of the

model.
B. The test statistic can never be

negative.
C. It requires a table with numerator and denominator degrees of freedom.
D. The F-test gives a different p-value than the t-

test.
F is the ratio of two variances (mean squares) that measures overall fit. The test
calc
statistic cannot be negative because the variances are non-negative. In a simple

regression, the F-test always agrees with the t-test.
AACSB: Analytic
Blooms: Remember
test.
68. A researcher's Excel results are shown below using Femlab (labor force
participation rate among females) to try to predict Cancer (death rate per 100,000
population due to cancer) in the 50 U.S. states.
Which of the following statements is not true?
A. The standard error is too high for this model to be of any predictive use.
B. The 95 percent confidence interval for the coefficient of Femlab is -4.29 to -
0.28.
C. Significant correlation exists between Femlab and Cancer at α

= .05.
D. The two-tailed p-value for Femlab will be less than .05.
The magnitude of s depends on Y (and, in this case, the t indicates significance).

e calc
AACSB: Analytic
Blooms: Apply
tests.
Which statement is valid regarding the relationship between Femlab and Cancer?
A. A rise in female labor participation rate will cause the cancer rate to decrease
within a state.
B. This model explains about 10 percent of the variation in state cancer rates.
C. At the .05 level of significance, there isn't enough evidence to say the two
variables are related.
D. If your sister starts working, the cancer rate in your state will decline.
It is customary to express the R as a percent (here, the t indicates significance).

2
calc
AACSB: Analytic
Blooms: Apply
test.
What is the R for this regression?

2
A. .9018
B. .0982
C. .8395
D. .1605
R = SSR/SST = (5,377.836)/(54,745.225) = .0982.

2
AACSB: Analytic
Blooms: Apply
test.
71. A news network stated that a study had found a positive correlation between the
number of children a worker has and his or her earnings last year. You may
conclude that:
A. people should have more children so they can get better jobs.
B. the data are erroneous because the correlation should be negative.
C. causation is in serious doubt.
D. statisticians have small families.

There is no a priori basis for expecting causation.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
significance.
72. William used a sample of 68 large U.S. cities to estimate the relationship between
income per capita, in dollars). His estimated regression equation was Crime = 428 +
0.050 Income. We can conclude that:
A. the slope is small so Income has no effect on Crime.
B. crime seems to create additional income in a city.
C. wealthy individuals tend to commit more crimes, on

average.
D. the intercept is irrelevant since zero median income is impossible in a large

city.
Zero median income makes no sense (significance cannot be assessed from given
facts).
AACSB: Analytic
Blooms: Apply
tests.
73. Mary used a sample of 68 large U.S. cities to estimate the relationship between
income per capita, in dollars). Her estimated regression equation was Crime = 428 +
0.050 Income. If Income decreases by 1000, we would expect that Crime will:
A. increase by 428.
B. decrease by 50.
C. increase by 500.
D. remain unchanged.
The constant has no effect so ΔCrime = 0.050 ΔIncome = 0.050(-1000) = -50.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
equation.
74. Amelia used a random sample of 100 accounts receivable to estimate the
relationship between Days (number of days from billing to receipt of payment) and
Size (size of balance due in dollars). Her estimated regression equation was Days =
22 + 0.0047 Size with a correlation coefficient of .300. From this information we can
conclude that:
A. 9 percent of the variation in Days is explained by Size.
B. autocorrelation is likely to be a
problem.
C. the relationship between Days and Size is significant.
D. larger accounts usually take less time to pay.
R = .30 = .09. These are not time-series data, so there is no reason to expect
2 2
autocorrelation. We cannot judge significance without more information.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
test.
75. Prediction intervals for Y are narrowest when:

A. the mean of X is near the mean of Y.
B. the value of X is near the mean of X.
C. the mean of X differs greatly from the mean of Y.
D. the mean of X is small.
Review the formula, which has (x - ) in the numerator. The minimum would be when
i
2
x=.
i
AACSB: Analytic
Blooms: Remember
Y.
76 If n = 15 and r = .4296, the corresponding t-statistic to test for zero correlation is:
.
A. 1.715.
B. 7.862.
C. 2.048.
D. impossible to determine without

α.
t = r[(n - 2)/(1 - r )] = (.4296)[(15 - 2)/(1 - .4296 )] = 1.715.

calc
2 1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
77. Using a two-tailed test at α = .05 for n = 30, we would reject the hypothesis of zero
correlation if the absolute value of r exceeds:
A. .2992.
B. .3609.
C. .0250.
D. .2004.
Use r = t /(t
crit .025 .025
2
+ n - 2) = (2.048)/(2.048 + 30 - 2) = .3609 for d.f. = 30 - 2 = 28.
1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
78. The ordinary least squares (OLS) method of estimation will minimize:
A. neither the slope nor the intercept.
B. only the slope.
C. only the intercept.
D. both the slope and intercept.
OLS method minimizes the sum of squared residuals.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
plot.
79. A standardized residual e = -2.205 indicates:

i
A. a rather poor prediction.
B. an extreme outlier in the residuals.

C. an observation with high leverage.
D. a likely data entry error.
This residual is beyond ±2s but is not an outlier (and without x we cannot assess
e i
leverage).
AACSB: Analytic
Blooms: Apply
observations.
80. In a simple regression, which would suggest a significant relationship between X

and Y?
A. Large p-value for the estimated

slope
B. Large t statistic for the slope
C. Large p-value for the F

statistic
D. Small t-statistic for the slope
The larger the t the more we feel like rejecting H : β = 0.

calc 0 1
AACSB: Analytic
Blooms: Remember
tests.
81. Which is indicative of an inverse relationship between X and

Y?
A. A negative F statistic
B. A negative p-value for the correlation coefficient

C. A negative correlation coefficient
D. Either a negative F statistic or a negative p-value
F and the p-value cannot be negative.

calc
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
test.
82. Which is not correct regarding the estimated slope of the OLS regression
line?
A. It is divided by its standard error to obtain its t

statistic.
B. It shows the change in Y for a unit change in X.
C. It is chosen so as to minimize the sum of squared

errors.
D. It may be regarded as zero if its p-value is less than

α.
We would reject H : β = 0 if its p-value is less than the level of significance.

0 1
AACSB: Analytic
Blooms: Remember
tests.
83 Simple regression analysis means that:

.
A. the data are presented in a simple and clear way.
B. we have only a few observations.

C. there are only two independent variables.
D. we have only one explanatory variable.
Multiple regression has more than one independent variable (predictor).
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
equation.
84 The sample coefficient of correlation does not have which property?

.
A. It can range from -1.00 up to

+1.00.
B. It is also sometimes called Pearson's

r.
C. It is tested for significance using a t-

test.
D. It assumes that Y is the dependent variable.
Correlation analysis makes no assumption of causation or dependence.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
significance.
85. When comparing the 90 percent prediction and confidence intervals for a given
regression analysis:
A. the prediction interval is narrower than the confidence

interval.
B. the prediction interval is wider than the confidence interval.
C. there is no difference between the size of the prediction and confidence

intervals.
D. no generalization is possible about their comparative

width.
Individual values of Y vary more than the mean of Y.
AACSB: Analytic
Blooms: Remember
Difficulty: 1 Easy
Y.
86 Which is not true of the coefficient of determination?

.
A. It is the square of the coefficient of correlation.
B. It is negative when there is an inverse relationship between X and Y.
C. It reports the percent of the variation in Y explained by X.
D. It is calculated using sums of squares (e.g., SSR, SSE, SST).
R cannot be negative.
2
AACSB: Analytic
Blooms: Remember
test.
87. If the fitted regression is Y = 3.5 + 2.1X (R = .25, n = 25), it is incorrect to conclude
2
that:
A. Y increases 2.1 percent for a 1 percent increase in X.
B. the estimated regression line crosses the Y axis at 3.5.

C. the sample correlation coefficient must be positive.
D. the value of the sample correlation coefficient is 0.50.
Units are not percent unless Y is already a percent.
AACSB: Analytic
Blooms: Apply
equation.
88. In a simple regression Y = b + b X where Y = number of robberies in a city

0 1
(thousands of robberies), X = size of the police force in a city (thousands of police),

and n = 45 randomly chosen large U.S. cities in 2008, we would be least likely to
see which problem?
A. Autocorrelated residuals (because this is time-series

data)
B. Heteroscedastic residuals (because we are using totals uncorrected for city

size)
C. Nonnormal residuals (because a few larger cities may skew the

residuals)
D. High leverage for some observations (because some cities may be

huge)
It is not a time series, so autocorrelation would not be expected, but the "size effect"
is likely to produce heteroscedasticity, nonnormality, and unusual leverage.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
assumptions.
89. When homoscedasticity exists, we expect that a plot of the residuals versus the
fitted Y:
A. will form approximately a straight
line.
B. crosses the centerline too many times.
C. will yield a Durbin-Watson statistic near

2.
D. will show no pattern at all.
Homoscedastic residuals exhibit no pattern (equal variance for all Y).
AACSB: Analytic
Blooms: Understand
assumptions.
90. Which statement is not correct?
A. Spurious correlation can often be reduced by expressing X and Y in per capita

terms.
B. Autocorrelation is mainly a concern if we are using time-series data.
C. Heteroscedastic residuals will have roughly the same variance for any value of
X.
D. Standardized residuals make it easy to identify outliers or instances of poor

fit.
Heteroscedastic residuals exhibit different variance for different X or Y values.
AACSB: Analytic
Blooms: Understand
assumptions.
91. In a simple bivariate regression with 25 observations, which statement is most

nearly correct?
A. A non-standardized residual whose value is e = 4.22 would be considered an

i
outlier.
B. A leverage statistic of 0.16 or more would indicate high leverage.
C. Standardizing the residuals will eliminate any heteroscedasticity.
D. Non-normal residuals imply biased coefficient estimates, a major

problem.
For simple regression, the "high leverage criterion" is h > 4/n = 4/25 = .16. We i
cannot judge a residual's magnitude without knowing the standard error s . e
Standardizing is only a scale shift so does not reduce heteroscedasticity. Non-

normal errors do not bias the OLS estimates.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
observations.
92. A regression was estimated using these variables: Y = annual value of reported
bank robbery losses in all U.S. banks ($millions), X = annual value of currency held
by all U.S. banks ($millions), n = 100 years (1912 through 2011). We would not
anticipate:
A. autocorrelated residuals due to time-series data.
B. heteroscedastic residuals due to the wide variation in data

magnitudes.
C. nonnormal residuals due to skewed data as bank size increases over time.
D. a negative slope because banks hold less currency when they are robbed.
It is a time series, so autocorrelation would be expected, and the "size effect" is

likely to produce heteroscedasticity and nonnormality, but growth in both X and Y
would yield a positive slope.
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
assumptions.
93. A fitted regression for an exam in Prof. Hardtack's class showed Score = 20 + 7
Study, where Score is the student's exam score and Study is the student's study
hours. The regression yielded R = 0.50 and SE = 8. Bob studied 9 hours. The quick
2
95 percent prediction interval for Bob's grade is approximately:
A. 69 to 97.
B. 75 to 91.
C. 67 to 99.
D. 76 to 90.
The quick interval is y predicted ±2s or 83 ± (2)(8) or 83 ± 16.

e
AACSB: Analytic
Blooms: Apply
Y.
94 Which is not an assumption of least squares regression?

.
A. Normal X values
B. Non-autocorrelated errors
C. Homoscedastic errors
D. Normal
errors
The predictor X is not assumed to be a random variable at all.
AACSB: Analytic
Blooms: Apply
plot.
95 In a simple bivariate regression with 60 observations there will be _____ residuals.

.
A. 60
B. 59
C. 58
D. 57
There is one residual for every observation.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
Learning Objective: 12-03 Make a prediction for a given x value using a regression
equation.
96. Which is correct to find the value of the coefficient of determination (R )? 2
A. SSR/
SSE
B. SSR/
SST
C. 1 -
SST/SSE
We use the ANOVA sums of squares to calculate R . 2
AACSB: Analytic
Blooms: Remember
test.
97. The critical value for a two-tailed test of H : β = 0 at α = .05 in a simple regression
0 1
with 22 observations is:
A. ±1.725
B. ±2.086
C. ±2.528
D. ±1.960
From Appendix D, t = ±2.086 for d.f. = n - 2 = 22 - 2 = 20.

crit
AACSB: Analytic
Blooms: Apply
tests.
98. In a sample of size n = 23, a sample correlation of r = .400 provides sufficient

evidence to conclude that the population correlation coefficient exceeds zero in a
right-tailed test at:
A. α = .01 but not α = .05.
B. α = .05 but not α = .01.
C. both α = .05 and α = .01.

t = r[(n - 2)/(1 - r )] = (.40)[(23 - 2)/(1 - .40 )] = 2.000 > t = 1.721 for d.f. = 23 - 2 =
calc
2 1/2 2 1/2
.05
21. However, the test would not be significant for t = 2.518. .01
AACSB: Analytic
Blooms: Apply
significance.
99. In a sample of n = 23, the Student's t test statistic for a correlation of r = .500 would
be:
A. 2.559.
B. 2.819.
C. 2.646.
t = r[(n - 2)/(1 - r )] = (.50)[(23 - 2)/(1 - .50 )] = 2.646.

calc
2 1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
A. ±.524
B. ±.412
C. ±.500
D. ±.497
Use r = t /(t
crit .025 .025
2
+ n - 2) = (2.069)/(2.069 + 23 - 2) = .4115 for d.f. = 23 - 2 = 21.
1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
A. ±2.229
B. ±2.819
C. ±2.646
D. ±2.080

.025
AACSB: Analytic
Blooms: Apply
tests.
102 In a sample of n = 40, a sample correlation of r = .400 provides sufficient evidence

. to conclude that the population correlation coefficient exceeds zero in a right-tailed
test at:
A. α = .025 but not α = .05.
B. α = .05 but not α = .025.
C. both α = .025 and α = .05.

t = r[(n - 2)/(1 - r )] = (.40)[(40 - 2)/(1 - .40 )] = 2.690 > t = 2.024 for d.f. = 40 - 2
calc
2 1/2 2 1/2
.025
= 38. The test would also be significant a fortiori if we used t = 1.686. .05
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
significance.
103 In a sample of n = 20, the Student's t test statistic for a correlation of r = .400 would
. be:
A. 2.110
B. 1.645
C. 1.852
D. can't say without knowing if it's a two-tailed or one-tailed

test.
t = r[(n - 2)/(1 - r )] = (.40)[(20 - 2)/(1 - .40 )] = 1.852.

calc
2 1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
A. ±.587
B. ±.412
C. ±.444
D. ±.497
Use r = t /(t
crit .025 .025
2
+ n - 2) = (2.101)/(2.101 + 20 - 2) = .4437 for d.f. = 20 - 2 = 18.
1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
A. ±2.060
B. ±2.052
C. ±2.898
D. ±2.074

.025
AACSB: Analytic
Blooms: Apply
tests.
106 In a sample of size n = 36, a sample correlation of r = -.450 provides sufficient

. evidence to conclude that the population correlation coefficient differs significantly
from zero in a two-tailed test at:
A. α = .01
B. α = .05
C. both α = .01 and α = .05.

t = r[(n - 2)/(1 - r )] = (-.45)[(36 - 2)/(1 - (-.40) )] = -2.938 < t = -2.728 for d.f. =
calc
2 1/2 2 1/2
.005
34. The test would also be significant a fortiori if we used t = -2.032 .025
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
significance.
107 In a sample of n = 36, the Student's t test statistic for a correlation of r = -.450
. would be:
A. -2.110.
B. -2.938.
C. -2.030.
t = r[(n - 2)/(1 - r )] = (-.45)[(36 - 2)/(1 - (-.40) )] = -2.938.

calc
2 1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
A. ±.329
B. ±.387
C. ±.423
D. ±.497
Use r = t /(t
crit .025 .025
2
+ n - 2) = (2.032)/(2.032 + 36 - 2) = .3191 for d.f. = 36 - 2 = 34.
1/2 2 1/2
AACSB: Analytic
Blooms: Apply
significance.
. significance of the slope for a simple regression at α = .05 is:
A. 2.938
B. 2.724
C. 2.032
D. 2.074

.025
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
tests.
110 A local trucking company fitted a regression to relate the travel time (days) of its
. shipments as a function of the distance traveled (miles). The fitted regression is
Time = -7.126 + 0.0214 Distance. If Distance increases by 50 miles, the expected
Time would increase by:
A. 1.07 days
B. 7.13 days
C. 2.14 days
D. 1.73 days
50(0.0214) = 1.07.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
equation.
111 A local trucking company fitted a regression to relate the cost of its shipments as a
. function of the distance traveled. The Excel fitted regression is shown.
Based on this estimated relationship, when distance increases by 50 miles, the

expected shipping cost would increase by:
A. $286.
B. $143.
C. $104.
D. $301.
2.8666(50) = $143.33.
AACSB: Analytic
Blooms: Apply
Difficulty: 1 Easy
equation.
112 If SSR is 2592 and SSE is 608, then:

.
A. the slope is likely to be insignificant.
B. the coefficient of determination is .81.
C. the SST would be smaller than SSR.

D. the standard error would be large.
R = SSR/SST = SSR/(SSR + SSE) = 2592/(2592 + 608) = .81. SST cannot be

2
smaller than SSR because SST = SSR + SSE. The significance and standard error
cannot be judged without more information.
AACSB: Analytic
Blooms: Apply
test.
113 Find the sample correlation coefficient for the following data.
.
A. .8911
B. .9124
C. .9822
D. .9556
Use Excel =CORREL(XData, YData) to verify your calculation using the formula for
r.
AACSB: Analytic
Blooms: Apply
significance.
114 Find the slope of the simple regression = b + b x. 0 1
A. 1.833
B. 3.294
C. 0.762
D. -2.228
Use Excel to verify your calculations using the formulas for b and b . 0 1
AACSB: Analytic
Blooms: Apply
plot.
115 Find the sample correlation coefficient for the following data.
.
A. .7291
B. .8736
C. .9118
D. .9563
Use Excel =CORREL(XData, YData) to verify your calculation using the formula for
r.
AACSB: Analytic
Blooms: Apply
significance.
116 Find the slope of the simple regression = b + b x. 0 1
A. 2.595
B. 1.109
C. -2.221
D. 1.884
Use Excel to verify your calculations using the formulas for b and b . 0 1
AACSB: Analytic
Blooms: Apply
plot.
117 A researcher's results are shown below using n = 25 observations.

.
A. [ -3.282, -1.284].
B. [ -4.349, -0.217].
C. [1.118, 5.026].
D. [ -0.998, +0.998].
For d.f. = n - 2 = 25 - 2 = 23, t = 2.069, so -2.2834 ± (2.069)(0.99855).

.025
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-05 Calculate and interpret confidence intervals for regression
coefficients.
118. A researcher's regression results are shown below using n = 8

observations.
A. [1.333, 2.284].
B. [1.602, 2.064].
C. [1.268, 2.398].
D. [1.118, 2.449].
For d.f. = n - 2 = 8 - 2 = 6, t = 2.447, so 1.8333 ± (2.447)(0.2307).

.025
AACSB: Analytic
Blooms: Apply
Learning Objective: 12-05 Calculate and interpret confidence intervals for regression
coefficients.
119 Bob thinks there is something wrong with Excel's fitted regression. What do you
. say?
A. The estimated equation is obviously incorrect.
B. The R looks a little high but otherwise it looks OK.

2
C. Bob needs to increase his sample size to decide.
D. The relationship is linear, so the equation is

credible.
A visual estimate of the slope is Δy/Δx = (625 - 100)/(200 - 0) = 2.625, so the

indicated slope less than 1 must be wrong, plus the visual intercept is 100 (not
154.61) and the fit seems better than R = .2284. 2
AACSB: Analytic
Blooms: Apply
Difficulty: 3 Hard
plot.
Short Answer Questions
120 Pedro became interested in vehicle fuel efficiency, so he performed a simple

. regression using 93 cars to estimate the model CityMPG = β + β Weight where
0 1
Weight is the weight of the vehicle in pounds. His results are shown below. Write a
brief analysis of these results, using what you have learned in this chapter. Is the
intercept meaningful in this regression? Make a prediction of CityMPG when
Weight = 3000, and also when Weight = 4000. Do these predictions seem
believable? If you could make a car 1000 pounds lighter, what change would you
predict in its CityMPG?
It is reasonable that a causal relationship might exist between a vehicle's weight

and its MPG. We expect a negative slope (heavier vehicles would get lower MPG).
The coefficient of Weight differs from zero at any common value of α (the p-value
is less than .0001) and the F statistic is huge. The confidence interval for the
coefficient of the predictor Weight does not include zero. The highly significant
predictor Weight is consistent with the high coefficient of determination (R = .711),
2
which says that well over half the variation in MPG is explained by Weight. If
Weight = 3000, we predict MPG = 47.0484 - .0080 Weight = 47.0484 - .0080(3000)
= 23.05 mpg. If Weight = 4000, we predict MPG = 47.0484 - .0080 Weight =
47.0484 - .0080(4000) = 15.05 mpg. The intercept is not meaningful since no
vehicle has zero weight or a weight close to zero.
Feedback: It is reasonable to postulate that a causal relationship might exist

between a vehicle's weight and its MPG. Our a priori expectation would be that the
slope should be negative since we would expect that heavier vehicles would get
lower MPG. The coefficient of Weight differs from zero at any common value of α
(the p-value is less than .0001) and the F statistic is huge. The confidence interval
for the coefficient of the predictor Weight does not include zero. The slope's sign is
negative, as anticipated a priori. The highly significant predictor Weight is
consistent with the high coefficient of determination (R = .711), which says that
2
well over half the variation in MPG is explained by Weight. If Weight = 3000, we
predict MPG = 47.0484 - .0080 Weight = 47.0484 - .0080(3000) = 23.05 mpg.
When Weight = 4000, we would predict MPG = 47.0484 - .0080 Weight = 47.0484
- .0080(4000) = 15.05 mpg. The intercept is not meaningful since no vehicle has
zero weight or any weight close to zero.
AACSB: Reflective Thinking

Blooms: Evaluate
Difficulty: 3 Hard
tests.
121 Mary noticed that old coins are smoother and more worn. She weighed 31 nickels
. and recorded their age, and then performed a simple regression to estimate the
model Weight = β + β Age where weight is the weight of the coin in grams and
0 1
Age is the age of the coin in years. Her results are shown below. Write a brief
analysis of these results, using what you have learned in this chapter. Make a
prediction of Weight when Age = 10, and also when Age = 20. What does this tell
you? Is the intercept meaningful in this regression?
It is reasonable to postulate a causal relationship between a coin's age and its

weight (negative slope, since we would expect that coins will wear down with
usage). The coefficient of Age differs from zero at any common α (the p-value is
less than .0001) and the F test statistic is large. The confidence interval for the
coefficient of Age does not include zero, and its sign is negative, as anticipated a
priori. Despite the significant predictor Age, the coefficient of determination (R 2
= .442) shows that less than half the variation in nickel weights is explained by
Age. If Age = 10, we predict Weight = 5.0210 - .0040 Age = 5.0210 - .0040(10) =
4.981 gm. If Age = 20, we predict Weight = 5.0210 - .0040 Age = 5.0210
- .0040(20) = 4.941 gm. The intercept is meaningful if Age = 0 was in the sample
data set (or at least some Age value near zero). The intercept is logically
meaningful because Age = 0 is something we might observe (i.e., a newly minted
nickel).
Feedback: It is reasonable to postulate that a causal relationship might exist

between a coin's age and its weight. Our a priori expectation would be that the
slope should be negative since we would expect that coins will wear down with
usage. The coefficient of Age differs from zero at any common value of α (the p-
value is less than .0001) and the F test statistic is quite large. The confidence
interval for the coefficient of Age does not include zero, and its sign is negative, as
anticipated a priori. Despite the highly significant predictor Age, the coefficient of
determination (R = .442) shows that less than half the variation in nickel weights is
2
explained by Age. Our predictions: If Age = 10, we would predict Weight = 5.0210 -
.0040 Age = 5.0210 - .0040(10) = 4.981 gm. If Age = 20, we would predict Weight
= 5.0210 - .0040 Age = 5.0210 - .0040(20) = 4.941 gm. The intercept is
meaningful, assuming that Age = 0 years was included in the sample data set (or
at least some Age value near zero). The intercept is logically meaningful a priori
because Age = 0 is something we might easily observe (i.e., a newly minted
nickel).
AACSB: Reflective Thinking

Blooms: Evaluate
Difficulty: 3 Hard
tests.

Chapter 12

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chapter 12

Uploaded by

Copyright:

Available Formats

Chapter 12

True / False Questions

4. If r = .55 and n = 16, then the correlation is significant at α = .05 in a two-tailed

5 A sample correlation r = .40 indicates a stronger linear relationship than r = -.60.

6. A common source of spurious correlation between X and Y is when a third

7. The correlation coefficient r always has the same sign as b in Y = b + b X.

11. In least-squares regression, the residuals e , e , . . . , e will always have a zero

17 The least squares regression line gives unbiased estimates of β and β .

18 In a simple regression, the correlation coefficient r is the square root of R . 2

19. If SSR is 1800 and SSE is 200, then R is .90.

24. Cause-and-effect direction between X and Y may be determined by running the

31. In simple linear regression, the coefficient of determination (R ) is estimated from

sums of squares in the ANOVA table.

33 An observation with high leverage will have a large residual (usually an

39 A poor prediction (large residual) indicates an observation with high leverage.

44 Omission of a relevant predictor is a common source of model misspecification.

45. The regression line must pass through the

46. Outliers can be detected by examining the standardized residuals.

51 A prediction interval for Y is widest when X is near its mean.

53. In correlation analysis, neither X nor Y is designated as the independent variable.

59 In a simple bivariate regression, F = t .

Multiple Choice Questions

61. The variable used to predict another variable is called the:

62 The standard error of the regression:

A. is based on squared deviations from the regression line.

B. may assume negative values if b < 0. 1

C. is in squared units of the dependent variable.

D. may be cut in half to get an approximate 95 percent prediction interval.

65. If the attendance at a baseball game is to be predicted by the equation Attendance

67. Which of the following is not a characteristic of the F-test in a simple

A. It is a test for overall fit of the

B. The test statistic can never be

C. It requires a table with numerator and denominator degrees of freedom.

D. The F-test gives a different p-value than the t-

Which of the following statements is not true?

B. The 95 percent confidence interval for the coefficient of Femlab is -4.29 to -

C. Significant correlation exists between Femlab and Cancer at α

What is the R for this regression?

B. the data are erroneous because the correlation should be negative.

C. causation is in serious doubt.

D. statisticians have small families.

A. the slope is small so Income has no effect on Crime.

B. crime seems to create additional income in a city.

C. wealthy individuals tend to commit more crimes, on

D. the intercept is irrelevant since zero median income is impossible in a large

A. 9 percent of the variation in Days is explained by Size.

C. the relationship between Days and Size is significant.

D. larger accounts usually take less time to pay.

75 Prediction intervals for Y are narrowest when:

A. the mean of X is near the mean of Y.

B. the value of X is near the mean of X.

C. the mean of X differs greatly from the mean of Y.

D. the mean of X is small.

D. impossible to determine without

A. neither the slope nor the intercept.

B. only the slope.

C. only the intercept.

D. both the slope and intercept.

79. A standardized residual e = -2.205 indicates:

A. a rather poor prediction.

B. an extreme outlier in the residuals.

C. an observation with high leverage.

D. a likely data entry error.

80. In a simple regression, which would suggest a significant relationship between X