Professional Documents
Culture Documents
Tutorial 5 - Linear Regression S and A
Tutorial 5 - Linear Regression S and A
Reject
There is a relationship between the Statistics and CPA scores at 2% significance level.
d)
2
e) The CPA score for a student who had zero mark in Statistic is 0.8110.
For every one mark increase in Statistics, the CPA score will increase 0.0323
f)
()
()
()()
()
Reject
The Statistics scores have positive significant effect on CPA scores at 2.5%
significance level.
g) ()
h)
91.01% of the variation in CPA scores can be explained by the variation in the
Statistics scores.
Only 8.99% is unexplained, due to error.
2. An architect wants to determine the relationship between the heights (in feet) of a building
(y) and the number of stories in the building (x). The following results are based on ten
samples that have been measured.
30.00 40.00 50.00 60.00 70.00
x
400.0
500.0
600.0
700.0
800.0
900.0
y
3
Hint:
5391 . 73 , 6 . 123921 , 4 . 870
275237 , 5968 , 444
0
b S S
xy y x
yy xx
a) Does the scatter plot suggest an approximate linear relationship? Explain.
b) Determine the strength of the relationship between the heights of a building and the
number of stories in the building. Interpret the value.
c) Fit a least squares line.
d) Can we conclude that the number of stories in a building has positive significance
effect on its heights at 5% significance level?
Solution:
a) Yes. The data values fluctuate on the estimated straight line.
b) 10257.8
10
) (444)(5968
275237 S
xy
0.9877
3921.6) (870.4)(12
10257.8
r
The correlation coefficient suggests a strong positive linear relationship between
heights of a building and the number of stories in the building.
c) 11.7852
870.4
10257.8
b
1
x 11.7852 73.5391 y
d) H
0
:
1
0 H
1
:
1
> 0
378.9219
8
257.8) 11.7852(10 123921.6
S
2
e
17.8616
870.4
378.9219
0 11.7852
t
test
1.8595 t
8 0.05,
Reject H
0
.
There is enough evidence to conclude that number of stories in a building has positive
significance effect on its heights at 5% significance level.
4
3. Suppose that the sales manager of a large automotive parts distributor wants to estimate
the total annual sales of a region. Several factors appear to be related to sales, including
the number of retail outlets (X
1
), number of automobiles registered (X
2
), personal incomes
(X
3
), average age of automobiles (X
4
) and number of supervisors (X
4
). The following
output is the results of the analysis obtained by the sales manager. Based on the output,
answer the following questions.
ANOVA(b)
Model
Sum of
Squares df Mean Square F Sig.
1 Regression
1594.237 5 318.847 148.003 .000(a)
Residual
8.617 4 2.154
Total
1602.855 9
a Predictors: (Constant), x5, x3, x2, x4, x1
b Dependent Variable: sales
Coefficients(a)
Model
Unstandardized
Coefficients
Standardized
Coefficients t Sig.
B Std. Error Beta
1 (Constant)
-20.157 5.041 -3.998 .016
x1
.000 .003 -.020 -.148 .889
x2
1.696 .514 .311 3.299 .030
x3
.425 .043 .922 9.775 .001
x4
2.316 .932 .144 2.483 .068
x5
-.145 .203 -.042 -.714 .515
a Dependent Variable: sales
Correlations
sales x1 x2 x3 x4 x5
sales Pearson Correlation
1 .899(**) .604 .962(**) -.369 .243
Sig. (2-tailed)
.000 .064 .000 .294 .500
x1 Pearson Correlation
.899(**) 1 .775(**) .820(**) -.504 .144
Sig. (2-tailed)
.000 .008 .004 .137 .691
x2 Pearson Correlation
.604 .775(**) 1 .400 -.314 .364
Sig. (2-tailed)
.064 .008 .252 .377 .301
x3 Pearson Correlation
.962(**) .820(**) .400 1 -.439 .115
Sig. (2-tailed)
.000 .004 .252 .204 .751
x4 Pearson Correlation
-.369 -.504 -.314 -.439 1 .471
Sig. (2-tailed)
.294 .137 .377 .204 .169
x5 Pearson Correlation
.243 .144 .364 .115 .471 1
Sig. (2-tailed)
.500 .691 .301 .751 .169
** Correlation is significant at the 0.01 level (2-tailed).
5
a) Write down the estimated equation of the regression line.
b) Is there sufficient evidence to indicate that there is a positive relationship between
sales and X
2
at 2.5% level of significance?
c) At the 5% significance level, test the overall validity of the model.
d) Which explanatory variable has no significant effect on Y at 5% significance level?
e) Which variable(s) has negative relationship with X
1
?
f) Which two variables have the strongest relationship?
g) Describe the strength and direction between X
5
and the dependent variable.
h) State the value for determination coefficient and interpret it.
Solution:
a)
b)
Failed to reject
The relationship is not significant at 5% significance level.
c)
Reject
The model is valid at 5% significance level.
d) X
1
, X
4
and X
5
e) X
4
f) X
3
and sales
g) There is a weak positive relationship between X
5
and the dependent variable.
h)
99.46% of the variation in total annual sales can be explained by the variation in
number of retail outlets (X
1
), number of automobiles registered (X
2
), personal incomes
(X
3
), average age of automobiles (X
4
) and number of supervisors (X
4
).
Only 0.54% is unexplained, due to error.
6
4. The electric power consumed (y) each month by a chemical plant is thought to related to
the average ambient temperature (
), the average
product purity (
= 25
o
F,
= 24 days,
= 15%
and
= 98 tons.
m) How well the model fit the data?
Solution:
a) n = 15
b) The average ambient temperature (
)
The number of days in the month (
)
The average product purity (
)
The tons of product produced (
).
c) The electric power consumed (y)
d) 3.716
e) The average product purity (
),
= 0.890
f)
Reject
The relationship is significant at 5% significance level.
g)
8
h) The power consumption without the chemical plant is 3.716.
Assuming the other variables are constant, for every 1
o
F increase in temperature, the
power consumption will decrease 1.4.
i)
and
j)
and
k)
Failed to reject
The tons of product produced have no significant at 2% level of significance.
So that, tons of product produced shouldnt be include in the model.
l) () () () ()
m)
94.6% of the variation in electric power consumed can be explained by the variation in
average ambient temperature, number of days in the month, average product purity
and tons of product produced.
Only 5.4% is unexplained, due to error.