You are on page 1of 3

QMB3200 –project make up

20– points- Due Apr 25 @ midnight

Watson & Watson Repair Inc. provides maintenance service for a large apartment complex in downtown Saint
Petersburg, Florida. The managers are evaluating the possibility of hiring another maintenance person due to
the increase of maintenance calls. Rafael Roddick and Andy Nadal are currently responsible for maintenance
tasks. To investigate “what” drives Repair Time and be able to hire the best candidate, the managers hire you as
statistician to conduct a regression analysis. As dependent variable you have time of repair for each, Rafael and
Andy. You also have time since last maintenance. You should accomplish your analysis step by step, including
one variable at a time. First look at the correlations table to figure out the relationship between variables.
STEP 1: use the dummy variable REPAIRPERSON = 1 IF responsible = RAFAEL; REPAIRPERSON = 0 IF responsible
= ANDY; RUN a regression model using ONLY repairperson as variable to explain REPAIRTIME

1) (1pt) Comment on goodness of fit of the model


Fully explain here:

Based on the regression output for the Step 1, we can say that the R-Squared value (goodness of fit value) is
0.614 or 61.4%. It shows that 61.4% of variance in the dependent variable REPAIRTIME can be predicted
from the independent variable REPAIRPERSON. The value of 61.4% shows that the if relatively poor as the R-
Squared value is less than 70%.

Answer: the R-Squared value is 61.4% (or 0.614).

2) (2pt) Report the statistical significance of the coefficients.


Fully explain here:
As we can see from the SPSS output, the statistical significance (p-value) of the coefficients are:
Variable Value Statistical Significance
Constant (y-intercept) 4.6 0.000
REPAIRPERSON -1.58 0.007
Thus, the coefficients (both constant and REPAIRPERSON’s coefficient) are statistically significant at 99%
significance as p-value for both coefficients are lower than α=0.01. The constant coefficient is positive, the
coefficient for the REPAIRPERSON variable is negative.

STEP 2: Use MONTHS SINCE LAST SERVICE AND REPAIRPERSON in a regression to explain REPAIRTIME
3) (1pt) Comment on the scatter diagram for Months-since-last-service and Repairtime.
Fully explain here:
The scatter plot shows possible positive but disperse relationship: as the value of the Months-since-last-
service increases, the repair time also increases.

4) (1pt) Comment on goodness of fit of this model.


Fully explain here:
The R-Squared value is 0.705 or 70.5%, which shows that this model is better than the model from the step 1
(with only one independent variable – REPAIRPERSON). The R-Squared value of 70.5% shows that 70.5% of
variability in the dependent variable REPAIRTIME is explained by its linear relationship to variables
REPAIRPERSON and MONTHSLASTSERVICE. As the value is greater than 70%, we can say that the fit is good.

5) (1pt) Why do you think Repairperson is important or unimportant in this model?


Fully explain here:
The coefficient for REPAIRPERSON is not statistically significant as its p-value, which is equal to 0.222 is
greater than α=0.01 (or even 0.05 and 0.1). The R-Squared value has increased and has a good fit, and the
magnitude is not higher than other coefficients. Thus, I would say that the variable REPAIRPERSON is
unimportant for this model.

1
6) (2pt) Report the statistical significance of the coefficients.
Fully explain here:
As we can see from the SPSS output, the statistical significance (p-value) of the coefficients are:
Variable Value Statistical Significance
Constant (y-intercept) 3.195 0.015
REPAIRPERSON -0.86 0.222
MONTHSLASTSERVICE 0.191 0.185
The constant is positive and statistically significant at 0.05 level. The REPAIRPERSON coefficient is negative
and not statistically significant as its p-value = 0.222 is greater than α=0.05. The MONTHSLASTSERVICE
coefficient is positive and not statistically significant as its p-value=0.185 is greater than α=0.05.

7) (1pt) Write down the estimated regression equation. here:

According to the SPSS output, the coefficients are:


Constant = 3.195
Repairperson = -0.86
Monthslastservice = 0.191

Thus, the estimated regression equation is:


Yhat = 3.195 – 0.86*X1 + 0.191*X2, where X1 is REPAIRPERSON. X2 is MONTHSLASTSERVICE, or:

REPAIRTIME = 3.195 – 0.86*REPAIRPERSON + 0.191*MONTHSLAST SERVICE

8) (5pt) Interpret the intercept for this model


Fully explain here:
The intercept is the value of the dependent variable, when all independent variables equal to zero. In our
case, it means the repair time when REPAIRPERSON = 0 and MONTHSLASTSERVICE = 0. The value of the
intercept is 3.195. It means that the repair time is 3.195, if the repair person is Andy and there are 0 months
since last service.

9) (3pt) Provide an interpretation for the coefficient of Repairperson.


Fully explain here:
The coefficient for of REPAIRPERSON is -0.86, which means that as the REPAIRPERSON increases by 1, the
REPAIRTIME increases by 0.86. In our situation, we can say that if Rafael works (i.e., the REPAIRPERSON = 1),
the repair time increases by 0.86.

STEP 3: Use MONTHS SINCE LAST SERVICE to capture the curvature explaining REPAIRTIME

10) (1pt) From all the models which you think is best? WHY?
Fully explain here:
As we can see from the SPSS output, all models are statistically significant at 0.05 level as p-values of all
models are lower than α=0.05 (p-value of the Linear model is 0.006, p-value of the quadratic model is
0.013, and p-value of the cubic model is 0.026). The magnitude of the coefficients is higher for Cubic
model than for Quadratic and Linear models. The R-squared values are 62.9%, 70.9%, and 76.5% for
Linear, Quadratic, and Cubic models respectively, which means that the Cubic model is better than
Linear or Quadratic as the variation in dependent variable REPAIRTIME is explained at higher level by
independent variables included in each model. Thus, we can conclude that the Cubic model (model 3)
performs better than Linear or Quadratic models.
Answer: Cubic model is better.

11) (1pt) Write down the estimated equation for the QUADRATIC model
Fully explain here:
According to the SPSS output, the coefficients are:

2
Constant = 0.213
b1 = 1.130
b2 = -0.072

Thus, the estimated equation for the quadratic model is:


Yhat = 0.213 + 1.13*X – 0.072*X^2, where X = MONTHSLASTSERVICE, or:

REPAIRTIME = 0.213 + 1.13*MONTHSLASTSERVICE – 0.072*(MONTHSLASTSERVICE)^2.

12) (1pt) What is the average repair time for the quadratic model?
Fully explain here:
The average repair time for some number of months since last service can be calculated by using the
equation:
REPAIRTIME = 0.213 + 1.13*MONTHSLASTSERVICE – 0.072*(MONTHSLASTSERVICE)^2

For example, the average repair time if it’s 1 month since last service will be:
REPAIRTIME = 0.213 + 1.13*1 - 0.072*(1)^2 = 0.213 + 1.13 - 0.072 = 3.188.

You might also like