Professional Documents
Culture Documents
Course
Instructor
Date
STATISTICAL ASSIGNMENT 2
A).
SUMMARY
OUTPUT
Regression Statistics
0.609
Multiple R 4
0.371
R Square 4
Adjusted R 0.370
Square 1
Standard 10095
Error .7328
495.0
Observations 000
ANOVA
Signific
df SS MS F ance F
29687314 29687314 291.
Regression 1 944.7007 944.7007 2696 0.0000
50248443 10192382
Residual 493 796.1759 1.0876
79935758
Total 494 740.8766
P-
Coeffi Standard valu Lower Upper Lower Upper
cients Error t Stat e 95% 95% 95.0% 95.0%
42362 0.00 41038.0 43686 41038. 43686.
Intercept .2458 673.9857 62.8533 00 071 .4845 0071 4845
1201. 0.00 1062.86 1339. 1062.8 1339.4
yearsrank 1490 70.3800 17.0666 00 72 4307 672 307
B).
STATISTICAL ASSIGNMENT 3
If the meaning of the years in rank of an academic staff increase by 1, then their salary also
increases by $ 1201.149.
C).
When X = 12
Then,
Salary = $42362.25 + 1201.149(12)
Salary (Y) = $ 56,776.0338
D).
E).
The standard error of the estimate above shows the accuracy of the predictions. The above
standard error value of 10095.733 is a small percentage of the mean value of salary, therefore,
F).
STATISTICAL ASSIGNMENT 4
The residual analysis has been performed in the excel spreadsheet and the residual plot is shown
below:
20000
0
-20000 0 5 10 15 20 25 30
-40000
yearsrank
The above plot shows that the assumptions of independence of variables and constant variance
are maintained by the model however, the normality of the distribution is not visible.
G).
b1 1201.1490
SE 70.3800
t value 17.0666
Df 493.0000
Using T Distribution Calculation P Value
is 0.0000%
H).
Df 493
P value 0.0000%
I).
J).
The true value of the average salary of academic staff would be between $ 55,655.4287 and $
57,896.6388.
K).
$
Mean Y = 56,776.0338
T value 1.9650
Standard Error of the Estimate 673.9857
$
Lower 55,451.6572
$
Upper 58,100.4103
The future observations or the future mean salary of academic staff with a certain probability
A).
SUMMARY
OUTPUT
Regression Statistics
0.273
Multiple R 7
0.074
R Square 9
0.073
Adjusted R Square 8
5.388
Standard Error 8
Observations 1700
ANOVA
Signifi
cance
df SS MS F F
68.
3991.15 1995. 720
Regression 2 49 5774 3 0.0000
49279.4 29.03
Residual 1697 313 91
53270.5
Total 1699 862
STATISTICAL ASSIGNMENT 7
P- Uppe
Coeffi Standar val Lower r Lower Upper
cients d Error t Stat ue 95% 95% 95.0% 95.0%
8.360 4.953 0.0 11.67 11.670
Intercept 4 1.6879 2 000 5.0498 09 5.0498 9
0.336 11.64 0.0 0.392
ttl_exp 0 0.0289 42 000 0.2794 6 0.2794 0.3926
- - -
0.125 2.892 0.0 0.040 - -
age 0 0.0432 1 039 -0.2098 2 0.2098 0.0402
Wage = $8.360 +
0.336(ttl_exp) -
0.125(age)
B).
If the total years of experience of the employee increases by 1 year, then the wage of the
employee would increase by $ 0.336 per hour. If the age of the employee increases by 1 year,
then the wage of the employee would decrease by $ 0.125 per hour.
C).
When X = 18 and X2 = 38
Then,
Wage = $8.3604 + 0.3360(18) - 0.125(38)
$
Wage = 9.6574
D).
5.5995
$
Size 1,700.0000
$
Confidence 0.2662
$
LCI 9.3912
$
UCI 9.9236
The true value of the average wage of all employees would be between $ 9.39 and $ 9.92 per
hour.
E).
The future observations or the future mean wage of employees with a certain probability would
F).
STATISTICAL ASSIGNMENT 9
20
0
0 5 10 15 20 25 30 35
-20
ttl_exp
10
0
32 34 36 38 40 42 44 46 48
-10
-20
age
The above residual plots show that the variance is not constant, and the distribution of the
variables does not seem to be normal. The variables seem to be independent however, normality
G).
There is no reason to suspect collinearity as the VIF for both variables is less than 10. This is an
acceptable value.
STATISTICAL ASSIGNMENT 10
H).
X1 0.3360
SE 0.0289
t value 11.6442
Df 1698.0000
Using T Distribution Calculation P Value
is 0.0000
X2 0.1250
SE 0.0432
t value 2.8921
Df 1698.0000
Using T Distribution Calculation P Value
is 0.0039
On the basis of the t-tests, X1 should be included in the regression model as its p value is
significant at 0.000 and X2, should be excluded from the regression model as its p value is
insignificant at 0.338%.
I).
We have used the F-test to test the significance of the overall multiple regression model. The p
value is 0.000 and is less than 0.05 level of significance therefore, the model is significant
overall.
J).
STATISTICAL ASSIGNMENT 11
The relationship between Y and X1 is significant and Y and X2 is insignificant based on the
K).
Intermediate Calculations
SSR(X1, X2) 3991.1549
SST 53270.5862
SSR(X2) 14130.5316
SSR(X1) 1693.1194
SSR(X1 | X2) -10139.3768
SSR(X2 | X1) 2298.0355
Coefficients
r2 Y1.2 -0.2591
r2 Y2.1 0.0446
The percentage variation in Y or wage that cannot be explained within a reduced model, but can
only be explained in the fuller model is -0.2591 and 0.0446 for X1 and X2 respectively.
L).
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.2977
R Square 0.0886
STATISTICAL ASSIGNMENT 12
If the race of the employee is white, then the wage per hour would increase by $ 1.492 and if the
race of the employee is black, then the wage per hour would increase by $0.
M).
SUMMARY
OUTPUT
Regression Statistics
Multiple R 0.9971
R Square 0.9942
Adjusted R
Square 0.9942
Standard
Error 0.4256
Observations 1700
STATISTICAL ASSIGNMENT 13
ANOVA
Signific
df SS MS F ance F
52963.90 8827. 48729.
Regression 6 26 3171 8639 0.0000
0.181
Residual 1693 306.6836 1
53270.58
Total 1699 62
Coeffi Standar P- Lower Upper Lower Upper
cients d Error t Stat value 95% 95% 95.0% 95.0%
9.496
Intercept 3.6193 0.3811 6 0.0000 2.8718 4.3668 2.8718 4.3668
10.83
ttl_exp 0.3192 0.0295 14 0.0000 0.2614 0.3770 0.2614 0.3770
-
- 9.504 -
age 0.0929 0.0098 5 0.0000 -0.1120 0.0737 -0.1120 -0.0737
-
- 1.642
race 0.0390 0.0238 0 0.1008 -0.0856 0.0076 -0.0856 0.0076
Interaction 0.464
(X1xX2) 0.0002 0.0004 8 0.6421 -0.0007 0.0011 -0.0007 0.0011
Interaction 161.9
(X1xX3) 0.0258 0.0002 759 0.0000 0.0255 0.0261 0.0255 0.0261
-
Interaction - 11.10 -
(X2x X3) 0.0082 0.0007 02 0.0000 -0.0096 0.0067 -0.0096 -0.0067
N).
Interaction 1
9105827.5343
F value %
Df 1698
STATISTICAL ASSIGNMENT 14
P value 0.0000%
Interaction 2
F value 30271.5231%
Df 1698
P value 0.0000%
Interaction 3
F value 402719.2307%
Df 1698
P value 0.0000%
The three interactions improve the regression model significantly as the F value of the partial F
test is 0.000. Furthermore, the three interactions separately also improve the regression model as
the partial F test p value is 0.000 for all the three interactions. However, as the F value for
Reference