You are on page 1of 4

Class 28 Assignment Answers

These questions refer to EMBS Case Problem 2. “Alumni Giving” which concerns data for 48 US national
universities (America’s Best Colleges, Year 2000 Edition). Both the University of Notre Dame and the
University of Virginia are included. The following five variables are in the data set.

Variable School Graduation % of Classes Under Student/Faculty Alumni Giving Rate


Rate 20 Ratio
Description The name Percentage Percentage of Number of Percentage of living
of the of enrollees Classes offered students enrolled alumni who gave to
University who with <= 20 divided by total the University in
graduate students. number of faculty 2000
Mean 83.042 55.729 11.542 29.271
Median 83.5 59.5 10.5 29
Mode 92 65 13 13
Standard 8.607 13.194 4.851 13.441
Deviation
Skewness -0.282 -0.501 0.582 0.370
Minimum 66 29 3 7
Maximum 97 77 23 67
Count 48 48 48 48

1. Test the hypothesis that graduation rate and alumni giving rate are (linearly) independent. We expect
universities with higher graduation rates to have higher mean giving rates. [15 points]

A regression of giving rate on graduation rate shows a positive linear relationship with
reported p-value of 5.24E-10. For Ha: b>0, the p-value is half that, or 2.62E-10. We reject H0
in favor of Ha. The results are statistically significant.

Standard
Coefficients Error t Stat P-value
Intercept -68.76 12.58 -5.46 1.82E-06
Graduation Rate 1.18 0.15 7.83 5.24E-10

2. If the graduation rate of school A is 5 percentage points higher than that of school B, how much
higher do we expect school A’s giving rate to be? [10 points]

Using the above regression (graduation rate is all we know), the expected giving rate will be
1.18*5 = 5.9 percentage points higher for school A.

3. If you learn that A and B above have identical student to faculty ratios, what is your revised answer to
question 2? Be certain to explain why it went up (if it went up) or why it went down (if it went down) or
why it stayed the same. Direct your response to a university administrator. [15 points]
For this question, we know both graduation rate and student/faculty ratio. Since the latter is
also predictive of giving rate, we will use a multiple regression to answer this question.

Standard
Coefficients Error t Stat P-value
Intercept -19.10631 15.55006 -1.22870 0.22557
Graduation Rate 0.75574 0.16023 4.71669 0.00002
Student/Faculty Ratio -1.24595 0.28430 -4.38250 0.00007

(Note the p-value associated with student/faculty ratio is very low. Student/faculty ratio is an
important variable which should not be ignored.) The 5 point higher graduation rate leads us
to expect 0.756*5 = 3.8 percentage points higher giving rate for A. Our answer went down (5.9
to 3.8) because graduation rates and faculty/student ratios are negatively correlated in the
sample. (Schools with higher graduation rates are expected to have lower faculty/student
ratios….which in turn also lead to higher giving rates.) The answer to 2 reflected this reality.
The higher grad rate for A would also imply a lower student faculty ratio…and the
combination would lead to expecting 5.9 more percentage points in giving rate. When we
learned that A did NOT have a lower student/faculty ratio than B, our expectations for its
giving rate go down and we expect a smaller giving rate gap between the two schools.

4. Provide a point forecast of alumni giving rate for a university with graduation rate of 80, 65 percent of
its classes with 20 or fewer students, and a student/faculty ratio of 20. [25 points] (To answer this
question, I expect you will build a linear regression model. Do not try anything fancy. Just pick which
subset of the three numerically scaled variables you think comprise the best model.)

From a modeling stand-point, the question is whether percent under 20 is needed. Does it add
predictive poser to the model given we have both grad rate and student/faculty ratio? To see,
we try the three-variable model.

Standard
Coefficients Error t Stat P-value
Intercept -20.7201 17.5214 -1.1826 0.2433
Graduation Rate 0.7482 0.1660 4.5082 0.0000
% of Classes Under
20 0.0290 0.1393 0.2084 0.8358
Student/Faculty Ratio -1.1920 0.3867 -3.0823 0.0035

The p-value associated with %under20 is 0.83---not significant. We do not need and should
not use all three variables. The model used to answer Q3 should be used to come up with the
point forecast. Using a sumproduct to perform the calculation results in a point forecast of
16.4 for the alumni giving rate of the school in question. See below.
Coefficients
Intercept -19.10631
Graduation Rate 0.75574
Student/Faculty Ratio -1.24595

Intercept 1
Graduation Rate 80
Student/Faculty Ratio 20
POINT FORECAST 16.43

5. Of the 48 universities in the data set, which one has the most surprisingly low alumni giving rate? [10
points] (Hint: The answer is not U. of California-Davis. Its last-place giving rate is explained by its
relatively low graduation rate and large classes.)

I will use our 2-variable regression to calculate predictions (expectations) for each of the 48
schools and then identify the school with actual giving rate most below the prediction. This is
the same thing as finding the school with the most negative residual.

25

20

15
ERRORS or RESIDUALS

10

0
0 10 20 30 40 50
-5

-10

-15
PREDICTED VALUES

In the scatter plot of errors versus predicted, the circled point is the one with the most negative
error. It is school 35 (U. of Michigan-Ann Arbor) for which the regression prediction was 24.9
but the actual giving rate was 13….a full 11.9 points below expectation. I will leave it to you
Notre Dame readers to draw your own conclusions. (You can also identify the most negative
residual by asking EXCEL to give you the residuals.....and either eyeball or sort.)
6. Bo notices that some of the 48 have “university” in their names, some have “college” and the rest
have “institute”. Bo wonders whether these names are predictive of student/faculty ratio? (Formulate
and test a relevant hypothesis.) [25 points]

Let us use H0: mean S/F ratio is equal for the three names. Ha will be not all equal. We can
use either ANOVA single factor or regression with 2 dummies to test this hypothesis.

SUMMARY OUTPUT

Regression Statistics
Multiple R 0.306267658
R Square 0.093799878
Adjusted R
Square 0.053524317
Standard Error 4.719185001
Observations 48

ANOVA
Significance
df SS MS F F
Regression 2 103.7348 51.8674 2.3290 0.1090
Residual 45 1002.1818 22.2707
Total 47 1105.9167

Standard
Coefficients Error t Stat P-value
Intercept 11.8636 0.7114 16.6754 0.0000
Dcollege -0.3636 3.4120 -0.1066 0.9156
Dinstitute -7.3636 3.4120 -2.1582 0.0363

Although the mean giving S/F ratio for institutes is significantly lower than for Universities
(the group not included in the model) because the p-value is 0.036, overall we CAN NOT reject
H0 ( the p-value for our H0 is 0.1090). The differences in three sample means are not
statistically significant. Part of the reason is that there are only 2 colleges and 2
institutes…which makes our estimates of their means highly uncertain---a fact accounted for
in our p-value.

You might also like