Professional Documents
Culture Documents
variables
Create dummy variables
Fit the linear regression model
Oneway ANOVA Model using Proc GLM
Test for Equality of Variances
Post-hoc comparison of means
The purpose of the tutorial is to create dummy variables that
can be used in a linear regression model.
Next, we look at the Analysis of Variance table. Always check this table
to be sure the model is set up correctly. The Corrected Total df = n-1,
which is 393 for this model. The Model df = 2, because we have two
dummy variables as predictors. The Error df = 391, which is calculated as:
Error df = Corrected Total df – Model df.
The F test is reported as F (2, 391) = 97.57, p <0.0001, and indicates that
we have a significant overall model.
The Model R-square is 0.3329, indicating that about 33% of the total
variance of MPG is explained by this regression model.
The Analysis of Variance table for this model is the same as for the previous model,
and the Model R-square is the same. However, the parameter estimates differ,
because they represent different quantities than they did in the first model. This is
because we have fit the same model, but used a different way to parameterize the
dummy variables for ORIGIN.
The intercept is now the estimated mean MPG for Japanese cars (30.45). The
parameter estimate for USA represents the contrast in the mean MPG for USA cars
vs. Japanese cars (USA cars have on average, 10.405 lower MPG than do Japanese
cars). The parameter estimate for EUROPEAN represents the contrast in the mean
MPG for European cars vs. Japanese cars (European cars have on average, 2.747
MPG lower MPG than do Japanese cars).
3. Oneway ANOVA Model using Proc GLM
The model we investigate here is called a oneway ANOVA because
there is only one categorical predictor. You may also fit a twoway or
higher-way ANOVA, if you have two or more categorical predictors in
the model.
/*SAS COMMAND*/
proc rsquare cp mse sse adjrsquare;
model y = x1 x2 x3 x4 x1x2;
run;
3. Model selection: stepwise regression, forward selection,
backward elimination, and maximum r^2 improvement
proc stepwise;
model y = x1 x2 x3 x4 x1x2 /stepwise sle = 0.05
sls=0.05;
run;