Using Correlation and Regression

:
Mediation, Moderation, and More
Part 3: Moderation with regression
Claremont Graduate University
Professional Development Workshop
August 22, 2015
Dale Berger, Claremont Graduate University (dale.berger@cgu.edu)
Statistics website: http://wise.cgu.edu
This document is designed to aid note taking during the presentation and to serve as a resource
for later use. It provides selected formulas, figures, SPSS syntax and output, and references, with
much more detail than PowerPoint slides that accompany the presentation.
This document, data files, supplemental reading, and other materials are available on a Google
Drive site for which members of the class will receive a link. If you have difficulty, please
contact me at dale.berger@cgu.edu . We won’t cover all of this material in the on-line
presentation.
2
3
3
4

10
10

14
15
16
18

Moderation analysis with regression
Examples of moderation (identify X, Y, and Z)
Model of salary for men and women
SPSS example of moderation with a dichotomous moderator
5 SPSS point-and-click commands
6 SPSS regression output and interpretations
7 Figure presenting the findings
8 Table presenting regression analysis
9 Dummy mediator variable and centered continuous X variable
Multicollinearity and tolerance
SPSS example of moderation with a continuous moderator
11 SPSS output and table for presentation
12 Interpretations
13 Figure presenting the findings
Summary
References
SPSS syntax for moderation analysis
Excel workbook Plotting Regression Interactions

1
Session 2a: Moderation Analysis with Regression

Moderation Analysis with Regression
Group differences in treatment effects often are especially important to measure and understand.
If a treatment has greater effects for women than for men, we say that sex ‘moderates’ the effects
of the treatment. When there is moderation, it may be misleading to describe overall treatment
effects without taking group membership into account. Moderation analysis can guide decisions
about interpreting effects and redesigning treatments for different groups.
Moderation is interaction. For example,
among eighth grade children, drug use by a
child can be predicted by drug use of their
friends. However, the relationship is weaker
for children who have greater parental
monitoring. This finding can be displayed by
showing the difference in regression lines for
children with high parental monitoring and
children with low parental monitoring.
“Parental monitoring moderates the
relationship between drug use by children and
drug use by their friends, such that the
relationship is weaker for children with
stronger parental monitoring.”
In general, Variable Z is a moderator of the relationship between X and Y if the strength of the
relationship between X and Y depends on the level of Z. A moderator relationship can be
illustrated with an arrow from Variable Z (the moderator variable) pointing to the arrow that
connects X and Y (see Figure 2). A model can include both mediation and moderation. We can
include a path from X to Z, indicating that Z may also mediate the relationship between X and Y.
The X-Z path would not affect our analysis of the moderation effect. Z can be a moderator even
if it has no direct effect on Y and thus no mediation effect. The effect of X on Y for a specific
value of Z is called a simple effect of X on Y for that value of Z.
Figure 2: Model Showing Z Moderating the X-Y Relationship

Z

X

Y
2
Session 2a: Moderation Analysis with Regression

X= Y= Z= The relationship between education and occupational prestige is greater for women than for men. the regression model simplifies to Y' = 55. for men.0 + 1.7X1.5X1. or Y' = 51. For men. X= Y= Z= Learning outcomes are positively related to amount of study time for children who use either Book A or Book B. X= Y= Z= Models of Salary for Men and Women We wish to test the null hypothesis that the relationship between salary and time on the job is the same for men and women in a large organization. Regression analysis yielded the following model: Y' = 55. women = 1) To test for an interaction. so X3 = X1 * X2 = X1 * 1 = X1. For women.2X1 We can use the models for men and women to create a diagram showing the simple effects for men and women.5X1 + (-3. for women. X1 = years on the job. years on the job. and gender for a random sample of n=200 employees. Thus. Elements of the original regression equation can be interpreted as follows. X2 = 1. and Z): The impact of a program is greater for younger adolescents than for older adolescents. Thus. we can predict salary for any individual if we know X1 and X2 for that person. but the relationship is stronger for those who use Book B. X3 = X1 * X2.Examples of moderation (identify X.4X2 + .0 + 1.6 + 2. Y = salary in $1000s. we create a special interaction term. Y. and also X3 = 0 because X3 = X1 * X2 = X1 * 0 = 0.4) + . 3 Session 2a: Moderation Analysis with Regression . the regression model simplifies to Y' = 55. We have data on salary.7X3 With this model. X2 = gender (men = 0.5X1 -3.0 + 1. X2 = 0.

. X2 = 0. then both men and women would be given the same regression coefficient on X1. years of education. This is the difference in the constant for the models for men and women.5 for men and 2.7 greater for women. In this example. men with zero years on the job have X1 = 0. If relationships are nonlinear. i.0) is the predicted salary for someone who has values of zero on all predictors. the weight on X3 is the additional weight given to X1 for women.000. the nonlinear components should be included in the model. If we did not have an interaction term in the model. If this null hypothesis is rejected. Because there is an interaction term.SAV. adults given in 1991 U.5 on X1 applies only to men (who have values of zero on the interaction term). 55. The coefficient of . indicating that the average increment in predicted salary per year is $700 greater for women. The coefficient of -3. This sample includes n=1415 cases with complete data on the three variables of occupational prestige. 4 Session 2a: Moderation Analysis with Regression .S.S.2 for women. the weight on X1 is 1.The constant (55.5 describes the slope of the best fitting regression line for men in the cross-sectional data. we can use data from a national sample of U. but rather 1. but is the relationship the same for men and women? For this example. General Social Survey. which would mean that the slopes of the regression model for men and women are the same. as provided by SPSS. This does not mean that every individual man will earn $1500 more each year. so the constant is the predicted salary for men with zero years on the job. In the example. The null hypothesis for the test of the regression weight on X3 is that this weight is zero in the population. and gender. In the regression models for men and women. SPSS Example: Moderation Effects with a Dichotomous Moderator Occupational prestige as measured by a standard scale is positively related to years of education. the slope is . so the model predicts average salary to be $1500 greater for each year on the job for men.0 or $55. In general. the coefficient of 1. An assumption for tests of statistical significance is that residuals from the model are reasonably homogeneous and normally distributed across levels of X1 and X2.4 on X2 is the modeled difference in salary between men and women who have zero years on the job. the conclusion is that the slopes are different. and X3 = 0.e.7 on the interaction term X3 indicates the difference in the regression weight for men and women. the coefficient on X1 is the simple effect of X1 when X2 = 0.7. Because X3 = 0 for men and X3 = X1 for women. a difference of . It is not a measure of the average sex effect when the interaction term is in the model.

This should give educ * sex in the Numeric Expression: window. Click Transform. This expression can be typed into the window instead of selecting and clicking. we will ask for a lot of statistics. we compute the product of Education (X) and Sex (Z) to create a new variable that we name EdxSex (XZ). Click Next to go to the third block. go to the syntax window and run this computation because we need the new variable EdxSex for the next analysis. Click Statistics…. select *ZRESID as the Y variable and *ZPRED as the X variable. For illustration. select file 1991 U. select prestg80 and click the black triangle to enter it as the Dependent variable. check Histogram. the dependent variable (Y) is occupational prestige and the independent variable (X) is years of education. SPSS Commands (Point and Click): Call up SPSS and the GSS1991 data file (available online under computer files for this course. 5 Session 2a: Moderation Analysis with Regression . the interaction term can be computed as the product of X and Z.S. A special term must be constructed to represent the interaction of X and Z. Linear…. select sex and click the triangle. To demonstrate how this works in our example. You can click OK to run this computation. Model fit. select Estimates. and click Continue. General Social Service. or you can click Paste to save the syntax in a syntax file to be run later.sav). select EdxSex and click the triangle to enter EdxSex as the third independent variable in a hierarchical analysis. Regression. select educ and click the black triangle to enter educ into the Numeric Expression: window. Select educ and click the triangle to enter educ as the first independent variable.In this example. and click Continue. If you use Paste. which would indicate that the relationship between X and Y depends on the level of Z. Compute…. R squared change. With regression we can test whether this term contributes beyond the main effects of X and Z in predicting Y. Click Paste to save the syntax. (Highlight the compute statement and the Execute command. Part and partial correlations. click *. Go to the syntax window and run the regression analysis. in the Target Variable: window enter EdxSex. create the interaction term. First. Moderation in this example is indicated by an interaction between X and Z in predicting Y. Click Plots. while gender is a potential moderator (Z). whereby we must enter the main effects of education and sex before we enter the interaction. and we include EdxSex in a final model to predict Y.) The regression analysis to test interactions is hierarchical. Table 1 shows selected SPSS output. Mathematically. Click Next to go to the second block. select sex and click the triangle to enter sex as the second independent variable. press the triangle to run. Descriptives. and Collinearity diagnostics. Click Analyze.

520 .100 14. From Unstandardized Coefficients we find the following: Predicted Y = ˆ = B0 + B1X + B2Z . This also means that the sex difference on occupational prestige varies with level of education.864.520 .995 1.001.005 1. p < .000 Zeroorder Partial Part .520 Toleranc e VIF 2. Dependent Variable: R's Occ upational Prestige Sc ore (1980) The first model uses only Education (X) to predict Y (Occupational Prestige).050 .262 2.864 .027 -1. p=.041.000 a.054 -.600 4. ˆ = 14.050.060 .119 .) 6 Session 2a: Moderation Analysis with Regression .520 .231 .761 Sig. if there is an interaction.668X – 6.063 -.000 1.247 .210 .083 .083Z + .138 .027 .051 .340 Standa rdized Coeffic ients Beta Collinearity Statistic s Correlations t 9.520 1.000 22.294 + 2. We conclude that the relationship between Education and Occupational Prestige differs for males and females. with r = beta = .517 .237 .709Z The coefficient B1 = 2.295 .412XZ The test of statistical significance of the interaction term yields t(1411)=2..005 -.036 20.047 .286 can be interpreted as indicating that for either males or females.294 1. assuming no moderation. .286 .318 . this simple model is not accurate for either group.101 .031 -.2 27.099 10.000 .000 .5 .995 1.286 more units of predicted Y (+2.518 .520.403 + 1.049 .709 22.Table 1: Test of Moderation Effects with a Dichotomous Moderator (N=1415) Coe fficientsa 1 2 3 (Cons tant) Highes t Year of School Completed (Cons tant) Highes t Year of School Completed Respondent's Sex (Cons tant) Highes t Year of School Completed Respondent's Sex EDXSEX Unstandardiz ed Coefficients Std.668 .378 5. one additional unit of X (one more year of education) is associated with 2. B Error 13.041 -.201 -.063 .024 .079 1.000 -.403 . If the effects of education are different for males and females.689 .300 -. The second model predicts Occupational Prestige (Y) from the additive effects of Education (X) and Sex (Z).1 -6. the model may be misleading.286X .412 2.518 22.688 2.466 . t(1413) = 22.520 .732 .182 5.244 -2.255 -. The third model in Table 1 includes the interaction term. ˆ = 22. resulting in the following equation: ˆ = B0 + B1X + B2Z + B3XZ .000 8.286 on the Occupational Prestige scale). However. We see that education is a strong predictor. (Statistical significance doesn’t necessarily indicate a large or important effect.

166 + 1. especially in field studies 7 Session 2a: Moderation Analysis with Regression .412X(1). the regression weight of 2. for males the equation reduces to ˆ = 22.403 + 1.083)(2) + .412*X*1.How does this work? It is instructive to compute the regression equations separately for males and females.403 – 12. or ˆ f = 10.412) is the difference in the regression weight on X for females and males (2.XLS.668*X + . This graph was made with Excel using regression weights from SPSS.083)(1) + .668X – (6. You can access this Excel worksheet through http://WISE. 2.237 + 2. Thus. Occupational Prestige Figure 7: Modeled Occupational Prestige as a Function of Education for Males and Females (N = 1415) 70 60 50 40 Males 30 Females 20 10 0 0 20 Years of Education An interaction is often illustrated effectively with a figure. on average. In the model without the interaction term. which can be written as ˆ = 22. not the actual data (which probably would not show such a nice regular pattern).403 + 1. In this data set.668X – (6. Of course. Simulation studies have shown that statistical power to detect the effects of a dichotomous moderator variable can be very low if samples are small. The weight on the XZ interaction term (B3 = . and where the modeled sex effect is largest.668X + . We can conclude that.492 vs.cgu. Thus. It is important to note that the figure is a model of the relationship.edu under WISE Stuff in a file called Plotting Regression Interactions.083*1 + 1.080X. or ˆ m = 16. which can be written as ˆ = 22.320 + 2. education has a statistically significantly stronger relationship with occupational prestige for females than for males.403 – 6.824X. or if there is restriction of range on the predictor. a test of B3 is a test of the sex difference in the regression weight on education when predicting occupational prestige.080). statistical significance does not imply that this difference is large enough to be theoretically or practically interesting.286 on Education overestimates the relationship for males and underestimates the relationship for females.492X. Sex (Z) is coded Z=1 for males and Z=2 for females. the proportions of cases in the two groups are unequal.412X(2). etc. For females the equation is ˆ = 22. Figure 7 shows the size and direction of the main effects and interaction.

F=2) -. ignoring all other variables. Cumulative R squared = .with large measurement error. while controlling for variables that were entered into the model earlier. low co-occurrence of extreme values of predictors. Reasonable people may choose to report different statistics.518*** -6. R2 Change measures and tests the effect sizes of components.027 . The B weights and their tests of 8 Session 2a: Moderation Analysis with Regression . if we wish. However. each of which can be useful.403*** 4. These weights allow us to construct the raw regression equation. statistically significant with p<. The correlation of the interaction term with the criterion is not easily interpreted.668*** . the interaction term adds significantly beyond the main effects (R2 Change = . depending on the goals of the study. For example. and Beta is from the model at Step 2 (all main effects. if Sex had been entered alone on Step 1. Adjusted R squared = .05.01.083* 2.001.300 *p<. it is best omitted from the table. because this correlation is greatly influenced by scaling of the main effects. and small effect sizes (McClelland & Judd. Table 2: Moderation Effects of Sex on Education in Predicting Occupational Prestige (N=1415) Variable 1 Education (years) 2 Sex (M=1. and we can use them to compute separate equations for males and females. Sex adds only . The third type of information comes from the unstandardized B weights in the final model.063** .412* . First.520*** .689 -. p<.201 --- 22.002.) Because of partial overlap with education.318 Beta . B and SEB are from the final model at Step 3. The second type of information comes from R2 Change at each step.05). **p<.001 3 Education x Sex --- .01. (R2 Change for the first term entered into a model is simply its r squared. ***p<.001 R2 Change (not significant) when it is entered after Education is in the model. We can see that Education is a much better predictor of Occupational Prestige than Sex.271. Here the order of entry is critical if the predictors overlap with each other.273. Table 2 shows one way to present a selection of important information. Presenting Results from Regression Analysis in a Table Results can also be presented in a table. but no interaction term). although both correlations are statistically significant. Table 2 summarizes key information with four conceptually distinct types of data.270*** B SEB 1. R2 Change would have been . indicating that we do have a statistically significant interaction between Sex and Education in predicting Occupational Prestige. we have the simple correlations (r) which tell us how each individual predictor variable is related to the criterion variable. 1993).004**.002* (Constant) r R2 Change Step .

Cohen. Occupational Prestige. these main effects may be misleading.599 . p<. ***p<.142 .01. unless the variable has a meaningful zero (also see Marquardt. the mean on Education is 13. because they refer to the unique contribution of each main effect beyond all other terms. p. Adjusted R squared = . Education is centered to a mean of zero.02. Cohen. Table 3: Moderation Effects of Sex on Education in Predicting Occupational Prestige.063. and Aiken (2003. These are tests of the unique contribution of each main effect beyond all other main effects. When there is an interaction. especially when zero is not meaningful on a scale (e. Compare the B coefficients for Sex in Tables 2 and 3. although its simple r was -. as it is equivalent to the test of R2 change for the final term.401 . 267) recommend that continuous predictor variables be centered before interaction terms are computed.450 *p<.273. In our example. the test of Sex is for a sex difference when Education is at the mean level of education. Dummy Mediator Variables and Centered Continuous Predictor Variables Interpretability of regression coefficients can be improved by ‘centering’ continuous predictor variables.027). With centered variables in Table 3.271. **p<. because we wish to predict values on the original scale.001.significance on the main effects are not easily interpreted in the final model.02 from Years of Education for each case.412* 43. including the interaction (which was computed as a product of the main effects). In this case.027 .720 . If the main effects do not overlap at all. both tests tell us that the interaction is statistically significant. the beta weight for each variable is identical to its r value. West.201 Beta .002* . Education Centered and Sex Dummy Coded (N=1415) Step Variable 1 Education (years) 2 Sex (M=0.With uncentered Education in Table 2.063** R2 Change B SEB . The test of B for the last term entered into the model is meaningful. Cumulative R squared = .520*** -. the test of Sex is for a sex difference when Education = 0 years. Centering reduces multicollinearity or overlap of the interaction term with other predictors and may improve interpretability.001 -. We do not center the dependent variable. Thus.. The fourth type of information comes from the tests of regression weights for the model that contains only main effects (no interactions).080*** . 9 Session 2a: Moderation Analysis with Regression .066* . an SAT score of 0 is meaningless). Here we see that Sex does not contribute significantly beyond Education in predicting Occupational Prestige (beta = -. Centering is accomplished by subtracting the mean from the variable. 1980).01.270*** 2.05.g. We create a centered education variable by subtracting 13. The syntax for this analysis is shown in Appendix A. F=1) 3 Education x Sex (Constant) r .471*** -. a centered score is a deviation score.

Multicollinearity and Tolerance Multicollinearity is the proportion of variance in a predictor that can be predicted from other predictor variables. We can also interpret the regression weight on Sex (B2 = -.e. We can interpret the constant in Table 3 (B0 = 43. low tolerance). However. That information is probably less interesting than the sex difference at the average level of education.When variables are centered. it is likely that neither one will make a unique contribution to the model. as we can see by comparing the R2 Change.036. and because the tests for the main effects are not at the means of the original scales. The tolerance for EdxSex = . This means that if we were to use multiple regression to predict EdxSex using all of the other predictor variables in the model with uncentered variables (Education and Sex).083) is the modeled difference between Occupational Prestige for males and females at zero years of education. Comparing Tables 2 and 3. even if each is a good predictor.. and tolerance = 1/27. i.S. respectively. In this example.720) as the difference in Occupational Prestige for males and females at the mean on education.R2 = 1-.244. Is the correlation between education and occupational prestige greater for those people whose mothers have more years of formal education? The data from this example are also from the 1991 U. or the proportion of variance in a predictor that cannot be predicted from other predictor variables. which are not easily interpreted because of the great overlap between the interaction term and the main effects.5 = . tolerance is one minus multicollinearity. the interaction term is again computed as the product of two predictors.964. VIF = 27. In contrast. and 1 . and it is tested as the contribution of the interaction term beyond the main effects in predicting the dependent variable. No conclusions are changed. and tolerance would be 1. in Table 2.378. Centering reduces multicollinearity with the interaction term. In this case. we see that the R and R2 change values are the same. the R2 for predicting any one from the others would be zero.00. 10 Session 2a: Moderation Analysis with Regression .401) as the mean value on Occupational Prestige when all predictors are zero. In Table 3 we show the simpler and more common convention of reporting both B and beta for the final model. it may be desirable to eliminate one of the predictors or to make a composite of the two. If all predictors were independent. and SEB for the interaction term in Tables 2 and 3. However. General Social Survey.231.964 = . we will test whether mother’s education moderates the relationship between years of education and occupational prestige. so the B coefficients are much more stable (compare the SEB in the two tables). we would find R2 = . and . -. SPSS Example: Moderation Effects with a Continuous Moderator With two continuous predictors.036. When predictor variables overlap substantially (high multicollinearity. For EdxSex. the weight on Sex (B2 = -6. the test of the interaction term is not affected by centering. When two predictors are highly correlated. The beta values for the final model with uncentered data (not shown in Table 2) would have been . for males (Sex = 0) at the mean of education (centered Education = 0). generally there is much less overlap between the interaction and the two main effects. The error term for a regression weight is inflated in proportion to the inverse of tolerance. B. presenting findings is more challenging.036. This term is the VIF (Variance Inflation Factor).5. regression weights are unstable.

005.07 .554 -.258*** 2.000 .001).822 .558 2.588 19.826 .490 -.557 -.081 Std.001).01) although the zero-order correlation is positive (r = .001.Table 4: SPSS Output for Moderation with a Continuous Moderator (N=1162) Coe fficie ntsa Standa rdized Coef f ic ients Unstandardized Coef f icients Model 1 2 3 (Cons tant) CEDUC2 (Cons tant) CEDUC2 CMA EDUC (Cons tant) CEDUC2 CMA EDUC cedxmaed B 43.496 -.646 19. Error .716 -2.006** .554*** . p < . p < .2. The SPSS analysis for centered continuous predictor variables is shown in Table 4.669 2. p < . p < .071 .01. it tests the extent to which the strength of the linear relationship between a predictor and the dependent variable is a linear function of the level of a second predictor variable.826 1.624 .07 . This may be a surprising finding.001.273 .507 1.334 . Both predictors are centered to a mean of zero.669 2.006 .005** -.294 43.003 Correlations ZeroPart order ial Part Collinearity Statistic s Toler ance V IF .539 -.507 .077 t 130.507 .216 1.755 123.081 .967 Sig.000 .027 Beta .273 .05.802 .000 .000 1.624 -.087 .507 .107 .351 r --- *p<.507*** .507.01).077** 43.406 43.335 . SPSS syntax for this analysis is shown in Appendix B. 1158) = 141.507 . Notice that the beta for Mother’s Education in Model 2 in Table 4 is negative (-. That is.211 1.351 . ***p<.032 .496 -.054 a. Mother’s Education adds significantly on the second step (R2 Change = .133 .057 130.000 . **p<.076 .075 .501 -.148 -.266.247 1.343 . and Table 5 illustrates a summary table presentation of these results.507 . p < .027 .268. .277 20.000 .011 .148*** .120 . Cumulative R squared = .132 .107 -.430 -2.000 .107 .211 . F(3. Table 5: Moderation Effects of Mother’s Education on Respondent’s Education in Predicting Occupational Prestige (N=1162) Step Variable 1 Education (years) 2 Mother’s Educ 3 Educ x Mom Educ (Constant) R2 Change B SEB Beta .148. Education is a strong predictor of Occupational Prestige (r = .071* .148 .08 . Dependent V ariable: pres tg80 R's Occupational Pres tige Score (1980) This moderation analysis is designed to detect a linear by linear interaction.133 .076.343 2.000 .949 1. Occupational 11 Session 2a: Moderation Analysis with Regression .06 . Adjusted R squared = .

23 years. the value on the centered scale (cmaeduc) is 6 .343 + 2.21) + .443.557 more points on the predicted Occupational Prestige scale.41 = 6. respectively. we might select levels of Mother’s Education at the mean and at one standard deviation above and below the mean. When we replace cmaeduc with -4. To construct a figure. for people of a given level of education. such as 16. or ˆ = 41.79 and 3.59. This is the model for cases where maeduc = 6. et al. The statistically significant interaction must be considered when main effects are interpreted.41 = -7.046*(ceduc). For respondents with 6 years of education.624*(ceduc) .. The beta weight for Education in Model 2 (. 77-78).21)*(ceduc). great care must be taken with conversions between the raw and centered scales.294 fewer points on the Occupational Prestige scale.44 = 7. We can use Excel with these values to generate a plot of the modeled relationships as shown in Figure 8 (generated with Plotting Regression Interactions.343 + 2.41.79.21 for cmaeduc.236*(ceduc).35 years.651 + 2.273*(-4. In this example.. For someone with average education (centered Education = 0). Occupational Prestige is greater for those whose mothers have less education.79 = -4. each additional year of Mother’s Education is associated with . the value on cmaeduc is 16 . on average. available on 12 Session 2a: Moderation Analysis with Regression .Prestige is greater for those whose mothers have more education. pp.79 + 3. it might be better to pick more meaningful high and low values for Mother’s Education. each additional year of education is associated with 2.10. we could use a high value of 10.081*(ceduc * cmaeduc). An indicator of suppression is when the beta weight for a variable is not between zero and the correlation of that variable with the dependent variable (Y). In our example. The model for uncentered variables is easier to use when generating a figure. the regression equation becomes ˆ = 43.79) + .796. and a low value of 10. We have a modest ‘suppression’ relationship (see Cohen.79. 12.624*(ceduc) .273*(5. ceduc = 6 – 13. which is 5.21.79 years.343 + 2.081*(5.41 years. For someone whose mother has average education (centered Mother’s Education = 0). the regression equation becomes ˆ = 43.79 = 5.081*(-4..XLS.41 and 2. and for Mother’s Education these values are 10. 2003. Model 2 does not consider the interaction between predictors. (2003). or ˆ = 44. Suppression may or may not be large enough to be practically or theoretically interesting.507). Entering 5. the mean and SD for Education are 13. A figure can be very helpful to describe complex findings such as these.921 + 3. the mean of 10. For cases where Mother’s Education is 6. ceduc = 20 – 13.21 above the mean of maeduc. we can use information from the regression model of the relationship between Education and Occupational Prestige for each of several levels of Mother’s Education.79)*(ceduc). If centered variables are used to create a figure. Here is an example. For Mother’s Education.539) is greater than the zero-order correlation (.273*(cmaeduc) + .79–3. The final regression model for centered data in Table 4 is ˆ = 43. and 6 years.10. The mean education for respondents was 13.624*(ceduc) . For cases where Mother’s Education is 16. This is the model for cases where maeduc = 16. Following Cohen et al. indicating that mother’s education ‘suppresses’ the relationship between Education and Occupational Prestige when it is not controlled.44 = 14. however. For respondents with 20 years of education..

and is more suitable for a nontechnical audience. (2003) or specialized books such as Aiken and West (1991). Berger (2004) provided short introductions to categorical variables. 13 Session 2a: Moderation Analysis with Regression . et al. and it may give a misleading impression of regularity in the data.cgu. Figure 8: Modeled Occupational Prestige as a Function of Education and Mother's Education (N = 1162) 70 Occupational Prestige 60 Mother’s Education 50 6 years 40 12 years 30 16 years 20 10 0 6 20 Respondent's Education (years) Keep in mind that a modeled description of the data is not a complete description of actual data. centering. outliers. hierarchical selection of variables. power analysis and sample size. Be sure to plot the raw data to assure that models are appropriate. In our model. and stepwise vs.edu under WISE Stuff). Estimates near the ends of the distribution are less reliable than estimates from the middle. multicollinearity. adjusted.http://WISE. nonlinear relationships. The figure shows the size and direction of the effects more clearly than tabled numbers. interactions. Other Issues Many additional issues are discussed in detail in comprehensive textbooks such as Cohen. missing data. Especially be careful not to over interpret patterns in the model at the extremes of observed data. the large effect of Mother’s Education when the respondents’ Education = 6 should not be taken seriously without additional evidence. correlation and causation. Very few respondents had as little as six years of education.

In general. it is important to include effect sizes and directions of effects along with statistical significance. Be on the lookout for omitted ‘lurking’ variables that may affect multiple variables in your model. Additionally. independent. log or square root) may reduce the effects of extreme scores. Estimates and tests of mediation and moderation are based on assumptions. but a plot of residuals is the best way to find multivariate outliers. a glossary of terms. A residual plot can help you spot extreme outliers or departures from linearity. 2007. Perhaps when these prior variables are included. sampling must be random and independent if we wish to generalize to the population from which the sample was selected. make relationships more linear. A transformation of your data (e. we must assume that residuals from the regression modules are reasonably normally distributed.Summary and Final Advice The most important advice is to get close to your data and make sure that your models and descriptions are appropriate to the data. and homoscedastic (equal variance at all values of predicted Y). G*Power is a wonderful free program for power analysis that you can download from http://www. An assumption of regression analysis is that residuals are random. and addition summary advice in Berger (2004).. A model that hypothesizes causal flow in a different direction may fit the data equally well and also produce statistically significant effects. and make the distributions closer to normal (see Tabachnick & Fidell. You can find discussion of mediation analysis in program evaluation. 14 Session 2a: Moderation Analysis with Regression . normally distributed.psycho. In particular. It is essential to examine the plot of residuals as a function of predicted Y. Keep in mind that alternate models may also account for the data.g.de/abteilungen/aap/gpower3/. the direct (unique) contributions of observed variables will change.uni-duesseldorf. Chapter Four on “Cleaning Up Your Act”). Bivariate scatter plots can also provide helpful diagnostics.

S. and statistical considerations. Y. S. (2007). Using regression analysis. M. CA: Sage. Muller. H. D. L. J. T. 115-134. (2004). L. Newbury Park.. Multiple regression: Testing and interpreting interactions. Journal of Personality and Social Psychology.References Aiken. G. D. Tabachnick.. McClelland. 114. (1980). (2002). 6.htm [Excellent discussion of moderation. E. 2nd ed. Cohen. In S. & Judd. 479-505.. C. E. & West. (2001). A primer on regression artifacts. J. S.. H. Handbook of program development for health behavior research. (2005)..cgu. Mahwah. 877-883. Marquardt. D.).. (2003). 376-390. & Yzerbyt. The moderator-mediator distinction in social psychological research: Conceptual. & Agras W. Boston: Allyn and Bacon.net/cm/moderation. D. S.. 89. New York: Guilford. 1173-1182. and much. Statistical difficulties of detecting interactions and moderator effects. (1991). & Newcomer. Estimating and testing mediation and moderation in within-participant designs. K. 51. A.com/spss-sas-and-mplus-macros-and-code. D. Handbook of practical program evaluation. In Wholey. Fairburn C. 852-863. S. G. West. W. 87-91. & Kenny. G. & Kenny.] Judd. M.. Cohen. (1999). A. B. Hatry. much more. Campbell.afhayes. Judd. L. Berger. CA: Sage Publications. Journal of Personality and Social Psychology. http://www. D. & Aiken. V. Wilson G.). 15 Session 2a: Moderation Analysis with Regression . A. S. Archives of General Psychiatry. Applied multiple regression/correlation analysis for the behavioral sciences (3rd ed. moderation. Psychological Methods. & McClelland. Baron. Sussman (Ed. D.. G. D. G. A. http://davidakenny. M. G... Psychological Bulletin. C. (eds. strategic. C. You should standardize the predictor variables in your regression models. Newbury Park. T. H.). When moderation is mediated and mediation is moderated. and MPlus macros for mediation.html [Downloadable SPSS. Kenny.. D. Journal of the American Statistical Association. R.. SAS. 75. Web Interface for Statistics Education: WISE http://wise. Kenny. Jossey Bass. S. Donaldson. (1986). Mediators and moderators of treatment effects in randomized clinical trials. & Fidell.. C. (2001). M. P.). 59.. (1993).edu Berger.] Kraemer H. Hayes. Mediator and moderator analysis in program development. I. NJ: Lawrence Erlbaum Associates. Using multivariate statistics (5th ed. 470-496.

*Data file is GSS1991 from SPSS. *Hierarchical analysis with interaction entered last. EXECUTE . COMPUTE filter_$=(sex >= 0 & educ >= 0 & prestg80 >= 0). FORMAT filter_$ (f1. checking for errors or violations of assumptions. using uncentered education. *First.Appendix A: SPSS syntax for moderation analysis with a dichotomous moderator *Syntax for Multiple Regression Workshop . *Create interaction term. FREQUENCIES VARIABLES=sex educ prestg80 /FORMAT=LIMIT(10) /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN SKEWNESS SESKEW KURTOSIS SEKURT /HISTOGRAM NORMAL /ORDER= ANALYSIS . *Recheck to make sure the filter worked as intended.moderation. FILTER BY filter_$. COMPUTE EdxSex = educ * sex . look at the data carefully. VARIABLE LABEL filter_$ 'sex >= 0 & educ >= 0 & prestg80 >= 0 (FILTER)'. REGRESSION /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA COLLIN TOL CHANGE ZPP /CRITERIA=PIN(. 16 Session 2a: Moderation Analysis with Regression .0). FREQUENCIES VARIABLES=sex educ prestg80 /FORMAT=LIMIT(10) /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN SKEWNESS SESKEW KURTOSIS SEKURT /HISTOGRAM NORMAL /ORDER= ANALYSIS . VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. *Select only cases with complete data.05) POUT(. *Note data are missing on prestg80. EXECUTE .10) /NOORIGIN /DEPENDENT prestg80 /METHOD=ENTER educ /METHOD=ENTER sex /METHOD=ENTER EdxSex /RESIDUALS HIST(ZRESID) . USE ALL.

*Recenter education for this reduced sample with N=1162 and create interaction term. FILTER BY filter_$.4088.0). *Find the means for the new subset of cases.13.ceduc. The ‘EXECUTE’ command must be given to create these variables before they can be used in regression.7926. compute cedxmaed = ceduc2*cmaeduc. FORMAT filter_$ (f1. RECODE SEX (1=0) (2=1) INTO sexd.10) /NOORIGIN /DEPENDENT prestg80 /METHOD=ENTER ceduc /METHOD=ENTER sexd /ent=cedxsexd /RESIDUALS HIST(ZRESID) .Create a dummy variable for sex and center education on its mean.02 years. COMPUTE CEDXSEXD = SEXD * CEDUC . *Hierarchical regression with interaction entered last. *Limit analyses to cases with complete data. EXECUTE . mother's education. using centered education. FREQUENCIES VARIABLES=educ. EXECUTE . COMPUTE CMAEDUC = MAEDUC .13.05) POUT(.prestg80 /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS CI R ANOVA TOL CHANGE ZPP /CRITERIA=PIN(. COMPUTE CEDUC = EDUC . maeduc. Appendix B: SPSS syntax for moderation analysis with a continuous moderator *Example with continuous moderator variable. 17 Session 2a: Moderation Analysis with Regression . COMPUTE filter_$=(educ >= 0 & maeduc >=0 & prestg80 >= 0).02. REGRESSION /variables=sexd. COMPUTE CEDUC2 = EDUC . Then calculate the interaction term. 13. center education. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. VARIABLE LABEL filter_$ 'educ >= 0 & maeduc >=0 & prestg80 >= 0 (FILTER)'. prestg80 /STATISTICS=STDDEV MINIMUM MAXIMUM MEAN MEDIAN SKEWNESS SESKEW /HISTOGRAM NORMAL /ORDER= ANALYSIS .10.cedxsexd. *Recode sex to dummy variable. create interaction term. USE ALL.

Excel workbook for plotting regression interactions Unstandardized regression coefficients (B weights) can be copied from the Coefficients table in SPSS output files and pasted into an Excel worksheet to generate a graph showing interactions.05) POUT(. cmaeduc. 18 Session 2a: Moderation Analysis with Regression .edu: go to WISE Stuff. prestg80 Even if the analysis used only ceduc2 and prestg80. An alternative method to limit a regression analysis with a subset of variables to those cases that have complete data on all of the cases used in the full model is to use a command like this /variables=ceduc2. bar graphs may be the best way to present the relationships. REGRESSION /variables=ceduc2.10) /NOORIGIN /DEPENDENT prestg80 /METHOD=ENTER ceduc2 /METHOD=ENTER cmaeduc /ent=cedxmaed /RESIDUALS HIST(ZRESID) . There is an interaction between gender and college in predicting starting salary.*Tables 4 and 5. cmaeduc. prestg80 /DESCRIPTIVES MEAN STDDEV CORR SIG N /MISSING LISTWISE /STATISTICS COEFF OUTS R ANOVA TOL CHANGE ZPP /CRITERIA=PIN(. showing starting salaries for men and women in different colleges. The example below is taken from a data set provided by SPSS. cases missing data on cmaeduc would be omitted. cedxmaed.cgu. With two categorical variables. Demonstrations using Excel. You can download an Excel template for making figures from http://wise. cedxmaed. Excel Downloads. Plotting Regression Interactions.

genders. 19 Session 2a: Moderation Analysis with Regression .The Excel workbook Plotting Regression Interactions. in predicting occupational prestige. etc. offers templates where regression coefficients from SPSS can be copied into the template to produce graphs. Here are examples showing interactions with education.

20 Session 2a: Moderation Analysis with Regression .