You are on page 1of 41

Mediation and Multi-group

Lyytinen & Gaskin
In an intervening variable model,
variable X, is postulated to exert an
effect on an outcome variable, Y,
through one or more intervening
variables called mediators (M)
mediational models advance an X M
Y causal sequence, and seek to
illustrate the mechanisms through which
X and Y are related. (Mathieu & Taylor)

Why Mediation?
Seeking a more accurate explanation of the
causal effect the antecedent (predictor)
has on the DV (criterion , outcome) focus
on mechanisms that make causal chain
Missing variables in the causal chain
Intelligence Performance
Intelligence Work Effectiveness

Conditions for mediation
(1) justify the causal order of variables including
temporal precedence;
(2) reasonably exclude the influence of outside
(3) demonstrate acceptable construct validity of
their measures;
(4) articulate, a priori, the nature of the intervening
effects that they anticipate; and
(5) obtain a pattern of effects that are consistent
with their anticipated relationships while also
disconfirming alternative hypotheses through
statistical tests.

Conditions for mediation
Inferences of mediation are founded first and foremost
in terms of theory, research design, and the construct
validity of measures employed, and second in terms of
statistical evidence of relationships.
Mediation analysis requires:
1) inferences concerning mediational X MY
relationships hinge on the validity of the assertion that
the relationships depicted unfold in that sequence
(Stone-Romero & Rosopa, 2004). As with SEM, multiple
qualitatively different models can be fit equally well to
the same covariance matrix. Using the exact same data,
one could as easily confirm a YMX mediational
chain as one can an XMY sequence (MacCallum,
Wegener, Uchino, & Fabrigar, 1993).

Conditions for mediation
2) experimental designs is to isolate and test, as best as possible,
XY relationships from competing sources of influence. In
mediational designs, however, this focus is extended to a three
phase XMY causal sequence requiring random assignments to
both X and M and related treatments
Because researchers may not be able to randomly assign participants to
conditions, the causal sequence of XMY is vulnerable to any selection related
threats to internal validity (Cook & Campbell, 1979; Shadish et al., 2002). To the
extent that individuals status on a mediator or criterion variable may alter their
likelihood of experiencing a treatment, the implied causal sequence may also be
compromised. For example, consider a typical: trainingself-
efficacyperformance, mediational chain. If participation in training is
voluntary, and more efficacious people are more likely to seek training, then the
true sequence of events may well be
self-efficacytrainingperformance. If higher performing employees develop
greater self-efficacy (Bandura, 1986), then the sequence could actually be
performanceefficacytraining. If efficacy and performance levels remain
fairly stable over time, one could easily misconstrue and find substantial support
for the trainingefficacyperformance sequence when the very reverse is
actually occurring. (Mathieu and Taylor 2006)
Conditions of mediation
It is a hallmark of good theories that they articulate the how and
why variables are ordered in a particular way (e.g., Sutton &
Staw, 1995; Whetten, 1989). This is perhaps the only basis for
advancing a particular causal order in non-experimental studies
with simultaneous measurement of the antecedent, mediator,
and criterion variables (i.e., classic cross-sectional designs).
Implicitly, mediational designs advance a time-based model of
events whereby X occurs before M which in turn occurs before Y. It
is the temporal relationships of the underlying phenomena that
are at issue, not necessarily the timing of measurements
In other words, in mediation analyses, omitted variables
represent a significant threat to validity of the XM relationship if
they are related both to the antecedent and to the mediator, and
have a unique influence on the mediator. Likewise omitted
variables (and related paths) may lead to conclude falsely that no
direct effect XY exists, while in fact it holds in the population

Importance of theory
Cause and effect
Self- Performa
Training efficacy nce

Performa Self-
nce efficacy

Performa Self-
nce efficacy

Self- Performa
efficacy nce
Significant Path
Types of Mediation Insignificant Path

Indirect Effect

Partial Mediation

Full Mediation
More complex mediation structures
Chain Model

X M1 M2 M3 Y


X M2 Y


Parallel Model
Hypothesizing Mediation
All types of mediation need to be explicitly and
with good theoretical reasons and logic
hypothesized before testing them
Indirect Effect
You still need to assume and test that X has an
indirect effect on Y, though there is no effect in path
X has an indirect, positive effect on Y, through M.
Partial or Full
M partially/fully mediates the effect of X on Y.
The effect of X on Y is partially/fully mediated by M.
The effect of X on Y is partially/fully mediated by M 1,
M2, & M3.
Statistical evidence of relationships.
Each type of mediation needs to be backed by appropriate
statistical analysis
Sometimes the analysis can be based on OLS, but in most
cases it needs to be backed by SEM based path analysis
There are four types of analyses to detect presence of
mediation relationships
1. Causal steps approach (Baron-Kenny 1986) (tests for
significance of different paths)
2. Difference in coefficients (evaluates the changes in
betas/coefficients and their significance when new paths are
added to the model)
3. Product of effect approach (tests for indirect effects a*b- this
always needs to be tested or evaluated using bootstrapping)
4. Sometimes evaluating differences in R squares

Statistical evidence of relationships
Convergent validity is critical for mediation tests as this forms
the basis for reliability especially poor reliability of mediator as
to the extent that a mediator is measured with less than
perfect reliability, the MY relationship would likely be
underestimated, whereas the XY would likely be
overestimated when the antecedent and mediator are
considered simultaneously (see Baron & Kenny 1986)
Discriminant validity must be gauged in the context of the larger
nomological network within which the relationships being
considered are believed to reside. Discriminant validity does not
imply that measures of different constructs are uncorrelated
the issue is whether measures of different variables are so
highly correlated as to raise questions about whether they are
assessing different constructs. It is incumbent on researchers to
demonstrate that their measures of X, M, and Y evidence
acceptable discriminant validity before any mediational tests are
Statistical evidence of

Statistical evidence of the
In simple partial mediation mx is the coefficient for X
for predicting M, and ym.x and yx.m are the coefficients
predicting Y from both M and X, respectively. Here yx.m
is the direct effect of X, whereas the product mx*ym
quantifies the indirect effect of X on Y through M. If all
variables are observed then yx = yx.m + mx*ym or
mx*ym = yx - yx.m
Indirect effect is the amount by which two cases who
differ by one unit of X are expected to differ on Y
through Xs effects on M, which in turn affects Y
Direct effect part of the effect of X on Y that is
independent of the pathway through M
Similar logic can be applied to more complex situations

What would be the paths

Statistical analysis
The testing of the existence of the mediational effect
depends on the type of indirect effect
The lack of direct effect XY (yx is either zero or not
significant) is not a demonstration of the lack of
mediated effect
Therefore three different situations prevail (in this
1. The presence of a indirect effect (mx*ym is
2. The presence of full mediation (yx is significant but
yx.m is not)
3. The presence of partial mediation (yx is significant
and yx.m is non zero and significant)

Testing for indirect effect

Testing for full mediation

Testing for partial mediation

Observations of statistical
The key is to test for the presence of a significant
indirect effect just demonstrating the significant of
paths yx, yx.m,mx.y, and mx is not enough
One reason is that Type I testing of statistical
significance of paths is not based on inferences on
indirect effects as products of effects and their
Can be done either using Sobel test (see e.g. or bootstrapping
Sobel tests assumes normality of product terms and
relatively large sample sizes (>200)
Lacks power with small sample sizes or if the distribution
is not normal

Bootstrapping (available in most statistical packages, or there is
additional code to accomplish it for most software packages)
Samples the distribution of the indirect effect by treating the obtained
sample of size n as a representation of the population as a minitiature
and then resampling randomly the sample with replacement so that a
sample size n is built by sampling cases from the original sample by
allowing any case once drawn to be thrown back to be redrawn as the
resample of size n is constructed
mx and ym and their product is estimated for each sample recorded
The process is repeated for k times where k is large (>1000)
Hence we have k estimates of the indirect effect and the distribution
functions as an empirical approximation of the sampling distribution of
the indirect effect when taking the sample of size n from the original
Specific upper and lower bound for confidence intervals are established
to find ith lowest and jst largest value in the ordered rank of value
estimates to reject the null hypothesis that the indirect effect is zero with
e.g. 95 level of confidence
Observations of statistical
In full and partial mediation bivariate XY (assessed via correlation
rYX or coefficient yx) must be nonzero in the population if the effects
of X on Y are mediated by M
Hence establishing a significant bivariate is conditional on sample
For example Assume that N=100 and sample correlations rXM=.30 and
rMY =.30 and both would be significant at p<.05. However sample
correlation rXY =.09 would not!
Hence tests for full mediation can be precluded if this is the true
model in the population
This point become even more challenging when complex mediations
XM1M2M3Y are present.
Hence many times full mediations are not detected due to
underpowered designs; the same holds for interactions or
suppression variables; in fact four step Baron Kenny has power of .52
with a sample size of 200 to detect medium effect!
This can be overcome by bootstrapping

Observations of statistical analysis
Testing for full mediation requires that yx.m
is zero. When yx.m does not drop zero the
evidence supports partial mediation. This
requires researchers to make a priori
hypotheses concerning full or partial
mediation and transforms confirmatory
tests to exploratory data mining
What counts as significant reduction in yx
vs. yx.m is not clear (c.f. from .15 to .05 vs. .
75 to .65)
Typically the baseline model for mediation
24 is partial mediation while theoretical clarity
Testing for Mediation in
Direct Effects First

Regression Weights S.E. C.R. P
loylong <--- ctrust .282 .048 5.812 ***
loylong <--- atrust .184 .048 3.850 ***
Testing for Mediation in
Add Mediator

Regression Weights S.E. C.R. P
value <--- atrust .210 .048 4.400 ***
value <--- ctrust .602 .048 12.452 ***
loylong <--- ctrust .089 .056 1.592 .111
loylong <--- atrust .123 .047 2.638 .008
26 loylong <--- value .312 .052 5.935 ***
Testing significance of partially
mediated paths Sobel Test
Use for partially mediated relationships.
Use the Sobel Test online calculator
Assumes normal distribution
and sufficiently large sample
Regression Estimat
S.E. C.R. P
Weights e
value <--- atrust .210 .048 4.400 ***
value <--- ctrust .602 .048 12.452 ***
loylong <--- ctrust .089 .056 1.592 .111
loylong <--- atrust .123 .047 2.638 .008
loylong <--- value .312 .052 5.935 ***
Testing significance of indirect
effects Bootstrapping At least 1000

No Missing Values Allowed!

Testing significance of indirect
effects Bootstrapping

p- values

Direct Effects - Two Tailed
No Mediation wu wf aut burnm burnc
If Indirect is > burnm 0.003 0.033 0.026 ... ...
0.05 burnc 0.004 0.969 0.435 ... ...
satc 0.845 0.026 0.260 0.016 0.007
Full Mediation satw 0.004 0.836 0.020 0.011 0.009
Given the
direct effects Indirect Effects - Two Tailed
wu wf aut burnm burnc
significant burnm ... ... ... ... ...
prior to adding burnc ... ... ... ... ...
the mediator satc 0.005 0.546 0.016 ... ...
If Indirect < satw 0.003 0.115 0.016 ... ...
0.05 and
Total Effects - Two Tailed
Direct is > Significance
0.05 wu wf aut burnm burnc
Partial burnm 0.003 0.033 0.026 ... ...
Mediation burnc 0.004 0.969 0.435 ... ...
satc 0.033 0.024 0.026 0.016 0.007
Partial Mediation


Full Mediation

Overall value partially mediates the effect of trust in
agent on loyalty for longterm (p < 0.000).
Overall value fully mediates the effect of trust in
company on loyalty for longterm (p < 0.000).
Using AMOS for testing chain
models and parallel models

Moderation concept
Based on the observation that independent-
dependent variable relationship is affected
by another independent variable
This situation is called moderator effect
which occurs when a moderator variable, a
second independent variable changes the
form of the relationship between another
independent variable and the DV
Can be expanded to a situation where the
mediated relationship is moderated

Moderation: affecting the
Moderating variables must be chosen with strong
theoretical support (Hair et al 2010)
The causality of the moderator cannot be tested
Becomes potentially confounded as moderator
becomes correlated with either of the variables in the
Testing easiest when moderator has no significant
relationship with other constructs
This assumption is important in distinguishing
moderator from mediators which (by definition) are
related to both constructs of the mediated

Moderation: Multi-group
Non-Metric moderators: categorical
variables are hypothesized as moderators
(gender, age, turbulence vs. non-
turbulence, non customer vs. customer)
For non-metric variables a multi-group
analysis is applied i.e. data is split for
separate groups for analysis based on
variable values and tested for statistical
difference (both for measurement and
structural model)

Multi-group example

Exercis Weight
e Loss

Exercis Weight
Low e Loss

Moderator vs. Mediator

Mediator: the means by which IV affects DV


Moderator: a variable that influences the

magnitude of the effect an IV has on a DV


Mediation vs. Moderation Example

ce that the mediator and the moderator can be the

a mediator also be used as a moderator?
- see Baron and Kenny 1986 for a complex example
Some Theory-based Criteria
(i.e., arguments for mediation and moderation are based on theory
first, rather than statistical correlations)

Logical effect of IV
Logical cause of DV
Not logically correlated to IV or DV (if
Holistic/multiplicative effect (interaction)
Varying effect for different categorical values

Driving home the point: Neither,
Moderator or Mediator? One or the
Caloric intake
Positive reinforcement


Exercise partner Exercis Weight
Exercise e Loss
40 IQ
Koufteros & Marcoulides 2006