Professional Documents
Culture Documents
0092 6566 (86) 90129 7 PDF
0092 6566 (86) 90129 7 PDF
HERBERT W. MARSH
AND
GARRY E. RICHARDS
INTRODUCTION
The Internal-External construct as inferred from the Rotter scale. In-
ternal-external (IE) locus of control is hypothesized to be a bipolar con-
struct. The locus is internal if a person perceives events to be contingent
upon his/her behavior or relatively enduring personal characteristics; the
locus is external when events are seen to be contingent upon luck, fate,
the control of powerful others, the environment, or some characteristic
not under his/her own control (Lefcourt, 1976, 1981; Rotter, 1966, 1975;
The authors acknowledge the assistance of Jennifer Barnes for assistance in data preparation,
and thank the participants and staff of the Outward Bound programme. Address requests
for reprints to Dr. Herbert Marsh. Faculty of Education, University of Sydney, Sydney,
NSW 2006. Australia.
509
0092-6566/86 $3 .OO
Copyright 0 1986 by Academic Press, Inc.
All rights of reproduction in any form reserved.
510 MARSH AND RICHARDS
Stipek & Wiesz, 1981). While a large number of I-E scales have been
developed, the most widely used is the Rotter Scale, and this instrument
will be the focus of the present investigation. The Rotter I-E scale consists
of 23 pairs of statements, using a forced-choice format, and six filler
questions. Each pair contains one internal statement and one external
statement, and subjects make a dichotomous choice between the two
alternatives. The Rotter instrument is based on the assumptions that (a)
the IE construct is relatively unidimensional, (b) internality and externality
represent endpoints of a bipolar dimension, and (c) the use of a dichotomous
forced-choice format is the most effective way to infer the construct.
The purpose of the present investigation is to examine these assumptions.
Although Rotter suggested that the IE construct inferred from his
instrument was relatively unidimensional, subsequent research has shown
it to be clearly multidimensional (e.g., Abrahamson, Schludermann, &
Schludermann, 1973; Collins, 1974; Dixon, McKee, & McRae, 1976;
Gurin, Gurin, & Morrison, 1978; Marsh &Richards, in press; Mirels, 1970;
O’Brien & Kabanoff, 1981; Watson, 1981; and Zuckerman & Gerbasi,
1977). MarshandRichards(inpress),onthe basisofareviewofthisresearch
and empirical analyses with confirmatory factor analysis, found that five
factors can be identified in responses to the Rotter scale: General Luck
(GL); Political Control (PC); Success via Personal Initiative (SV); In-
terpersonal Control in Social Relations (IQ; and Control in Academic
Situations (AS). It should be noted, however, that the validity of responses
to the Rotter instrument does not depend on the unidimensionality as-
sumption. Scores representing the separate facets may be useful, and
the total score may adequately reflect a higher-order or more general
construct that incorporates the specific components.
Rotter also assumed that the IE construct is bipolar: that the correlation
between independently derived measures of internality and externality
would approach - 1.0 when corrected for unreliability. While Rotter
presented a theoretical justification for this assumption, it is not testable
with the forced-choice format employed in his instrument, and research
with other scales suggests that the IE construct may not be bipolar when
independent ratings are made of internal and external items (e.g., Marsh,
Cairns, Relich, Barnes, & Debus, 1984; also see Collins, 1974; Klockars &
Varnum, 1975; Zuckerman & Gerbasi, 1977). Unless this bipolarity as-
sumption can be supported, the forced choice format used in the Rotter
IE scale may be dubious. Even if the bipolarity assumption and forced-
choice format are supported, it may be that an expanded forced-choice
scale is superior to the dichotomous scale used by Rotter.
The present investigation. Marsh, Richards, and Barnes (1986)
found that participation in the Outward Bound program produced more
internal scores on the Rotter IE scale. Since the program is specifically
designed to effect the IE construct, this finding provided support for the
ROTTER IE SCALE 511
is a questionnaire designed to find out how well a person (in this case, YOU) can assess
views which another person might hold. You should base your assessment on everything
you know about the person, that is, what they say, what they do. the way you feel they
think about things in general and themselves.” In completing this task, approximately half
the subjects used the original format in making their judgments about all the other group
members, and half used the expanded forced-choice format; responses to the independent
rating task were not collected due to time limitations. Through this process, each subject
was described by approximately three observers using the original format and by approximately
three different observers using the expanded format.
Eight sets of scores were computed for each subject: (a) three sets of scores from time 1
(self-responses to the original, the expanded, and the rating formats); (b) three corresponding
sets of self-response scores from time 2; and (c) two sets of scores representing observer-
responses at time 2 (the original and expanded formats). In each set there were five scores
representing the five factors previously identified in responses to the Rotter instrument’
and a total score. For just the independent rating tasks, total internal and total external
scores were also computed (these could not be determined for the original and expanded
forced-choice formats). Because of the design of the study, nearly all subjects completed
all six self-report instruments (418 of 426 instruments were completed), and there were
few missing responses on the completed instruments (less than l/2 of 1%: the group mean
was substituted for the few missing values that did occur on otherwise completed instruments).
Scores for the two sets of external observations, one for the original format and one for
the expanded format, each represented the mean responses of approximately three observers.
For the original Rotter format, the response to each of the 23 item pairs was scored 1
(internal) or 0 (external), and scores for the five components and the total consisted of
the number of internal statements that were selected. For the expanded forced-choice
format, each item pair was scored on an 11 (most internal) to 0 (most external) scale, and
scores for the five components and the total were the sum of these responses. For the
independent rating format, each item pair from the original format was represented by
independent ratings of the internal and the external statement. Each of the five components
and the total score was represented as the sum of responses to the internal statements
minus the sum of responses to the external statements. The total internal score was the
sum of responses to the internal statements, and the total external score was the sum of
responses to the external statements.
RESULTS
Program Effects with Different Formats
The purpose of the first set of analyses are to determine if the effect
of participation in the Outward Bound program, the difference between
time 1 and time 2 responses, varied as a function of the particular IE
component or the response format used to assess it. In a preliminary
analysis conducted with the commercially available Manova procedure
(Hull & Nie, 1981), the effect of time was highly significant, but its effect
’ The five IE facets were defined to be the unweighted average of responses to the
following items (the numbers l-23 refer to the 23 Rotter items-excluding the filler items-
in the order that they appear on the original Rotter instrument): General Luck (1, 12, 13,
15, 17, and 20); Political Control (2. 10, 14. 18, and 23); Success via Personal Initiative
(5, 7, 9, 11, and 22); Interpersonal Relations (3, 6, 16, and 21); Control in Academic
Situations (4, 8, and 19).
ROTTER IE SCALE 513
varied significantly with both the IE component and the response format.
The nature of these complex interactions was examined in a set of analyses
summarized in Table 1. For all three response formats, scores are more
rnal at time 2 than time 1, and the differences are statistically significant
for all but the IC facet. The effect sizes are larger for the expanded
forced-choice and independent rating formats than for the original format.
While the ordering of the effect sizes for different IE components is not
completely consistent for the three response formats, the effects for the
GL and SV (and also the total score) tend to be the largest, while the
effect for IC fails to reach statistical significance for any of the formats.
Also, for the independent rating task, total internal scores are more
internal at time 2 than time 1, and total external scores are less external.
An evaluation of the effectiveness of the Outward Bound program is
not the major purpose of this investigation, and alternative explanations
of the time l/time 2 differences may be viable and will be discussed in
more detail elsewhere (see Marsh et al,, 1986. for further discussion).
Nevertheless, if a specific intervention designed to alter the IE construct
results in a systematic change in responses to the Rotter instrument,
then the findings provide one source of support for the construct validity
of responses to the IE scale. However, these findings also suggest that
this support for the construct validity varies with the IE component and
with the response format. In particular, the support is stronger for both
of the alternative formats than for the original format.
Note. A repeated-measures ANOVA was conducted to determine the significance of the pretest post-test difference. While the main effect of
time was highly significant, the size of this effect varied (i.e., interacted significantly) with particular scale and the form of responses. Consequently,
paired t tests were employed to examine pair-wise differences.
* p < .05; **p < .Ol. TI
’ Effect size is defined as the pretest (time I) post-test (time 2) difference divided by the pretest standard deviation. The tests of statistical 3
significance presented with the effect sizes are based on paired t tests. !!
b For the independent rating format. the five specific factor and the total scores are the sum of responses to internal statements minus the sum ;
of responses to external statements. The Total Internal and Total External scores are the sum of responses to the internal and external statements.
8
TABLE 2
CORRELATIONS AMONG ROTTER TOTAL SCORES
Scales I 2 3 4 5 6 7 8 9 IO 11 12
- - --~__ ___-
Time I self-responses
1. Original format (63)
2. Expanded 71 (71)
3. Ratings-internal 65 56 (75)
4. Ratings-external -60 -54 -30 (78)
5. Ratings-total 71 69 79 -82 (81)
Time 2 self-responses
6. Original format 55 53 43 -42 53 (81)
7. Expanded 44 56 42 -32 45 85 (91)
8. Ratings-internal 50 51 55 -34 55 82 82 (88)
9. Ratings-external -45 -45 -28 59 -55 -76 -82 -61 WN
10. Ratings-total 53 52 45 -52 60 88 91 89 -91 (93)
Time 2 observer-responses”
I I. Original format 16 02 06 -07 08 34 29 40 -21 33 (84)”
12. Expanded 23 31 14 -19 20 42 33 28 -30 32 41 (85)
Now. Correlations, presented without decimal points. larger than .23 are statistically significant. The values in parentheses are coefficient alpha estimates of reliability.
’ The multiple observer ratings for each subject, about three for each type of response, were averaged, and this average was correlated with the self-ratings. The
reliability estimates for the observer responses are for separate sels of responses and are comparable to reliability estimates for each of the other total scores.
ROTTER IE SCALE 517
familiarity with the Rotter instrument. For both time 1 and time 2, responses
to the original format are least reliable, while those to the rating format
are most reliable.
An important purpose of the present investigation is to test the bipolarity
of the IE construct. While this assumption is not testable with the original
Rotter format (or the expanded forced-choice format), the correlation
between the total internal and total external scores for the rating format
does provide such a test. This correlation is - .30 and - .61 (- .39 and
- .68 after correction for attenuation) for responses from time 1 and time
2, respectively. The more negative correlation at time 2 probably again
reflects the effect of the intervention (i.e., participants became more
internal and less external), but may also reflect an increased familiarity
with the scale. While these correlations are clearly in the predicted
direction, the size of the correlations-even after correction for unreliability
and particularly at time 1 which more closely approximates the typical
application of the instrument-may not be sufficiently large to support
the assumption of bipolarity.
Correlations between observer-responses and self-responses offer an
important test of the validity of self-responses to the Rotter instrument
against an external validity criterion. The average correlation between
self-responses and observer responses at time 2 is .34, and this does not
seem to depend on the particular response format (though only the original
and expanded formats were completed by the external observers). These
findings offer modest support for the validity of responses to the Rotter
total score, but do not appear to offer evidence for the superiority of
any of the three response formats.
To the extent that the Outward Bound program does have a systematic
(i.e., valid) effect on the IE construct, then observer responses at the
end of the program should correlate more highly with self-responses at
the end of the program than with self-response before the start of the
program. Inspection of the correlations in Table 2 supports this prediction
and thus offers further support for the interpretation of the intervention
and for the validity of responses to the Rotter instrument. While alternative
explanations may again be viable, these findings, coupled with the sys-
tematic shifts in responses to both internal and external items as described
earlier, do suggest that the Outward Bound program does systematically
effect the IE construct.
Specific Rotterfacets-Self-responses. Multidimensionality is typically
inferred through techniques such as factor analysis and multitrait-mul-
timethod (MTMM) analysis. Marsh and Richards (in press) argued for
the existence of five factors in the Rotter instrument on the basis of a
review of previous factor analytic research and the results of their con-
firmatory factor analyses. One purpose of the present investigation is to
further test these factors with MTMM analyses. Campbell and Fiske
518 MARSH AND RICHARDS
Scales” 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
-
Original
1. GL - 14 46 32 30 61” 09 53 37 24 73 -02 54 43 27
2. PC 27 - 28 -05 04 34 64 11 -03 00 20 66 12 04 07
3. sv 64 48 - 09 49 45 05 54 05 32 55 11 ss 18 31
4. IC 40 07 33 - 11 10 -06 10 60 22 23 -08 17 11 08
5. AS 39 08 46 29 - 37 09 43 12 54 34 06 32 31 46
Expanded
6. GL 32 62 45 40 - 12 67 28 31 20 49 35 24 8
76 65
7. PC 25 42 14 14 41 - 09 12 -13 -05 11 02 -06 ::
87 64
8. sv 62 38 34 42 84 48 - 13 36 52 10 17 27 E
70 68
9. IC 44 18 39 21 34 51 29 50 - 25 21 02 22 64 09 R
10. AS 47 13 44 30 2 60 30 62 50 - 35 -09 30 36 2 rs
Ratings F
11. GL 37 72 42 43 45 74 50 57 - 11 64 42 42 m
II 82
12. PC 30 88 44 18 12 40 90 44 24 20 47 - 02 05 -05
13. sv 66 43 zz 39 44 76 44 sg. 47 53 81 45 - 28 48
14. IC 47 23 41 28 25 53 30 47 ss 43 53 29 54 - 30
15. AS 50 25 55 31 63 61 28 61 45 76 52 24 59 45 -
Coefficient
alphas
Time 1 29 52 38 40 44 50 70 54 34 30 71 71 54 57 70
Time 2 55 76 59 60 56 84 89 82 69 66 80 89 81 73 78
NOW. All coefficients are presented without decimal points and those larger than .23 are statistically significant.
’ See Table 2 for definitions of the scales.
’ The underlined coefficients are convergence coefficients, the correlation between the same trait inferred from two different response formats. 2
520 MARSH AND RICHARDS
specific IE traits identified by Marsh and Richards (in press). Even though
the convergence coefficients were substantially higher for time 2 (Table 2)
support for the distinctiveness of the traits was evident for both time 1
and time 2.
Specijk Rotter factors-Observer responses. Correlations among the
two sets of observer responses collected at time 2 appear as part of
Table 4. While a MTMM analysis is appropriate for the examination of
this 10 x 10 correlation matrix (i.e., five traits and two methods), the
interpretation is quite different from that described earlier. Convergence
in Table 3 represented correlations between self-responses by the same
person to the same set of items with different response formats. Con-
vergence here represents correlations between external observations by
different individuals to the same set of items with different response
formats. The application of the four Campbell-Fiske guidelines to the
MTMM matrix for the observer responses indicates that:
(1) Four of five convergence coefficients are statistically significant
(mean r = .29).
(2) Convergence coefficients (mean r = .29) are higher than other
correlations in the same row or column of the corresponding heterotrait-
heteromethod squares (mean r = .19) for only 27 of 40 comparisons.
(3) Convergence coefficients (mean r = .29) are higher than other
correlations in the same row or column of the corresponding heterotrait-
monomethod triangles (mean r = Sl) for only 7 of 40 comparisons.
(4) The pattern of correlations among the different traits appears to
be similar for the two response formats (and also similar to the pattern
observed for the self-responses in that the correlation between GL and
SV is large, while correlations involving the PC factor and to a lesser
extent the IC factor tend to be lower).
In summary there is modest agreement between two different sets of
observers using different response formats (i.e., convergence) for four
of the five facets, but support for the divergent validity of the traits is
weaker. Support for both the convergent and divergent validity of the
Political Control facet is strongest, there is little support for either con-
vergent or divergent validity of IC. and support for the divergent validity
of the other three facets is weak. It is interesting to note that convergence
on PC (.47) and GL (.39) were comparable to convergence on the total
score (-41) even though these subscales were based on fewer items. The
results suggest that different external observers are able to agree on
some aspects of the IE construct, though apparently not on IC, and are
able to differentiate the PC from other facets.
Specific Rotter facets-Self and observer agreement. Correlations be-
tween observer responses and self-responses also appear in Table 4.
While it would be possible to combine the three methods of self-response
(summarized above and in Table 3) and the two methods of observer
ROTTER IE SCALE 521
Scales” 1 2 3 4 5 6 I 8 9 10 F
E
3:
Observer original
1. GL - 5
2. PC 56 -
3. sv 17 60 - E
4. IC 59 48 - G
49
5. AS 70 52 68 42 - %
B
Observer expanded
6. GL x! 05 31 25 07 -
31 35 31 27 -
7. PC 41 47
8. SV 32 03 22 15 11 78 32 -
9. IC 02 01 -01 OS 00 45 31 46 -
10. AS 33 18 35 21 27 53 32 67 26 -
Self original
Il. GL 33 13 22 10 26 22 07 33 13 36
12. PC 32 21 30 18 21 39 18 37 29 21
13. sv 21 16 1J. 14 02 37 -02 II 23 20
14. IC 25 08 07 00 15 17 08 22 11 25
15. AS 02 - 10 -05 04 II 09 -06 20 29 2
Self expanded
16. GL 19 03 14 03 14 21 -04 31 08 23
17. PC 35 22 29 33 22 26 29 24 17 21
18. sv 19 04 II 07 02 30 - 10 2 07 21
19. IC 19 03 - 10 iu 05 09 09 06 07 13
20. AS 13 -05 - 06 01 14 15 03 28 12 2
Self ratings
21. GL 24 14 19 14 13 32 -06 30 10 26
22. PC 35 28 29 28 24 30 II 29 18 24
23. SV 29 19 22 14 12 33 -06 31 08 29
24. IC 27 05 04 00 11 13 04 15 -&I 22
25. AS 09 -11 05 -02 07 23 -04 29 18 22
Note. All coefficients are presented without decimal points and those larger than .23 are statistically significant. The underlined coefficients are g
convergence coefficients, the correlation between the same trait inferred from different response formats and/or inferred by different individuals.
a See Table 2 for definitions of the scales. 3P
R
“0
$
m
524 MARSH AND RICHARDS
than were scores from the other two formats. However, support for the
divergent validity for the specific IE facets (Table 3) and agreement
between self-responses and observer-responses did not vary substantially
with the particular response format. These findings provide some evidence
against the dichotomous forced-choice format used in the original Rotter
instrument, but are not particularly compelling.
The assumption that internality and externality represent endpoints of
a bipolar continuum has important theoretical implications for IE research
and is implicit in the use of the original forced-choice format (and the
expanded forced-choice format). One implication of this assumption is
that independently derived scores of internality and externality should
correlate - 1.O with each other after correction for unreliability. When
subjects were asked to make independent ratings of the internal and
external statement from each Rotter item-pair, the correlation was only
modestly negative before the start of the intervention but substantially
negative after the intervention. The modest size of the correlation at
time 1 provides evidence against the bipolarity of the construct when
no intervention has taken place, and the intervention apparently affects
the extent of the bipolarity of the IE construct. A second implication of
this assumption is that if an intervention produces an increase in internality,
then it should produce a decrease in externality of a similar magnitude.
While the intervention did increase internality and decrease externality,
the effect size was substantially larger for the internal score. Taken
together, these findings cast doubt on the validity of the bipolarity
assumption.
Rotter originally assumed that responses to his instrument were uni-
dimensional, but subsequent research has clearly shown this not to be
the case. Marsh and Richards (in press)-on the basis of a literature
review and factor analysis-identified five factors in responses to the Rotter
instrument. The results of the MTMM analyses summarized in Table 3
provide strong support for these five facets and show that the differentiation
among the facets does not depend upon the response format. Also, the
effect size of the intervention effect and the extent of agreement between
self-responses and observer-responses varied with the specific IE facet.
However, the different scales, particularly those based on the original
format, are not sufficiently reliable (see Table 3) to be considered separately;
their practical application would require the construction of new scales
that contain more items and items that are more clearly related to the
specific facet that each is designed to measure.
The Outward Bound program is specifically designed to alter the IE
construct, and the Rotter IE instrument is specifically designed to measure
the IE construct. Thus, using a construct validity approach, the present
study provides support for the effectiveness of the intervention and for
the validity of the Rotter IE instrument. While alternative explanations
ROTTER IE SCALE 525
exist for the findings presented here, and those in Marsh, et al. (1986),
a more detailed examination of the results renders some of them as
implausible. First, Marsh, et al. found significant effects on both the
Rotter IE construct and multiple dimensions of self-concept, but reported
that changes in self-concept were not substantially correlated with changes
in IE. Alternative explanations based on response biases, a placebo
effect, or what Marsh, et al. called a post-group euphoria effect would
produce changes in these two self-report measures that were substantially
correlated. Second, the effects in the present study were consistent across
scores derived from three response formats, indicating that the results
do not depend on a particular format. Third, after the intervention the
internal scores for the independent rating format were more internal and
external scores less external, a finding that is inconsistent with many
forms of response bias. Fourth, observer-responses collected at the end
of the program were modestly correlated with self-responses at the end
of the intervention, but were less correlated with self-responses collected
before it. This finding, particularly when coupled with the increased
internality of the time 2 scores, suggests that the intervention produced
systematic, observable changes in the IE construct over the time l/time
2 interval; it is unlikely that changes produced by most potential sources
of invalidity would be systematically related to observer responses. Hence,
while alternative explanations may be viable, the findings support the
effectiveness of the intervention that alters IE and the construct validity
of the Rotter IE instrument as an indicator of this change.
The responses of external observers to the Rotter items were also used
as a validity criterion in the present investigation. This use of the observer
responses assumes that the Rotter items are appropriate for use by ob-
servers, that the observers are able to infer IE, and that observers are
able to differentiate among the different IE facets. There was little evidence
to suggest that observers were able to differentiate among the specific
IE components in a manner that was similar to the self-responses. However,
the finding that two sets of observer responses are modestly correlated
with each other, and with the self-responses, provides support for the
first two assumptions and for the construct validity of the self-responses.
While the extent of self-observer agreement in this study is only modest,
Shrauger and Schoeneman (1979) reviewed studies of the agreement
between self-perceptions and the evaluations by others across a wide
variety of constructs and concluded that “there is no consistent agreement
between people’s self-perceptions and how they are actually viewed by
others” (p. 549). Hence, evidence for construct validity found in the
present investigation is stronger than typically reported in other research
that uses this approach (but see Marsh et al., 1985).
What are the implications of this study for the IE construct and the
Rotter instrument? The results of this study provide evidence for the
526 MARSH AND RICHARDS
Stipek, D. J., & Weisz, J. R. (1981). Perceived personal control and academic freedom.
Review of Educational Research, 51, 101-137.
Watson, J. M. (1981). A note on the dimensionality of the Rotter Locus of Control scale.
Australian Journal of Psychology, 33, 319-330.
Zuckerman, M., & Gerbasi, K. C. (1977). Dimensions of the I-E scale and the relationship
to other personality measures. Educationa/ and Psychological Measurement, 37, 1.59-
175.