You are on page 1of 18

Child Development, November/December 2007, Volume 78, Number 6, Pages 1640 – 1656

Longitudinal Study of Preadolescent Sport Self-Concept and Performance:


Reciprocal Effects and Causal Ordering
Herbert W. Marsh Erin Gerlach
University of Oxford University of Paderborn

Ulrich Trautwein and Oliver Lüdtke Wolf-Dietrich Brettschneider


Max Planck Institute for Human Development University of Paderborn

Do preadolescent sport self-concepts influence subsequent sport performance? Longitudinal data (Grades 3, 4,
and 6) for young boys and girls (N 5 1,135; mean age 5 9.67) were used to test reciprocal effects model (REM)
predictions that sport self-concept is both a cause and a consequence of sport accomplishments. Controlling prior
sport performance (performance-based measures and teacher assessments), prior sport self-concept had positive
effects on subsequent sport performance in both Grade 4 and Grade 6 and for both boys and girls. Coupled with
previous REM studies of adolescents in the academic domain, this first test for preadolescents in the sport domain
supports the generalizability of REM predictions over gender, self-concept domain, preadolescent ages, and the
transition from primary to secondary school.

The overarching purpose of our longitudinal panel retical review and meta-analysis of empirical
study is to test the causal ordering of sport self- research, Valentine and DuBois concluded that recip-
concept and physical performance as posited in the rocal effects relating academic self-beliefs and
reciprocal effects model (REM; Marsh, 1990a, 1990b, achievement are consistent with theories of learning
1993a, 2007b; Marsh & Craven, 1997, 2006). The and human development that view the self as a causal
reciprocal pattern of relations between self-concept agent (e.g., Bandura, 1997; Carver & Scheier, 1981;
and performance posited in the REM is also repre- Deci & Ryan, 1985).
sented in many other theoretical accounts of related Two critical questions identified by both the Marsh
self-belief constructs (e.g., Bandura, 1986, 1997; Byrne, and Craven (2006; also see Marsh, Byrne, & Yeung,
1996, 2002; Eccles & Wigfield, 2002; Harter, 1998, 1999; 1999) review and the Valentine and Dubois (2005)
Hattie, 1992; Skaalvik, 1997; Skaalvik & Hagtvet, 1990; meta-analysis form the focus of the present inves-
Valentine & DuBois, 2005; Wigfield & Eccles, 2002) as tigation. First, does support for REM predictions
well as in the broader themes of reciprocal patterns of generalize over age, particularly for responses by
relation in developmental psychology (e.g., Lerner, preadolescents (because existing support is limited
1982, 1996). Thus, for example, expectancy-value the- largely to adolescents)? Second, does support for the
orists (Eccles & Wigfield, 2002) hypothesize academic REM generalize over different domains, particularly
self-beliefs to be a function of prior academic suc- the sport domain (because existing support is limited
cesses and to affect subsequent academic success largely to the academic domain)?
directly or indirectly through their influence on other
mediating constructs. More generally, in their theo-
Developmental Perspectives on REM

Work on the present investigation was conducted, in part, while From a theoretical perspective, many researchers
H.W.M. was a visiting scholar at the Center for Educational posited that young children’s understanding of com-
Research at the Max Planck Institute for Human Development petence changes with age such that with increasing
and was supported in part by the University of Western Sydney and
the Max Planck Institute. These data are from a large-scale German age, academic self-concepts are likely to be less
project directed by W.-D.B. (Department for Sport & Health, positive, more stable, and more systematically related
University of Paderborn). The study was supported by a grant to external academic outcomes. Thus, for example,
from the Foundation of the Sparkasse Bank of Paderborn.
Correspondence concerning this article should be addressed to
Professor Herbert W. Marsh, Department of Educational Studies,
University of Oxford, 15 Norham Gardens, Oxford OX2 6PY, Uni-
ted Kingdom. Electronic mail may be sent to herb.marsh@edstud. # 2007 by the Society for Research in Child Development, Inc.
ox.ac.uk. All rights reserved. 0009-3920/2007/7806-0002
A Reciprocal Effects Model 1641

Wigfield and Karpathian (1991) argued that: ‘‘Once strongly correlated with academic achievement. How-
ability perceptions are more firmly established the ever, the magnitude of these developmental differ-
relation likely becomes reciprocal: Students with high ences was small. Importantly, there was strong support
perceptions of ability would approach new tasks with for the REM for all the three age cohorts, and these
confidence, and success on those tasks is likely to results were reasonably invariant over age when
bolster their confidence in their ability’’ (p. 255). rigorously tested with multigroup tests of invariance
Skaalvik and Hagtvet (1990) also proposed that in across the three age cohort groups. Although this study
early school years, academic self-concept might be provides good support for the generalizability of
shaped more by achievement but that as self-concept reciprocal effects for young children, there is a need
becomes more established, it may influence subse- to replicate and extend this study as it apparently runs
quent achievement. Consistent with this perspective, counter to prevailing conclusions in developmental
Skaalvik and Hagtvet found support for the REM for psychology.
older students (sixth and eighth grades) but not for
younger students (third and fourth grades). Similarly,
Skaalvik (1997) reported support for reciprocal effects
Generalizability to the Sport Domain
for high school students but support for only the effect
of performance on self-concept during primary Nearly all REM tests have been based on self-concepts
school. Hence, developmental theory and these lon- and school performance in traditional academic sub-
gitudinal studies suggest that during primary school jects for which school grades provide such a salient
years, academic achievement influences academic source of feedback about performance. However,
self-concept but that self-concept may not influence there is limited research to test the generalizability
achievement, whereas this relation becomes recipro- of these results in other domains. Sport is ideally
cal in adolescence. suited for this purpose because feedback about
In their review, Marsh et al. (1999) argued that sport performance—particularly in preadolescence—
although relations between academic self-concept comes largely from social comparison with the per-
and achievement become stronger and more stable formances of peers, direct feedback from peers, and
with age, there was insufficient evidence to determine a variety of sources that are not directly related to
whether the causal ordering of relations between school performance in traditional academic subjects.
these variables actually changes with age or whether This also means that children have much more
any such differences reflect underlying processes or flexibility in the way that they form their sport self-
researchers’ inability to measure these constructs concepts. In particular, they typically receive more
reliably with young children (see Marsh, Debus, & direct and salient feedback from other children in
Bornholt, 2005). In their meta-analysis of REM studies sport (e.g., striking out with the bases loaded, being
based on academic self-beliefs and achievement, the last child chosen to be on a team, etc.) than in
Valentine and colleagues (Valentine, 2001; Valentine traditional school subjects. In the present investiga-
& Dubois, 2005) also predicted that REM support tion, for example, young children in the early stages of
would be stronger in middle and high school than in our study received no formal feedback from teachers
primary school. However, Valentine found the small- about their performance in sport. Hence, this crit-
est effects for middle and junior high school students. ical source of feedback—school grades and feed-
Even those studies specifically chosen to provide the back from teachers—in the formation of academic
strongest tests of age as a moderator effect produced self-concepts in traditional academic domains was
results that were ‘‘mixed, suggesting the need for completely absent. Also, because many sports in-
further focused study of the effects of age on the volve improvement over time in the same activity
relations between age, self-concept, and achieve- (running faster, jumping higher, hitting a ball fur-
ment’’ (Valentine, 2001, p. 55). ther, increasing accuracy, etc.), there is more scope
Guay, Marsh, and Boivin (2003) pursued this chal- to use improvement on previous performance along
lenge for developmental researchers that was pro- a reasonable absolute metric that does not depend
posed by Marsh et al. (1999) and by Valentine (2001). on external feedback as a basis for self-concept for-
Guay et al. used a multivariable – multioccasion – mation than is typical for school grades (which are
multicohort design (i.e., three age cohorts—students typically idiosyncratic and normatively scored).
in Grades 2, 3, and 4, each with three measurement Also, because sport is primarily an extracurricular
occasions separated by 1-year intervals). They found activity and extracurricular sport is typically optional
that as children grew older, their academic self-concept for children, support for the REM should be even
responses became more reliable, more stable, and more stronger than in academic domains. Thus, even
1642 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

though physical education classes are obligatory for that is the focus of the present investigation and
the German preadolescent students in the present where support for the REM is more tenuous.
investigation, much of the sporting activity takes
place outside of the normal school curriculum (e.g.,
at recess, after school, or on the weekend)—unlike,
Gender-Stereotypic Self-Concept Differences,
for example, traditional academic subjects such as
the REM, and Sport
mathematics. Particularly during preadolescence,
decisions about whether or not to participate in Small gender differences in self-esteem favoring boys
traditional academic school subjects are highly con- (Kling, Hyde, Showers, & Buswell, 1999) mask larger,
strained. These constraints are considerably less counterbalancing gender-stereotypic differences in
strong, more varied, and more idiosyncratic for sport specific components of self-concept. Here, as in so
(Eccles & Harold, 1991). Hence, children with low many other areas of self-concept research (Marsh,
levels of sport self-concept have much more oppor- 2007b; Marsh & Craven, 2006), it is important to take
tunity to opt out of sporting activities that are option- a multidimensional perspective and evaluate how
ally available to them outside the physical education differences vary for particular self-concept domains.
classes, and opting out should lead to poorer sport Consistently across age groups, males report higher
performance that reinforces negative sport self- sport, physical appearance, and math self-concepts,
concepts—a negative pattern of reciprocal relations. whereas females report higher verbal and social self-
Recently, there have been several tests of the REM concepts (Marsh, 1989; also see Crain, 1996; Jacobs,
in the sport domain. Marsh, Chanal, Sarrazin, and Lanza, Osgood, Eccles, & Wigfield, 2002; Wigfield
Bois (2006) demonstrated REM support for gymnas- et al., 1997). Although there are small, gender-
tics self-concept and performance measures collected stereotypic differences in academic components of
before and after a 10-week high school gymnastics self-concept, the largest differences are for physical
program. As predicted by the REM, the results in this and sport self-concepts.
short longitudinal study showed that gymnastics self- In relation to the evaluation of the REM, it is
concept and gymnastics performance were both de- important to test the generalizability of results over
terminants and consequences of each other. Marsh gender—particularly in gender-stereotypic domains
and Perry (2005) tested the effects of sport self- such as math, verbal, and sport self-concept. Marsh
concept on subsequent performance for 270 elite (1989; also see Eccles, 1987; Eccles & Wigfield, 2002;
swimmers from 30 countries participating in the Pan Maccoby & Jacklin, 1974) posited and tested a differ-
Pacific Swimming Championships and the World ential socialization hypothesis in which ‘‘sex-linked
Short Course Championships. Whereas subsequent differences in socialization patterns may fail to rein-
championship performance was highly related to force adequately boys’ positive attitudes, expecta-
prior personal best performances (r 5 .90), structural tions, and performance in verbal areas as well as
equation models (SEMs) demonstrated that elite ath- failing to reinforce adequately girls’ positive atti-
lete self-concept contributed significantly to the pre- tudes, expectations, self-concepts, and performance
diction of subsequent championship performance, in mathematics’’ (Marsh, 1989a, p. 195). Although he
explaining approximately 10% of the residual vari- found small gender differences in the expected direc-
ance after controlling for personal best performances. tion for math and verbal constructs, relations among
However, support for the REM provided by these variables were largely invariant over gender. Marsh
studies is limited. The Marsh and Perry study was (1993a, 1993b) provided an alternative test of the
based on only a single wave of self-concept data and differential socialization model, in which it was
was not a true longitudinal study (although the predicted that: verbal self-concept would be more
existence of prior personal best performances and highly related to academic and general self-concept
subsequent performance provided two measures of for girls; math self-concept would be more highly
performance), thus precluding a full test of the REM. related to academic and general self-concept for boys;
Although the Marsh, Chanal, et al. study included and these gender differences would grow larger with
self-concept and performance data on two occasions age. These predictions were consistent with Eccles’
over a 10-week period, stronger tests require at least finding that gender differences in the value placed on
three waves of data collected over a longer period of math and verbal competence grew larger with age
time to more adequately assess developmental trends and with what Hill and Lynch (1983) called ‘‘gender-
in the data. Furthermore, neither of these studies role intensification’’ in which conformity to gender
focused on the formation of sport self-concept in role stereotypes becomes increasingly important with
relation to performance during preadolescent ages age. However, Marsh (1993b) found no support for
A Reciprocal Effects Model 1643

these predictions as relations were similar across Marsh & Craven, 2006; Marsh, Trautwein,
eight groups (2 gender  4 adolescent age groups). Lüdtke, Köller, & Baumert, 2005);
Eccles and Harold (1991) extended research on 4. Does the shift from primary school to secondary
gender stereotypes to the sport domain in relation to school disrupt sport self-concept and its relation
expectancy-value theory. Consistent with other to sport accomplishments beyond what might
research, they found that gender-stereotypic differ- be expected by typical developmental changes?
ences in favor of boys were much larger for sport self- Although not the major focus of our study, this
concept (8% of variance explained) than those in math question is relevant to the growing body of
and reading (1% of variance explained) and that these largely North American research showing neg-
differences generalized over age. However, there ative psychosocial consequences of the shift
were no systematic differences between boys and from primary to secondary school that are
girls in the correlation between sport self-concept associated with changes in educational de-
and performance. Although consistent with findings mands, teacher attitudes, grading systems, and
by Marsh (1993a, 1993b, 2007b) in academic domains, social networks (e.g., Eccles & Wigfield, 2002;
none of these studies tested the juxtaposition of the Harter, 1990; Jacobs et al., 2002; Wigfield, Eccles,
REM and gender-stereotypic models proposed in the Mac Iver, Reuman, & Midgley, 1991). In most
present investigation. However, bringing together of this research, transitional (shift from primary
these two research literatures into a common theoret- to secondary school) and developmental (onset
ical perspective is particularly relevant in sport of adolescence and puberty) factors are con-
because gender differences in sport self-concept are founded. However, in contrast to most North
larger than in other domains and because children American settings, in the German system this
have more flexibility in participating in sport than transition occurs at the end of Grade 4—prior to
typically is the case for academic domains. the typical onset of adolescence and puberty.
Hence, developmental theory and previous
empirical research suggest that self-concept
and its relations with external criteria should
become increasingly stable with age, whereas
The Present Investigation
research based on transitions suggests that
In the present investigation, we extend REM tests to stability should be lower during the period of
the sport domain for preadolescent German boys and transition.
girls (N 5 1,135; mean age 5 9.67) based on longitu-
dinal data collected in Grades 3, 4, and 6. In addition
to this focus on the REM in the sport domain and the
preadolescent age period that are our overarching
concerns, important features include questions Method
such as:
Sample
1. Do expected gender differences in favor of boys The data considered here are drawn from a larger
for physical performance and sport self-concept study (see Brettschneider & Gerlach, 2004) consisting
change with age? of a representative sample of approximately one third
2. Does support for REM predictions generalize of the students living in the city of Paderborn,
over gender and age? Germany.
3. Do results differ depending on different forms At Time 1 (T1; March 2001), third-grade students
of physical achievement (school grades, teacher (N 5 1,438; 50.0% female) with a mean age of 9.67
ratings, and standardized physical performance years (SD 5 0.64, range 5 8 – 12) completed a self-
tests)? This question is relevant because aca- concept questionnaire and were evaluated by teach-
demic self-concept research shows that aca- ers in terms of sport performance. Approximately 15
demic self-concept is less strongly related to months later, when students were near the end of
standardized achievement tests than to school fourth grade (Time 2 [T2]; June and July, 2002, prior to
grades that reflect features (e.g., effort, timely the 6-week summer vacation) and again approxi-
completion of materials, improvement time, mately 15 months after that, when students were in
and other considerations) that are more sensi- sixth grade (Time 3 [T3], late 2003), the same children
tive to processes instigated by a high academic completed the questionnaire measure again. Most
self-concept than standardized test scores (see students were Caucasian (more than 95%) and had
1644 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

German citizenship (83.8%). Children with Russian throughout Germany, ranging from excellent to very
(6.7%) and Turkish citizenship (2.2%) represented the poor. We reverse coded school grades, resulting in the
largest minority groups. Because the sample was following six levels: excellent (6), good (5), satisfactory
representative, it was diverse in terms of socioeco- (4), sufficient (3), poor (2), and very poor (1). Importantly,
nomic background. The high response rate and the these school grades were based on physical perfor-
focus of the present investigation on the relations mance in physical education classes and not on
between constructs collected at different measure- written examinations, as is typically the case for other
ment occasions meant that we only considered the school subjects. In this regard, T1 measures consisting
1,135 students with reasonably complete data for all of teacher ratings are a reasonable approximation of
three measurement occasions (i.e., did not have school grades at T2 and T3. T1 teacher ratings also
complete missing data for any one wave). differ from T2 and T3 school grades in that students
did not actually know what T1 teacher ratings they
received. Interestingly, even though T1 ratings vary
Instruments and Procedures
along a 5-point response scale (rather than the 6-point
Standardized test of basic physical test scores. At T1 response scale used for school grades at T2 and T3),
only, students’ overall physical skills were tested with standard deviations [SDs] of T1 scores were not
the Hagedorn Obstacle Course (Riepe & Zindel, smaller than SDs for the corresponding T2 and T3
1996), which is a standardized test of coordination, measures (see Appendix). T1 teacher ratings were
balance, and speed. The course consists of slalom, also as highly correlated with T3 sport self-concept
a balancing exercise, a forward roll, and getting over scores (.46) as were T2 school grades (.44), even
and through obstacles. Students completed the test though the T1 – T3 time gap (30 months) was twice
twice, and the performance score was the average the length of the T2 – T3 time gap. These preliminary
time a student needed to complete the course across results suggest that T1 teacher ratings are a reasonable
the two trials. To facilitate interpretations, the test was measure of sport performance.
reverse scored so that higher scores represent better Sport self-concept. A German adaptation of the
performances. This performance test was conducted sport self-concept scale from Harter’s Self-Perception
during regular physical education lessons by trained Profile for Children (Harter, 1985) was administered.
research assistants (mostly physical education uni- Students responded to the five items (I am very good
versity students). This test is routinely administered at sport, I learn quicker than others of my age in sport,
to students in the third grade across the school district I learn new exercises very quickly in sport, I am as
every year to evaluate sport potential and, thus, is good as others of my age in sport, and I am just not
widely accepted by teachers, parents, and students. good at sport) on a 4-point Likert response format
Its validity is supported by significant correlations (ranging from 1 5 disagree to 4 5 agree). To ensure high
with observer ratings and other fitness measures. For data quality, the self-concept instrument was admin-
example, in the present investigation, test scores were istered during regular school hours by trained
substantially correlated with teachers’ overall evalu- research assistants (university students) in small
ation of sporting potential at T1 (r 5 .52) collected groups of about six students. Whereas sport self-
during the same school year and with T2 physical concept was significantly higher at T1 than either T2
education class performance based on school grades or T3 (p , .01)—because in part of the large sample
collected the following school year (r 5 .46). size—the differences were not large (effect sizes of
Performance in physical education classes. Grades in about 0.1 SD) and there were no significant differ-
physical education classes were the main perfor- ences between T2 and T3 sport self-concept (p . .05).
mance measure in the present investigation. How-
ever, school grades were not available at T1 because
Statistical Analysis
students had not previously received grades in phys-
ical education. Hence, only for T1, physical education SEMs were conducted with LISREL (Version 8.54)
teachers were asked to rate each child on a single using maximum likelihood estimation (for further
global item, asking whether the child had the appro- discussion of SEM, see Bollen, 1989; Byrne, 1998;
priate prerequisites for sports in terms of sport en- Jöreskog & Sörbom, 1993; Kaplan, 2000; Marsh,
durance, coordination, and appropriate physique on 2007a). As emphasized by Cole and Maxwell (2003)
a 5-point response scale: below average ( 2) . . . average and others (e.g., Marsh & Craven, 2006), important
(0) . . . above average (+2). At T2 and T3, sport perfor- advantages of the SEM approach to multivariable –
mance was based on school grades on report cards multioccasion data include the following: (a) mea-
using the six-level grading system implemented surement error can be controlled by the incorporation
A Reciprocal Effects Model 1645

of measurement models based on multiple indicators, schools within the same city and the extreme dili-
(b) relations among all latent variables can be ana- gence was used in pursuing students with missing
lyzed simultaneously, (c) reciprocal relations can be data, we had reasonably complete data for all three
evaluated, and (d) various potentially confounding occasions of data for 1,135 of the 1,438 students who
variables can be included to test the construct validity responded at T1. For this final sample of 1,135
of causal interpretations. students with data from all three occasions, there
Diagrams of selected models to be tested (Figure 1) were small amounts of missing data for individual
are discussed in more detail as part of the results, items (less than ½ of 1% of responses—whereas the
whereas specific methodological issues are described typical guideline for missing data becoming an
below. Although the interpretation of relations in an important issue is 5%; e.g., Graham & Hofer, 2000).
SEM as causal effects should always be done with Because there were so little missing data in this final
appropriate caution, the use of longitudinal data sample, the choice of how we dealt with missing data
provides a much stronger basis of inference than was not a particularly critical issue. Nevertheless,
cross-sectional studies or studies that do not represent there is growing evidence that the expectation max-
the variables as latent constructs. As a part of the imization (EM) algorithm that we chose to use is
analyses, we consider the robustness of interpreta- superior to traditional methods such as pair-wise
tions in relation to gender and age and to alternative deletion or case-wise deletion.
representations of the sport performance variable. In order to explore further the implications of
Model evaluation. Following Marsh, Hau, and missing data, several additional analyses were under-
Grayson (2005; Marsh, Balla, & McDonald, 1988; but taken. First, we evaluated the nature of this missing-
also see Marsh, Hau, & Wen, 2004), we considered the ness by comparing scores for the 1,135 students who
Tucker – Lewis index (TLI), the relative noncentrality had data from all three occasions and the remaining
index (RNI), and the root mean square error of students with data completely missing at T2 or T3.
approximation (RMSEA) to evaluate goodness of fit, There were no significant differences between these
as well as the normal theory chi-square test statistic two groups in terms of age, T1 or T2 sport self-
and an evaluation of parameter estimates. The TLI concept, T1 physical test scores test, and T1 or T2
and RNI vary along a 0 – 1 continuum, in which teacher performance ratings (all p . .05). Whereas the
values greater than .90 and .95 are typically taken to final sample with complete data for all three occa-
reflect acceptable and excellent fits to the data, sions had a significantly higher proportion of girls,
respectively. RMSEA values of less than .05 and .08 the difference was not large (51% vs. 43%, p , .05),
are taken to reflect a close fit and a reasonable fit, and gender was included as a background variable in
respectively, whereas RMSEA values between .08 and models reported in the Results section. Second, we
.10 reflect a mediocre fit, and values greater than .10 used the EM algorithm to impute missing data for the
are generally unacceptable. entire set of 1,438 students—the 1,135 with data from
Following recommendations by Marsh and Hau all three occasions and the remaining students with
(1996; Jöreskog, 1979) and general recommendations completely missing data from T2 or T3. Noting that
for the evaluation of the REM (Marsh et al., 1999), the amount of missing data for this sample was
correlated uniquenesses were included for the match- substantially larger (9.75% of the data points), the
ing sport self-concept items collected at T1, T2, and parameter estimates using this approach were nearly
T3. Their exclusion would have positively biased the the same as for the final sample. These supplemental
corresponding test – retest stability estimates and analyses suggest that the results are reasonably
resulted in a poorer fit. However, their inclusion had robust in relation to problems associated with miss-
no substantively important effect on the pattern of ing data.
parameter estimates, suggesting that the inclusion of Tests of invariance. Multiple-group SEM tests of
correlated uniquenesses was not a critical issue. To invariance (Byrne, 1998; Marsh, 1994, in press) were
facilitate interpretation of the substantive import of used to test the generalizability of the results based on
the results, only the models with correlated unique- analyses of separate covariance matrices for boys and
nesses are presented. girls. Tests of factorial invariance traditionally posit
Missing data. In longitudinal research, missing a series of nested models in which the end points are
data are an inevitable problem—particularly in stud- the least restrictive model with no invariance con-
ies that incorporate responses from more than 1 year straints and the most restrictive (total invariance)
in school and across important transitions in school model with all parameters constrained to be the same
such as the shift from primary to secondary school. across all groups. Differences between nested models,
Because results from the present study are based on under appropriate conditions, can be tested for
1646 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

Figure 1. Three tests of the reciprocal effects model (REM). Note. In


each of the models, only selected paths central to the REM are
presented (see Tables 1 and 2 for full set of parameter estimates). In
T1-SSC T2-SSC T3-SSC Model 2, we present a full-forward multioccasion – multivariable
.23* model with five multiple indicators of sport self-concept (SSC) and
.23*
sport performance (Perf) collected in three successive occasions
.01
.16*
(Time 1 [T1], Time 2 [T2], and Time 3 [T3]). The boxes represent
indicators (of SSC or Perf each occasion). Ovals represent latent
.21*
-.06 constructs (SSC or Perf factors); straight, single-headed arrows
T1-Perf T2-Perf T3-Perf represent ‘‘causal’’ paths. In the full-forward model, each latent
construct has paths leading to all latent constructs on subsequent
occasions. Paths connecting the same variable on multiple occa-
sions reflect stability (the solid gray paths in Model 1) but may be
very different from the corresponding test – retest correlations
(which do not include the effects of other variables). The self-
enhancement model (SSC/Perf) predicts that the dashed black
paths are positive. The skill development model (Perf/SSC)
Sex T1-SSC T2-SSC T3-SSC predicts that the solid black paths are positive. The REM predicts
.23*
.24* that the solid and dashed paths are both positive. All three models
.02 assume stability over time for each construct (solid gray lines are
Age
.15* positive). Model 3 differs from Model 2 through the inclusion of age,
.21* gender (sex), and their interaction (new paths associated with these
-.06 variables are presented as solid black lines, whereas paths consid-
AgeXSex T1-Perf T2-Perf T3-Perf
ered in Model 2 are in gray). Model 4 differs from Model 3 by the
inclusion of the standardized physical test (Phys Test at T1 only;
new paths associated with this variable are presented as solid black
lines, whereas paths considered in Model 3 are in gray). Within each
occasion for all three models, SSC and Perf (and Phys Test at T1 in
Model 4) are correlated, but these correlations are not presented in
order to avoid clutter (but see Tables 1 and 2). Similarly, control
.12* variables (age, gender, and their interaction) are assumed to be
Sex T1-SSC T2-SSC T3-SSC
19* correlated in Models 3 and 4, but these correlations are not
.18* presented (see Tables 1 and 2). Correlated uniquenesses (covarian-
.00
Age ces between measured variable residuals) associated with each SSC
.12*
indicator are included between occurrences of the same SSC
.13* indicator on different occasions.
-.08*
AgeXSex T1-Perf T2-Per 3-Perf .20*
.22*
.24*
parameter estimates based on Grade 6 responses
Phys following the transition to secondary school. For the
Test purposes of these analyses, we began with a confir-
matory factor analysis (CFA) model with the two
factors (sport self-concept and performance) for each
of the three occasions. In the least restrictive model, all
statistical significance. However, Marsh, Hau, et al. parameters were freely estimated. In the most restric-
(2005; also see Marsh, 2007a) emphasized that the tive model, the set of five factor loadings for sport self-
limitations in the use of the chi-square test statistic to concept, the corresponding set of five uniquenesses,
test the difference between two models are even more and the correlation between sport self-concept and
problematic than when the statistical test is used to performance were all constrained to be the same for
evaluate the fit of a single model, so that it is typically each of the three occasions of data. (We did not
more appropriate to compare different models in constrain the factor loadings for the performance
terms of the fit indexes. factor because it was based on a single indicator, it
Although tests of invariance typically are con- was not based on responses by the children, and the
ducted over different groups, for longitudinal data it variance of the teacher rating at T1 was systematically
is also possible to evaluate the invariance of corre- larger than the variance of school grades at T2 and
sponding parameters from different occasions (e.g., T3). In alternative models, we freed parameters from
Marsh & Grayson, 1995). Of particular interest in the T1, T2, or T3 (holding invariant parameters across T1
present investigation is whether parameter estimates and T2, T1 and T3, and T2 and T3, respectively). To
from the two primary grades (Grades 3 and 4) are the extent that the transition from primary to second-
more similar to each other than to the corresponding ary school substantially disrupts self-concept, we
A Reciprocal Effects Model 1647

would expect better support for the invariance of T1 REM (see Models 2, 3, and 4 in Figure 1). Because the
and T2 than for T2 and T3. Alternatively, based on measurement component (factor loadings and
developmental theory, we would expect responses to uniquenesses) of these SEMs was nearly identical to
be as similar or more similar for T2 and T3, when the corresponding CFA Model 1(Table 1) and Models
children are older, better able to cope with the 2, 3, and 4 all provided excellent fits to the data
demands of completing a self-concept instrument, (Table 2), we limit attention to the path coefficients
and have more accurate perceptions of their relative (Table 3) used to test the REM.
strengths and weaknesses as one basis for forming Model 2 (Table 3 and Figure 1) most closely
self-perceptions. matches the prototypical REM, consisting of parallel
measures of performance (based on teacher assess-
ments) and sport self-concepts at T1, T2, and T3.
Stability coefficients (Table 3) relating the measures
Results of the same construct on different occasions were all
positive and highly significant (although these values
Relations Among the Constructs
differ from the corresponding test – retest correlations
In Model 1, an initial CFA based on all variables, in Table 2 because the path coefficients in Table 3
the a priori factor structure provided a very good fit to control for the effects of other, preceding variables).
the data (e.g., TLI 5 .989, Model 1 in Table 2). Factor Of critical importance are the path coefficients relat-
loadings for the five sport self-concept items were ing sport self-concept and sport performance. In sup-
all statistically significant and substantial at T1, T2, port of the REM, the effects of T1 sport self-concept on
and T3. Of particular relevance is the pattern of cor- T2 performance (.23) and of T1 performance on T2
relations among the different constructs. Sport self- sport self-concept (.21) are both statistically signifi-
concept is reasonably stable over time; test – retest cant. The effect of T2 sport self-concept on T3 perfor-
correlations varied from .57 (for T1 – T3) to .69 (T1 – mance (.23) is also statistically significant and similar
T2) and .74 (for T2 – T3). At T1, sport self-concept was in size to the corresponding path coefficient from T1
substantially related to both the physical test scores sport self-concept to T2 performance. Interestingly,
(.45) and the teacher ratings of sporting potential (.42), the path from T2 performance to T3 sport self-concept
whereas the two performance measures correlated .52 is not statistically significant, although there is a sta-
with each other. Correlations between sport self- tistically significant path from T1 performance to T3
concept and performance were .60 at both T2 and T3. sport self-concept (.16). This result is consistent with
Girls had significantly lower scores than boys for the observation that T3 sport self-concept was some-
both T1 test scores and T1 teacher ratings and lower what more highly related to T1 performance than T2
sport self-concepts at all three occasions (see correla- performance (Table 1), suggesting that T1 teacher
tions with gender in Table 1). Interestingly, there were ratings of sporting potential are qualitatively some-
no significant gender differences on physical educa- what different from the physical education class
tion school grades at T2 and T3, and even the gender grades and, perhaps, more relevant to sport self-
differences on teacher ratings at T1 (r 5 .10) were concept than physical education class grades (see
significantly smaller than corresponding test scores at subsequent discussion).
T1 (r 5 .25). Apparently, teachers used different In Model 3 (Figure 1 and Table 3), the effects of
standards for boys and girls rather than a common, gender, age, and gender – age interaction were added
‘‘absolute’’ metric that was independent of gender (as to Model 2 but had almost no effect on the path
with T1 physical test scores). Given the restricted age coefficients from Model 2. Whereas girls had signif-
range because of consideration of only a single ‘‘year icantly lower sport self-concept and performance
in school’’ cohort, it is not surprising that age was not scores at T1, path coefficients relating gender to T2
significantly related to any of the sport self-concept or and T3 measures were largely nonsignificant. There
teacher assessment measures, although older students was, however, a small, statistically significant nega-
had significantly higher scores on the T1 physical test tive effect of gender on T3 sport self-concept ( .05).
scores test (r 5.12). There were no statistically significant Hence, whereas gender differences on subsequent
Age  Gender interaction effects for any outcomes. measures were largely mediated by gender differ-
ences in prior measures, the girls’ lower sport self-
concepts at T3 remained even after controlling for T1
Tests of the REM and its Extension
and T2 measures of sport self-concept and performance.
In addition to Model 1 (the CFA model in Table 1), In Model 4, the T1 physical test scores were added
we pursued a set of three additional models to test the to Model 2. These test scores were substantially
Table 1
Confirmatory Factor Analysis Model 1: Sport Self-Concept, Performance, and Covariates at Times 1, 2, and 3 1648

T1Prf T1SSC T2Prf T2SSC T3Prf T3SSC Sex Age Sex  Age Test scores Uniqueness T1Ub T2CUb

Factor loadings
T1
T1Prf 1.00a .00a
T1SSC1 .81* .34*
T1SSC2 .58* .67*
T1SSC3 .66* .57*
T1SSC4 .60* .64*
T1SSC5 .48* .77*
T2*
T2Prf 1.00a .00a
T2SSC1 .85* .28* .00*
T2SSC2 .64* .59* .12*
T2SSC3 .71* .49* .04*
T2SSC4 .71* .50* .12*
T2SSC5 .60* .64* .08*
T3*
T3Prf 1.00a .00a
T3SSC1 .82* .32* .02 .04*
T3SSC2 .63* .61* .07* .16*
T3SSC3 .68* .53* .02 .06*
T3SSC4 .66* .56* .06* .05*
Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

T3SSC5 .66* .56* .09* .04*


Covariates
Sex (M 5 0, F 5 1) 1.00a .00a
a
Age 1.00 .00a
Sex  Age 1.00a .00a
Test scores 1.00a .00a
Factor correlations
T1Prf 1
T1SSC .42* 1
T2Prf .50* .40* 1
T2SSC .46* .69* .60* 1
T3Prf .37* .33* .44* .44* 0
T3SSC .46* .57* .44* .74* .60* 1
Sex .10* .23* .06 .15* .03 .18* 1
Age .01 .06 .01 .06 .02 .05 .07 1
Sex  Age .01 .01 .02* .01 0 0 .02 .03 1
Test scores .52* .45* .46* .52* .42* .51* .25* .12* 0 1

Note. Times 1, 2, and 3 (T1, T2, and T3) are Grade 3, Grade 4, and Grade 6. Prf 5 performance (based on a teacher rating at T1 and school grades in physical education at T2 and T3); SSC 5
sport self-concept; test scores 5 a standardized physical achievement test administered at T1.
a
Constructs inferred from a single indicator had factor loadings fixed to 1 and uniqueness terms fixed to be 0.
b
Correlated uniquenesses hypothesized a priori for responses to the same self-concept items are measured in different occasions.
*p , .05.
A Reciprocal Effects Model 1649

correlated with the sport self-concept ratings and the freedom (df 5 23). In summary, the tests of invariance
other sport performance measures—but qualitatively demonstrated that REM support generalizes over
different from the other performance measures that gender.
were based on teacher assessments. Thus, not sur-
prisingly, the path coefficients in Model 4 differed
The Transition From Primary to Secondary School:
somewhat from those in Models 1 and 2. The pattern
Generalizability Over Time
of effects of prior sport self-concept on subsequent
performance was the same, but the coefficients were The question addressed in this section is whether
smaller (reflecting the fact that some of the variance in the transition from primary to secondary school
subsequent performance measures could be ex- (between T2 and T3) substantially altered the factor
plained by the T1 test scores). Similarly, the pattern structure for sport self-concept and performance and
of effects of T1 performance and T2 performance on particularly the relation between self-concept and
subsequent sport self-concept measures was the performance. According to a ‘‘transitional perspec-
same, but the sizes of these paths were somewhat tive,’’ results should be more similar for T1 and T2
smaller. Consistent with these observations, all four (that did not contain a transition) than for T2 and T3
paths leading from the T1 test scores to the sub- (that did not contain a transition). In contrast, devel-
sequent T2 and T3 sport performance and sport self- opmental theory predicts that self-concept stability
concept measures were statistically significant and and its relations with external constructs should
positive. In particular, the T1 test scores had direct become stronger with age. Hence, according to this
effects on T3 sport self-concept and T3 performance in perspective, that the results should be as similar as, or
addition to the effects that were mediated by the even more similar, for responses by the children at T2
corresponding T2 outcomes. and T3 when students are older than for responses by
the same children at T1 and T2 when they are
younger.
Generalizability of the REM Predictions Over Gender
Several preliminary observations based on results
In the next set of models (Models 5a – 5e in Table 2), already presented bear on this issue. For instance,
we evaluated the invariance of the factor structure in internal consistency estimates (see Appendix) were
Model 4 over gender, that is, whether the parameter lower at T1 (.76) than at T2 and T3 (.83 and .84,
estimates differed as a function of gender. In pursuing respectively). Whereas the mean sport self-concept
these tests of invariance, separate covariance matrices was significantly higher at T1 than T2 or T3, the means
were constructed for responses by boys and girls at T2 and T3 were not significantly different from
based on all measures in Model 4. In the detailed set each other (see earlier discussion). In Table 1, the
of models, we evaluated the goodness of fit in correlation between T1 performance and T1 sport
a partially nested set of models, varying from the self-concept (.42) is substantially lower than the
least restrictive model with no invariance constraints corresponding correlations at T2 (.60) and T3 (.60).
(Model 5a) to the most restrictive model in which all Similarly, factor loadings are somewhat lower at T1
parameter estimates were constrained to be the same than the corresponding factor loadings at T2 and T3.
for boys and girls (Model 5e). Although the models These results do not support a transitional perspec-
are complex, the results are easy to summarize. tive in that T2 and T3 results (before and after the
Particularly for indexes that incorporate controls transition) are more similar to each other than are T1
for parsimony (TLI and RMSEA in Table 2), the and T2 results.
model with complete invariance of all parameter In order to provide a more formal test of some of
estimates (Model 5e) provided a better fit than the these observations, we posited a series of models in
corresponding model with no invariance con- which these parameters were held invariant across all
straints. Of particular relevance to the present inves- three occasions or across all combinations of two
tigation was the invariance of path coefficients that occasions (T1 and T2, T1 and T3, and T2 and T3).
are critical for testing REM predictions. Comparing Even the most restrictive model provided a good fit to
Model 5b (factor loadings invariant) with Model 5c the data (TLI 5 .979; RMSEA 5 .052), but this fit was
(factor loadings and path coefficients invariant) pro- poorer than the corresponding values for the model
vided a multivariate test of the invariance of the path with no invariance constraints (TLI 5 .985; RMSEA 5
coefficients. However, the TLI and RMSEA goodness .044). Consistent with our preliminary observations,
of fit indexes are better for the more restrictive Model freeing the parameter estimates for T1 improved the
5c, and the nonsignificant change in chi-square fit more than freeing the corresponding parameters
values (15.5) is less than the change in degrees of for T2 or T3. In fact, the goodness of fit statistics for the
1650 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

Table 2
Summary of Goodness of Fit for All Models

Model v2 df TLI RNI RMSEA Description

Total group tests


1 418.2 156 .989 .990 .035 CFA: SSC, Perf,
covariates, and test scores
2 342.1 108 .985 .990 .044 SEM: SSC, Perf
3 404.4 144 .983 .989 .040 SEM: SSC, Perf,
covariates (see Table 3)
4 418.2 156 .989 .990 .038 SEM: SSC, Perf,
covariates, and test scores
Multiple-group tests: invariance over gender
5a 526.6 265 .984 .989 .042 SEM: no invariance
5b 548.8 277 .984 .988 .042 SEM: invariant 5 factor
loadings (FL)
5c 564.3 300 .986 .989 .040 SEM: IN 5 FL and path
coefficients (PC)
5d 586.5 312 .991 .992 .039 SEM: IN 5 FL, PC, and factor
variance/covariance (FV/CV)
5e 654.2 342 .985 .986 .040 SEM: IN 5 FL, PC, Fc/Rfc, U/Cu
(total invariant)
Invariance over three occasions (T1, T2, T3)
6a 342.1 108 .985 .990 .044 CFA: no invariance
6b 444.2 119 .981 .985 .049 CFA: T1 and T2 invariant, T3 free;
Perf vars free
6c 469.5 119 .979 .984 .051 CFA: T1 and T3 invariant, T2 free;
Perf vars free
6d 377.0 119 .985 .988 .044 CFA: T2 and T3 invariant, T1 free
6e 524.1 130 .979 .982 .052 CFA: total
invariance; Perf vars free

Note. N 5 1,135. CFA 5 confirmatory factor analysis; TLI 5 Tucker – Lewis index; RNI 5 relative noncentrality index; RMSEA 5 root mean
square error of approximation; SRMR 5 standardized root mean square residual; Perf 5 performance; SSC 5 sport self-concept. See Table 1
for results of Model 1 and Table 3 for selected parameter estimates from Models 2 – 4. In Models 5a – 5e, the invariance over gender of
different parameters was evaluated. In Models 6a – 6e, the invariance of parameters (FLs for self-concept ratings, uniquenesses for self-
concept ratings, and the correlation between self-concept and performance) across the three occasions was evaluated.

model that constrained parameters to be equal at T2 Discussion


and T3 (TLI 5 .985; RMSEA 5 .044) were nearly the
The results of the present investigation demonstrate
same as those for the model with no invariance
clear support for REM predictions that sport self-
constraints and better than models that freed param-
concept and sport performance for young children are
eters associated with measures at T2 or at T3. This
invariance over time provides no support for a transi- each a cause and an effect of the other and that this
tional perspective. Indeed, consistent with develop- support for their reciprocal effects generalizes over
mental theory, the results for T2 and T3 are more age, gender, and the transition from primary to secon-
similar than the results for T1 and T2. Although dary school. Developmentally, the study is important
differences over time are not large, the direction of because most previous research has been based on
these differences is statistically significant and in the adolescents and some researchers (e.g., Skaalvik &
opposite direction from those predicted by the transi- Hagtvet, 1990) even speculated that REM predictions
tional perspective; results are significantly more dif- may only apply to adolescents (for alternative per-
ferent between T1 and T2 (before the transition) than spectives, see Marsh et al., 1999; Marsh & Craven,
between T2 and T3 (before and after the transition). In 2006). Support for the REM in the sport domain is
summary, these results provide good support for the important because most REM research is based on the
invariance of the results over time and grade level, academic domain. Although there have been pre-
reasonable support for a developmental perspective, vious studies of the REM in the sport domain, the
and no support for the transitional perspective. present investigation is apparently stronger—including
A Reciprocal Effects Model 1651

sport performance and sport self-concept measures in provided by school-based measures. This reinforces
each of the three occasions for a large, representative critical differences between the sport and the aca-
sample of young children. Hence, support for REM demic domains and supports the need to more fully
predictions in the sport domain based on responses by test the REM in the sport domain. Clearly, this is a
young children provides important new support for the relevant area for further research, with potentially
generalizability of the REM. important theoretical and practical implications about
how children incorporate information about their
performances in different domains into the formation
Representation of Performance in Sport and Academic
of their self-concepts and how these influence future
Domains
physical performance, health-related physical activ-
In most REM studies in the academic domain, aca- ity, physical fitness, and associated outcomes such as
demic performance measures are based on school- obesity.
based performance (e.g., school grades or teacher
ratings), standardized achievement test scores, or
Gender Differences in Sport Self-Concept and REM
both. Academic self-concept is typically more highly
Predictions
related to school-based performance measures—
particularly school grades—than standardized achieve- Gender differences in mean levels of sport self-
ment tests (e.g., Marsh & Craven, 1997). The theo- concept were in the expected, stereotypic direction
retical and substantive rationale for this finding is that (higher values for boys). However, in contrast to
school-based performance is a more immediate basis predictions based on the ‘‘differential socialization’’
of feedback about academic accomplishments and is model (that paths from sport self-concept to sport
likely to be more strongly affected by motivational performance and from sport performance to sport
influences (e.g., effort, persistence, conscientiousness) self-concept would be stronger for boys than for girls),
that are related to academic self-concept (for further the results were reasonably invariant over gender.
discussion, see Marsh, Trautwein, et al., 2005). For this However, this invariance over gender is consistent
reason, Marsh et al. (1999; Marsh & Craven, 2006) with previous research showing a similar pattern of
recommended that school grades and test scores invariance of relations among variables in gender-
should be considered as separate constructs rather stereotypic academic (math and verbal) domains.
than combined to form a single construct. Hence, support for the generalizability of REM pre-
Somewhat analogously to research in the academic dictions over gender is consistent with these a priori
domain, performance in the present investigation was expectations based on previous research. However,
based on performance on standardized physical test because gender differences in sport are stronger and
scores (T1), teacher ratings of sporting potential (T1), more consistent than those in other self-concept
and performance based on grades in physical educa- domains, our study offered an even stronger test of
tion classes (T2 and T3). Unlike typical findings in the the generalizability of REM predictions over gender.
academic domain, T1 sport self-concept was more Taken together with previous research, our results
strongly related to the T1 physical test scores (.52) provide strong support for the generalizability of
than the T1 teacher ratings (.42). Even at T3, the T1 REM predictions over gender.
physical test scores collected several years previously
were substantially correlated with T3 sport self-
The Effects of Transition From Primary to Secondary
concept (.51)—more highly correlated than the T2
School
school-based performance measure (.44)—and nearly
as highly correlated as the T3 school-based perfor- An interesting feature of the present investigation
mance measure (.60). Consistent with this finding, the is that German students moved from primary school
T1 physical test scores continued to make a significant to secondary school at the end of Grade 4 (between T2
contribution to the prediction of T2 sport self-concept and T3 in our study). Thus, during the T1 – T2 period,
and T3 sport self-concept even after controlling for the students moved from Grade 3 to Grade 4 within the
effects of the other performance measures (Model 4 in same school, whereas during the T2 – T3 period
Table 3). Whereas academic self-concepts for students students moved from Grade 4 in primary schools to
are likely to reflect primarily what happens in class- Grade 6 in secondary schools. Here, as is typically the
room settings, sport self-concepts are apparently more case, the effect of the transition is confounded with
likely to reflect extracurricular activities that may not age-related differences. However, because the transi-
even be formally associated with the school but which tion occurs earlier in the German school system than
provide a basis of self-evaluation in addition to that in many systems, the transition is less likely to be
1652 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

Table 3
Path Coefficients From Models 2, 3, and 4

T1Prf T1SSC T2Prf T2SSC Sex Age Sex  Age Test scores

Model 2: sport self-concept and performance


T1Prf — — — —
T1SSC — — — —
T2Prf .40* .23* — —
T2SSC .21* .60* — —
T3Prf .15* .01 .21* .23*
T3SSC .16* .08 .06 .66*
Model 3: sport self-concept, performance, and covariates (gender and age)
T1Prf — — — — .10* .00 .01
T1SSC — — — — .23* .04 .00
T2Prf .40* .24* — — .03 .03 .02
T2SSC .21* .60* — — .01 .02 .02
T3Prf .16* .02 .21* .23* .04 .01 .01
T3SSC .15* .06 .06 .66* .05* .00 .01
Model 4: sport self-concept, performance, covariates (gender and age), physical test scores
T1Prf — — — — .10* .00 .01 —
T1SSC — — — — .23* .04 .00 —
T2Prf .30* .18* — — .07* .05* .02 .24*
T2SSC .13* .55* — — .04 .00 .02 .22*
T3Prf .10* .00 .19* .19* .07* .01 .01 .20*
T3SSC .12* .05 .08* .63* .04 .01 .00 .12*
Test scores — — — — .24* .11* .00

Note. Times 1, 2, and 3 (T1, T2, and T3) are Grade 3, Grade 4, and Grade 6. Prf 5 performance (based on a teacher rating at T1 and school grades
in physical education at T2 and T3); SSC 5 sport self-concept; test scores 5 a physical test scores test administered at T1. Sex: 0 5 male, 1 5
female. Coefficients represent standardized path coefficients leading from constructs listed in the top of each table to constructs listed in the
left-hand column for each model.
*p , .05.

confounded with the onset of adolescence and contrasted a developmental perspective as an alter-
puberty than in most research (typically based on native explanation to the results. Second, most
North American settings). research in this area (based on North American
Developmentally, we expected self-concept respon- research) confounds the results of the actual transi-
ses to become slightly lower, more reliable, more tion with developmental changes associated with
stable, and more highly correlated with performance the onset of puberty and adolescence, whereas in the
as children grow older. In the present investigation, present investigation this transition occurred at an
we compared predictions from a transitional perspec- earlier preadolescent age (at the end of fourth grade).
tive (that T1 and T2 scores before the transition would Finally, much of the focus of sport in Germany
be more similar to each other than to the T3 scores occurs outside of school so that the potential disrup-
after the transition) and a developmental perspective tion because of a school transition might be smaller
(that T3 score would be more similar to T2 scores in sport than other academic and nonacademic
than to the T1 scores). Although there was reasonably components of self-concept more directly related to
good support for the invariance of the REM over time, school. Hence, there is need for further research that
there was support for the developmental perspective specifically contrasts developmental and transi-
and evidence against the transitional perspective. tional perspectives for different components of
The results still leave unanswered the question as self-concept (as well as other variables) at different
to why the transition effects were so weak in our developmental stages and that unconfounds the
study—a result that runs counter to many studies effects of the transition from the effects of the onset
showing the negative consequences of educational of adolescence and puberty. In this respect, our study
transitions. However, several features of the present offers an alternative, apparently novel approach to
investigation bear on interpretations of these results. studying the effects of transitions from a develop-
First, most previous research has not specifically mental perspective.
A Reciprocal Effects Model 1653

Conclusions and Practical Implications References

In conclusion, the direction of causality between self- Bandura, A. (1986). Social foundations of thought and action: A
social cognitive theory. Englewood Cliffs, NJ: Prentice-Hall.
concept and performance has profound practical
Bandura, A. (1997). Self-efficacy: The exercise of control. New
implications for educators, counselors, and parents, York: Freeman.
as well as for applied and academic psychologists. Bollen, K. A. (1989). Structural equations with latent variables.
Historically, research in this area has focused on two New York: John Wiley & Sons.
models that offer contrasting interpretations with Brettschneider, W.-D., & Gerlach, E. (2004). Sportliches
important theoretical and practical implications Engagement und Entwicklung im Kindesalter [Sports
(Marsh, 2007b; Marsh & Craven, 2006). According to involvement and development in children]. Aachen,
the self-enhancement model, the direction of causality Germany: Meyer & Meyer.
is from self-concept to performance. Support for this Byrne, B. M. (1996). Academic self-concept: Its structure,
model would justify placing more effort into enhanc- measurement, and relation to academic achievement. In
ing students’ self-concepts rather than focusing solely B. A. Bracken (Ed.), Handbook of self-concept (pp. 287 –
316). New York: Wiley.
on performance and achievement. In opposition to
Byrne, B. M. (1998). Structural equation modeling with
the self-enhancement model, the skill development LISREL, PRELIS, and SIMLIS: Basic concepts, applications
model predicts that the direction of causality is from and programming. Mahwah, NJ: Erlbaum.
performance to self-concept. Support for this model Byrne, B. M. (2002). Validating measurement and structure
implies that educators should focus solely on improv- of self-concept: Snapshots of past, present and future
ing performance and test scores, as this would be the research. American Psychologist, 57, 897 – 909.
best way to improve self-concept. In contrast to both Carver, C. S., & Scheier, M. F. (1981). Attention and self-
these apparently simplistic (either – or) models, the regulation: A control theory approach to human behavior.
REM implies that self-concept and performance are New York: Springer-Verlag.
reciprocally related and mutually reinforcing. Im- Cole, D. A., & Maxwell, S. E. (2003). Testing mediational
proved self-concepts will lead to better performance, models with longitudinal data: Questions and tips in the
use of structural equation modeling. Journal of Abnormal
and improved performance will lead to better self-
Psychology, 112, 558 – 577.
concepts. Thus, for example, if educators enhance Crain, R. M. (1996). The influence of age, race, and gender
self-concepts without improving corresponding lev- on child and adolescent multidimensional self-concept.
els of test scores and performance, then the gains in In B. A. Bracken (Ed.), Handbook of self-concept: Develop-
self-concept are likely to be short lived (as empha- mental, social, and clinical considerations (pp. 395 – 420).
sized in the REM as well as related research such as New York: Wiley.
Bandura’s, 1986, self-efficacy theory). However, if Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and
educators improve students’ test scores and perfor- self-determination in human behavior. New York: Plenum
mance levels without also fostering students’ self- Press.
beliefs in their capabilities, then the performance Eccles, J. S. (1987). Gender roles and achievement patterns:
gains are unlikely to be long lasting. If educators An expectancy value perspective. In J. M. Reinisch, L. A.
Rosenblum, & S. A. Sanders (Eds.), Masculinity/feminin-
focus on one construct to the exclusion of the other,
ity: Basic perspectives (pp. 240 – 280). New York: Oxford
then both are likely to suffer. The REM suggests that University Press.
the most effective strategy is to improve both self- Eccles, J. S., & Harold, R. D. (1991). Gender differences in
concept and performance simultaneously. The results sport involvement: Applying the Eccles‘ expectancy-
of the present investigation extend the generalizabil- value model. Journal of Applied Sport Psychology, 3, 7 – 35.
ity of these conclusions in relation to the sport Eccles, J. S., & Wigfield, A. (2002). Motivational beliefs,
domain, preadolescent ages, gender, and the critical values, and goals. Annual Review of Psychology, 53, 109 –
transition from primary to secondary school. 132.
Research in the academic domain suggests that the Graham, J. W., & Hofer, S. M. (2000). Multiple imputation
REM generalizes to a variety of academic outcomes in multivariate research. In T. D. Little, K. U. Schnabel, &
(Marsh, 2007; Marsh & Craven, 2006). Although J. Baumert (Eds.), Modeling longitudinal and multilevel
data: Practical issues, applied approaches, and specific exam-
beyond the scope of the present investigation, this
ples (pp. 201 – 218). Mahwah, NJ: Erlbaum.
suggests the relevance of the REM in the physical
Guay, F., Marsh, H. W., & Boivin, M. (2003). Academic self-
domain to a variety of physical outcomes such as concept and academic achievement: Developmental
health-related physical activity, physical fitness, obe- perspectives on their causal ordering. Journal of Educa-
sity, and cardiovascular problems associated with tional Psychology, 95, 124 – 136.
sedentary lifestyles that are likely to have their origin Harter, S. (1985). Manual for the Self-Perception Profile for
in childhood. Children. Denver, CO: University of Denver.
1654 Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

Harter, S. (1990). Processes underlying adolescent self- Marsh, H. W. (1993b). The multidimensional structure of
concept formation. In R. Montemayor, G. Adams, & academic self-concept: Invariance over gender and age.
T. Gullotta (Eds.), From childhood to adolescence: A transi- American Educational Research Journal, 30, 841 – 860.
tional period? (pp. 205 – 239). Thousand Oaks, CA: Sage. Marsh, H. W. (1994). Confirmatory factor analysis models
Harter, S. (1998). Developmental perspectives on the self- of factorial invariance: A multifaceted approach. Struc-
system. In N. Eisenberg (Vol. Ed.), Handbook of child tural Equation Modeling, 1, 5 – 34.
psychology (5th ed., Vol. 3, pp. 553 – 618). New York: Wiley. Marsh, H. W. (2007a). Application of confirmatory factor
Harter, S. (1999). The construction of the self: A developmental analysis and structural equation modeling in sport/exercise
perspective. New York: Guilford. psychology. In G. Tenenbaum & R. C. Eklund (Eds.),
Hattie, J. A. (1992). Self-concept. Hillsdale, NJ: Lawrence Handbook of sport psychology (3rd ed., pp. 774 – 798). New
Erlbaum. York: Wiley.
Hill, J. P., & Lynch, M. E. (1983). The intensification of Marsh, H. W. (2007b). Self-concept theory, measurement and
gender-related role expectations during early adoles- research into practice: The role of self-concept in educational
cence. In J. Brooks-Gunn & A. C. Peterson (Eds.), Girls at psychology. Leicester, UK: British Psychological Society.
puberty (pp. 201 – 228). New York: Plenum Press. Marsh, H. W., Balla, J. R., & McDonald, R. P. (1988).
Jacobs, J. E., Lanza, S., Osgood, D. W., Eccles, J. S., & Goodness of fit indexes in confirmatory factor analysis:
Wigfield, A. (2002). Changes in children’s self-competence The effect of sample size. Psychological Bulletin, 103,
and values: Gender and domain differences across grades 391 – 410.
one though twelve. Child Development, 73, 509 – 527. Marsh, H. W., Byrne, B. M., & Yeung, A. S. (1999). Causal
Jöreskog, K. G. (1979). Statistical estimation of structural ordering of academic self-concept and achievement:
models in longitudinal investigations. In J. R. Nessel- Reanalysis of a pioneering study and revised recom-
roade & P. B. Baltes (Eds.), Longitudinal research in the mendations. Educational Psychologist, 34, 154 – 157.
study of behavior and development (pp. 303 – 351). New Marsh, H. W., Chanal, J. P., Sarrazin, P. G., & Bois, J. E.
York: Academic Press. (2006). Self-belief does make a difference: A reciprocal
Jöreskog, K. G., & Sörbom, D. (1993). LISREL 8 [Computer effects model of the causal ordering of physical self-
software manual]. Chicago, IL: Scientific Software Inter- concept and gymnastics performance. Journal of Sport
national. Sciences, 24, 101 – 111.
Kaplan, D. (2000). Structural equation modeling: Foundations Marsh, H. W., & Craven, R. (1997). Academic self-concept:
and extensions. Newbury Park, CA: Sage. Beyond the dustbowl. In G. Phye (Ed.), Handbook of
Kling, K. C., Hyde, J. S., Showers, C. J., & Buswell, B. N. classroom assessment: Learning, achievement, and adjustment
(1999). Gender differences in self-esteem: A meta-analysis. (pp. 131 – 198). Orlando, FL: Academic Press.
Psychological Bulletin, 125, 470 – 500. Marsh, H. W., & Craven, R. G. (2006). Reciprocal effects of
Lerner, R. M. (1982). Children and adolescents as producers of self-concept and performance from a multidimensional
their own development. Developmental Review, 2, 342 – 370. perspective: Beyond seductive pleasure and unidimen-
Lerner, R. M. (1996). Relative plasticity, integration, tem- sional perspectives. Perspectives on Psychological Science,
porality, and diversity in human development: A devel- 1, 133 – 163.
opmental contextual perspective about theory, process, Marsh, H. W., Debus, R., & Bornholt, L. (2005). Validating
and method. Developmental Psychology, 32, 781 – 786. young children’s self-concept responses: Methodological
Maccoby, E. E., & Jacklin, C. N. (1974). The psychology of sex ways and means to understand their responses. In D. M.
differences. Stanford, CA: Stanford University Press. Teti (Ed.), Handbook of research methods in developmental
Marsh, H. W. (1989a). Age and sex effects in multiple science (pp. 138 – 160). Oxford, UK: Blackwell Publishers.
dimensions of self-concept: Preadolescence to early- Marsh, H. W., & Grayson, D. (1995). Latent-variable
adulthood. Journal of Educational Psychology, 81, 417 – 430. models of multitrait-multimethod data. In R. H. Hoyle
Marsh, H. W. (1989b). Sex differences in the development (Ed.), Structural equation modeling: Issues and applications
of verbal and math constructs: The High School and (pp. 177 – 198). Thousand Oaks, CA: Sage.
Beyond study. American Educational Research Journal, 26, Marsh, H. W., & Hau, K.-T. (1996). Assessing goodness of
191 – 225. fit: Is parsimony always desirable? Journal of Experimen-
Marsh, H. W. (1990a). The causal ordering of academic tal Education, 64, 364 – 390.
self-concept and academic achievement: A multiwave, Marsh, H. W., Hau, K.-T., & Grayson, D. (2005). Goodness
longitudinal panel analysis. Journal of Educational Psy- of fit evaluation in structural equation modeling. In
chology, 82, 646 – 656. A. Maydeu-Olivares & J. McCardle (Eds.), Contemporary
Marsh, H. W. (1990b). A multidimensional, hierarchical psychometrics. A festschrift to Roderick P. McDonald (pp.
self-concept: Theoretical and empirical justification. 275 – 340). Mahwah, NJ: Erlbaum.
Educational Psychology Review, 2, 77 – 172. Marsh, H. W., Hau, K.-T., & Wen, Z. (2004). In search of
Marsh, H. W. (1993a). Academic self-concept: Theory golden rules: Comment on hypothesis testing approaches
measurement and research. In J. Suls (Ed.), Psychological to setting cutoff values for fit indexes and dangers in
perspectives on the self (Vol. 4, pp. 59 – 98). Hillsdale, NJ: overgeneralising Hu & Bentler’s (1999) findings. Struc-
Erlbaum. tural Equation Modelling, 11, 320 – 341.
A Reciprocal Effects Model 1655

Marsh, H. W., & Perry, C. (2005). Does a positive self- Valentine, J. C., & DuBois, D. L. (2005). Effects of self-
concept contribute to winning gold medals in elite beliefs on academic achievement and vice-versa: Sepa-
swimming? The causal ordering of elite athlete self- rating the chicken from the egg. In H. W. Marsh, R. G.
concept and championship performances. Journal of Craven, & D. M. McInerney (Eds.), International advances
Sport and Exercise Psychology, 27, 71 – 91. in self research (Vol. 2, pp. 53 – 78). Greenwich, CT:
Marsh, H. W., Trautwein, U., Lüdtke, O., Köller, O., & Information Age.
Baumert, J. (2005) Academic self-concept, interest, grades Wigfield, A., & Eccles, J. S. (2002). The development of com-
and standardized test scores: Reciprocal effects models of petence beliefs, expectancies for success, and achievement
causal ordering. Child Development, 76, 397 – 416. values from childhood through adolescence. In A. Wigfield
Riepe, L., & Zindel, M. (1999). Talentsuche und Talentförder- & J. S. Eccles (Eds.), Development of achievement motivation
ung in NRW [Talent selection and promotion in North- (pp. 173 – 195). San Diego, CA: Academic Press.
Rhine Westphalia]. Düsseldorf, Germany: Ministerium Wigfield, A., Eccles, J., Mac Iver, D., Reuman, D., &
für Arbeit, Soziales und Stadtentwicklung, Kultur und Midgley, C. (1991). Transitions at early adolescence:
Sport des Landes Nordrhein-Westfalen. Changes in children’s domain-specific self-perceptions
Skaalvik, E. M. (1997). Issues in research on self-concept. In and general self-esteem across the transition to junior
M. L. Maehr & P. R. Pintrich (Eds.), Advances in motivation high school. Developmental Psychology, 27, 552 – 565.
and achievement (Vol. 10, pp. 51 – 98). Greenwich, CT: JAI Press. Wigfield, A., Eccles, J. S., Yoon, K. S., Harold, R. D.,
Skaalvik, E. M., & Hagtvet, K. A. (1990). Academic Arbreton, A., Freedman-Doan, K., et al. (1997). Changes
achievement and self-concept: An analysis of causal in children’s competence beliefs and subjective task
predominance in a developmental perspective. Journal values across the elementary school years: A three-year
of Personality & Social Psychology, 58, 292 – 307. study. Journal of Educational Psychology, 89, 451 – 469.
Valentine, J. C. (2001). The relation between self-concept and Wigfield, A., & Karpathian, M. (1991). Who am I and what can
achievement: A meta-analytic review. Doctoral dissertation, I do? Children’s self-concepts and motivation in achieve-
University of Missouri, Columbia. ment solutions. Educational Psychologist, 26, 233 – 261.
1656

Appendix
Means, Standard Deviations (SDs), and Correlations Among Variables Considered in the Present Investigation

Correlations

Variable M SD 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

1 T1PGRD 0.537 0.937 1.0


2 T1SS1 3.174 0.852 .391 1.0
3 T1SS2 2.244 1.059 .200 .462 1.0
4 T1SS3 3.171 0.900 .227 .508 .443 1.0
5 T1SS4 3.239 0.931 .211 .478 .335 .446 1.0
6 T1SS5 3.650 0.721 .223 .406 .194 .322 .332 1.0
7 T2PGRD 4.828 0.678 .495 .359 .202 .235 .215 .183 1.0
8 T2SS1 2.922 0.894 .408 .499 .353 .366 .328 .293 .557 1.0
9 T2SS2 2.271 0.985 .264 .367 .363 .297 .255 .154 .360 .555 1.0
10 T2SS3 3.089 0.896 .278 .400 .301 .344 .281 .228 .355 .585 .532 1.0
11 T2SS4 3.161 0.954 .344 .414 .233 .315 .395 .273 .389 .592 .438 .527 1.0
Marsh, Gerlach, Trautwein, Lüdtke, and Brettschneider

12 T2SS5 3.677 0.676 .302 .355 .201 .269 .274 .281 .373 .510 .300 .429 .465 1.0
13 T3PGRD 4.733 0.773 .374 .285 .141 .177 .209 .191 .435 .379 .277 .308 .329 .250 1.0
14 T3SS1 3.001 0.857 .397 .418 .273 .288 .243 .199 .383 .553 .400 .447 .436 .401 .530 1.0
15 T3SS2 2.363 0.932 .238 .339 .283 .242 .228 .151 .241 .394 .452 .390 .302 .230 .370 .530 1.0
16 T3SS3 3.122 0.859 .293 .327 .223 .256 .207 .184 .242 .394 .319 .420 .347 .321 .344 .548 .470 1.0
17 T3SS4 3.209 0.859 .290 .326 .201 .231 .265 .193 .296 .400 .271 .395 .410 .368 .379 .510 .388 .528 1.0
18 T3SS5 3.572 0.721 .343 .309 .200 2.30 .222 .220 .320 .408 .275 .370 .370 .403 .419 .548 .355 .446 .480 1.0
19 Sex 1.518 0.500 .099 .201 .204 .111 .082 .126 .062 .159 .126 .085 .064 .068 .034 .160 .156 .130 .075 .093 1.0
20 Age 9.674 0.634 .011 .043 .024 .059 .030 .015 .013 .046 .067 .071 .001 .021 .022 .064 .017 .031 .026 .005 .065 1.0
21 ZS  ZA 0.064 0.967 .008 .004 .002 .025 .014 .015 .017 .032 .039 .004 .003 .023 .003 .001 .036 .000 .025 .014 .023 .030 1.0
22 PSKIL 0.027 0.930 .515 .409 .251 .249 .225 .222 .458 .460 .315 .343 .342 .339 .418 .437 .269 .341 .335 .345 .245 .122 .003 1.0

Note. Each variable that was measured on multiple occasions has a prefix indicating the occasion (T1 5 Time 1, T2 5 Time 2, T3 5 Time 3). On each occasion, there are five sport self-concept
items (labeled SS1 – SS5). PGRD 5 physical education grade; Sex 5 gender (0 5 male, 1 5 female); ZS  ZA 5 Sex  Age cross product after each variable had been standardized (M 5 0,
SD 5 1); PSKIL 5 physical test scores of sport ability (measured at T1 only). Internal consistency (Cronbach’s alpha) was .76, .83, and .84 for responses at T1, T2, and T3, respectively.
Coefficient alpha estimates of reliability were nearly the same for girls and boys (T1: .74 vs. .76; T2: .83 vs. .82; T3: .81 vs. .82), although sport self-concept scores for girls were lower and had
larger SDs (see subsequent discussion of gender difference).

You might also like