You are on page 1of 15

Journal of Educational Psychology

© 2021 American Psychological Association 2022, Vol. 114, No. 3, 498–512


ISSN: 0022-0663 https://doi.org/10.1037/edu0000709

Class Composition, Student Achievement, and the Role of the Learning


Environment

Jeroen Lavrijsen1, Jonas Dockx1, Elke Struyf2, and Karine Verschueren1


1
Faculty of Psychology and Educational Sciences, KU Leuven
2
Faculty of Social Sciences, University of Antwerp

This study considers how class composition, in terms of between-student variability and the average level of
achievement, is related to the academic development of students, and how these relationships can be
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

explained by features of the class learning environment. At the start of secondary education, Flemish schools
This document is copyrighted by the American Psychological Association or one of its allied publishers.

can decide autonomously how to group their students, leading to variation in class mean and class heteroge-
neity between classes. In a sample of 2,895 Flemish students from 158 classes, math achievement at the end
of grade 8 was found to be unrelated to class heterogeneity, after accounting for previous achievement, intel-
ligence, gender, and social background. Path analyses showed that class heterogeneity was positively associ-
ated with teachers’ use of differentiated instruction to accommodate for differences between students, and
that differentiated instruction was related to higher student achievement. Second, students were found to
achieve better in classes with high average achievement. While this held true for all students, high achieving
students seemed to benefit the most from being in a class with a high average level. Although class-average
achievement was positively related to the academic orientation of the class, this did not explain the associa-
tion between class mean level and achievement. These results suggest that, although it might be beneficial
for the students in the high ability groups, grouping students in distinct classes according to ability might
have little overall benefit, and emphasize that teachers’ responses to student diversity might be more deci-
sive for improving student achievement than homogenizing classes in terms of ability.

Educational Impact and Implications Statement


In a sample of 2,895 Grade 8 students from 158 classes, students were found to achieve better when
they were in a class with a high mean achievement level, after accounting for previous achievement,
intelligence, gender, and social background. By contrast, class heterogeneity (i.e., the extent of between-
student differences) was largely unrelated to subsequent student achievement. Hence, while sorting stu-
dents according to ability could benefit those sorted into high ability classes (given the benefits associ-
ated with being in a class with a high mean level), this would come at the expense of those sorted into
low ability classes. Moreover, teachers in heterogeneous classes appeared to differentiate their instruc-
tion more to student capacities, which was subsequently related to higher student achievement.

Keywords: ability grouping, heterogeneity, class mean, differentiated instruction, academic orientation

Previous research has suggested that grouping students according to that group students in distinct classes depending on ability. Moreover,
ability can benefit their academic development. However, overall, the mechanisms through which ability grouping would affect achieve-
observed effects have been modest at best, in particular for arrangements ment remain unclear. This study considered how class composition, in
terms of average achievement and between-student variability, was asso-
ciated with individual math achievement in a large sample of eighth-
graders. In addition, differential effects for low and high achievers were
This article was published Online First December 30, 2021. considered: Do class average level and class heterogeneity affect stu-
Jeroen Lavrijsen https://orcid.org/0000-0001-9005-8350 dents with different levels of achievement differently? Finally, the study
The authors have no declarations of interest. This work was funded by examined how these relationships could be explained through the
Research Foundation–Flanders (FWO), Project S002917N. The work impact of class composition on two features of the class learning envi-
described has not been published previously and is not under consideration
ronment, that is, the instructional behavior of the teacher, and the aca-
for publication elsewhere. Its publication is approved by all authors and
tacitly or explicitly by the responsible authorities where the work was
demic orientation of the class. Accordingly, this study aimed to better
carried out. If accepted, it will not be published elsewhere in the same understand the processes through which class composition relates to the
form, in English or in any other language, including electronically without academic development of students.
the written consent of the copyright holder.
Correspondence concerning this article should be addressed to Jeroen Ability Grouping and Achievement
Lavrijsen, Faculty of Psychology and Educational Sciences, KU Leuven,
Tiensestraat 102 bus 3717, 3000 Leuven, Belgium. Email: Jeroen Over the past decades, numerous studies have investigated
.Lavrijsen@kuleuven.be the effects of ability grouping on academic achievement. This

498
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 499

literature has been summarized in multiple meta-analyses: A Class Composition


second order meta-analysis recently integrated findings from no
less than 13 earlier meta-analyses (Steenbergen-Hu et al., In general, students have been found to flourish when the diffi-
2016). This synthesis demonstrated that students can benefit culty of their schoolwork is in line with their capacities (Shernoff
from certain types of grouping arrangements, in particular et al., 2014), in particular when they face challenges that are just
within-class grouping (i.e., assigning students to small homoge- beyond their current level of mastery (“zone of proximal develop-
neous groups within their class) and cross-grade subject group- ment,” Vygotsky, 1980). However, when differences between
ing (i.e., grouping students of different grade levels together for classmates become too large, it could become difficult to tailor
a particular subject), although aggregated effect sizes for both schoolwork to divergent student capacities (Hallinan & Kubit-
types were modest (g = .25 and g = .26, respectively). schek, 1999). Hence, it has been argued that in heterogeneous
However, no benefits were observed for between-class group- classes, schoolwork might be too easy for high achieving students
ing, which involves placing students into distinct classrooms and too difficult for low achievers, leaving the former bored and
according to ability (g = –.03). Individual studies on between-class the latter frustrated and anxious (Acee et al., 2010), while in ho-
grouping have been largely inconsistent: Even meta-analytical mogeneous classes, students would become more motivated and
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

effect sizes have been either negative (e.g., Henderson, 1989), efficient learners (Duflo et al., 2011).
zero (e.g., Slavin, 1990), or positive (e.g., Kulik & Kulik, 1982). Despite these theoretical expectations, studies relating class
In addition, although ability grouping has often been assumed to heterogeneity to individual achievement have been inconclusive.
be beneficial to high achievers, but harmful to low achievers While some of these studies found that students in heterogeneous
(Schofield, 2010), the second order meta-analysis found little evi- classes achieved worse than their peers in homogeneous classes
dence for differential effects for low and high achievers (Steenber- (Cheung & Rudowicz, 2003; Fertig, 2003; Luyten & der
gen-Hu et al., 2016). Hoeven-van Doornum, 1995), other studies did not find any sys-
What might explain these inconsistencies? A first explanation tematic association (Hanushek et al., 2003; Leiter, 1983; Sanders
might be methodological. Effects of ability grouping have often et al., 1997), and some even reported small positive effects of
been established by comparing student achievement in classes class heterogeneity (Chiu et al., 2017; Duru-Bellat & Mingat,
with and without formal ability grouping arrangements. How- 1998; Raitano & Vona, 2013; Vigdor & Nechyba, 2007). A num-
ever, such arrangements might be, in reality, largely unrelated to ber of explanations have been put forward to understand why
actual class composition. For example, even schools that do not class heterogeneity does not have to be detrimental to student
formally adopt ability grouping might sort students in different achievement. For example, it has been suggested that heteroge-
classes depending on proxies of ability, such as students’ pro- neous classes create more help opportunities between high and
gram choices (Betts & Shkolnik, 2000). If this would be the low achieving students (Chiu et al., 2017); this could be benefi-
case, comparing student development in grouped and ungrouped cial not only to the recipient but also to the provider, as having to
classes without information on actual class composition would restructure and reconsider knowledge in order to explain it to
be like “comparing apples to apples and finding little difference”
others might deepen the latter’s own understanding of the mate-
(Rees et al., 2000, p. 18).
rial. It has also been suggested that students would benefit from
Second, it has been argued that “grouping does not produce
having a few high ability classmates, as these might already
achievement: instruction does” (Gamoran, 1987, p. 341). This
improve the quality of class interactions (e.g., class discussions).
argument, developed at length by Hattie (2002), proposes that
If adding a few high ability students to a low ability class would
grouping only affects the circumstances in which instruction
outweigh the corresponding loss in “peer quality” in high ability
takes place—but whether such arrangements promote learning
classes (Raitano & Vona, 2013), the more equal distribution of
depends more on how teachers adapt to and capitalize on these
circumstances than on the grouping in itself. For example, ability high ability students between classes associated with increased
grouping has been promoted under the assumption that more ho- class heterogeneity might be another asset of heterogeneous
mogeneous classes would lead teachers to better match instruc- classes (Duru-Bellat & Mingat, 1998).
tion to student capacities. However, whether teachers actually A second aspect of class composition associated with ability
take advantage of this opportunity depends on teacher behavior grouping is class mean level. In particular, ability grouping is
and views. In the end, the physical placement of students in abil- expected to polarize classes in high and low ability classes (com-
ity groups might be less decisive than how teachers respond to pared to ungrouped classes of medium ability). Class mean ability
student needs (Lou et al., 1996). has been consistently found to positively predict individual
This study aimed to advance the scientific understanding of the achievement (Cheung & Rudowicz, 2003; Duru-Bellat & Mingat,
relationship between class composition and student achievement in 1998; Fruehwirth, 2013; Opdenakker et al., 2002; Reynolds et al.,
three ways. First, instead of comparing the effects of formal group- 2014; Stäbler et al., 2017): Students in classes with high average
ing arrangements, this study directly considered two aspects of class ability usually perform better than similar students in classes with
composition related to it, that is, class heterogeneity and class mean low average ability. Several explanations have been put forward to
ability. Second, differential effects of compositional characteristics understand this relationship; for example, it has been suggested
on high and low achievers were investigated. Third, the study that teachers in high ability classes would hold higher expectations
examined how characteristics of the learning environment would for their students, positively affecting individual achievement (Jus-
mediate the relationship between class composition and individual sim & Harber, 2005; Kelly & Carbonaro, 2012; McGillicuddy &
achievement. Devine, 2018).
500 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

Differential Effects differentiated instruction more often than teachers in highly selective
(and thus more homogeneous) schools (Pozas et al., 2020). In addi-
In addition, class composition might have differential effects on tion, differentiated instruction could also moderate the effects of class
low and high achieving students. In general, ability grouping has composition on achievement. In particular, differentiated instruction
been argued to enlarge the gaps between low and high achieving
might matter most in heterogeneous classes, as in such classes
students (Duru-Bellat & Mingat, 1998; Hallinan & Kubitschek,
adequately accommodating for the large differences between students
1999; Hoffer, 1992; Schofield, 2010). For example, stratified edu-
would be most crucial.
cational systems (which track their students into different streams
As the second characteristics of the learning environment, we
according to ability) have been found to have larger score disper-
considered the academic orientation of the class, that is, the impor-
sion in international student tests than comprehensive school sys-
tance classmates attach to studying. In general, differentiation-
tems (Hanushek & Wößmann, 2006; Lavrijsen & Nicaise, 2016;
polarization theory postulates that grouping creates a polarization
Van de Werfhorst & Mijs, 2010). Of note, however, this disper-
between an academically oriented class culture in high ability
sion might be a mere consequence of the effect of class mean level
on achievement: In a tracked system, high achievers are grouped groups and a less academically oriented class culture in low ability
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

in classes with high average ability, which would subsequently groups (Hargreaves, 1967). On average, students referred to lower
This document is copyrighted by the American Psychological Association or one of its allied publishers.

predict better individual performance; for low achievers, the ability classes evaluate schoolwork as less relevant and are more
reverse would hold true. By contrast, this study addressed whether fatalistic about their educational prospects (Malmberg & Trempała,
class mean level or class heterogeneity themselves would have dif- 1997; Van Houtte & Stevens, 2008). The concentration of such atti-
ferential effects on high and low achievers. First, it has been tudes may fuel the development of an “antischool” culture in low
argued that in particular high ability students might benefit from ability classes (Abraham, 1989; Berends, 1995), which in turn
being in a class with a high mean level (Opdenakker et al., 2002). might deteriorate the academic development of students in these
In such classes, instruction is usually provided at a high level and classes (Agirdag et al., 2012; Kindermann, 2007; Van Houtte,
pace. As such, a cognitively demanding instruction would better 2016; Van Houtte & Stevens, 2010). By contrast, high ability
suit the needs of high ability students (Preckel et al., 2010); such classes often have a more orderly and quiet classroom atmosphere,
students would maximally benefit from being in a high level class. which would create a fertile environment for student development
Second, it has been suggested that in particular, low ability stu- (Opdenakker & Van Damme, 2001, 2006). Hence, we investigated
dents would benefit from being in a heterogeneous class (Deunk et whether class composition was associated with individual achieve-
al., 2018; Lou et al., 1996), as this implies the presence of high ment through the academic orientation of the class.
achieving peers upholding teacher standards and instructional
level (Raitano & Vona, 2013). By contrast, being grouped in a Present Study
homogeneously low achieving class would deprive students from
the example and stimulation provided by high achieving peers The present study investigated the association between class
(“behavioral contagion”; Slavin, 1987, p. 297). composition and individual student achievement and considered
how these associations might be mediated through the class learn-
The Mediating Role of The Learning Environment ing environment. To accommodate for potential selection biases
(Dicke et al., 2018), we controlled for predictors both at the indi-
Beyond establishing relationships between class composition vidual level and the class level, such as previous achievement
and individual achievement, this study focused on the learning (Hailikari et al., 2008), general intelligence (Kriegbaum et al.,
environment as a possible mediator of these relationships. 2018), gender (Voyer & Voyer, 2014), social background (Sirin,
First, to accommodate for differences between students, teach- 2005), and class size (Nye et al., 2000). Moreover, we adopted a
ers may differentiate their instruction, that is, identify learners’ multilevel structural “doubly latent” model to simultaneously cor-
needs and provide instruction in accordance with these needs. rect for measurement error and sampling error (see Statistical
Overall, differentiated instruction has been proposed as a powerful Strategy section).
instructional strategy to address diverse student needs (Tomlinson, We formulated the following three research objectives.
2014). However, because differentiated instruction has been con-
ceptualized in diverging ways (Graham et al., 2021), estimates of Research Objective 1—Investigating the Associations
its association with student achievement have been inconsistent, Between Class Heterogeneity and Class Mean Level and
ranging from very weak (Seidel & Shavelson, 2007) to moderate Individual Achievement
(Smale-Jacobse et al., 2019). In this study, we defined differenti-
ated instruction as instructing students with different capacities First, instead of comparing formal grouping arrangements, we
differently, for example by providing students with different tasks used student scores on a standardized math test to directly measure
according to their ability. two aspects of class composition that are most relevant for student
Arguably, class composition is related to teachers’ use of differen- learning, in particular, class mean level and class heterogeneity. As
tiated instruction in two ways. First, the opportunities to capitalize on students have consistently been found to benefit from having high-
student variability might be more salient for teachers in classes with ability peers, we expected class mean level to be positively associ-
large between-student differences. Accordingly, teachers in heteroge- ated with individual academic development (H1). By contrast, as
neous classes might be more eager to implement differentiated previous studies in this regard have been inconsistent, we investi-
instruction than teachers in homogeneous classes. For example, a gated the association between class heterogeneity and achievement
German study found teachers in comprehensive schools to employ in an exploratory way (RQ1).
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 501

Research Objective 2—Examining Differential Effects of Method


Class Composition on Low and High Achievers
In addition, we explored possible differential effects of class Participants and Procedure
composition between low and high achievers. First, as high This study used data from a large longitudinal study following
achieving students would benefit the most from the high standards 3,409 Flemish secondary students in 27 schools through Grades 7
and demanding instruction in high ability classes, we expected a and 8. In this article, we considered student academic development
positive interaction between individual achievement and class throughout Grade 8. The study was approved by the Ethical Com-
mean level (H2). Second, we expected class heterogeneity to have mittee of KU Leuven. Prior to conducting the study, we obtained
a more positive association with achievement for low achievers informed consent from students, their parents, and their teachers.
than for high achievers, as for low achievers the presence of a few Out of the original sample, 469 students left the school before the
high achieving peers could uphold teacher standards, instructional end of Grade 8. An additional 45 students switched between
level, and peer stimulation (H3). classes throughout Grade 8 and were discarded from the analysis.
This left us with an analytic sample consisting of 2,895 students
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Research Objective 3—Examining the Role of the Class


This document is copyrighted by the American Psychological Association or one of its allied publishers.

from 158 classes (48.9% boys, mean age at the end of the study:
Learning Environment in the Association Between Class
14.0 years).
Composition and Achievement Within each Grade 8 class, the standard deviation and the mean
Finally, beyond establishing associations between class compo- achievement of all class members was determined. This was done
sition and achievement, we considered two features of the class by using students’ average standardized score on two standardized
learning environment which may explain these associations (Hat- mathematics tests administered in fall (November) and spring
tie, 2002), in particular, teacher’s use of differentiated instruction (May) of Grade 7. By combining scores on two previous measure-
and the academic orientation of the class. First, we expected class ment waves, we aimed at minimalizing measurement error (Dicke
mean to positively predict class academic orientation (H4), with et al., 2018). Subsequent individual achievement was measured
class academic orientation subsequently predicting higher individ- using a standardized math test at the end of Grade 8 (May 2019).
ual achievement (H5). Second, we expected class heterogeneity to In the Fall of Grade 8 (November 2018), students completed a sur-
relate positively to teachers’ use of differentiated instruction (H6), vey to assess the perceived fit between capacities and schoolwork
with differentiated instruction positively predicting subsequent difficulty and to measure the degree of differentiation imple-
achievement (H7). In addition, we hypothesized that class hetero- mented by the math teacher; mentor teachers were surveyed on the
geneity and differentiated instruction would interact, with differen- academic orientation of their class.
tiated instruction being more strongly associated with achievement All data and analysis scripts have been made publicly available
in classes with large differences between students (H8). at Open Science Framework and can be accessed at https://osf.io/
wchqm/.
Study Context
The study was carried out among Grade 8 students in Flanders Instruments
(age at the first measurement occasion: 13.4 years). At this stage Standardized Mathematics Test
of secondary education, Flemish classes can still be fairly hetero-
geneous, although the level of heterogeneity varies between Items were adopted from the LiSO-test, a standardized mathe-
classes and schools (Lavrijsen et al., 2014). In principle, all stu- matics test lasting two hours that has demonstrated good reliability
dents who successfully completed primary education transfer to a and validity (Dockx et al., 2017). IRT-analyses confirmed that the
program with a largely shared curriculum and a strong emphasis test measured one underlying math competency and that the test
on basic education (A-stream).1 For example, Grade 8 students had good reliability (a = .89). On the basis of these IRT-analyses,
spend 24 weekly periods out of 32 on a common core (which also raw test scores were converted into math achievement scores using
includes mathematics). Schools can, however, decide autono- weighted likelihood estimation (Warm, 1989). Scores were stand-
mously how to group their students. In practice, this is often done ardized for each test.
according to students’ choices for the remaining eight weekly peri-
ods (option choice), in which students can choose to take classical
Intelligence
languages, supplementary basic education, technology, and so on. General intelligence was measured at the beginning of the study
As these choices are not independent from student ability (e.g., av- (October 2017). Each student completed a cognitive test (CoVaT-
erage ability levels are the highest among students opting for clas- CHC) measuring both fluid and crystallized intelligence (Magez et
sical languages (Magez & Bos, 2015), grouping students al., 2015). The test, which builds on other validated cognitive tests
according to option choice implies a certain homogenization of (Magez, 2015) within a CHC model of intelligence (Horn & Cattell,
classes, which might however be less thorough compared to edu- 1966), has demonstrated both content validity (Tierens, 2015) and
cational systems with a more explicit reliance on ability grouping criterion validity (Magez & Bos, 2015). An IQ score for each stu-
(e.g., systems in which students are assigned to tracks based on dent was calculated based on a comparison of test results with a
their performance on standardized tests). This variation in class
composition offers an interesting opportunity to investigate the 1
Students who did not successfully complete primary education and
association between class heterogeneity and class mean level and students with severe learning disabilities are usually referred to distinct
student achievement. programs (vocational preparatory or special education programs).
502 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

representative norming sample, resulting in a score with population likelihood function that indicates the probability to observe the
mean 100 and standard deviation 15 (Tierens & Magez, 2016). available data. Hence, with this method, observations with missing
values do not have to be discarded, and all available information is
Gender used to estimate the model.
This was coded with males as the reference category (= 0).
Statistical Strategy
Social Background
Within each class, the standard deviation and the mean of the
Social background was measured with three dichotomous meas- previous math achievement of all class members were calculated
ures delivered by a government database. Each of these measures to represent class composition characteristics (excluding the stu-
indicated an aspect of coming from a disadvantaged background, in dents’ own contribution to assess class mean level).
particular (1) whether the student was entitled to a school allow-
ance, (2) whether the mother of the student did not complete sec- Measurement Error, Sampling Error, and Phantom Effects
ondary school, and (3) whether the student lived in a neighborhood
Considering that our research questions focus on student-level
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

with strong educational disadvantage. On each of the measures, val-


latent variables that are aggregated to the class level, we need to
This document is copyrighted by the American Psychological Association or one of its allied publishers.

ues equal to 1 indicated social disadvantage (reference: 0).


simultaneously correct for measurement error and sampling error.
Differentiated Instruction The measurement error is caused by the indicators of a latent vari-
able being unreliable measures of that latent variable. The sam-
Students completed a 5-item scale measuring the degree of dif- pling error of each class’s aggregate is caused by the limited
ferentiated instruction implemented by their math teachers. Three
number of students per class (16.3 on average in our study). If the
items were derived from a scale documented by Bos et al. (2005),
measurement error of a latent variable is unaccounted for in a
which assesses the degree to which teachers tune tasks to student
regression model, the regression coefficient estimates will be bi-
capacities (e.g., “My math teacher gives students with different
ased toward zero (attenuation bias; Charles, 2005). If the sampling
capacities different tasks”). To increase the reliability of the scale,
error of the finite number of students per class is unaccounted for,
two items were added (“My math teacher succeeds in instructing
the between-class variance estimate will be positively biased.
students with different capacities differently”; “My math teacher
Structural equation modeling makes it possible to model and
succeeds in letting students with different capacities help each
account for the measurement error by specifying how well the
other”). All items were scored by students on a 5-point Likert
indicators measure a latent variable. In a similar way, multilevel
scale. The reliability of the 5-item scale was acceptable (a = .76).
modeling makes it possible to model and account for the sampling
ICC(2) at the item-level ranged between .586 and .828, which is
error of student-level variables that are aggregated to the class
considered to be good to excellent and allows for aggregation at
level, based on the number of students and the variance partition-
the class level. Similarly, a multilevel CFA demonstrated that
ing coefficient by using a shrinkage estimator (Greenland, 2000).
30.11% of the variance of the latent construct was at the class level
Harker and Tymms (2004) have shown that when both measure-
(the ICC of the manifest scale average was 23.3%). For each class,
ment error and sampling error of classes are present when estimat-
an average value was calculated representing the use of differenti-
ing composition effects, phantom effects can occur. These are
ated instruction by the class math teacher.
positive composition effects that result from the measurement
Academic Orientation of the Class error and sampling error (Dicke et al., 2018). Accordingly, to
simultaneously correct for measurement error and sampling error,
Mentors of each class evaluated the academic orientation of multilevel structural equation modeling (MSEM) should be used.
their class with 4 items from a scale developed in earlier research In research on the big-fish-little-pond effect, such a model has
(Dockx et al., 2015) (e.g., “For students in this class, learning been referred to as a doubly latent model (e.g., Lüdtke et al., 2011;
something is very important”). All items were scored by the men- Marsh et al., 2009).
tor on a 5-point Likert scale. Internal reliability was high (a = .88).
Using Measurement Models to Account for Measurement
Missing Data Error
Of the 2,895 students from 158 classes in the sample, one class In MSEM, measurement error of latent variables is accounted
with 20 students did not participate in the final mathematics test for by specifying measurement models of latent variables that
due to time shortage. In addition, 112 students (3.9% of the sam- have a good fit to the data (Baumgartner & Steenkamp, 2006;
ple) were absent at the time of the test administration, mostly due Cheung & Rensvold, 2002). Which measurement models fit best
to illness. This left a final analytic sample of 2,763 students from must be determined before specifying the structural equation
157 classes. A share of the students did not complete the items on model. Hence, we needed to assess how well a multilevel factor
perceived educational fit (7.3%) and differentiated instruction model for each latent variable reproduces the observed covariance
(8.9%); values on these items were, however, averaged within matrix of a latent variable’s indicators. Three fit indices were used
classes. The mentor teachers of 43 classes with, in total, 745 stu- to examine the fit: the comparative fit index (CFI), the Tuck-
dents did not complete the teacher survey measuring class aca- er–Lewis index (TLI), and the root mean square error of approxi-
demic orientation. To ensure maximal use of the sample, full mation (RMSEA). We followed Hu and Bentler (1999) cutoff
information maximum likelihood (FIML) estimation was used. criteria. This was tested for differentiated instruction and class
This means that parameters were estimated by maximizing a academic orientation. For differentiated instruction, the indicators
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 503

were allowed to vary at the student and class level; for academic achievement was standardized within each class in order to investigate
orientation of the class, the indicators were only allowed to vary at its interaction with class composition characteristics. All latent varia-
the class level. The latent variables’ means and variances were bles’ means and variances were respectively constrained to zero and
respectively constrained to zero and one at the class level. Differ- one at the class level. For Level 1 predictors, standardized coefficients
entiated instruction’s variances was unconstrained at the student were calculated using the STDYX option in Mplus. For Level 2 pre-
level, whereas class academic orientation’s variance at the student dictors in a multilevel doubly latent model, however, this option
level was constrained to zero. The factor loadings of the indicators appears to produce potentially unreliable standardized coefficients (see
were freely estimated, but constrained to equality across the stu- Marsh et al., 2009; Parker et al., 2013; Pinxten et al., 2015). To accom-
dent and class level. modate for this, and following recommendations by Marsh et al.
Satisfactory model fit was achieved for differentiated instruction (2009, pp. 791–792), we standardized the estimates of class-level pre-
(RMSEA = .05, CFI = .96, TLI = .97). To achieve this model fit, it dictors with respect to the between variance of the predictor and the
was necessary to let the residuals of item 3 and item 4 covary total (within þ between) variance of the criterion (see also Pinxten et
freely, both at the student level and class level. Satisfactory model al., 2015). In addition, following Lorah (2018), for the class-level pre-
fit was achieved for academic orientation of the class (RMSEA = dictors of interest (i.e., those with a significant association with
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

.00, CFI = 1.00, TLI = 1.00). To achieve this model fit, it was nec- achievement), we additionally reported Cohen’s f2, which reflects the
This document is copyrighted by the American Psychological Association or one of its allied publishers.

essary to let the residuals of item 1 and item 4 covary freely. These proportion of variance explained by the given effect relative to the pro-
measurement models were incorporated into the structural equa- portion of outcome variance unexplained. Cohen’s f2 was considered
tion models. small at a value of .02, medium at a value of .15, and large at a value
However, for the standardized mathematics tests of our baseline of .35 (Lorah, 2018). To assess the magnitude of indirect effects, the
and final measurements, we were unable to incorporate their mea- completely standardized indirect effect (CSIE) was calculated
surement models into the analyses, because they have too many pa- (Preacher & Kelley, 2011), in combination with a Sobel test to test sta-
rameters. This was also the case for the cognitive ability test. tistical significance of the indirect effect.
Instead, we used these variables’ estimated reliabilities from their
measurements, to include a single-indicator reliability correction Results
(Hayduk & Littvay, 2012) when specifying the multilevel structural
equation model. Assuming that measurement error was situated Descriptive Statistics
mostly at the individual level, we incorporated these reliability esti-
mates from these measurement models into the structural equation Table 1 presents the means, standard deviations and correlations
models (Geldhof et al., 2014). between the study variables. Math achievement at the end of
Grade 8 was strongly related to previous achievement (r = .77)
Multilevel Structural Equation Models and intelligence (r = .59), for which we control when assessing
We specified three multilevel structural equation models. The relationships between class level variables and individual achieve-
first model estimated math achievement at the end of Grade 8 as a ment. Class mean achievement and class heterogeneity proved to
function of prior student-level achievement, gender, cognitive abil- be moderately negatively related to each other (r = –.32), showing
ity and social background at the individual level and class mean that classes with higher average math achievement tend to be more
level and class’s variability in achievement at the class level. In homogeneous as well. Differentiated instruction was moderately
the second model, interaction terms between individual achieve- positively related (r = .41) to class heterogeneity: In classes with a
ment and class composition characteristics were added. Third, a larger variance in student capacities, teachers make more use of
multilevel path model was specified. In this model, indirect paths differentiated instruction according to their students. Finally, class
from class compositional characteristics to achievement were also academic orientation was moderately to strongly positively related
included, that is, accommodating for the possible mediation of the to class mean achievement (r = .49), suggesting that the climate in
association between class composition and achievement through classes with high initial achievement was more oriented toward
learning.
both features of the learning environment. To accommodate for
the potentially complex relationships between different features of
class composition, we estimated a saturated model, that is, a model Research Objective 1
including all hypothesized paths between class-level predictors, Table 2 presents the results from a multilevel structural model
class-level mediators, and achievement. predicting individual math achievement at the end of Grade 8 as a
Estimation function of class heterogeneity and class mean, controlling for
individual previous achievement, intelligence, gender, and social
These models were specified in Mplus 8, and full information maxi- background at the individual level, and gender and social composi-
mum likelihood was used for parameter estimation (Muthén & tion at the class level. Compared to an empty multilevel model,
Muthén, 2015). Variables were standardized (mean: 0, standard devia- the model explained large parts of the variance in achievement
tion: 1) according to the guidelines put forward by Enders and Tofighi scores, with 61.2% of the variance at the individual level and
(2007, p. 136), who suggested grand mean centering of Level 1 predic- 75.0% of the variance at the class level attributed to the predictors
tors when one is primarily interested in effects of Level 2 predictors, included in the model. At the individual level, previous achieve-
and centering within cluster when cross-level interactions are to be ment in Grade 7 was most strongly associated math achievement
examined. Hence, in this study, grand mean centering was used in all in Grade 8, with intelligence additionally predicting achievement.
analyses except for the analysis addressing RQ2; for the latter analysis, Gender (ref.: male) was positively associated with achievement,
504 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

Table 1
Means, Standard Deviations, and Correlations Between Study Variables
Correlations
Variable M SD 1 2 3 4 5 6 7 8 9 10 11 12
Individual level variables
1 Math achievement (G8) 0 1 —
2 Math achievement (G7) 0.01 0.95 .77* —
3 Intelligence 104.43 14.03 .59* .71* —
4 Female 0.51 0.5 .03 .11 .18* —
5 Disadvantaged neighbourhood 0.15 0.36 .01 .04 .10* .01* —
6 Low educated mother 0.11 0.31 .16* .18* .20* .06* .17* —
7 School allowance 0.18 0.39 .17* .19* .19* .03* .14 .32* —
Class level variables
8 Class mean 0.01 0.64 .59* .63* .49* .06* .04* .20 .17* —
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

9 Class heterogeneity 0.70 0.15 .17* .21* .13* .02* .08 .08* .05* .32* —
This document is copyrighted by the American Psychological Association or one of its allied publishers.

10 Female proportion 0.51 0.2 .06* .11* .15* .39* .06* .05* .05* .16* .05* —
11 Class size 19.34 3.75 .31* .33* .25* .04* .08 .14* .11* .48* .08* .10* —
12 Use of differentiated instruction 2.51 0.49 .18* .26* .17* .03* .02 .14 .14* .38* .41* .06* .23* —
13 Academic orientation 3.76 0.65 .37* .34* .28* .05* .02* .11 .09* .49* .03* .13 .32* .16*
* p , .05.

meaning that at the end of Grade 8, girls outperformed boys who H3, which hypothesized a negative interaction effect, could not be
were at a similar level of achievement in Grade 7. Having a lower supported.
educated mother and receiving a school allowance negatively pre-
dicted individual achievement. At the class level, the mean level
of the class was significantly positively associated2 with subse- Research Objective 3
quent individual achievement at the end of the school year, sup- Finally, Table 3 presents the results of a multilevel structural
porting H1. When standardized according to guidelines by Marsh and saturated model with paths between class composition and
et al. (2009), the effect size equaled .335, while Cohen’s f2 (Lorah, achievement through use of differentiated instruction and the aca-
2018) equaled .417, which is to be interpreted as a large effect. By demic orientation of the class (information on the measurement
contrast, class heterogeneity was not significantly associated with part of this model can be found in the Appendix). Direct effects of
achievement at the end of Grade 8 (RQ1). individual and class-level predictors were in line with previous
results. Regarding the paths through the learning environment
Research Objective 2 mediators, class mean level was positively associated with the aca-
demic orientation of the class, supporting H4. However, the rela-
Next, two interaction effects between individual achievement tionship between academic orientation of the class and individual
and class composition characteristics were added. This addition achievement did not reach significance, failing to provide statisti-
showed that class mean level both had a positive mean effect (b = cal support for H5. The completely standardized indirect effect of
.839, SE = .348, b/SE = 2.409, p = .016) and a significantly posi- the path from class mean to achievement through academic orien-
tive interaction effect with individual level (b = .074, SE = .019, b/ tation equaled .049, with a Sobel test indicating nonsignificance
SE = 3.896, p , .001), indicating that high achievers benefited the (z = 1.383, p = .167). Class heterogeneity was positively related
most from being in a class with a high mean level, supporting H2. to the use of differentiated instruction, supporting H6. In turn,
To illustrate this interaction effect, Figure 1 plots the effect of differentiated instruction led to higher achievement at the end of
being in a high mean level class for three hypothetical students the year, net of preexisting differences, supporting H7. Cohen’s
with different levels of individual ability (i.e., one student achiev- f2 for the association between differentiated instruction and
ing two standard deviations below the grand mean, one student achievement equaled .037, which is to be interpreted as a small
achieving at the grand mean, and one student achieving two stand- to medium effect (Lorah, 2018). The completely standardized
ard deviations above the grand mean). Hence, Figure 1 shows how indirect effect of the path from class heterogeneity to achieve-
being in a class with a high mean level is beneficial to all students, ment through differentiated instruction was .054, which was sig-
but somewhat more for high achieving students. nificant at the 5% level (z = 2.222, p = .027). Beyond these two
Class heterogeneity did not have a main effect on individual expected paths, the saturated model also yielded smaller but sig-
achievement (b = .025, SE = .030, b/SE= .831, p = .406). However, nificant associations between class heterogeneity and academic
a positive interaction between class heterogeneity and individual
achievement just reached significance (b = .114, SE = .057, b/SE = 2
We also added a quadratic effect for class mean; however, this
2.010, p = .044), suggesting that for high but not for low achievers quadratic term was non-significant (b = –0.166, SE = 0.088, b/SE = –1.887,
being in a heterogeneous class could be slightly beneficial. Hence, p = .059).
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 505

Table 2
Structural Multilevel Model Predicting Individual Math Achievement in Terms of Individual and Class Composition Characteristics
Predictor b SE b/SE p
Individual level
Previous achievement 0.730*** 0.027 27.039 .000
Intelligence 0.061* 0.029 2.069 .019
Female 0.077*** 0.016 4.755 .000
Disadvantaged neighborhood 0.014 0.017 0.800 .212
Low educated mother 0.034* 0.017 2.004 .023
School allowance 0.034* 0.015 2.214 .012
Class level
Class mean 0.335*** 0.075 4.456 .000
Class heterogeneity 0.072 0.043 1.687 .092
Female proportion 0.008 0.049 0.168 .866
Class size 0.035 0.059 0.589 .556
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Model fit
This document is copyrighted by the American Psychological Association or one of its allied publishers.

v2 2,997.069
df 24
CFI 0.923
RMSEA 0.077
SRMR
Within 0.090
Between 0.172
Variance components
Individual level 0.388
Class level 0.250
Variance explained
Individual level 0.612
Class level 0.750
Sample size
Number of students 2,736
Number of classes 156
Note. Standardized estimates. CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean
squared residual.
* p , .05. *** p , .001.

orientation (positively) and between class mean level and differ- explained through two features of the class learning environment,
entiated instruction (negatively). in particular, through the use of differentiated instruction and the
Finally, an interaction term between class heterogeneity and differ- academic orientation of the class.
entiated instruction was added. While the interaction analysis model The first class composition characteristic that depends on
confirmed the positive main effect of differentiated instruction on grouping arrangements is the class mean level. In line with ear-
individual achievement, the interaction term with class heterogeneity lier research (Reynolds et al., 2014), this study found students to
did not reach significance (b = –.010, SE = .032, b/SE = –.297, p = greatly benefit from having high-achievement peers, after
.380). Hence, differentiated instruction seemed to benefit students in accounting for various predictors at the individual (previous
all classes, irrespective of the extent of between-student differences. achievement, intelligence, gender, social background) and class
level (gender and social composition). While this appeared to be
Discussion an asset for all students, in particular high achieving students
benefited from being in a high achieving class. This might be
Although heterogeneous classes have been argued to depress due to the fact that in classes with a high mean level, instruction
student achievement because in such classes schoolwork difficulty can be cognitively demanding, which would primarily suit the
would be less aligned with student capacities (Kulik & Kulik, needs of high ability students (Preckel et al., 2010). Our finding
1982), research on between-class grouping, that is, the sorting of is also in line with earlier indications on the potential of gifted
students in distinct classes according to ability, has yielded incon- classes and gifted programs to cater for the needs of high ability
clusive results (Steenbergen-Hu et al., 2016). In this study, we students (Bailey et al., 2012; Rogers, 2007). In particular,
tested whether the class mean achievement level and class hetero- research on gifted education has emphasized that high ability
geneity, operationalized as the between-student variability in pre- students risk to be underchallenged at school (Acee et al., 2010),
vious math achievement, were associated with subsequent math due to limited fit between the needs of these students and the
achievement at the end of Grade 8, using data on 2,895 students learning environment in regular classes (Gallagher et al., 1997;
from 158 classes. In addition, we examined whether these associa- Kanevsky & Keighley, 2003). This may lead to disengagement
tions were different between low and high achieving students. and underachievement among gifted students (Obergriesser &
Finally, we investigated whether such associations could be Stoeger, 2015; Reis & McCoach, 2000; Snyder & Linnenbrink-
506 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

Figure 1
Graphic Representation of Interaction Effect Between Individual Achievement and Class Mean
Level
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Note. Predicted achievement as a function of class mean level for three students’ profiles with different levels
of initial achievement (2 standard deviations below group mean, at the group mean, and 2 standard deviations
above group mean). See the online article for the color version of this figure.

Garcia, 2013). Hence, implementing gifted classes has been sug- explanations of this relationship. For example, it has been
gested as a solution to provide high ability students with argued that teachers in high achieving classes hold higher
adequately challenging material (Preckel et al., 2010). Gifted expectations about their students and that this could promote
classes have indeed been shown to promote interest in school learning in such classes (and decrease learning in low achieving
among gifted students (Vogl & Preckel, 2014; Zeidner & classes; Jussim & Harber, 2005; Kelly & Carbonaro, 2012;
Schleyer, 1999), leading students in such classes to perform bet- McGillicuddy & Devine, 2018).
ter than their counterparts in regular classes (Preckel et al., The second class compositional characteristic that is affected by
2019). grouping is class heterogeneity. However, we did not find a main
This study then considered whether the positive association effect of class heterogeneity on subsequent math achievement.
between class-average achievement and individual achievement This finding corroborates and extends earlier research that did not
could be explained through the academic orientation of the find such an association neither (Hanushek et al., 2003; Leiter,
class, that is, the importance students attach to studying. Differ- 1983; Sanders et al., 1997), although other studies have yielded di-
entiation-polarization theory suggests that classes with a high vergent results (either positive or negative; Cheung & Rudowicz,
concentration of high achievers might have a stronger emphasis 2003; Chiu et al., 2017; Duru-Bellat & Mingat, 1998; Fertig,
on academic effort and performance, while the reverse would 2003; Luyten & der Hoeven-van Doornum, 1995; Raitano &
be true in classes with low mean achievement (Hargreaves, Vona, 2013; Vigdor & Nechyba, 2007). In addition, we found that
1967). Indeed, we observed that the academic orientation of a the more students within a class differed in terms of previous
class was related to its mean achievement level. However, the achievement, the more their teachers were found to adapt their
academic orientation of the class was not significantly associ- instruction to student needs. In particular, math teachers in hetero-
ated with individual achievement. It should be noted that while geneous classrooms were reported by their students to make more
class compositional characteristics and achievement were oper- use of differentiated instruction, that is, identifying the needs of
ationalized as subject-specific (i.e., measured with a math test), students and modifying instructional strategies and methods in ac-
academic orientation was measured with reference to school- cordance with these needs (Tomlinson, 2014). This more intensive
work in general. In general, motivational characteristics are use of differentiated instruction was found to positively relate to
known to have smaller effects on subject-specific outcomes subsequent student performance.
when they are measured in a general instead of a subject-spe- These findings might corroborate earlier arguments that rather
cific way (e.g., Steinmayr et al., 2019). More substantively, the than reducing class heterogeneity through ability grouping, it
fact that academic orientation did not significantly mediate the might be how teachers respond to differences between students
relationship between class-average achievement and subse- that is decisive for student achievement—in this sense, indeed,
quent individual achievement could also suggest that other fea- “grouping does not produce achievement: instruction does” (Gam-
tures of the learning environment would be more powerful oran, 1987, p. 341). Students are known to learn most when they
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 507

Table 3
Model Predicting Individual Math Achievement in Terms of Individual Characteristics and Class Composition, Through Features of the
Class Learning Environment
Predictor b SE b/SE p
Individual level
Previous achievement 0.602*** 0.019 31.109 .000
Intelligence 0.140*** 0.021 6.558 .000
Female 0.078*** 0.015 5.161 .000
Disadvantaged neighborhood 0.015 0.016 0.917 .180
Low educated mother 0.033* 0.016 1.993 .023
School allowance 0.032* 0.015 2.166 .015
Class level
Class mean 0.350*** 0.089 3.921 .000
Class heterogeneity 0.032 0.067 0.486 .627
Female proportion 0.015 0.050 0.302 .763
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Class size 0.041 0.056 0.735 .462


This document is copyrighted by the American Psychological Association or one of its allied publishers.

Paths to learning environment


Class mean . Academic orientation 0.561*** 0.091 6.140 .000
Class heterogeneity . Academic orientation 0.239** 0.082 2.930 .003
Class mean . Differentiated instruction 0.253** 0.088 2.883 .004
Class heterogeneity . Differentiated instruction 0.347*** 0.093 3.744 .000
Paths from learning environment to achievement
Academic orientation 0.088 0.062 1.404 .160
Differentiated instruction 0.149** 0.054 2.749 .006
Model fit
v2 5,003.010
df 136
CFI 0.979
RMSEA 0.019
SRMR
Within 0.033
Between 0.086
Variance components
Individual level 0.498
Class level 0.205
Variance explained
Individual level 0.502
Class level 0.795
Sample size
Number of students 2,736
Number of classes 156
Note. Standardized estimates. CFI = comparative fit index; RMSEA = root mean square error of approximation; SRMR = standardized root mean
squared residual.
* p , .05. ** p , .01. *** p , .001.

are provided with learning material that suits the level of their individual achievement did not reach significance (i.e., because of
capacities, for example, challenges that they can meet with their a nonsignificant but negative direct effect; see Table 2). Hence,
teachers’ support, but not without (Lavrijsen et al., 2021). How- the association between class heterogeneity and the use of differ-
ever, to provide appropriately challenging schoolwork, teachers’ entiated instruction did not, overall, produce a distinguishable pos-
instructional behavior seems to matter more than class heterogene- itive effect on individual achievement.
ity (Fraser et al., 1987; Scheerens, 2000). The finding that, con- Still, overall, such findings may also help to understand why
trary to expectations, class heterogeneity was slightly more earlier research on ability grouping found that the effects of abil-
beneficial for high than for low achieving students, could also be ity grouping were highly dependent on the particular form of
interpreted in this regard: Possibly, in heterogeneous classes, ability grouping: For example, the second order meta-analysis by
teachers’ preparedness to use differentiated instruction could bene- Steenbergen-Hu et al. (2016) did not find any clear advantage of
fit high achieving students who are often reported to experience between-class grouping (that is, placing student in different
limited fit between capacities and schoolwork demands (e.g., classes according to ability), while within-class grouping (that is,
Snyder & Linnenbrink-Garcia, 2013). Of note, however, any asso- creating small, temporary, and flexible ability groups within a
ciations of achievement with class heterogeneity were markedly single class) did have a significant positive effect on student
smaller than the associations with class mean level. In particular, achievement. Indeed, within-class grouping is usually employed
while class heterogeneity was positively associated with differenti- as an organizational format to support differentiated instruction
ated instruction, which in turn was positively associated with indi- (Prast et al., 2018); in such a case, forming more homogeneous
vidual achievement, the total effect of class heterogeneity on student groups may increase, not reduce, teachers’ sensitivity to
508 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

between-student variability (Deunk et al., 2018). Given this em- might have already been somewhat constrained (e.g., in compari-
phasis on the learning environment, the present study may also son with fully comprehensive educational systems; Dupriez et
help to understand the inconsistency in earlier studies on al., 2008), as Flemish schools often group students according to
between-class grouping (e.g., Kulik & Kulik, 1982; Slavin, student choices not independent of ability (Lavrijsen et al.,
1990). Indeed, studies comparing achievement in grouped and 2014). It might be that with higher levels of heterogeneity, dif-
ungrouped classes, without taking into account how teachers ferentiated instruction ceases to be a feasible solution to manage
responded to between-student differences (e.g., Duflo et al., differences between students (Chiu et al., 2017). More impor-
2011), may be difficult to generalize. tantly, through their study choices, students may have had some
Finally, it should be noted that class mean and class heteroge- influence over their sorting into classes. This may have invoked
neity were inversely correlated (r = –.32). This could be a selection bias in the analysis, as study choices might not only
explained as follows: High average achievement classes are of- covary with achievement but also with other student factors not
ten only accessible to high achievers, and thus are fairly homo- included in this study (e.g., engagement, interest, parental
geneous, while there might be a diversity of reasons why involvement, and so on). Hence, even when we carefully con-
students end up in low ability classes. For example, disengaged trolled for other important covariates (previous achievement,
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

students have been found to sometimes opt for classes below


This document is copyrighted by the American Psychological Association or one of its allied publishers.

cognitive ability, gender, and SES), the specific context of Flem-


their actual level of achievement (Van Praag et al., 2013); in ish secondary education may have affected our results (in partic-
addition, students from socially disadvantaged backgrounds ular, the strong association between class mean level and
have been found to be channeled toward low ability classes individual achievement).
(Boone et al., 2018; Dustmann, 2004). In addition to this corre- In addition, this study considered only achievement in mathe-
lation between class mean and class heterogeneity, our saturated matics as the marker of student academic development. Future
models showed that both characteristics were significantly asso- research could investigate how class composition affects other
ciated with both features of the learning environment (i.e., class cognitive (e.g., achievement in language domains) and noncogni-
mean level predicting the use of differentiated instruction, and tive (e.g., motivation) outcomes of education.
class heterogeneity predicting the academic orientation of the Finally, the present study focused only on a limited number
class). This entanglement of class mean level and class hetero- of classroom processes associated with class composition. There
geneity emphasizes the importance of taking both characteristics might be additional mechanisms associated with class composi-
into account when considering the effects of grouping on stu- tion, such as teacher expectations (Kelly & Carbonaro, 2012),
dent achievement. that were not considered in this study. In addition, to further
enlighten the different mechanisms connecting class composi-
Strengths, Limits, and Directions for Future Research tion to individual achievement, it would also be worthwhile to
directly assess the level of educational fit between student
The present study added to the literature on class composi-
capacities and schoolwork requirements.3 A more detailed
tion in two ways. First, instead of comparing associations
account of the classroom environment could help to understand
between achievement and formal grouping arrangements, it
how this environment affects achievement. In particular, more
directly measured the relevant classroom characteristics, that
“rich and thick” descriptions (Reynolds et al., 2014) of the
is, class heterogeneity and class mean achievement. Second, it
classroom environment could be derived from more qualitative
considered differential effects of class composition on low and
approaches, beyond student reports, for example by assessing
high achievers. Third, beyond merely establishing associations
the diverse and subtle ways in which teachers respond to differ-
between compositional characteristics and achievement, it
ences between students (Coubergs et al., 2017; Pablico et al.,
attempted to explain these associations by considering changes
2017).
in the learning environment associated with class composition.
In this way, this study complements the broader move in edu-
cational effectiveness research from “input-output” to “input- Conclusion
process-output” models (Reynolds et al., 2014). This study In a sample of 2,895 Grade 8 students from 158 classes, math
stood out by making use of a large-scale, longitudinal database achievement was found to be positively associated with class
consisting of 2,895 students from 158 classes, allowing us to mean level, after accounting for previous achievement, intelli-
reliably estimate the associations between a number of individ- gence, gender, and social background. While this was the case
ual-level and class-level variables and individual achievement. for all students, in particular high achieving students benefited
Moreover, the study carefully took into account other individ- from being in a class with high average achievement. Class het-
ual and class-level characteristics related to achievement, such erogeneity turned out to be unrelated to individual achievement.
as previous achievement, gender, intelligence, and social back- Path analyses showed that teachers in classes with large
ground. Despite this effort, of course, we cannot exclude the between-student variability appeared to be more sensitive to dif-
possibility that other important covariates were not yet con- ferences between students, and more differentiated instruction
trolled for; as with any observational study on school or class was positively associated with subsequent achievement. Hence,
composition effects, the results of the present study should
thus be interpreted with caution. 3
While the present study included a researcher-developed measure of
Moreover, results from the present study have to be interpreted perceived educational fit, this measure was not included as a learning
in the specific educational context in which students were fol- environment feature in the current analysis due to concerns over its
lowed. In Flemish Grade 8 classes, the degree of heterogeneity multilevel and factorial structure.
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 509

teachers’ responses to student diversity seemed to matter more Dicke, T., Marsh, H. W., Parker, P. D., Pekrun, R., Guo, J., & Televantou,
for student achievement than whether students are physically I. (2018). Effects of school-average achievement on individual self-con-
sorted in distinct classes according to ability. cept and achievement: Unmasking phantom effects masquerading as
true compositional effects. Journal of Educational Psychology, 110(8),
1112–1126. https://doi.org/10.1037/edu0000259
References Dockx, J., Stevens, E., Custers, C., Fidlers, I., & De Fraine, B. (2015).
LiSO-project - Vragenlijst voor vakleerkrachten februari 2014 - Techni-
Abraham, J. (1989). Testing Hargreaves’ and Lacey’s differentiation-
sche rapportering [LiSO project: Questionnaire for subject teachers
polarisation theory in a setted comprehensive. The British Journal of
February 2014 - Technical reporting]. KU Leuven.
Sociology, 40(1), 46–81. https://doi.org/10.2307/590290
Acee, T. W., Kim, H., Kim, H. J., Kim, J.-I., Chu, H.-N. R., Kim, M., & Dockx, J., Van den Branden, N., Stevens, E., Denies, K., & De Fraine, B.
Wicker, F. W. (2010). Academic boredom in under- and over-challeng- (2017). LiSO-project: Toetsen wiskunde 2013–2016. IRT-analyses [LiSO-
ing situations. Contemporary Educational Psychology, 35(1), 17–27. project: Mathematics tests 2013–2016. IRT-analyses]. KU Leuven.
https://doi.org/10.1016/j.cedpsych.2009.08.002 Duflo, E., Dupas, P., & Kremer, M. (2011). Peer effects, teacher incen-
Agirdag, O., Van Houtte, M., & Van Avermaet, P. (2012). Why does the tives, and the impact of tracking: Evidence from a randomized evalua-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

ethnic and socio-economic composition of schools influence math tion in Kenya. The American Economic Review, 101(5), 1739–1774.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

achievement? The role of sense of futility and futility culture. European https://doi.org/10.1257/aer.101.5.1739
Sociological Review, 28(3), 366–378. https://doi.org/10.1093/esr/jcq070 Dupriez, V., Dumay, X., & Vause, A. (2008). How do school systems
Bailey, R., Pearce, G., Smith, C., Sutherland, M., Stack, N., Winstanley, manage pupils’ heterogeneity? Comparative Education Review, 52(2),
C., & Dickenson, M. (2012). Improving the educational achievement of 245–273. https://doi.org/10.1086/528764
gifted and talented students: A systematic review. Talent Development Duru-Bellat, M., & Mingat, A. (1998). Importance of ability grouping in
& Excellence, 4(1), 33–48. French “colleges” and its impact upon pupils’ academic achievement.
Baumgartner, H., & Steenkamp, J.-B. E. M. (2006). An extended paradigm Educational Research and Evaluation, 4(4), 348–368. https://doi.org/10
for measurement analysis of marketing constructs applicable to panel .1076/edre.4.4.348.6951
data. Journal of Marketing Research, 43(3), 431–442. https://doi.org/10 Dustmann, C. (2004). Parental background, secondary school track choice,
.1509/jmkr.43.3.431 and wages. Oxford Economic Papers, 56(2), 209–230. https://doi.org/10
Berends, M. (1995). Educational stratification and students’ social bonding .1093/oep/gpf048
to school. British Journal of Sociology of Education, 16(3), 327–351.
Enders, C. K., & Tofighi, D. (2007). Centering predictor variables in
https://doi.org/10.1080/0142569950160304
cross-sectional multilevel models: A new look at an old issue. Psycho-
Betts, J. R., & Shkolnik, J. L. (2000). The effects of ability grouping on
logical Methods, 12(2), 121–138. https://doi.org/10.1037/1082-989X.12
student achievement and resource allocation in secondary schools. Eco-
.2.121
nomics of Education Review, 19(1), 1–15. https://doi.org/10.1016/S0272
Fertig, M. (2003). Educational production, endogenous peer group forma-
-7757(98)00044-2
Boone, S., Seghers, M., & Van Houtte, M. (2018). Transition from primary tion and class composition—Evidence from the PISA 2000 Study (IZA
to secondary education in a rigidly tracked system: The case of Flanders. Discussion Paper No. 714). Institute for the Study of Labor.
In A. Tarabini & N. Ingram (Eds.), Educational choices, transitions and Fraser, B. J., Walberg, H. J., Welch, W. W., & Hattie, J. A. (1987). Syntheses
aspirations in Europe: Systemic, institutional and subjective challenges of educational productivity research. International Journal of Educational
(pp. 53–70). Routledge. https://doi.org/10.4324/9781315102368-4 Research, 11(2), 147–252. https://doi.org/10.1016/0883-0355(87)90035-8
Bos, W., Lankes, E.-M., Prenzel, M., Schwippert, K., Valtin, R., Voss, A., Fruehwirth, J. C. (2013). Identifying peer achievement spillovers: Implica-
& Walther, G. (2005). IGLU - Skalenhandbuch zur Dokumentation der tions for desegregation and the achievement gap. Quantitative Econom-
Erhebungsinstrumente [Scale manual for the documentation of the sur- ics, 4(1), 85–124. https://doi.org/10.3982/QE93
vey instruments]. Waxmann. Gallagher, J., Harradine, C. C., & Coleman, M. R. (1997). Challenge or
Charles, E. P. (2005). The correction for attenuation due to measurement boredom? Gifted students’ views on their schooling. Roeper Review,
error: Clarifying concepts and creating confidence sets. Psychological 19(3), 132–136. https://doi.org/10.1080/02783199709553808
Methods, 10(2), 206–226. https://doi.org/10.1037/1082-989X.10.2.206 Gamoran, A. (1987). Organization, instruction, and the effects of ability group-
Cheung, G. W., & Rensvold, R. B. (2002). Evaluating goodness-of-fit ing: Comment on Slavin’s “Best-evidence synthesis.” Review of Educational
indexes for testing measurement invariance. Structural Equation Model- Research, 57(3), 341–345. https://doi.org/10.3102/00346543057003341
ing, 9(2), 233–255. https://doi.org/10.1207/S15328007SEM0902_5 Geldhof, G. J., Preacher, K. J., & Zyphur, M. J. (2014). Reliability estima-
Cheung, C.-K., & Rudowicz, E. (2003). Academic outcomes of ability tion in a multilevel confirmatory factor analysis framework. Psychologi-
grouping among junior high school students in Hong Kong. The Journal cal Methods, 19(1), 72–91. https://doi.org/10.1037/a0032138
of Educational Research, 96(4), 241–254. https://doi.org/10.1080/
Graham, L. J., De Bruin, K., Lassig, C., & Spandagou, I. (2021). A scoping
00220670309598813
review of 20 years of research on differentiation: Investigating concep-
Chiu, M. M., Chow, B. W.-Y., & Joh, S. W. (2017). Streaming, tracking
tualisation, characteristics, and methods used. Review of Education,
and reading achievement: A multilevel analysis of students in 40 coun-
9(1), 161–198. https://doi.org/10.1002/rev3.3238
tries. Journal of Educational Psychology, 109(7), 915–934. https://doi
.org/10.1037/edu0000188 Greenland, S. (2000). Principles of multilevel modelling. International
Coubergs, C., Struyven, K., Vanthournout, G., & Engels, N. (2017). Meas- Journal of Epidemiology, 29(1), 158–167. https://doi.org/10.1093/ije/29
uring teachers’ perceptions about differentiated instruction: The DI- .1.158
Quest instrument and model. Studies in Educational Evaluation, 53, Hailikari, T., Nevgi, A., & Komulainen, E. (2008). Academic self-beliefs
41–54. https://doi.org/10.1016/j.stueduc.2017.02.004 and prior knowledge as predictors of student achievement in mathemat-
Deunk, M. I., Smale-Jacobse, A. E., de Boer, H., Doolaard, S., & Bosker, ics: A structural model. Educational Psychology, 28(1), 59–71. https://
R. J. (2018). Effective differentiation practices: A systematic review and doi.org/10.1080/01443410701413753
meta-analysis of studies on the cognitive effects of differentiation prac- Hallinan, M. T., & Kubitschek, W. N. (1999). Curriculum differentiation
tices in primary education. Educational Research Review, 24, 31–54. and high school achievement. Social Psychology of Education, 3,
https://doi.org/10.1016/j.edurev.2018.02.002 41–62. https://doi.org/10.1023/A:1009603706414
510 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

Hanushek, E. A., Kain, J. F., Markman, J. M., & Rivkin, S. G. (2003). ability, or both? Personality and Individual Differences, 171, Article
Does peer ability affect student achievement? Journal of Applied Econo- 110558. https://doi.org/10.1016/j.paid.2020.110558
metrics, 18(5), 527–544. https://doi.org/10.1002/jae.741 Leiter, J. (1983). Classroom composition and achievement gains. Sociol-
Hanushek, E. A., & Wößmann, L. W. (2006). Does educational tracking ogy of Education, 56(3), 126–132. https://doi.org/10.2307/2112381
affect performance and inequality? Differences-in-differences evidence Lorah, J. (2018). Effect size measures for multilevel models: Definition,
across countries. The Economic Journal, 116(510), 63–76. https://doi interpretation, and TIMSS example. Large-Scale Assessments in Educa-
.org/10.1111/j.1468-0297.2006.01076.x tion, 6(1), Article 8. https://doi.org/10.1186/s40536-018-0061-2
Hargreaves, D. (1967). Social relations in a secondary school. Routledge. Lou, Y., Abrami, P. C., Spence, J. C., Poulsen, C., Chambers, B., &
Harker, R., & Tymms, P. (2004). The effects of student composition on d’Apollonia, S. (1996). Within-class grouping: A meta-analysis. Review
school outcomes. School Effectiveness and School Improvement, 15(2), of Educational Research, 66(4), 423–458. https://doi.org/10.3102/
177–199. https://doi.org/10.1076/sesi.15.2.177.30432 00346543066004423
Hattie, J. A. (2002). Classroom composition and peer effects. International Lüdtke, O., Marsh, H. W., Robitzsch, A., & Trautwein, U. (2011). A 2 3 2
Journal of Educational Research, 37(5), 449–481. https://doi.org/10 taxonomy of multilevel latent contextual models: Accuracy-bias trade-
.1016/S0883-0355(03)00015-6 offs in full and partial error correction models. Psychological Methods,
Hayduk, L. A., & Littvay, L. (2012). Should researchers use single indica- 16(4), 444–467. https://doi.org/10.1037/a0024376
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Luyten, H., & der Hoeven-van Doornum, V. (1995). Classroom composi-


This document is copyrighted by the American Psychological Association or one of its allied publishers.

tors, best indicators, or multiple indicators in structural equation mod-


els? BMC Medical Research Methodology, 12(1), Article 159. https:// tion and individual achievement: Effects of classroom composition and
doi.org/10.1186/1471-2288-12-159 teacher goals in Dutch elementary education. Tijdschrift Voor Onder-
Henderson, N. D. (1989). A meta-analysis of ability grouping achievement wijsresearch, 20(1), 42–62.
and attitude in the elementary grades. Mississippi State University. Magez, W. (2015). Biografische testvaliditeit CoVaT-CHC [Biographic
Hoffer, T. B. (1992). Middle school ability grouping and student achieve- test validity CoVaT-CHC]. Thomas More.
ment in science and mathematics. Educational Evaluation and Policy Magez, W., & Bos, A. (2015). Validiteit wederzijdse relatie schoolse crite-
Analysis, 14(3), 205–227. https://doi.org/10.3102/01623737014003205 ria—testresultaten [Validity relationship educational criteria—Test
Horn, J. L., & Cattell, R. B. (1966). Refinement and test of the theory of results]. Thomas More.
Magez, W., Tierens, M., Van Huynegem, J., Van Parijs, K., Decaluwé, V.,
fluid and crystallized general intelligences. Journal of Educational Psy-
& Bos, A. (2015). CoVaT-CHC: Cognitieve vaardigheidstest volgens
chology, 57(5), 253–270. https://doi.org/10.1037/h0023816
het CHC-model [CoVaT-CHC: Intelligence test based on the CHC-
Hu, L., & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covari-
model]. Thomas More.
ance structure analysis: Conventional criteria versus new alternatives.
Malmberg, L.-E., & Trempała, J. (1997). Anticipated transition to adult-
Structural Equation Modeling, 6(1), 1–55. https://doi.org/10.1080/
hood: The effect of educational track, gender, and self-evaluation on
10705519909540118
Finnish and Polish adolescents’ future orientation. Journal of Youth and
Jussim, L., & Harber, K. D. (2005). Teacher expectations and self-fulfill-
Adolescence, 26(5), 517–537. https://doi.org/10.1023/A:1024577805149
ing prophecies: Knowns and unknowns, resolved and unresolved contro-
Marsh, H. W., Lüdtke, O., Robitzsch, A., Trautwein, U., Asparouhov, T.,
versies. Personality and Social Psychology Review, 9(2), 131–155.
Muthén, B., & Nagengast, B. (2009). Doubly-latent models of school con-
https://doi.org/10.1207/s15327957pspr0902_3
textual effects: Integrating multilevel and structural equation approaches to
Kanevsky, L., & Keighley, T. (2003). To produce or not to produce?
control measurement and sampling error. Multivariate Behavioral
Understanding boredom and the honor in underachievement. Roeper
Research, 44(6), 764–802. https://doi.org/10.1080/00273170903333665
Review, 26(1), 20–28. https://doi.org/10.1080/02783190309554235 McGillicuddy, D., & Devine, D. (2018). “Turned off” or “ready to fly”—
Kelly, S., & Carbonaro, W. (2012). Curriculum tracking and teacher Ability grouping as an act of symbolic violence in primary school.
expectations: Evidence from discrepant course taking models. Social Teaching and Teacher Education, 70, 88–99. https://doi.org/10.1016/j
Psychology of Education, 15(3), 271–294. https://doi.org/10.1007/ .tate.2017.11.008
s11218-012-9182-6 Muthén, B., & Muthén, L. (2015). Mplus statistical analysis with latent
Kindermann, T. A. (2007). Effects of naturally existing peer groups on variables user’s guide.
changes in academic engagement in a cohort of sixth graders. Child De- Nye, B., Hedges, L. V., & Konstantopoulos, S. (2000). The effects of small
velopment, 78(4), 1186–1203. https://doi.org/10.1111/j.1467-8624.2007 classes on academic achievement: The results of the Tennessee class
.01060.x size experiment. American Educational Research Journal, 37(1),
Kriegbaum, K., Becker, N., & Spinath, B. (2018). The relative importance 123–151. https://doi.org/10.3102/00028312037001123
of intelligence and motivation as predictors of school achievement: A Obergriesser, S., & Stoeger, H. (2015). The role of emotions, motivation,
meta-analysis. Educational Research Review, 25, 120–148. https://doi and learning behavior in underachievement and results of an interven-
.org/10.1016/j.edurev.2018.10.001 tion. High Ability Studies, 26(1), 167–190. https://doi.org/10.1080/
Kulik, C.-L. C., & Kulik, J. A. (1982). Effects of ability grouping on sec- 13598139.2015.1043003
ondary school students: A meta-analysis of evaluation findings. Ameri- Opdenakker, M.-C., & Van Damme, J. (2001). Relationship between
can Educational Research Journal, 19(3), 415–428. https://doi.org/10 school composition and characteristics of school process and their effect
.3102/00028312019003415 on mathematics achievement. British Educational Research Journal,
Lavrijsen, J., & Nicaise, I. (2016). Educational tracking, inequality and 27(4), 407–432. https://doi.org/10.1080/01411920120071434
performance: New evidence from a differences-in-differences technique. Opdenakker, M.-C., & Van Damme, J. (2006). Differences between second-
Research in Comparative and International Education, 11(3), 334–349. ary schools: A study about school context, group composition, school
https://doi.org/10.1177/1745499916664818 practice, and school effects with special attention to public and Catholic
Lavrijsen, J., Nicaise, I., & Poesen-Vandeputte, M. (2014). The Flemish schools and types of schools. School Effectiveness and School Improve-
education system in comparative perspective: A re-assessment of educa- ment, 17(1), 87–117. https://doi.org/10.1080/09243450500264457
tional regime typologies. KU Leuven. Opdenakker, M.-C., Van Damme, J., De Fraine, D. F., Van Landeghem,
Lavrijsen, J., Preckel, F., Verachtert, P., Vansteenkiste, M., & G., & Onghena, P. (2002). The effect of schools and classes on mathe-
Verschueren, K. (2021). Are motivational benefits of adequately chal- matics achievement. School Effectiveness and School Improvement,
lenging schoolwork related to students’ need for cognition, cognitive 13(4), 399–427. https://doi.org/10.1076/sesi.13.4.399.10283
CLASS COMPOSITION AND STUDENT ACHIEVEMENT 511

Pablico, J. R., Diack, M., & Lawson, A. (2017). Differentiated instruction theory. School Psychology Quarterly, 18(2), 158–176. https://doi.org/10
in the high school science classroom: Qualitative and quantitative analy- .1521/scpq.18.2.158.21860
sis. International Journal of Learning, Teaching and Educational Sirin, S. R. (2005). Socioeconomic status and academic achievement: A
Research, 16(7), 30–54. meta-analytic review of research. Review of Educational Research,
Parker, P. D., Marsh, H. W., Lüdtke, O., & Trautwein, U. (2013). Differential 75(3), 417–453. https://doi.org/10.3102/00346543075003417
school contextual effects for math and English: Integrating the big-fish-little- Slavin, R. E. (1987). Ability grouping and student achievement in elemen-
pond effect and the internal/external frame of reference. Learning and tary schools: A best-evidence synthesis. Review of Educational
Instruction, 23, 78–89. https://doi.org/10.1016/j.learninstruc.2012.07.001 Research, 57(3), 293–336. https://doi.org/10.3102/00346543057003293
Pinxten, M., Wouters, S., Preckel, F., Niepel, C., De Fraine, B., & Slavin, R. E. (1990). Achievement effects of ability grouping in secondary
Verschueren, K. (2015). The formation of academic self-concept in ele- schools: A best-evidence synthesis. Review of Educational Research,
mentary education: A unifying model for external and internal compari- 60(3), 471–499. https://doi.org/10.3102/00346543060003471
sons. Contemporary Educational Psychology, 41, 124–132. https://doi Smale-Jacobse, A. E., Meijer, A., Helms-Lorenz, M., & Maulana, R.
.org/10.1016/j.cedpsych.2014.12.003 (2019). Differentiated instruction in secondary education: A systematic
Pozas, M., Letzel, V., & Schneider, C. (2020). Teachers and differentiated review of research evidence. Frontiers in Psychology, 10, Article 2366.
instruction: Exploring differentiation practices to address student diver-
This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Snyder, K. E., & Linnenbrink-Garcia, L. (2013). A developmental, person-


sity. Journal of Research in Special Educational Needs, 20,
This document is copyrighted by the American Psychological Association or one of its allied publishers.

centered approach to exploring multiple motivational pathways in gifted


217–230. https://doi.org/10.1111/1471-3802.12481 underachievement. Educational Psychologist, 48(4), 209–228. https://
Prast, E. J., Van de Weijer-Bergsma, E., Kroesbergen, E. H., & Van Luit, J. E. doi.org/10.1080/00461520.2013.835597
(2018). Differentiated instruction in primary mathematics: Effects of teacher Stäbler, F., Dumont, H., Becker, M., & Baumert, J. (2017). What happens
professional development on student achievement. Learning and Instruction, to the fish’s achievement in a little pond? A simultaneous analysis of
54, 22–34. https://doi.org/10.1016/j.learninstruc.2018.01.009 class-average achievement effects on achievement and academic self-
Preacher, K. J., & Kelley, K. (2011). Effect size measures for mediation concept. Journal of Educational Psychology, 109(2), 191–207. https://
models: Quantitative strategies for communicating indirect effects. Psy- doi.org/10.1037/edu0000135
chological Methods, 16(2), 93–115. https://doi.org/10.1037/a0022658 Steenbergen-Hu, S., Makel, M. C., & Olszewski-Kubilius, P. (2016). What
Preckel, F., Götz, T., & Frenzel, A. (2010). Ability grouping of gifted stu-
one hundred years of research says about the effects of ability grouping
dents: Effects on academic self-concept and boredom. The British Jour- and acceleration on K–12 students’ academic achievement: Findings of
nal of Educational Psychology, 80(Pt. 3), 451–472. https://doi.org/10 two second-order meta-analyses. Review of Educational Research,
.1348/000709909X480716 86(4), 849–899. https://doi.org/10.3102/0034654316675417
Preckel, F., Schmidt, I., Stumpf, E., Motschenbacher, M., Vogl, K.,
Steinmayr, R., Weidinger, A. F., Schwinger, M., & Spinath, B. (2019). The
Scherrer, V., & Schneider, W. (2019). High-ability grouping: Benefits
importance of students’ motivation for their academic achievement—
for gifted students’ achievement development without costs in academic
Replicating and extending previous findings. Frontiers in Psychology, 10,
self-concept. Child Development, 90(4), 1185–1201. https://doi.org/10
Article 1730. https://doi.org/10.3389/fpsyg.2019.01730
.1111/cdev.12996
Tierens, M. (2015). Onderzoeksrapport Constructvaliditeit [Research
Raitano, M., & Vona, F. (2013). Peer heterogeneity, school tracking and
report: Construct validity]. Thomas More.
students’ performances: Evidence from PISA 2006. Applied Economics,
Tierens, M., & Magez, W. (2016). Onderzoeksrapport Normering
45(32), 4516–4532. https://doi.org/10.1080/00036846.2013.791020
[Research report: Norming]. Thomas More.
Rees, D. I., Brewer, D. J., & Argys, L. M. (2000). How should we measure
Tomlinson, C. A. (2014). The differentiated classroom: Responding to the
the effect of ability grouping on student performance? Economics of Educa-
tion Review, 19(1), 17–20. https://doi.org/10.1016/S0272-7757(98)00050-8 needs of all learners. ASCD.
Reis, S. M., & McCoach, D. B. (2000). The underachievement of gifted Van de Werfhorst, H. G., & Mijs, J. J. (2010). Achievement inequality and
students: What do we know and where do we go? Gifted Child Quar- the institutional structure of educational systems: A comparative per-
terly, 44(3), 152–170. https://doi.org/10.1177/001698620004400302 spective. Annual Review of Sociology, 36, 407–428. https://doi.org/10
Reynolds, D., Sammons, P., De Fraine, B., Van Damme, J., Townsend, T., .1146/annurev.soc.012809.102538
Teddlie, C., & Stringfield, S. (2014). Educational effectiveness research Van Houtte, M. (2016). Lower-track students’ sense of academic futility:
(EER): A state-of-the-art review. School Effectiveness and School Improve- Selection or effect? Journal of Sociology, 52(4), 874–889. https://doi
ment, 25(2), 197–230. https://doi.org/10.1080/09243453.2014.885450 .org/10.1177/1440783315600802
Rogers, K. B. (2007). Lessons learned about educating the gifted and tal- Van Houtte, M., & Stevens, P. A. (2008). Sense of futility: The missing
ented: A synthesis of the research on educational practice. Gifted Child link between track position and self-reported school misconduct. Youth
Quarterly, 51(4), 382–396. https://doi.org/10.1177/0016986207306324 & Society, 40(2), 245–264. https://doi.org/10.1177/0044118X08316251
Sanders, W. L., Wright, S. P., & Horn, S. P. (1997). Teacher and classroom Van Houtte, M., & Stevens, P. A. (2010). The culture of futility and its
context effects on student achievement: Implications for teacher evalua- impact on study culture in technical/vocational schools in Belgium.
tion. Journal of Personnel Evaluation in Education, 11(1), 57–67. Oxford Review of Education, 36(1), 23–43. https://doi.org/10.1080/
https://doi.org/10.1023/A:1007999204543 03054980903481564
Scheerens, J. (2000). Improving school effectiveness. UNESCO. Van Praag, L., Van Houtte, M., Boone, S., & Stevens, P. (2013, August).
Schofield, J. W. (2010). International evidence on ability grouping with The paradox of the cascade system: When homogeneous grouping leads
curriculum differentiation and the achievement gap in secondary to more heterogeneity in the classroom. 11th Conference of the Euro-
schools. Teachers College Record, 112(5), 1492–1528. pean Sociological Association, Turin, Italy.
Seidel, T., & Shavelson, R. J. (2007). Teaching effectiveness research in Vigdor, J., & Nechyba, T. (2007). Peer effects in North Carolina public
the past decade: The role of theory and research design in disentangling schools. In L. Woessman & P. Peterson (Eds.), Schools and the equal
meta-analysis results. Review of Educational Research, 77(4), 454–499. opportunity problem (pp. 73–101). MIT Press.
https://doi.org/10.3102/0034654307310317 Vogl, K., & Preckel, F. (2014). Full-time ability grouping of gifted students:
Shernoff, D. J., Csikszentmihalyi, M., Schneider, B., & Shernoff, E. S. (2014). Impacts on social self-concept and school-related attitudes. Gifted Child
Student engagement in high school classrooms from the perspective of flow Quarterly, 58(1), 51–68. https://doi.org/10.1177/0016986213513795
512 LAVRIJSEN, DOCKX, STRUYF, AND VERSCHUEREN

Voyer, D., & Voyer, S. D. (2014). Gender differences in scholastic achieve- Warm, T. A. (1989). Weighted likelihood estimation of ability in item response
ment: A meta-analysis. Psychological Bulletin, 140(4), 1174–1204. https:// theory. Psychometrika, 54(3), 427–450. https://doi.org/10.1007/BF02294627
doi.org/10.1037/a0036620 Zeidner, M., & Schleyer, E. J. (1999). Evaluating the effects of full-time
Vygotsky, L. S. (1980). Mind in society: The development of higher psy- vs part-time educational programs for the gifted: Affective outcomes
chological processes. Harvard University Press. https://doi.org/10.2307/ and policy considerations. Evaluation and Program Planning, 22(4),
j.ctvjf9vz4 413–427. https://doi.org/10.1016/S0149-7189(99)00027-0

Appendix

Measurement Model (Item Level Estimates Relative to Table 3)


This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.
This document is copyrighted by the American Psychological Association or one of its allied publishers.

Table A1
Predictor Estimate SE Est./SE p-value
Academic orientation BY
KTKLASG1_C 0.698 0.093 7.493 0.000
KTKLASG2_C 0.714 0.073 9.716 0.000
KTKLASG3_C 0.615 0.079 7.758 0.000
KTKLASG4_C 0.782 0.084 9.276 0.000
KTKLASG1_C WITH
KTKLASG4_C 0.136 0.051 2.681 0.007
Differentiated instruction BY
WLKDI1 0.386 0.033 11.689 0.000
WLKDI2 0.416 0.035 11.794 0.000
WLKDI3 0.178 0.019 9.430 0.000
WLKDI4 0.195 0.020 9.995 0.000
WLKDI5 0.262 0.022 12.171 0.000
WLKDI3 WITH
WLKDI4 0.049 0.010 4.768 0.000
Intercepts
KTKLASG1_C 0.036 0.084 0.434 0.664
KTKLASG2_C 0.062 0.083 0.745 0.456
KTKLASG3_C 0.030 0.088 0.345 0.730
KTKLASG4_C 0.039 0.086 0.452 0.651
WLKDI1 0.002 0.036 0.048 0.962
WLKDI2 0.006 0.038 0.155 0.877
WLKDI3 0.008 0.031 0.247 0.805
WLKDI4 0.002 0.029 0.075 0.940
WLKDI5 0.012 0.033 0.379 0.705
Residual variances
KTKLASG1_C 0.302 0.093 3.232 0.001
KTKLASG2_C 0.320 0.064 4.978 0.000
KTKLASG3_C 0.456 0.064 7.162 0.000
KTKLASG4_C 0.186 0.075 2.482 0.013
WLKDI1 0.000 0.000 999.000 999.000
WLKDI2 0.000 0.000 999.000 999.000
WLKDI3 0.066 0.014 4.804 0.000
WLKDI4 0.050 0.009 5.215 0.000
WLKDI5 0.042 0.011 3.792 0.000
Note. Standardized estimates. KTKLASG = class academic orientation; WLKDI = use of differentiated instruction.

Received June 29, 2020


Revision received June 29, 2021
Accepted July 31, 2021 n

You might also like