You are on page 1of 16

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/376170252

Measuring pre-service teachers' decision-making in classroom management:


A video-based assessment approach

Article in Teaching and Teacher Education · January 2024


DOI: 10.1016/j.tate.2023.104426

CITATIONS READS

0 56

4 authors, including:

Charlotte Kramer Kai Kaspar


University of Cologne University of Cologne
17 PUBLICATIONS 299 CITATIONS 123 PUBLICATIONS 2,152 CITATIONS

SEE PROFILE SEE PROFILE

Johannes König
University of Cologne
216 PUBLICATIONS 6,682 CITATIONS

SEE PROFILE

All content following this page was uploaded by Jonas Weyers on 02 December 2023.

The user has requested enhancement of the downloaded file.


Teaching and Teacher Education 138 (2024) 104426

Contents lists available at ScienceDirect

Teaching and Teacher Education


journal homepage: www.elsevier.com/locate/tate

Research paper

Measuring pre-service teachers’ decision-making in classroom


management: A video-based assessment approach
Jonas Weyers a, *, Charlotte Kramer a, Kai Kaspar b, Johannes König a
a
University of Cologne, Gronewaldstraße 2a, 50931, Köln, Germany
b
University of Cologne, Herbert-Lewin-Str. 10, 50931, Köln, Germany

A R T I C L E I N F O A B S T R A C T

Keywords: This study presents a video-based test instrument entitled CME-Decide that focuses on pre-service teachers’
Classroom management decision-making skills as part of their Classroom Management Expertise. A sample of 284 pre-service teachers
Teacher expertise viewed 12 short video clips of instructional practice and indicated after each how they would behave in this
Teacher noticing
situation. Correlation and regression analyses revealed that the CME-Decide test was associated with perception/
Decision-making
interpretation skills in classroom management, general pedagogical knowledge, teaching enthusiasm, and
Test
Video clips school-based learning opportunities but not with university-based learning opportunities in general pedagogy.
These findings suggest that the CME-Decide test is appropriate for assessing pre-service teachers’ competence
development.

1. Introduction (e.g., Christiansen & Erixon, 2021), video-based assessments are already
being used to capture pre-service teachers’ situation-specific skills as a
Current research increasingly emphasizes a set of so-called “situa­ learning outcome (Mertens & Gräsel, 2018; Stürmer, Seidel, & Schäfer,
tion-specific skills”—namely, perception, interpretation, and decision- 2013; Wiens, Hessberg, LoCasale-Crouch, & DeCoster, 2013).
making in the context of instructional practice—as a crucial part of The importance of considering situation-specific skills is particularly
teacher competence (Blömeke, Gustafsson, & Shavelson, 2015; Kaiser, evident regarding teachers’ ability to manage their classrooms. This
Busse, Hoth, König, & Blömeke, 2015; Krauss et al., 2020). This theo­ ability is regarded as a central aspect of teacher competence, and several
retical bundle of three skills has inspired the development of video-based instruments have been developed to assess perception and
performance-oriented testing in the field of teacher competence interpretation with respect to classroom management (Gold & Hol­
research (e.g., Kaiser et al., 2015), but also draws on concepts and odynski, 2017; Jamil et al., 2015; König, 2015). Regarding
findings from research on teacher expertise (see König et al., 2022). decision-making in classroom management, several studies have
Researchers have invested considerable effort in developing video- focused on differences between expertise groups (Stahnke & Blömeke,
based test instruments to capture teachers’ situation-specific skills and 2021; Wolff, Jarodzka, & Boshuizen, 2017; Wolff, van den Bogert, Jar­
to provide a measure of competence that more accurately captures odzka, & Boshuizen, 2015), on promoting decision-making through
teachers’ classroom behavior than traditional paper-and-pencil tests specific interventions (e.g., Demiraslan Çevik & Andre, 2014), and on
(Jamil, Sabol, Hamre, & Pianta, 2015; Kersting, 2008; Seidel & Stürmer, qualitative analyses of teachers’ responses to classroom disruptions
2014). For such video-based assessments, teachers usually view class­ (Kasperski & Yariv, 2022; Lampert, Burnett, Comber, Ferguson, &
room video clips and answer test questions related to the video material. Barnes, 2020). However, studies that include high-quality standardized
With the expansion of school-based learning opportunities within uni­ testing of decision-making skills in classroom management are lacking
versity teacher education programs across several countries worldwide (initial approaches can be found in Gippert, Hörter, Junker, &

* Corresponding author.
E-mail address: jonas.weyers@uni-koeln.de (J. Weyers).

https://doi.org/10.1016/j.tate.2023.104426
Received 7 May 2023; Received in revised form 14 October 2023; Accepted 20 November 2023
0742-051X/© 2023 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Holodynski, 2022; see below). 2.1. Situation-specific skills regarding classroom management
This paper focuses on the concept of Classroom Management
Expertise (CME), which can be seen as a specific conceptualization of Classroom management is broadly defined as “the actions teachers
teachers’ situation-specific skills in the area of classroom management. take to create an environment that supports and facilitates both aca­
The CME concept is based on findings from expertise research and has demic and social-emotional learning” (Evertson & Weinstein, 2006, p.
been used as the basis for the video-based measurement of competence 4). In this sense, classroom management involves the implementation of
(see König, 2015; König et al., 2021).1 Focusing on pre-service teachers’ various strategies, such as establishing rules and routines, monitoring
decision-making skills as part of their CME, we present a newly devel­ students’ activities, and dealing with disruptive behavior (Doyle, 2006).
oped video-based test instrument entitled ‘CME-Decide.’ The partic­ Knowledge about classroom management is considered to be part of
ipants—a sample comprising secondary and vocational school teachers’ GPK (König, 2014; Voss, Kunter, & Baumert, 2011), and suc­
pre-service teachers—viewed 12 short video clips and were asked to cessful classroom management has been shown to predict both student
assume the teacher’s perspective and indicate how they would behave achievement (Hattie, 2012; Wang, Haertel, & Walberg, 1993) and stu­
immediately after each observed situation. The purpose of this study is dent motivation (Kunter, Baumert, & Köller, 2007). However, many
to examine the reliability of this instrument and to provide empirical pre-service teachers do not feel adequately prepared for the demands of
evidence for the validity of intended test score interpretations by classroom management (Jones, 2006; Maulana, Helms-Lorenz, & van de
demonstrating plausible empirical associations with other relevant Grift, 2016).
variables in university teacher education. Drawing on teacher expertise research (Berliner, 1992; Carter,
Cushing, Sabers, Stein, & Berliner, 1988), we conceptualize teachers’
2. Theoretical framework and state of research situation-specific skills in the area of classroom management as CME
(König, 2015). Three cognitive demands associated with CME are
As a general theoretical framework, we draw on Blömeke et al.’s conceptualized as indicators of expertise (König, 2015): (1) “accuracy of
(2015) competence model, in which competence is viewed as a con­ perception”—that is, teachers’ capacity to identify, categorize, and
tinuum that ranges from dispositions (i.e., characteristics of a person), recall meaningful instructional details; (2) “holistic perception,” which
including cognition (e.g., professional knowledge) and denotes teachers’ ability to reconstruct the context of classroom events
affect–motivation (e.g., teachers’ motivational orientation), to observ­ within a lesson, thereby anticipating the lesson’s further course; (3)
able behavior (e.g., teaching quality). Furthermore, the model includes “interpretation/justification of action,” which concerns teachers’
three situation-specific skills (perception, interpretation, and reasoning about the function of classroom events. CME is closely linked
decision-making), which refer to the cognitive processes in specific job to the situation-specific skills outlined by Blömeke et al. (2015), with the
situations (e.g., during instruction) and which are conceptualized as a first two cognitive demands belonging to perception and the third
link between disposition and performance. cognitive demand representing interpretation. In addition, there are
Drawing on Blömeke et al.’s (2015) model, Fig. 1 illustrates the clear parallels between CME and the construct of teacher noticing.
analytical model that underpins the test development and validation Research on teacher noticing, also referred to as professional vision,
process for CME-Decide.2 For the conceptualization of addresses the specialized ways in which teachers perceive, interpret, and
situations-specific skills, we highlight the concept of CME and explore respond to instructional events (Choy & Dindyal, 2020; König et al.,
empirical association with pre-service teachers’ competence disposi­ 2022), and also relates teachers’ cognitive processes to dimensions of
tions, that is, general pedagogical knowledge (GPK) for cognition and instructional quality, including classroom management (Gold & Hol­
teaching enthusiasm for affect–motivation. Furthermore, following odynski, 2017; Seidel & Stürmer, 2014).3
commonly used models (e.g., Kaiser & König, 2019; Kunter, Kleick­ The effectiveness of classroom management is highly dependent on
mann, Klusmann, & Richter, 2013), the analytical model assumes that how the teacher decides to respond (or not respond) to student behavior
(pre-service) teachers use opportunity to learn (OTL) to acquire pro­ (Doyle, 2006). Moreover, teachers’ decision-making may also serve as a
fessional competence. valid indicator of expertise. Experienced teachers are assumed to have
In the following, we shall focus first on situation-specific skills in the more elaborate mental representations of teaching situations—so-called
area of classroom management, with particular attention to decision- “scripts”—enhancing the accessibility of routinized responses (Borko,
making skills as part of CME (König, 2015), and provide an overview Roberts, & Shavelson, 2008; Borko & Shavelson, 1990). Empirical evi­
of existing instruments. As the basis for our validation strategy, we also dence also suggests that experienced teachers generate more alternative
highlight the links between situation-specific skills and other key vari­ actions than novices when reflecting on videotaped classroom man­
ables in teacher education—namely, cognition, affect–motivation, and agement events (Stahnke & Blömeke, 2021; Wolff et al., 2015, 2017).
OTL. Therefore, we extend the CME framework to include “interactive deci­
sion-making” (Borko & Shavelson, 1990), focusing on the decisions that
the teachers make as they interact with their students during the course
of the lesson. This is to be distinguished from, for example, decisions that
teachers make in the course of lesson planning (Borko & Shavelson,
1990). Following Shavelson and Stern’s (1981) model of teachers’
1
The term competence is not used consistently, but primarily refers to interactive decision-making, teaching is primarily conceptualized as the
malleable skills and characteristics that enable individuals to cope with specific performance of routines—that is, “recurring activities that become
professional demands (e.g., Baumert & Kunter, 2013). Research on teacher established within a particular classroom as predictable sequences or
competence draws on findings from research on teacher expertise (e.g., Ber­ ‘scripts’ for teacher and student behavior” (Doyle, 1979, p. 61). The
liner, 1992), which also refers to skills in a professional context but has focused teacher delivers instruction in accordance with a mentally represented
primarily on the characteristics of highly experienced teachers compared to lesson plan, thereby monitoring student activities. When an
novices. We situate this work within research on teacher competence. However,
to highlight the connections with expertise research and to ensure consistency
with previous work, we use the term Classroom Management Expertise with
3
regard to our conceptualization of situation-specific skills. Overall, research on situation-specific skills is characterized by different
2
While Blömeke et al. (2015) list perception and interpretation separately, terminologies for similar constructs (for example, noticing or professional
the present study used a test instrument that measures these skills together. vision). In the following, we use “situation-specific skills” as an umbrella term.
Therefore, perception and interpretation were combined for the analytical When referring to individual studies, we use the terminology suggested there
model (Fig. 1). (for example, professional vision of classroom management).

2
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Fig. 1. Analytical model used in the present study.

instructional event, such as disruptive behavior, exceeds the threshold of Although this approach appears promising, the generation of alternative
what the teacher deems acceptable, the teacher may respond by courses of action does not refer exclusively to interactive
considering and initiating a routine (e.g., calling for quiet and attention) decision-making but also intersects with reflection on teaching.
or—if no suitable routine is available—by generating a spontaneous
response prior to reinitiating the teaching routine. 2.3. Associated variables in university teacher education

2.2. Measurement of situation-specific skills Developing an assessment of interactive decision-making requires a


careful validation process in which validity is understood as “the degree
Several test instruments have been developed to measure situation- to which evidence and theory support the interpretations of test scores
specific skills in a standardized way, commonly combining video clips for proposed uses of tests” (American Educational Research Association;
of teaching practice with writing prompts or closed item formats American Psychological Association; National Council on Measurement
(Weyers, König, Santagata, Scheiner, & Kaiser, 2023). However, few in Education, 2014, p. 11). Regarding video-based measures of
instruments developed to date have focused primarily on classroom pre-service teachers’ competence, plausible correlations between test
management. Assessing professional vision of classroom management, scores and other relevant variables within teacher education offer a key
Gold and Holodynski (2017) used a video-based measure to capture source of validity evidence (e.g., Stürmer, Könings, & Seidel, 2015;
(pre-service) teachers’ ability to describe and interpret instructional Wiens et al., 2013). For CME-Decide, we first address the relationship
events. Similarly, König’s (2015) video-based assessment of CME fo­ between decision-making, further situation-specific skills (i.e., percep­
cuses on perception and interpretation (hereafter, the CME-PI test). The tion and interpretation) and other aspects of competence, i.e., cognition
Video Assessment of Interactions and Learning (VAIL; Jamil et al., 2015; and affect–motivation (see Fig. 1, middle columns). Second, conceptu­
Wiens et al., 2013) examines teachers’ skills to identify effective alizing situation-specific skills as a learning outcome of teacher educa­
teacher–student interactions in video clips, including a separate subscale tion, we focus on the association between situation-specific skills and
to address “classroom organization.” learning opportunities (see Fig. 1, left column). Demonstrating associ­
In terms of decision-making, Gold and Holodynski’s (2015) situa­ ations between test scores and learning opportunities is a central part of
tional judgment test captures teachers’ strategic knowledge in classroom the validation strategy. If test scores covary with the use of learning
management. For this test, teachers read short descriptions of instruc­ opportunities, this can be taken as evidence that the test allows for valid
tional situations and then evaluate the effectiveness of several possible inferences about learning processes in the context of university teacher
responses using rating scales. However, the test takers are not required education. This is particularly important to use the test for the purpose
to actually formulate responses themselves. In Gippert et al.’s (2022) of evaluating interventions.
study, pre-service teachers watched a 4.5-min video clip and were asked
(among other things) to generate at least one alternative action.

3
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

2.3.1. Situation-specific skills, cognition, and affect–motivation 3. Context of the study and research questions
Teachers’ situation-specific skills have been conceptualized as
closely interrelated (Jacobs, Lamb, & Philipp, 2010; Kaiser et al., 2015). The CME-Decide has been designed to assess pre-service teachers’
Focusing on mathematics teaching, Bastian, Kaiser, Meyer, Schwarz, decision-making skills in the context of classroom management for
and König (2022) identified medium to strong latent correlations be­ research and evaluation purposes. In addition to evaluating this in­
tween decision-making and perception (r = .462) as well as between strument’s reliability, this study aims to provide validity evidence for
decision-making and interpretation (r = .815). two aspects of test score interpretations: first, test scores should be
Situation-specific skills are further assumed to be based on profes­ interpretable as indicative of pre-service teachers’ decision-making
sional knowledge (Blömeke et al., 2015; Kaiser et al., 2015; Seidel & skills in classroom management as part of their professional compe­
Stürmer, 2014). Accordingly, previous studies have repeatedly demon­ tence. Second, test scores should be interpretable as a learning outcome
strated associations between video-based measures and (paper-­ of university teacher education. Three research questions structure the
and-pencil) knowledge testing (Kersting, Givvin, Sotelo, & Stigler, 2010; study, as follows.
König, Blömeke, et al., 2014; Meschede, Fiebranz, Möller, & Steffensky,
RQ1. Does CME-Decide provide a reliable measure of decision-making
2017). Focusing on CME, König’s (2015) CME-PI test showed a positive
skills in classroom management (1a) that is distinguishable from a
correlation with GPK (r = .47). Similarly, Gold and Holodynski (2017)’s
video-based measure of perception/interpretation skills (1b)?
professional vision of classroom management test was positively corre­
We assumed that the CME-Decide test would show sufficient reli­
lated with strategic knowledge of classroom management (r = .42).
ability as a one-dimensional measure, which is a prerequisite for further
Following Blömeke et al. (2015), situation-specific skills have also
analyses. Consistent with previous research suggesting the separability
been regarded as connected to affect–motivation. In the present study,
of perception, interpretation, and decision-making measures (e.g., Bas­
we focus on teacher enthusiasm, defined as “an affective, person-specific
tian et al., 2022; Gippert et al., 2022), we hypothesized that the
characteristic that reflects the subjective experience of enjoyment,
CME-Decide test, focusing on decision-making, and the CME-PI test
excitement, and pleasure, and that is manifested in certain behaviors in
(König, 2015), focusing on perception/interpretation, would provide
the classroom” (Kunter, Frenzel, Nagy, Baumert, & Pekrun, 2011, p.
empirically separable measures, suggesting that two distinct constructs
290). Thus, teacher enthusiasm can be seen as a trait-like, dispositional
are being addressed. In line with the concept of situation-specific skills
characteristic that varies between teachers but also shows
as closely interrelated processes, we expected a moderate to high
situation-specific (i.e., state-like) variance components (e.g., Gaspard &
correlation.
Lauermann, 2021). Although the association between affect–motivation
and situation-specific skills has been studied rarely, there are theoretical RQ2. To what extent does CME-Decide correlate with measures
and empirical arguments suggesting a positive relationship—especially regarding teachers’ cognition (2a), that is, GPK assessed by means of a
for the domain of classroom management. For example, König (2015) paper-and-pencil test, and regarding teachers’ affect–motivation (2b),
reported negative correlations between CME-PI and teacher burnout that is, self-reported teaching enthusiasm?
scales. Kunter (2013) demonstrated that teachers who reported greater We anticipated a positive correlation between CME-Decide and GPK
general enthusiasm for teaching exhibited more effective classroom assessed using a paper-and-pencil test, in accordance with findings
management. Moreover, teacher enthusiasm is closely related to teacher based on the CME-PI test. Given the heterogeneity of previous studies’
emotions, which condition teachers’ instructional behavior, including results, no specific effect size was assumed. Furthermore, we hypothe­
classroom management strategies, whereby positive emotions (e.g., sized a positive correlation between CME-Decide and self-reported
enjoyment) promote the availability of alternative actions (Fredrickson, teaching enthusiasm. Again, no effect size was specified in light of the
2001; Frenzel, 2014). Overall, we expect that teachers’ decision-making lack of previous findings and theoretical reference points.
skills in classroom management will be positively related to enthusiasm.
RQ3. To what extent does the CME-Decide test relate to self-reported
university OTL (3a) and in-school OTL (3b)?
2.3.2. Opportunity to learn
We hypothesized that both university and in-school OTLs reported
According to the educational concept of learning opportunities in
by pre-service teachers would be positively associated with the CME-
teacher education (Floden, 2002; Schmidt, Cogan, & Houang, 2011),
Decide test scores; however, we expected small effect sizes, in line
(pre-service) teachers’ acquisition of competence is substantially influ­
with existing findings. Moreover, the association between OTL and test
enced by their use of OTL. OTLs, broadly defined as experiences aimed at
scores is highly dependent on the sample’s characteristics—for example,
achieving specific learning outcomes (Tatto et al., 2012), are created
the variance of OTL and test scores. Therefore, the available data were
and implemented by teacher education institutions and policy makers to
used to compare the effects of university and in-school OTLs on the CME-
organize (prospective) teachers’ learning processes (Schmidt et al.,
Decide test, the CME-PI test, and the GPK test using comparable samples.
2008). Two types of OTL applied in teacher education may be distin­
This approach also allows us to contextualize the effects with respect to
guished: university OTLs (e.g., lectures) and in-school OTLs (e.g., guided
the CME-Decide, identifying possible weaknesses but also the added
teaching during school practicum) (Flores, 2016).
value of the instrument compared to two established instruments.
Few studies have investigated the relationship between OTL and
situation-specific skills. Measures of professional vision have been
4. Method
shown to be associated with the number of pedagogical courses that pre-
service teachers have completed (Stürmer et al., 2015) and their
4.1. Sample and data collection
subject-specific study focus (Todorova, Sunder, Steffensky, & Möller,
2017). Studies have also demonstrated that participation in school in­
The sample used is part of a larger survey at the University of Co­
ternships (i.e., in-school OTL) supports the development of
logne (project: Zukunftsstrategie Lehrer*innenbildung [Future Strategy for
situation-specific skills (Mertens & Gräsel, 2018; Stürmer et al., 2013;
Teacher Education]), in which selected cohorts of pre-service teachers
Weber, Gold, Prilop, & Kleinknecht, 2018). However, some studies have
are surveyed annually throughout their studies. The survey focused on
detected no positive effects of in-school OTL (Gippert et al., 2022;
pre-service teachers’ knowledge, skills, and beliefs in addition to pro­
Todorova et al., 2017; Wiens et al., 2013), suggesting that further
gram features. Owing to time constraints, the CME instruments were
research is necessary.
only presented to those participants who were not studying German or
mathematics as a subject. Pre-service teachers who studied German or
mathematics were given subject-specific knowledge tests instead, which

4
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

were evaluated for other research projects. At the University of Cologne, instructional feedback. The participants were informed that they would
it is mandatory for primary school teachers and special education be shown four videos of teaching situations and that they would be
teachers to study either mathematics or German. As such, these pre- required to answer questions about the teaching that they had observed
service teachers could not be included in the sample. Consequently, after each video. The participants were permitted to view each video
the sample was limited to vocational and secondary school pre-service only once, and the videos were presented in randomized order. The test
teachers. Data from two cohorts, each with two measurement time took approximately 20 min to complete.
points, i.e., 2019 and 2020, were included: a bachelor’s cohort, which After each video clip, multiple choice and open response questions
was in the 2nd bachelor’s semester in 2019 and in the 4th bachelor’s were used to capture the participants’ perception and interpretation
semester in 2020; a master’s cohort, which was in the 2nd master’s se­ skills. The open response items were coded based on a coding manual
mester in 2019 and in the 4th master’s semester in 2020. The standard including 48 coding categories. Interrater agreement between two raters
period of study is 6 semesters for the bachelor’s program and 4 semesters was investigated based on 30 questionnaires (about 10% of all ques­
for the master’s program. tionnaires). Cohen’s Kappa was calculated for each of the 48 dichoto­
Participants were contacted via their student e-mail addresses and mous coding categories, yielding good results (Mκ = .867; SDκ = .146).
provided with a link to the survey, which was administered online using After aggregating the coding categories (see König, 2015), the final
EFS Survey software (TIVIAN XI GmbH) and which took about 1.5 h to scoring included 24 dichotomous items (multiple choice: 5 items; open
complete. In order to obtain the largest possible sample, participation response: 19 items), with 14 items targeting accuracy of perception, six
was possible for about six months (April to September, i.e., the entire items focusing on holistic perception, and four items addressing inter­
summer semester). Participants could choose when and where to com­ pretation (for item examples, see Fig. 2). Consistent with previous
plete the online survey. Data collection and processing were performed research that used this instrument, all 24 items were aggregated into one
in line with the General Data Protection Regulation. mean score.4
Participants were included in the analysis reported below if they CME-PI data were available for 283 participants. A one-dimensional
provided valid values for at least half of the CME-Decide test items. The IRT scaling analysis indicated acceptable reliability (weighted likeli­
final sample comprised 134 participants for 2019 and 150 participants hood estimation [WLE] reliability = .626; expected a posteriori/plau­
for 2020. Table 1 details the descriptive sample statistics. Because both sible values [EAP/PV] reliability = .634), with all items showing good
cohorts were invited to participate in 2019 and 2020, in principle, all item discrimination (>.20). Weighted mean squares (WMNSQ) were in
pre-service teachers could have participated at both time points. How­ the recommended range (0.75–1.30; Bond, Yan, & Heene, 2020).
ever, only n = 62 individuals provided data for both measurement time
points, and the data were analyzed cross-sectionally. 4.2.2. Classroom Management Expertise: decision-making
To assess representativeness, this sample was compared with the The CME-Decide test was implemented to address decision-making
study’s target population (see Table 2). We defined the target population in classroom management. As part of the test development, 17 instruc­
as those pre-service secondary and vocational teachers who were tional videos were initially selected from databases and own classroom
enrolled at the University of Cologne as part of one of the two cohorts videography. To ensure that the teaching situations were sufficiently
studied. For this purpose, data pertaining to the distribution of different authentic, particularly the teaching disruptions shown, only video re­
teacher education programs as well as the sample’s gender distribution cordings of authentic teaching practice were considered as opposed to
were provided by the university administration. Although some statis­ scripted video vignettes. In a pilot survey, 12 pre-service teachers and 18
tically significant differences emerged for the 2019 sample (gender for in-service teachers indicated how they would respond as teachers in the
bachelor’s students, teacher education program for master’s students), scenarios that they viewed and further rated the image and sound
the sample appears to be generally comparable to the study population. quality, the videos’ lengths, and the level of difficulty for each video.
However, selection effects based on other characteristics cannot be ruled Based on this survey, 12 videos lasting 9–58 s were selected for the final
out (e.g., increased willingness to participate among high-achieving test version.
students; subject-specific selection). Therefore, the inferential statisti­ For the present study, the pre-service teachers were advised in
cal analyses should be interpreted with caution. advance that they would be shown short videos and that they would be
Participants first completed a general survey with questions about required to indicate how they would respond as teachers immediately
learning opportunities, enthusiasm for teaching, and GPK, and then the after the situations they observed. The participants were prompted to
specific survey about CME (first perception/interpretation, then respond immediately without giving considerable thought to their re­
decision-making). To avoid survey overload, the survey questions were sponses. At the beginning of each video, a black screen showing the
adapted to the cohort’s characteristics and study phase. Consequently, video source was presented for 5 s followed by a still image of the
the questions about teaching enthusiasm, which focused on teachers’ ensuing instructional practice for a further 5 s, offering the participants
experience during instruction, were administered only to the master’s an overview of the teaching situation before the instructional sequence
cohort in 2020 (n = 52). For this cohort, a sufficient level of practical began. Participants could view each video only once and were presented
experience may be assumed, as the cohort would typically have with the same writing prompt after each video: “As a teacher, how
completed the long-term teaching practicum by that time. Of the 52 would you act now (following the sequence you saw)?“. The videos were
respondents, 49 indicated that they had already completed their long- presented in randomized order, and all participants were given the op­
term practicum (currently in practicum: one person; not yet in prac­ portunity to view all videos and answer the questions. Due to technical
ticum: one person; no answer: one person). problems and/or other unknown reasons, some participants did not
respond to all video clips. Missing values were kept as missing values in
4.2. Measures further analyses (and not recoded as incorrect). Scaling and averaging
were performed using the available data. However, the proportion of
4.2.1. Classroom Management Expertise: perception/interpretation valid responses was high, as the final sample responded to an average of
To assess perception and interpretation regarding classroom man­ 11.67 videos (SD = 0.86; Min = 7; Max = 12). The CME-Decide test took
agement, we implemented the CME-PI test, which has been successfully approximately 20 min to complete.
used in previous studies (König, 2015; König et al., 2021). The instru­
ment comprises four video clips that portray authentic instructional
practice and include typical classroom management situations focusing 4
For all test instruments, mean scores were calculated by dividing the
on (1) management of transitions, (2) management of instructional time, number of points obtained by the number of items completed by each indi­
(3) management of student behavior, and (4) management of vidual participant.

5
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Table 1
Sample statistics.
Measurement time point n nBachelor nMaster Gender Age Study semester Average grade Teacher education
(female) M (SD) M (SD) M (SD) program (%)

Ba Ma Ba Ma Ba Ma Ba Ma I II III

2019 134 93 (69 %) 41 (31 %) 78 % 74 % 22 (7) 26 (3) 2 (− ) 8 (− ) 2.2 (0.62) 2.3 (0.52) 29 60 11
2020 150 98 (65 %) 52 (35 %) 70 % 75 % 24 (5) 27 (3) 4 (− ) 10 (− ) 2.3 (0.63) 2.2 (0.50) 22 64 14
Total 284 191 (67 %) 93 (33 %) 74 % 74 % 23 (6) 26 (3) 3 (1) 9 (1) 2.3 (0.63) 2.2 (0.51) 25 62 13

Note. Average grade: average grade in final secondary school examinations (lower values indicate better performance); the study semester for the master’s cohort was
determined by adding the master’s semester plus six semesters (i.e., the typical time required to complete a preceding bachelor’s degree); I: non-academic track (lower
secondary school); II: academic track (lower and upper secondary school); III: vocational school.

Table 2
Comparison of population and sample statistics for teacher education program and gender.
2019 Bachelor Master

Population Sample χ2 (p) Population Sample χ2 (p)


Teacher education program
Non-academic track 25% 28% 0.37 (.82) 21% 32% 6.56 (.04)
Academic track 67% 66% 68% 47%
Vocational school 8% 7% 12% 21%
Gender
Female 61% 78% 8.42 (<.01) 63% 74% 1.74 (.19)
Male 39% 22% 37% 26%
N/n (response rate) 1151 93 (8%) 505 41 (8%)
2020 Bachelor Master
Population Sample χ2 (p) Population Sample χ2 (p)
Teacher education program
Non-academic track 24% 24% 2.83 (.24) 21% 19% 0.81 (.67)
Academic track 67% 63% 68% 65%
Vocational school 8% 13% 11% 15%
Gender
Female 61% 70% 3.51 (.06) 63% 75% 3.22 (.07)
Male 39% 30% 37% 25%
N/n (response rate) 945 98 (10%) 481 52 (11%)

Note. The population includes all pre-service secondary and vocational teachers who were enrolled at the University of Cologne as part of one of the two cohorts
studied.

Fig. 2. Sample items from the CME-PI test, focusing on a) accuracy of perception, b) holistic perception, and c) interpretation.

Table 3 presents an overview of the videos, which encompassed wherein the teacher could decide to proceed with instruction as plan­
instructional practice covering different school types and grades as well ned—for example, by continuing the classroom dialogue—or to inter­
as teachers with varying experience levels. The instrument was origi­ rupt the lesson’s flow to deal with disturbance to ensure that the planned
nally developed to capture decision-making across teachers in all types activities would be successful. The videos addressed various issues
of schools. As such, the videos include teaching situations from primary relating to classroom management (see Table 3, last column).
and secondary schools, although the present sample does not include A coding manual that included literature-based response criteria for
primary school teachers. Each video depicted a teaching situation each video was developed to score the written answers. These response

6
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Table 3 Table 3 (continued )


Overview of the videos included in the CME-Decide. Video Dura- School Grade Description Issues in
Video Dura- School Grade Description Issues in tion type classroom
tion type classroom management
management
begins talking. The
1 17 s Sec. 7 The sequence occurs Goal clarity, teacher stops moving
(non- at the beginning of disruptive and looks at the
ac.) the lesson. The behavior student.
teacher writes the 7 33 s Prim. 4 The students sit at Rules and
lesson’s topic on the their tables, talking to routines
blackboard and each other. The
begins saying, “Our teacher first gives an
topic of this lesson acoustic signal and
…“, while the then places herself in
students converse front of the class
with one another and showing a card and
do not attend to the using gestures to
teacher or communicate that the
blackboard. students should all
2 27 s Sec. 7 The teacher has Managing stand up. Most
(non- written a text on the momentum, students do not react
ac.) blackboard that the group focus to this instruction.
students are asked to 8 25 s Prim. 4 In a plenary phase, Managing
copy. One student is the students sit at transitions,
unable to identify a tables in groups, Clarity of
word on the while the teacher instruction
blackboard and asks explains that for the
the teacher “What’s next working phase,
written there?“. The they will change
teacher goes to the places as soon as she
student. gives an acoustical
3 9s Sec. 7 The teacher has Time signal. She then
(non- finished writing the management, names three groups of
ac.) text on the group focus students (lions, bees,
blackboard and says: and mice) and assigns
“I hear some one table to each
whispering. Who has group.
not yet finished 9 33 s Sec. 5 In a plenary phase, Clarity of
copying the text? (ac.) the teacher sits at the instruction
Who still needs one or teacher’s desk and
2 min?” Two students clarifies the next steps
raise their hands. for the following
4 28 s Prim. 1–4 The video depicts a Rules and phase, in which the
classroom during the routines students will conduct
working phase. Two an experiment. He
students leave their says that the safety
places to ask the instructions will be
teacher for help. The discussed later but
teacher asks, “What is that he first wants the
our rule when we students to explain
have a question?” The the next steps. A
students answer, “We student chosen by the
raise our hands”. teacher begins
5 11 s Prim. 4 The teacher stands in Clarity of reading the safety
front of the class instruction instructions.
explaining the next 10 58 s Sec. 5 In a plenary phase, Group focus
steps. She says, “In a (ac.) the teacher and the
moment, I will ask students sit in a circle.
everyone to take out a One student explains
pencil”. While the a subject-related issue
teacher wishes to using an example. As
continue instruction, the student’s turn is
the students begin comparably long, the
searching for pencils other students in the
in their school bags. circle begin moving
6 23 s Prim. 4 During a plenary Disruptive in their seats.
phase, the students sit behavior 11 17 s Prim. – The students are Disruptive
quietly in a circle. The sitting at group tables behavior
teacher announces during classroom
that she will give a dialogue. Suddenly,
worksheet to one one student
student and that she complains that his
will now prepare the neighbor is making
partitions for the next fun of him. The two
step. While she moves students begin
toward the partitions, arguing loudly while
another student the rest of the class
remains quiet.
(continued on next page)

7
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Table 3 (continued ) (“assessment”). Fig. 3 illustrates item examples.


Video Dura- School Grade Description Issues in GPK test data were available for 227 participants, who provided
tion type classroom valid values for at least half of the items administered. Owing to the
management small number of items within each area, which led to unacceptable re­
12 25 s Sec. 5 The students work at Managing liabilities in some cases, a one-dimensional measure of GPK was speci­
(ac.) group tables. As they transitions fied using the mean score. After items with low item discrimination
are talking to one (<.10) were removed, the test comprised 50 dichotomous items. IRT
another, the
scaling analysis revealed acceptable reliability (WLE reliability = .730;
classroom is
comparably noisy. EAP/PV reliability = .587) with sufficient item discrimination (>.10)
The teacher is moving and WMNSQ in the recommended range.
in the middle of the
classroom and begins 4.2.4. Teaching enthusiasm
talking to the plenary,
beginning “So …“.
Teaching enthusiasm was assessed using a scale developed by Bau­
mert et al. (2008) comprising three statements rated on four-point Likert
Note. Sec. = secondary school, Prim. = primary school, non-ac. = non-academic scales (1 = fully incorrect to 4 = fully correct). The items were originally
track (i.e., vocationally oriented schools that do not qualify students for uni­
constructed to assess teachers’ subject-specific enthusiasm in mathe­
versity studies), ac. = academic track (i.e., schools that prepare and qualify
matics and were reformulated with a general focus on teaching (e.g., “In
students for university studies).
my classes, I try to inspire the students”). The scale’s reliability was good
(Cronbach’s α = .736, M = 3.36, SD = 0.50).
criteria had been validated and adjusted as necessary based on an expert
survey of 12 teachers, who had an average of 20.73 years of teaching
4.2.5. Opportunity to learn
experience (SD = 13.34). Given that multiple correct reactions were
Table 4 presents the OTL measures implemented, including sample
possible for most videos, the final coding manual included 21 coding
items, descriptive statistics, and reliability. University OTL in general
categories. For example, regarding video 10 (see Table 3), one category
pedagogy was assessed using a scale developed by König, Ligtvoet,
focused on completing the student’s turn (e.g., “I would thank the stu­
Klemenz, and Rothland (2017). Each item referred to a specific item of
dent for his detailed answer including the example and then carry on
learning content, and participants indicated whether it had been
with instruction”); a second category addressed the activation of the
covered during their teacher training (yes/no). The content represented
entire group (e.g., “I would try to integrate the entire group into the
central topics within German university teacher training.
dialogue and not only focus on one single student”).
In-school OTL was assessed using two operationalizations. First, an
Each coding category was implemented as dichotomous (0 = the
instrument developed by König, Tachtsoglou, Darge, and Lünnemann
answer does not match the category, 1 = the answer matches the cate­
(2014) was applied, focusing on four central activities during teaching
gory). A double-blind coding was conducted by two independent raters
practicum: lesson planning, teaching, linking theories to situations, and
for 60 questionnaires (about 20% of all questionnaires). Cohen’s Kappa
reflecting on practice. Each area included several items that referred to
was calculated for each of the 21 dichotomous coding categories, indi­
specific activities (e.g., formulating learning goals), for which partici­
cating good interrater reliability (Mκ = .930, SDκ = .075). Preliminary
pants indicated whether they had engaged in this activity during
analyses revealed strong dependencies among categories within one
teaching practicum (yes/no).
video, as, for example, participants tended to write only one out of two
Second, a measure of invested learning time during teaching prac­
responses categorized. To avoid testlet effects, the coding categories
ticum that had been developed in a previous project (König, Tachtso­
were combined into a single dichotomous scoring category per video.
glou, et al., 2014) was implemented. Four activities were addressed:
For example, for video 10, the scoring item was 1 if the answer matched
observing instruction, supporting/co-teaching, teaching with guidance,
at least one of the two coding categories. Only for video 11 were the two
and teaching alone (i.e., teaching without being guided by an experi­
coding categories—(1) disrupting the dispute (immediate response) and
enced teacher). For each of the four activities, the participants indicated
(2) settling the dispute after class (delayed response)—sufficiently in­
how many hours they had invested using a rating scale (0: “0 h”, 1:
dependent to justify the inclusion of both categories in the score.
“1–10 h”, 2: “11–20 h”, 3: “21–31 h”, 4: “31–50 h”, 5: “51–100 h”, 6:
Consequently, the final CME-Decide scoring included 13 dichotomous
“more than 100 h”).
items, which were aggregated into a mean score.
4.3. Data analysis
4.2.3. General pedagogical knowledge
For GPK, a paper-and-pencil test that was originally developed for
Regarding RQ1, CME-Decide’s reliability was estimated by speci­
the international comparative study TEDS-M was implemented (König &
fying a one-dimensional Rasch model using the software ConQuest (Wu,
Blömeke, 2010). Owing to data collection constraints, a short form was
Adams, & Wilson, 1997), additionally accounting for internal consis­
applied using a rotated design based on two equivalent test booklets
tency as indicated by Cronbach’s α. To investigate the empirical asso­
randomly assigned to the participants. It took approximately 15 min to
ciation between the CME-Decide and CME-PI tests, first, a
complete one booklet. While 36 items were included in both test
one-dimensional Rasch model was estimated, with both measures
booklets, 11 items were included in only one of the booklets. Across both
specified as based on a single latent variable. Second, a two-dimensional
test booklets, the test contained 58 test items (30 multiple choice items
Rasch model was estimated by differentiating two latent varia­
and 28 open response items). Trained raters coded the open-ended re­
bles—perception/interpretation and decision-making. The two scaling
sponses on the basis of an established coding manual.
models were compared, accounting for reliability, theta variance, model
Previous studies have indicated the reliability of the short version of
deviance, and the sample size-adjusted Bayesian information criterion
the GPK test (e.g., Brühwiler, Hollenstein, Affolter, Biedermann, & Oser,
(BIC).
2017; Gerhard, Jäger-Biela, & König, 2023; König, Blömeke, Paine,
The relationship between CME-Decide and other competence mea­
Schmidt, & Hsieh, 2011). Four areas of GPK were addressed: (1) dealing
sures (RQ2)—GPK and teaching enthusiasm—was investigated using
with heterogeneous learning groups (“adaptivity”), (2) preparing,
Pearson correlation. The effects of university and in-school OTLs (RQ3)
structuring and evaluating lessons (“structure”), (3) organizing class­
were examined based on multiple regression models using Mplus
rooms and motivating students (“classroom management/motivation”),
(Muthén & Muthén, 1998–2006). This procedure allowed estimation of
and (4) assessing and evaluating students’ learning and ability
the effects of OTL while controlling for the participants’ individual

8
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Fig. 3. Sample items from the TEDS-M GPK test, focusing on (a) motivation, and (b) structure.

characteristics, including average grade in final school examinations, 5. Results


age, gender, and teacher education program. Given that OTL measures
typically show high intercorrelations, a separate model was specified for 5.1. RQ1: Reliability analysis and decision-making as part of CME
each measure (see König et al., 2017). For each regression model pre­
dicting the CME-Decide scores, two equivalent models with CME-PI and The reliability analysis revealed that CME-Decide had comparably
GPK as criterion were specified. Thus, we could compare the effects of low reliability (α = .512, WLE reliability = .459, EAP/PV reliability =
OTL on CME-Decide with the effects of OTL on two established .501). Item discrimination was sufficient for all items, and WMNSQ were
competence measures. For the regression models, the sample structur­ in the recommended range. Table 5 presents an overview of IRT scaling
e—two cohorts at two measurement times—was considered by speci­ models for the three proficiency tests, including CME-Decide. Reliability
fying the study semester as a stratification variable (option “type = was thus lower than recommended by widely used rules of thumb.
complex” in Mplus).5 However, as reliability represents the proportion of “true” variance, at
In cross-sectional analyses with the full sample (i.e., correlations least 50% of the variance was explained by the underlying construct. As
with knowledge, CME-PI, and learning opportunity effects), 62 in­ all items also showed positive item discrimination, we consider this
dividuals who participated twice are included twice in the analysis. This sufficient to explore further analyses, although reliability must be dis­
could bias the results. We have therefore replicated the central analyses cussed critically.
on the CME-Decide test, excluding second participations, and report Based on IRT scaling, a one-dimensional model—CME as one uni­
discrepancies in the text. dimensional latent ability—was compared to a two-dimensional mod­
el—CME-PI versus CME-Decide (see Table 6). The one-dimensional
model showed good reliability in light of the higher number of items
5 targeting one latent variable. However, reliability estimates were still
After removing for each instrument those cases that had completed less than
50% of the items, the proportions of missing values were low for CME-Decide acceptable for the two-dimensional model, except for the CME-Decide
(2.5%) and CME-PI (7.8%). Owing to the rotated design, the proportion of measure’s WLE reliability. Regarding model fit, the deviance was
missing values was higher for the GPK test (22.0%).Regarding the predictor lower for the two-dimensional model, suggesting that this model fitted
variables, no additional information for any predictor was available for 31 cases the data better than the one-dimensional model. The difference in model
(10.9%). Regarding the remaining cases, the proportion of missing values was fit was statistically significant. By contrast, the BIC, accounting for the
acceptable for OTL in general pedagogy (7.6%), in-school OTL (9.9 %), and number of parameters estimated, was in favor of the more parsimonious
invested time (7.9 %). Cases without valid values on one or more predictors one-dimensional model. The latent correlation between perception/
were excluded from the respective regression models.

9
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Table 4 distinct. In line with our expectations, GPK showed significant, but small
Overview of OTL measures. correlations with both CME-PI and CME-Decide. This can be seen as an
Scale Sample item nvalid nitems M α encouraging finding, as knowledge and decision-making were associ­
cases (SD) ated to a similar extent as knowledge and a more established measure of
Opportunity to learn—General pedagogy perception/interpretation. The effects remained stable when second
Adaptivity Individual instructional 239 11 .39 .809 participations were excluded from analyses.
support (.28) Teaching enthusiasm showed a significant correlation of medium
Structure Teaching methods 239 9 .62 .882 height with CME-Decide. This finding corresponds to our assumption
(.34)
Classroom Classroom rules 237 8 .36 .838
that decision-making, which is closely related to teachers’ instructional
management/ (.32) behavior, relates to enthusiasm. Since enthusiasm was not significantly
Motivation correlated with either knowledge or CME-PI, the effect may be specific
Assessment Diagnostics of learning 239 9 .44 .945 to decision-making rather than generally linking performance on
processes (.40)
pedagogy-related tests to positive emotions.
Total (OTL – 241 37 .45 .939
Pedagogy) (.27)

In-school opportunity to learn 5.3. RQ3: Effects of OTL in university teacher education
Lesson planning I have formulated 234 12 .44 .911
learning goals aligned (.33)
The manifest correlations between competence measures, control
with the curriculum. variables, and OTL measures are detailed in Supplementary Material A
Teaching I have checked 234 32 .45 .950 (Tables A1 and A2). The CME-Decide measure was significantly corre­
attendance. (.30) lated with the study semester (r = .121), with more advanced semesters
Linking theories I have observed teaching 229 11 .39 .872
associated with better decision-making skills. Moreover, moderate to
to situations methods that I have (.31)
learned at my university/ high correlations emerged between study semester and OTL scales. In
teacher training college light of our aim to capture decision-making as a function of OTL across
course. university teacher education, the study semester was not included in the
Reflecting on I have drawn conclusions 229 11 .35 .870 regression models; as such, the regression coefficient estimated for the
practice for future teaching. (.31)
OTL scales represents the effect of OTL across all study semesters. For
Learning time Median Category each model, an equivalent model including the study semester is re­
Observing instruction (hours) – 235 5 51–100 h ported in Supplementary Material A (Tables A3 to A5), corresponding to
Supporting/Co-Teaching (hours) – 239 2 11–20 h the effect of OTL within a semester.
Teaching with guidance (hours) 233 1 1–10 h

Regarding the control variables (teacher education program, gender,
Teaching alone (hours) – 234 0 0h
age, and average grade), pre-service teachers in the academic track
Note. Means represent the relative frequency of content or experiences reported significantly outperformed their counterparts in the non-academic and
by participants; nvalid cases indicates the number of cases for which sufficient data vocational track for both video-based CME instruments. Consistent with
were available with respect to this variable.

Table 7
interpretation and decision-making was high (r = .813). In sum, these Bivariate Pearson’s correlations of competence measures.
results are partly ambiguous but also provide evidence in favor of the
Variables CME-Decide CME-PI GPK
two-dimensional model reflecting two interrelated—but
distinguishable—abilities. CME-Decide r –
(n)
CME-PI r .445*** –
5.2. RQ2: Associations with other competence measures (n) (283)
GPK r .214** .206** –
(n) (227) (226)
Table 7 presents Pearson’s correlations between the different Teaching enthusiasm r .339** .178 .081
competence measures. A moderate manifest correlation was observed (n) (52) (52) (50)
between the CME-PI and CME-Decide tests, which was consistent with
Note. *p < .05, **p < .01, ***p < .001. Correlations are based on mean scores.
our expectations, highlighting that the two measures are empirically

Table 5
Reliability analysis of proficiency tests.
Measure nItems M (SD) α WLE reliability EAP/PV reliability Theta variance WMNSQ (min–max) Item discrimination (mean; min–max)

CME-Decide 13 .39 (.17) .512 .459 .501 .419 0.95–1.04 .38; .24–.47
CME-PI 24 .51 (.15) .709 .626 .634 .480 0.94–1.07 .34; .21–.49
GPK 50 .52 (.13) – .730 .587 .417 0.87–1.12 .31; .12–.53

Note. For the GPK test, a rotation design based on two test booklets was implemented to reduce testing time. Consequently, Cronbach’s α is not reported for GPK.

Table 6
Comparison of IRT models.
Model BIC Deviance Para meters Likelihood ratio test WLE reliability EAP/PV reliability Variance

CME 10952.22 10737.42 38 .713 .716 0.412


CME-PI vs. CME-Decide 10955.59 10729.49 40 ΔDeviance = 7.92 df = 2 PI .626 PI .691 PI 0.480
p < .05 D .459 D .638 D 0.431

Note. BIC: Bayesian information criterion; Deviance: − 2log (likelihood ratio); ΔDeviance: χ2-distributed test statistic; CME = Classroom Management Expertise as one
latent variable; CME-PI & CME-Decide = Classroom Management Expertise modeled as two latent variables perception/interpretation (PI) and decision-making
(Decide). For the likelihood ratio test, the two-dimensional model shows better fit.

10
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

previous research, there were effects of gender (better performance by 6.1. Reliability of measurement and separability of perception/
female students), and average grade (better performance by students interpretation
with higher grades) on GPK (e.g., König & Seifert, 2012). In addition,
older students showed higher GPK even when controlling for the study Regarding RQ1, CME-Decide’s reliability was sufficient for subse­
semester, which may be due to extracurricular pedagogical experience. quent analyses but remained comparably low. However, the teachers
surveyed had limited teaching experience, and thus the low reliability
5.3.1. OTL in general pedagogy may also be a result of the sample’s limited variance. Moreover, the
Table 8 presents multiple regression models, including university number items used to measure a comparatively heterogeneous construct
OTL in general pedagogy. As expected, the four OTL scales significantly was small, which may have further negatively impacted the estimation
predicted GPK with small to moderate effect sizes. Contrary to our hy­ of reliability. We thus consider the reliability to be particularly
pothesis, however, no OTL scale showed a significant effect for CME- encouraging—also in view of CME-Decide’s innovative potential as
Decide and CME-PI. While the predictors explained a considerable based on a writing prompt that is open compared to previous in­
proportion of variance for GPK (around 30%), only a small proportion of struments. Neither a pre-specified interpretation of the classroom
variance was explained for both video-based assessments. practice shown (Bastian et al., 2022) nor a range of alternative actions
(Gold & Holodynski, 2015) was provided during testing. Adopting a
5.3.2. In-school OTL multitrait-multimethod approach, Müller and Gold (2023) reported that
Regarding in-school OTL, operationalized as specific in-school ac­ the assessment of describing and interpreting skills was highly depen­
tivities, small but significant effects were identified for lesson planning, dent on the item format used (open writing prompts vs. rating scales),
linking theories to situations, and reflecting on practice, predicting suggesting that the task format significantly shapes the measurement of
decision-making skills (see Table 9). For perception/interpretation, no situation-specific skills. The CME-Decide test may thus be promising in
significant effect was found for any in-school OTL scale, whereas all four providing a new analytical perspective on decision-making in classroom
in-school OTL scales significantly predicted GPK with small effect sizes. management.
Regarding in-school OTL, operationalized as invested learning time Based on IRT scaling analysis, the CME-Decide was closely connected
during school practicum, three activities significantly predicted to perception and interpretation skills in classroom management. Model
decision-making skills: observing instruction, supporting instruction, fit indices showed ambiguous findings with respect to whether the one-
and teaching with guidance (see Table 10). Perception/interpretation dimensional (CME) or two-dimensional model (perception/interpreta­
was only predicted by observing instruction, while GPK was predicted tion vs. decision-making) fitted the data better. This finding should be
by supporting instruction and teaching with guidance. Effect sizes for all interpreted cautiously as the model fit of the two-dimensional model is
models were small. Teaching alone was unrelated to any competence affected by the low reliability of the CME-Decide test. By contrast, on the
measure. manifest level, CME-Decide and CME-Perception/Interpretation were
When the second participations (n = 62) were removed from the only moderately correlated, suggesting that the measures represent two
analysis, the effects of in-school OTL on the CME-Decide test became interrelated but distinct constructs. This is in line with previous research
slightly weaker and in some cases no longer significant. However, sig­ conceptualizing decision-making as based on the accurate perception
nificant effects still emerged for reflecting on practice and teaching with and interpretation of instructional practice (Jacobs et al., 2010; Kaiser
guidance. et al., 2015; Shavelson & Stern, 1981).

6. Discussion 6.2. Associations with knowledge and teaching enthusiasm

The study presents the CME-Decide test, which was developed to Regarding RQ2, a positive correlation was found between the CME-
assess pre-service teachers’ decision-making skills in classroom man­ Decide and GPK as measured by a paper-and-pencil test. However, the
agement as a learning outcome of university teacher education. Partic­ effect size was small. Previous studies’ findings concerning professional
ipants watched short video clips and noted how they would respond as knowledge and video-based competence measures show considerable
teachers in the situations observed. The measure’s appropriateness was heterogeneity in effect sizes (r = .29; König, Blömeke, et al., 2014; r =
investigated by examining its reliability as a separable facet of CME 0.42; Gold & Holodynski, 2017; r = 0.56; Meschede et al., 2017).
(RQ1), its associations with GPK and teaching enthusiasm (RQ2), and Exploring the relationship between CME-PI and GPK, König (2015) re­
the effects of university and in-school OTLs (RQ3). ported a moderate correlation (r = .47). As Müller and Gold (2023) have
recently remarked, the association between professional knowledge and

Table 8
Multiple regression models for CME-Decide, CME-PI, and GPK regressed on university OTL.
CME-Decide CME-PI GPK

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12

Variable β β β β β β β β β β β β
Age .02 .03 .03 .02 .06 .07 .06 .07 .35*** .39*** .38*** .37***
Gender .00 − .01 .00 − .01 − .06 − .06 − .06 − .06 − .22*** − .25*** − .24*** − .24***
Average grade − .06 − .07 − .07 − .07 .00 .00 .00 − .01 − .15* − .17* − .17** − .17**
Programa
Non-academic − .23** − .22** − .22** − .24** − .21** − .21** − .21** − .19* − .06 − .04 − .05 − .10
Vocational − .16* − .16* − .15* − .16* − .04 − .03 − .04 − .03 − .05 − .03 − .02 − .03
OTL-GP
Adaptivity .07 .01 .30***
Structure .04 − .01 .20***
Classroom management .02 .00 .26***
Assessment .08 − .06 .26***
R2 .08 .08 .08 .08 .05 .05 .05 .05 .32 .27 .30 .29

Note. *p < .05, **p < .01, ***p < .001. M: model; β: standardized regression coefficients.
a
Teacher education program was accounted for using dummy variables; reference group is academic track.

11
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

Table 9
Multiple regression models for CME-Decide, CME-PI, and GPK regressed on in-school OTL (OTL in general pedagogy included as control variable).
CME-Decide CME-PI GPK

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12

Variable β β β β β β β β β β β β
Age − .04 − .01 − .01 − .02 .05 .07 .08 .05 .25*** .29*** .30*** .30***
Gender − .01 − .02 .02 .01 − .07 − .08 − .09 − .08 − .21*** − .22*** − .18** − .20***
Average grade − .06 − .06 − .06 − .07 − .01 − .01 − .01 .00 − .14* − .15* − .13* − .15*
Programa
Non-academic − .25** − .25** − .22** − .25** − .23** − .22** − .24** − .25** − .10 − .10 − .07 − .09
Vocational − .14* − .16* − .14* − .17* − .03 − .03 − .04 − .05 − .03 − .04 − .03 − .06
OTL-GP − .02 .03 − .03 − .02 − .06 − .02 − .02 − .03 .18* .23** .18* .21**
In-school OTL
Planning .18* .08 .27***
Teaching .09 .02 .17*
Linking .17* − .04 .24**
Reflecting .23** .07 .21**
R2 .10 .09 .10 .13 .07 .06 .07 .08 .35 .33 .34 .37

Note. *p < .05, **p < .01, ***p < .001. M: model; β: standardized regression coefficients.
a
Teacher education program was accounted for using dummy variables; reference group is academic track.

Table 10
Multiple regression models for CME-Decide, CME-PI and GPK regressed on learning time during teaching practicum (OTL in general pedagogy included as control
variable).
CME-Decide CME-PI GPK

M1 M2 M3 M4 M5 M6 M7 M8 M9 M10 M11 M12

Variable β β β β β β β β β β β β
Age .00 − .03 − .05 .00 .05 .04 .03 .07 .32*** .26*** .26*** .33***
Gender .00 .01 − .01 .01 − .06 − .07 − .08 − .08 − .22*** − .22*** − .23*** − .23***
Average grade − .05 − .07 − .05 − .08 .01 − .01 .00 − .02 − .13* − .16* − .13* − .15*
Programa
Non-academic − .24** − .25** − .27*** − .24** − .21** − .23** − .24** − .23** − .09 − .11 − .12 − .10
Vocational − .18** − .17* − .18** − .16* − .05 − .02 − .04 − .04 − .07 − .03 − .07 − .06
OTL-GP .01 − .01 − .02 .06 − .07 − .05 − .07 − .03 .27*** .21** .21** .30***
Learning time
Observing .17* .16* .12
Supporting .21** .09 .21**
Teaching (with guidance) .26** .12 .27***
Teaching (alone) .11 .02 .04
R2 .11 .11 .13 .10 .08 .07 .07 .07 .32 .34 .36 .31

Note. *p < .05, **p < .01, ***p < .001. M: model; β: standardized regression coefficients.
a
Teacher education program was accounted for using dummy variables; reference group is academic track.

situation-specific competence may be particularly dependent on the positive emotions in the classroom (Frenzel, 2014). On the other hand,
type of knowledge assessed (e.g., procedural–conditional vs. declarative robust teaching enthusiasm may be regarded as a disposition that pro­
knowledge), the item format used to measure situation-specific skills (e. motes successful decision-making in the classroom (Fredrickson, 2001;
g., open prompts vs. rating scales), and the sample’s expertise level. Frenzel, 2014; Kunter, 2013). The affective–motivational disposition
Assuming that teachers develop their situation-specific skills when may also have directly affected the testing situation. Teachers with
reflecting on instructional practice drawing on professional knowledge, higher enthusiasm levels may have experienced greater enjoyment
the association between knowledge and skills may be further consoli­ during testing, which may, in turn, have influenced their
dated with increased practical experience. This assumption may be decision-making favorably.
supported by explorative analysis. In the present sample, the correla­
tions between both CME measures and GPK were lower for bachelor’s
students (n = 149; rCME-PI,GPK = .159, p = .054; rCME-D,GPK = .099, p = 6.3. The effect of opportunity to learn
.229) than for master’s students (n = 78; rCME-PI,GPK = .344, p < .01;
rCME-D,GPK = .334, p < .01). Regarding RQ3, the CME-Decide was predicted by in-school OTL,
For the subsample of pre-service teachers at the end of their master’s including lesson planning, linking theories to situations, and reflecting
program, a positive correlation was found between the CME-Decide test on practice as well as by time invested in observing instruction, sup­
and self-reported teaching enthusiasm, which is consistent with previous porting instruction, and teaching with guidance. This is in line with
findings on teaching enthusiasm predicting teachers’ classroom man­ previous findings that school-based OTLs promote situation-specific
agement performance (Kunter, 2013). This finding suggests that the link competence (Mertens & Gräsel, 2018; Stürmer et al., 2013; Weber
between teachers’ affective–motivational disposition and et al., 2018). The CME-Decide test was even slightly more strongly
situation-specific skills warrants greater attention in future research. associated with in-school OTLs than the more established CME-PI in­
Regarding teaching enthusiasm and decision-making, the constructs strument, underscoring the added value of the CME-Decide test. The
may influence one another reciprocally. On the one hand, pre-service pattern of effects suggests that theory-based analysis and reflection on
teachers with elaborate decision-making skills pertaining to classroom teaching promote the acquisition of competence rather than mere
management may be more successful as teachers, fostering more teaching practice. Consequently, the effectiveness of in-school learning
opportunities may be enhanced by preparatory and accompanying

12
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

courses—for example, video-based seminars that specifically promote other groups, including lower-achieving students. Moreover, the sample
the application of theoretical knowledge to practical situations (Gold, did not include prospective primary school and special education
Pfirrmann, & Holodynski, 2021; Stürmer et al., 2013; Weber et al., teachers, who are required to take a greater number of general pedagogy
2018). courses and thus may have greater variance in their CME. Consequently,
By contrast, decision-making was not predicted by university OTL in further research should be conducted to determine the extent to which
general pedagogy, which may suggest that classroom management the results can be generalized to these two groups. The results regarding
generally receives comparatively scant attention in university teacher teaching enthusiasm are further limited to the sub-sample of advanced
education in Germany (Helmke, 2022; König et al., 2017). As O’Neill master’s students and may thus be biased by this specific sub-sample’s
and Stephenson (2012) highlighted, pre-service teachers require suffi­ characteristics (e.g., recent participation in a long-term practicum,
cient time during their coursework to become familiar with classroom including teaching practice guided and supported by mentor teachers).
management concepts and to practice classroom management strate­ The question of whether the associations shown may be replicated with
gies. This suggests not only that more learning opportunities are needed in-service teachers or students in earlier stages of study remains open.
but that learning opportunities that specifically address the link between Moreover, given that our findings are based on cross-sectional analyses
theory and practice (e.g., through practical exercises) are crucial. and considering that correlations do not imply causal effects, longitu­
Furthermore, focusing on the assessment of OTL, the operationali­ dinal analyses, including at least two measurement time points, should
zation of university OTL as a list of pedagogical content areas may not be performed to explore the mutual causal influences of OTL and
map learning processes that condition the acquisition of CME. This is competence.
consistent with the finding that even the more established CME-PI test The CME-Decide test is based on a single open writing prompt, which
was not associated with university OTL. Rather, this operationalization may reinforce testing consistency. However, the use of a single method
appears to be more strongly linked to the acquisition of declarative may impact the empirical results’ generalizability (Shadish, Cook, &
knowledge, in line with the finding that GPK was predicted by all uni­ Campbell, 2002). Comparing the use of open and closed item formats
versity OTL scales. (see Müller & Gold, 2023)—for example, using situational judgment
In conclusion, CME-Decide’s associations with in-school OTL and methodology—would permit a better understanding of what
invested time during school practicum suggest that the test values may CME-Decide measures.
be validly interpreted as resulting from specific learning processes in Bearing in mind these limitations, we conclude that CME-Decide
teacher education. CME-Decide may thus be implemented to investigate offers a promising measure of teachers’ decision-making. The test re­
the effects of specific courses or interventions designed to promote quirements were conceptualized and designed to be as close as possible
CME—for example, using video-based learning environments (Gold to teachers’ actual behavior in the classroom, which can potentially
et al., 2021). enhance ecological validity regarding studies on teacher competence.

6.4. Limitations and directions for future research Funding

This study has several limitations that should be acknowledged and This work was supported as part of a larger project called
recognized as providing impetus for future research. First, CME-Decide’s Zukunftsstrategie Lehrer*innenbildung Köln (ZuS), which is part of the
relatively low reliability suggests that a higher number of video clips and Qualitätsoffensive Lehrerbildung (Quality Initiative for Initial Teacher
test items may be useful. However, given that our sample included only Education), a joint initiative of the Federal Government and the Länder
pre-service teachers—which should also be acknowledged as a limi­ aiming to improve the quality of teacher training. The programme is
tation—the low variance may have reduced the measure’s reliability. funded by the Federal Ministry of Education and Research (grant num­
Future studies should thus apply comparable decision-making measures ber 01JA1515 and 01JA1815). The authors are responsible for the
for in-service teachers. Expert–novice comparisons can provide evidence content of this publication.
that the test scores are validly interpretable as measures of professional
competence. Moreover, the extent to which measures of teachers’ Declaration of competing interest
decision-making predict classroom performance should be explored. A
previous study using CME-PI demonstrated that teachers’ test scores The authors declare that they have no known competing financial
predict both cognitive activation and student learning progress in interests or personal relationships that could have appeared to influence
mathematics classrooms (König et al., 2021). the work reported in this paper.
Regarding the effects of OTL, the effect sizes were small, with
considerable variance in CME-Decide test values remaining unex­ Data availability
plained. Despite the test’s sensitivity to specific OTL effects, it is unclear
which factors within university teacher education are critical to devel­ The datasets used for analyses in this paper are not available.
oping decision-making skills. Developing and implementing more spe­
cific OTL measures—for example, measures focusing on representations Appendix A. Supplementary data
of practice, such as discussing video clips or text vignettes—will help
advance the field. Provided that classroom management learning op­ Supplementary data to this article can be found online at https://doi.
portunities are systematically implemented at the university level, org/10.1016/j.tate.2023.104426.
comprehensive measures that include more fine-grained queries
regarding learning content and assess pre-service teachers’ confidence
References
in using different classroom management approaches may also be
promising (e.g., O’Neill & Stephenson, 2012). American Educational Research Association, American Psychological Association,
The study’s major limitations relate to the sample characteristics. National Council on Measurement in Education. (2014). In Standards for educational
and psychological testing. American educational research association. National Council
The analyses are based on a convenience sample and the response rate
on Measurement in Education.
was low. The sample’s resulting limited representativeness could bias Bastian, A., Kaiser, G., Meyer, D., Schwarz, B., & König, J. (2022). Teacher noticing and
the results and limit the validity of inferential statistical conclusions. For its growth toward expertise: An expert–novice comparison with pre-service and in-
example, in assuming that high-achieving students are more likely to service secondary mathematics teachers. Educational Studies in Mathematics, 110(2),
205–232. https://doi.org/10.1007/s10649-021-10128-y
participate (see Sax, Gilmartin, & Bryant, 2003), associations may be Baumert, J., Blum, W., Brunner, M., Dubberke, T., Jordan, A., Klusmann, U., et al.
masked due to limited variance or results may not be generalizable to (2008). Professionswissen von Lehrkräften, kognitiv aktivierender Mathematikunterricht

13
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

und die Entwicklung von mathematischer Kompetenz (COACTIV): Dokumentation der learning environments. Journal of Teacher Education, 72(4), 431–447. https://doi.
Erhebungsinstrumente [Cognitive activation in the mathematics classroom and org/10.1177/0022487120963681
professional competence of teachers: Documentation of the survey instruments. Max- Hattie, J. (2012). Visible learning for teachers: Maximizing impact on learning. Routledge.
Planck-Institut für Bildungsforschung. Helmke, A. (2022). Unterrichtsqualität und Professionalisierung: Diagnostik von Lehr-Lern-
Baumert, J., & Kunter, M. (2013). The COACTIV model of teachers’ professional Prozessen und evidenzbasierte Unterrichtsentwicklung [Instructional quality and
competence. In M. Kunter, J. Baumert, W. Blum, U. Klusmann, S. Krauss, & professionalization: diagnostics of teaching-learning processes and evidence-based
M. Neubrand (Eds.), Cognitive activation in the mathematics classroom and professional development of teaching]. Klett Kallmeyer.
competence of teachers (pp. 25–48). Springer. https://doi.org/10.1007/978-1-4614- Jacobs, V. R., Lamb, L. L. C., & Philipp, R. A. (2010). Professional noticing of children’s
5149-5_2. mathematical thinking. Journal for Research in Mathematics Education, 41(2),
Berliner, D. C. (1992). The nature of expertise in teaching. In F. K. Oser, A. Dick, & J.- 169–202. https://doi.org/10.5951/jresematheduc.41.2.0169
L. Patry (Eds.), Effective and responsible teaching: The new synthesis (pp. 227–248). Jamil, F. M., Sabol, T. J., Hamre, B. K., & Pianta, R. C. (2015). Assessing teachers’ skills
Jossey-Bass. in detecting and identifying effective interactions in the classroom. The Elementary
Blömeke, S., Gustafsson, J.-E., & Shavelson, R. J. (2015). Beyond dichotomies: School Journal, 115(3), 407–432. https://doi.org/10.1086/680353
Competence viewed as a continuum. Zeitschrift für Psychologie, 223(1), 3–13. https:// Jones, V. (2006). How do teachers learn to be effective classroom managers? In
doi.org/10.1027/2151-2604/a000194 C. M. Evertson, & C. S. Weinstein (Eds.), Handbook of classroom management:
Bond, T. G., Yan, Z., & Heene, M. (2020). Applying the Rasch model: Fundamental Research, practice, and contemporary issues (pp. 887–907). Routledge.
measurement in the human sciences (4th ed.). Routledge. Kaiser, G., Busse, A., Hoth, J., König, J., & Blömeke, S. (2015). About the complexities of
Borko, H., Roberts, S. A., & Shavelson, R. (2008). Teachers’ decision making: From Alan video-based assessments: Theoretical and methodological approaches to overcoming
J. Bishop to today. In P. Clarkson, & N. Presmeg (Eds.), Critical issues in mathematics shortcomings of research on teachers’ competence. International Journal of Science
education (pp. 37–67). Springer. and Mathematics Education, 13(2), 369–387. https://doi.org/10.1007/s10763-015-
Borko, H., & Shavelson, R. J. (1990). Teacher decision making. In B. F. Jones, & L. Idol 9616-7
(Eds.), Dimensions of thinking and cognitive instruction (pp. 311–346). Lawrence Kaiser, G., & König, J. (2019). Competence measurement in (mathematics) teacher
Erlbaum Associates Publishers. education and beyond: Implications for policy. Higher Education Policy, 32(4),
Brühwiler, C., Hollenstein, L., Affolter, B., Biedermann, H., & Oser, F. (2017). Welches 597–615. https://doi.org/10.1057/s41307-019-00139-z
Wissen ist unterrichtsrelevant? Prädiktive Validität dreier Messinstrumente zur Kasperski, R., & Yariv, E. (2022). Gut reaction or rational problem-solving? Teachers’
Erfassung des pädagogisch-psychologischen Wissens von Lehrpersonen: [What considerations when coping with classroom disruptions. Emotional & Behavioural
knowledge is relevant to teaching? Predictive validity of three instruments for Difficulties, 27(2), 152–162. https://doi.org/10.1080/13632752.2022.2125210
measuring general pedagogical knowledge of teachers]. Zeitschrift für Kersting, N. B. (2008). Using video clips of mathematics classroom instruction as item
Bildungsforschung, 7(3), 209–228. https://doi.org/10.1007/s35834-017-0196-1 prompts to measure teachers’ knowledge of teaching mathematics. Educational and
Carter, K., Cushing, K., Sabers, D., Stein, P., & Berliner, D. (1988). Expert-novice Psychological Measurement, 68(5), 845–861. https://doi.org/10.1177/
differences in perceiving and processing visual classroom information. Journal of 0013164407313369
Teacher Education, 39(3), 25–31. https://doi.org/10.1177/002248718803900306 Kersting, N. B., Givvin, K. B., Sotelo, F. L., & Stigler, J. W. (2010). Teachers’ analyses of
Choy, B. H., & Dindyal, J. (2020). Teacher noticing, mathematics. In M. A. Peters (Ed.), classroom video predict student learning of mathematics: Further explorations of a
Encyclopedia of teacher education (living). Springer. https://doi.org/10.1007/978- novel measure of teacher knowledge. Journal of Teacher Education, 61(1–2),
981-13-1179-6_241-1. 172–181. https://doi.org/10.1177/0022487109347875
Christiansen, I. M., & Erixon, E.-L. (2021). Opportunities to learn mathematics pedagogy König, J. (2014). Designing an international instrument to assess teachers’ general
and learning to teach mathematics in Swedish mathematics teacher education: A pedagogical knowledge (GPK): Review of studies, considerations, and
survey of student experiences. European Journal of Teacher Education, 1–19. https:// recommendations. In Technical paper prepared for the OECD innovative teaching for
doi.org/10.1080/02619768.2021.2019216 effective learning (ITEL) - phase II project: A survey to profile the pedagogical knowledge
Demiraslan Çevik, Y., & Andre, T. (2014). Studying the impact of three different in the teaching profession (ITEL teacher knowledge survey). OECD. https://one.oecd.
instructional methods on preservice teachers’ decision-making. Research Papers in org/document/EDU/CERI/CD/RD(2014)3/REV1/en/pdf.
Education, 29(1), 44–68. https://doi.org/10.1080/02671522.2012.742923 König, J. (2015). Measuring classroom management expertise (CME) of teachers: A
Doyle, W. (1979). Chapter II: Making managerial decisions in classrooms. Teachers video-based assessment approach and statistical results. Cogent Education, 2(1),
College Record, 80(6), 42–74. https://doi.org/10.1177/016146817908000602 Article 991178. https://doi.org/10.1080/2331186X.2014.991178
Doyle, W. (2006). Ecological approaches to classroom management. In C. M. Evertson, & König, J., & Blömeke, S. (2010). Pädagogisches Unterrichtswissen (PUW): Dokumentation
C. S. Weinstein (Eds.), Handbook of classroom management: Research, practice, and der Kurzfassung des TEDS-M Testinstruments zur Kompetenzmessung in der ersten Phase
contemporary issues (pp. 97–125). Routledge. der Lehrerausbildung [General pedagogical knowledge: Documentation of the short version
Evertson, C. M., & Weinstein, C. S. (Eds.). (2006). Handbook of classroom management: of the TEDS-M test instrument for competence measurement in the first phase of teacher
Research, practice, and contemporary issues. Routledge. education]. Berlin: Humboldt-Universität.
Floden, R. (2002). The measurement of opportunity to learn. In National Research König, J., Blömeke, S., Jentsch, A., Schlesinger, L., née Nehls, C. F., Musekamp, F., et al.
Council (Ed.), Methodological advances in cross-national surveys of education (2021). The links between pedagogical competence, instructional quality, and
achievement (pp. 231–266). National Academies Press. mathematics achievement in the lower secondary classroom. Educational Studies in
Flores, M. A. (2016). Teacher education curriculum. In J. Loughran, & M. L. Hamilton Mathematics, 107(1), 189–212. https://doi.org/10.1007/s10649-020-10021-0
(Eds.), International handbook of teacher education (pp. 187–230). Springer. https:// König, J., Blömeke, S., Klein, P., Suhl, U., Busse, A., & Kaiser, G. (2014). Is teachers’
doi.org/10.1007/978-981-10-0366-0_5. general pedagogical knowledge a premise for noticing and interpreting classroom
Fredrickson, B. L. (2001). The role of positive emotions in positive psychology: The situations? A video-based assessment approach. Teaching and Teacher Education, 38,
broaden-and-build theory of positive emotions. American Psychologist, 56(3), 76–88. https://doi.org/10.1016/j.tate.2013.11.004
218–226. https://doi.org/10.1037/0003-066X.56.3.218 König, J., Blömeke, S., Paine, L., Schmidt, W. H., & Hsieh, F.-J. (2011). General
Frenzel, A. C. (2014). Teacher emotions. In R. Pekrun, & L. Linnenbrink-Garcia (Eds.), pedagogical knowledge of future middle school teachers: On the complex ecology of
Educational psychology handbook series. International handbook of emotions in education teacher education in the United States, Germany, and Taiwan. Journal of Teacher
(pp. 494–518). Routledge/Taylor & Francis Group. Education, 62(2), 188–201. https://doi.org/10.1177/0022487110388664
Gaspard, H., & Lauermann, F. (2021). Emotionally and motivationally supportive König, J., Ligtvoet, R., Klemenz, S., & Rothland, M. (2017). Effects of opportunities to
classrooms: A state-trait analysis of lesson- and classroom-specific variation in learn in teacher preparation on future teachers’ general pedagogical knowledge:
teacher- and student-reported teacher enthusiasm and student engagement. Learning Analyzing program characteristics and outcomes. Studies In Educational Evaluation,
and Instruction, 75, Article 101494. https://doi.org/10.1016/j. 53, 122–133. https://doi.org/10.1016/j.stueduc.2017.03.001
learninstruc.2021.101494 König, J., Santagata, R., Scheiner, T., Adleff, A.-K., Yang, X., & Kaiser, G. (2022). Teacher
Gerhard, K., Jäger-Biela, D. J., & König, J. (2023). Opportunities to learn, technological noticing: A systematic literature review of conceptualizations, research designs, and
pedagogical knowledge, and personal factors of pre-service teachers: Understanding findings on learning to notice. Educational Research Review, 36, Article 100453.
the link between teacher education program characteristics and student teacher https://doi.org/10.1016/j.edurev.2022.100453
learning outcomes in times of digitalization. Zeitschrift Für Erziehungswissenschaft, 26, König, J., & Seifert, A. (Eds.). (2012). Lehramtsstudierende erwerben pädagogisches
653–676. https://doi.org/10.1007/s11618-023-01162-y. advance online Professionswissen: Ergebnisse der Längsschnittstudie LEK zur Wirksamkeit der
publication. erziehungswissenschaftlichen Lehrerausbildung. Waxmann.
Gippert, C., Hörter, P., Junker, R., & Holodynski, M. (2022). Professional vision of König, J., Tachtsoglou, S., Darge, K., & Lünnemann, M. (2014). Zur Nutzung von Praxis:
teaching as a focus-specific or focus-integrated skill: Conceptual considerations and Modellierung und Validierung lernprozessbezogener Tätigkeiten von angehenden
video-based assessment. Teaching and Teacher Education, 117, Article 103797. Lehrkräften im Rahmen ihrer schulpraktischen Ausbildung [On the use of practice:
https://doi.org/10.1016/j.tate.2022.103797 Modeling and validating future teachers’ learning process related activities during
Gold, B., & Holodynski, M. (2015). Development and construct validation of a situational in-school opportunities to learn]. Zeitschrift für Bildungsforschung, 4(1), 3–22. https://
judgment test of strategic knowledge of classroom management in elementary doi.org/10.1007/s35834-013-0084-2
Schools. Educational Assessment, 20(3), 226–248. https://doi.org/10.1080/ Krauss, S., Bruckmaier, G., Lindl, A., Hilbert, S., Binder, K., Steib, N., et al. (2020).
10627197.2015.1062087 Competence as a continuum in the COACTIV study: The “cascade model”. ZDM-
Gold, B., & Holodynski, M. (2017). Using digital video to measure the professional vision Mathematics Education, 52(2), 311–327. https://doi.org/10.1007/s11858-020-
of elementary classroom management: Test validation and methodological 01151-z
challenges. Computers & Education, 107, 13–30. https://doi.org/10.1016/j. Kunter, M. (2013). Motivation as an aspect of professional competence: Research
compedu.2016.12.012 findings on teacher enthusiasm. In M. Kunter, J. Baumert, W. Blum, U. Klusmann,
Gold, B., Pfirrmann, C., & Holodynski, M. (2021). Promoting professional vision of S. Krauss, & M. Neubrand (Eds.), Cognitive activation in the mathematics classroom and
classroom management through different analytic perspectives in video-based

14
J. Weyers et al. Teaching and Teacher Education 138 (2024) 104426

professional competence of teachers (pp. 273–289). Springer. https://doi.org/10.1007/ Shadish, W. R., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-
978-1-4614-5149-5_13. experimental designs for generalized causal inference. Houghton Mifflin.
Kunter, M., Baumert, J., & Köller, O. (2007). Effective classroom management and the Shavelson, R. J., & Stern, P. (1981). Research on teachers’ pedagogical thoughts,
development of subject-related interest. Learning and Instruction, 17(5), 494–509. judgments, decisions, and behavior. Review of Educational Research, 51(4), 455–498.
https://doi.org/10.1016/j.learninstruc.2007.09.002 Stahnke, R., & Blömeke, S. (2021). Novice and expert teachers’ situation-specific skills
Kunter, M., Frenzel, A., Nagy, G., Baumert, J., & Pekrun, R. (2011). Teacher enthusiasm: regarding classroom management: What do they perceive, interpret and suggest?
Dimensionality and context specificity. Contemporary Educational Psychology, 36(4), Teaching and Teacher Education, 98, Article 103243. https://doi.org/10.1016/j.
289–301. https://doi.org/10.1016/j.cedpsych.2011.07.001 tate.2020.103243
Kunter, M., Kleickmann, T., Klusmann, U., & Richter, D. (2013). The development of Stürmer, K., Könings, K. D., & Seidel, T. (2015). Factors within university-based teacher
teachers’ professional competence. In M. Kunter, J. Baumert, W. Blum, U. Klusmann, education relating to preservice teachers’ professional vision. Vocations and Learning,
S. Krauss, & M. Neubrand (Eds.), Cognitive activation in the mathematics classroom and 8(1), 35–54. https://doi.org/10.1007/s12186-014-9122-z
professional competence of teachers (pp. 63–78). Springer. Stürmer, K., Seidel, T., & Schäfer, S. (2013). Changes in professional vision in the context
Lampert, J., Burnett, B., Comber, B., Ferguson, A., & Barnes, N. (2020). ‘It’s not about of practice. Gruppendynamik und Organisationsberatung, 44(3), 339–355. https://doi.
punitive’: Exploring how early-career teachers in high-poverty schools respond to org/10.1007/s11612-013-0216-0
critical incidents. Critical Studies in Education, 61(2), 149–165. https://doi.org/ Tatto, M. T., Peck, R., Schwille, J., Bankov, K., Senk, S. L., Rodriguez, M., et al. (2012).
10.1080/17508487.2017.1385500 Policy, practice, and readiness to teach primary and secondary mathematics in 17
Maulana, R., Helms-Lorenz, M., & van de Grift, W. (2016). Validating a model of effective countries: Findings from the IEA teacher education and development study in mathematics.
teaching behaviour of pre-service teachers. Teachers and Teaching, 23(4), 1–23. ERIC: TEDS-M).
https://doi.org/10.1080/13540602.2016.1211102. Todorova, M., Sunder, C., Steffensky, M., & Möller, K. (2017). Pre-service teachers’
Mertens, S., & Gräsel, C. (2018). Entwicklungsbereiche bildungswissenschaftlicher professional vision of instructional support in primary science classes: How content-
Kompetenzen von Lehramtsstudierenden im Praxissemester [The development of specific is this skill and which learning opportunities in initial teacher education are
educational competences during long-term internships in teacher education]. relevant for its acquisition? Teaching and Teacher Education, 68, 275–288. https://
Zeitschrift für Erziehungswissenschaft, 21(6), 1109–1133. https://doi.org/10.1007/ doi.org/10.1016/j.tate.2017.08.016
s11618-018-0825-z Voss, T., Kunter, M., & Baumert, J. (2011). Assessing teacher candidates’ general
Meschede, N., Fiebranz, A., Möller, K., & Steffensky, M. (2017). Teachers’ professional pedagogical/psychological knowledge: Test construction and validation. Journal of
vision, pedagogical content knowledge and beliefs: On its relation and differences Educational Psychology, 103(4), 952–969. https://doi.org/10.1037/a0025125
between pre-service and in-service teachers. Teaching and Teacher Education, 66, Wang, M. C., Haertel, G. D., & Walberg, H. J. (1993). Toward a knowledge base for
158–170. https://doi.org/10.1016/j.tate.2017.04.010 school learning. Review of Educational Research, 63(3), 249–294. https://doi.org/
Müller, M. M., & Gold, B. (2023). Videobasierte Erfassung wissensbasierten Verarbeitens 10.3102/00346543063003249
als Teilprozess der professionellen Unterrichtswahrnehmung – Analyse eines Weber, K. E., Gold, B., Prilop, C. N., & Kleinknecht, M. (2018). Promoting pre-service
geschlossenen und offenen Verfahrens [Video-based measurements of knowledge- teachers’ professional vision of classroom management during practical school
based reasoning as a process of professional vision—analysis of a closed and open training: Effects of a structured online- and video-based self-reflection and feedback
task format]. Zeitschrift für Erziehungswissenschaft, 26(1), 7–29. https://doi.org/ intervention. Teaching and Teacher Education, 76, 39–49. https://doi.org/10.1016/j.
10.1007/s11618-022-01128-6 tate.2018.08.008
Muthén, B. O., & Muthén, L. K. (1998–2006). MPlus (version 4.2) [Computer software]. Los Weyers, J., König, J., Santagata, R., Scheiner, T., & Kaiser, G. (2023). Measuring teacher
Angeles, CA. noticing: A scoping review of standardized instruments. Teaching and Teacher
O’Neill, S., & Stephenson, J. (2012). Does classroom management coursework influence Education, 122, Article 103970. https://doi.org/10.1016/j.tate.2022.103970
pre-service teachers’ perceived preparedness or confidence? Teaching and Teacher Wiens, P. D., Hessberg, K., LoCasale-Crouch, J., & DeCoster, J. (2013). Using a
Education, 28(8), 1131–1143. https://doi.org/10.1016/j.tate.2012.06.008 standardized video-based assessment in a university teacher education program to
Sax, L. J., Gilmartin, S. K., & Bryant, A. N. (2003). Assessing response rates and examine preservice teachers knowledge related to effective teaching. Teaching and
nonresponse bias in web and paper surveys. Research in Higher Education, 44(4), Teacher Education, 33, 24–33. https://doi.org/10.1016/j.tate.2013.01.010
409–432. https://doi.org/10.1023/A:1024232915870 Wolff, C. E., Jarodzka, H., & Boshuizen, H. P. (2017). See and tell: Differences between
Schmidt, W. H., Cogan, L., & Houang, R. (2011). The role of opportunity to learn in expert and novice teachers’ interpretations of problematic classroom management
teacher preparation: An international context. Journal of Teacher Education, 62(2), events. Teaching and Teacher Education, 66, 295–308. https://doi.org/10.1016/j.
138–153. https://doi.org/10.1177/0022487110391987 tate.2017.04.015
Schmidt, W. H., Houang, R. T., Cogan, L., Blömeke, S., Tatto, M. T., Hsieh, F. J., et al. Wolff, C. E., van den Bogert, N., Jarodzka, H., & Boshuizen, H. P. A. (2015). Keeping an
(2008). Opportunity to learn in the preparation of mathematics teachers: Its Eye on learning: Differences between expert and novice teachers’ representations of
structure and how it varies across six countries. ZDM-Mathematics Education, 40(5), classroom management events. Journal of Teacher Education, 66(1), 68–85. https://
735–747. https://doi.org/10.1007/s11858-008-0115-y doi.org/10.1177/0022487114549810
Seidel, T., & Stürmer, K. (2014). Modeling and measuring the structure of professional Wu, M. L., Adams, R. J., & Wilson, M. R. (1997). ConQuest [Computer software].
vision in preservice teachers. American Educational Research Journal, 51(4), 739–771. Camberwell, Australia: Australian Council for Educational Research.
https://doi.org/10.3102/0002831214531321

15

View publication stats

You might also like