You are on page 1of 20

J Pers Eval Educ (2007) 20:165–184

DOI 10.1007/s11092-008-9053-z

What is the Relationship Between Teacher Quality


and Student Achievement? An Exploratory Study

James H. Stronge & Thomas J. Ward &


Pamela D. Tucker & Jennifer L. Hindman

Received: 19 December 2007 / Accepted: 25 January 2008 /


Published online: 13 February 2008
# Springer Science + Business Media, LLC 2008

Abstract The major purpose of the study was to examine what constitutes effective
teaching as defined by measured increases in student learning with a focus on the
instructional behaviors and practices. Ordinary least squares (OLS) regression analyses
and hierarchical linear modeling (HLM) were used to identify teacher effectiveness
levels while controlling for student-level and class/school-level variables. Actual
achievement of 1936 third grade students in 85 classrooms on the Virginia Standards of
Learning (SOL) Assessment results in English, Mathematics, Social Studies, and
Science were compared to expected achievement resulting in an indicator of teacher
effectiveness. Based on student learning gains, teachers were divided into quartiles. The
statistical modeling approach facilitated comparisons of outcomes that were free of
influences of identified extraneous variables. A double blind design was selected for in-
depth cross-case studies with teachers from the highest quartile representing highly
effective teachers (N=5) and the lowest quartile the less effective teachers (N=6). The
observation team assessed the third grade teachers (N=11) based 20 categories within
four domains: instruction, student assessment, classroom management, and personal
qualities. Key findings indicate that effective teachers scored higher across the four
domains. Additionally, effective teachers tended to ask a greater number of higher
level (e.g., analysis) questions and had fewer incidences of off-task behavior than
ineffective teachers. The exploratory study identified instructional behaviors and
practices of teachers that result in higher student learning gains.

J. H. Stronge (*)
School of Education, The College of William and Mary, P.O. Box 8795, Williamsburg,
VA 23187-8795, USA
e-mail: jhstro@wm.edu

T. J. Ward
The College of William and Mary, Williamsburg, VA, USA

P. D. Tucker
University of Virginia, Charlottesville, VA, USA

J. L. Hindman
Teacher Quality Resources, LLC, Williamsburg, VA, USA
166 J Pers Eval Educ (2007) 20:165–184

Keywords Teacher quality . Teacher effectiveness . Ineffective teacher . Effective


teacher . Student achievement . Questioning . Student learning gains
In recent years, research has focused on the value-added connection between
teaching and learning, with leading examples of this assessment process including
the Tennessee Value-Added Assessment System and the Dallas Independent Public
Schools (see, for example, Mendro 1998; Nye et al. 2004; Wright et al. 1997).
Analysis of data from these and other programs offers dramatic evidence regarding
the influence of the classroom teacher on student learning (Stronge and Tucker 2000;
Tucker and Stronge 2005; Wenglinsky 2002). Thus, we have seen in recent years the
emergence of a new approach to answering an age-old question: What is the value-
added impact of teachers on student learning?
The purpose of the study reported here was to examine what constitutes effective
teaching as defined by measured increases in student learning. Specifically, what are
the instructional practices of teachers who facilitate high growth in student
achievement measures? In an effort to address this guiding question, we engaged
in the following two steps:
1. Used regression analyses (ordinary least squares and hierarchical linear
modeling) identifying teacher effectiveness as measured by student learning
gains while controlling for both student-level and class/school-level concomitant
variables; and
2. Identified behaviors and practices distinguishing top quartile versus bottom
quartile teachers (i.e., teachers who effected higher versus lower than predicted
gains in student learning).

1 Background

1.1 Demand for Accountability

The current demand for educational accountability has been building and
crystallizing although the post-Sputnik period when reforms based on “excellence”
and “accountability” emerged due in part to the Elementary and Secondary
Education Act (ESEA) of 1965. This predecessor of today’s No Child Left Behind
Act (NCLB) was intended to increase quality and equity by emphasizing an
accountability component that required evidence of effectiveness for Title I
programs (Sacks 1999). In subsequent decades, we have experienced wave after
wave of educational reform efforts, most notably those advocated in A Nation at Risk
in 1983 (National Commission on Excellence in Education) which “solidified the
accountability trends of the 1960s and 1970s” (Heinecke et al. 2003, p. 22) and
galvanized the national agenda of high standards. The last 40 years of reform efforts
have focused primarily on the development of curriculum standards, assessments to
measure student achievement, and school level reporting mechanisms to publicly
explain results. Most recently, reauthorization of the Elementary and Secondary
Education Act, better known as the No Child Left Behind Act, is intended to tie
federal education funding directly to improvements in student test scores.
J Pers Eval Educ (2007) 20:165–184 167

Unfortunately, much of the foregoing policy discussion has overlooked the most
fundamental unit of change—the classroom—and the primary catalyst for improvement
in our schools—the teacher. In recent years, there has been a renewed interest in the role
of the teacher as the key to school improvement (Darling-Hammond and Youngs 2002).
In fact, the 2001 NCLB legislation (34 CFR Part 200: Title I–Improving the Academic
Achievement of the Disadvantaged; Final Rule) codifies the emphasis of having a
highly qualified teacher in every classroom. Although the various states appear to be
operationalizing a “highly qualified teacher” under NCLB in relatively simplistic
licensure terms, at least the focus for reform has moved from state-, district-, or even
school-level reforms to the classroom. To a large extent, this transition is grounded in
the realization that any significant improvement in schools and in student learning
must have the teacher as a centerpiece (see, for example, Darling-Hammond 1997;
Mendro 1998; Stronge and Tucker 2000; Tucker and Stronge 2005).
Basic teacher qualifications, as stipulated under NCLB, are certainly an important
starting point in acknowledging the critical role of teachers in student learning. In the
following section, however, we move beyond issues of preparation and qualifica-
tions for teaching to ones of teacher competence and effectiveness.

1.2 Relationship Between Teacher Effectiveness and Student Achievement

Over the past few decades, numerous studies have focused on defining the charac-
teristics of effective schools and teachers. Contemporary research has focused on the
value-added connection between teaching and learning, with leading examples of this
assessment process including the Tennessee Value-added Assessment System and the
Dallas Independent Public Schools. Analysis of data from these and other programs
offer dramatic evidence regarding the influence of the classroom teacher on student
learning (Mendro 1998; Nye et al. 2004;Wright et al. 1997).
There is a growing body of research critiquing the Tennessee Value Assessment
System research (see, for example, Kupermintz 2002). Nonetheless, the evidence
from multiple studies seems to confirm the efficacy of value-added approaches for
assessing teacher quality. In a review of studies that utilize value-added modeling to
explain teacher effects on student achievement, McCaffrey et al. (2003) concluded
that while the value-added approach has limitations it nonetheless should be an
alternative in examining teacher quality. They stated, “…given the current state of
knowledge about VAM [value-added modeling] we expect that some efforts to
estimate teacher effects could provide useful information on teachers” (p. 114).
The over-arching finding from value-added studies is that effective teachers are,
indeed, essential for student success. For example, Wright et al. (1997) found there is
evidence that lower-achieving students are more likely to be placed with less
effective teachers. Thus, the neediest students are being instructed by the least
capable teachers. Using a multi-year database, Sanders and colleagues found that
when children, beginning in third grade, were placed with three high performing
teachers in a row, they scored, on average, at the 96th percentile on Tennessee’s
statewide mathematics assessment at the end of fifth grade. When children with
comparable achievement histories starting in third grade were placed with three low
performing teachers in a row, their average score on the same mathematics
assessment was at the 44th percentile, yielding a 52-percentile point difference.
168 J Pers Eval Educ (2007) 20:165–184

They claimed that “the immediate and clear implication of this finding is that
seemingly more can be done to improve education by improving the effectiveness of
teachers than by any other single factor” (Wright et al. 1997, p. 63).
A more recent study based in Tennessee supports Wright et al. (1997) conclusions
regarding the magnitude of teacher effects. In a randomized experiment in which
students and teachers were randomly assigned to classes from Kindergarten through
grade 3, Nye et al. (2004) concluded that “the results of this study support the idea
that there are substantial differences among teachers in the ability to produce
achievement gains in their students” (p. 253). Students of less effective teachers
experienced reading achievement gains of one third of a standard deviation less than
that of students with effective teachers. In mathematics the differences was slightly
less than one half a standard deviation.
In addition, data from the Dallas Independent Public Schools revealed that there is
a powerful residual effect on student learning based on the quality of the teacher. If a
student has a high performing teacher for just 1 year, the student will remain ahead
of peers for at least the next few years of schooling. Unfortunately, if a student has
an ineffective teacher, the influence on student achievement is not remediated fully
for up to 3 years (Mendro 1998).
A study of third grade teachers in an urban Virginia school district found that
students of teachers in the top quartile of effectiveness (based on hierarchical linear
modeling predictions) scored approximately 30–40 scale score points higher than
expected on the Virginia Standards of Learning state assessment in English,
Mathematics, Science, and Social Studies, respectively. Students of teachers in the
bottom quartile of effectiveness scored approximately 24–32 points below expected
scores. (The range for scale scores on these assessments is 200 to 600). These
differences occurred after controlling for multiple demographic variables (e.g.,
gender, ethnicity, free/reduced lunch, special education status, ESL status, and days
absent), students’ prior achievement, and class-level differences (e.g., class size,
free/reduced lunch percentages; Stronge and Ward 2002).

1.3 Qualities of Effective Teachers

Dimensions that characterize teacher effectiveness synthesized from a meta-review


of extant research (Stronge 2002, 2007) were used as the conceptual framework for
this study. From this review, the qualities of effective teachers were divided into the
dimensions of instructional expertise, student assessment, learning environment, and
personal qualities of the teacher (Table 1). Each of these dimensions focuses on a
fundamental aspect of the teacher’s professional qualifications or responsibilities and
is summarized below.1

1
Due to the extensive nature of the extant research related to qualities of effective teachers, it is not
feasible to provide a comprehensive review in this manuscript. Thus, the manuscript provides only a
summary table depicting prominent research related to key teacher qualities. For more in-depth coverage
of teacher qualities, see Stronge 2002, 2007, and similar reviews.
J Pers Eval Educ (2007) 20:165–184 169

Table 1 Summary of teacher effectiveness dimensions and related research

Dimensions of teacher effectiveness Representative research base

Instruction
Focus on instruction Allington 2002; Darling-Hammond 2000; Johnson 1997;
Wenglinsky 2000
Expectations for achievement Peart and Campbell 1999; Wenglinsky 2002
Planning for instruction Good and Brophy 1997; Jay 2002; Shellard and Protheroe 2000
Range of strategies Pressley et al. 2004; Walsh and Sattes 2005; Weiss et al. 2003
Questioning Eisner 2003/2004; Peart and Campbell 1999; Sternberg 2003;
Zahorik et al. 2003
Student engagement Cawelti 2004; Walsh and Sattes 2005; Wenglinsky 2002
Homework Allington 2002; Berliner 1986; Cawelti 2004; Cotton 2000;
Johnson 1997
Student assessment
Monitor student progress Cotton 2000; Foegen et al. 2007; Janisch and Johnson 2003;
Yesseldyke and Bolt 2007
Differentiation Shellard and Protheroe 2000; Tomlinson 1999, 2003;
VanTassel-Baska 2005
Learning environment
Classroom management Johnson 1997; Marzano et al. 2003; Pressley et al. 2004;
Wang et al. 1993
Organization McLeod et al. 2003; Zahorik et al. 2003
Behavioral expectations Good and Brophy 1997; Hamre and Pianta 2005; Marzano 2003;
Pressley et al. 2004
Personal qualities
Caring Boyle-Baise 2005; Collinson et al. 1999
Fairness and respect McBer 2000; Peart and Campbell 1999
Interactions with students Corbett and Wilson 2002; Cruickshank and Haefele 2001;
Darling-Hammond 2001; Peart and Campbell 1999
Enthusiasm and motivation Rowan et al. 1997; Quek 2005
Attitude toward teaching Hamre and Pianta 2005; Southeast Center for
Teaching Quality 2003
Reflective practice Cruickshank and Haefele 2001; Good and Brophy 1997

2 Method

2.1 Part I: Identification of Effective Teachers

2.1.1 Setting and Target Population

The data for the current study were collected from third grade students and teachers
in a moderately sized urban school district located in Virginia. The school district
has 36 schools, a student population of approximately 23,000, and a teacher
population of nearly 1,500. The student population is predominantly 60% African-
American and 35% white; approximately 2% of the student population receives ESL
services. The sample selected for this study consisted of the third grade regular
classroom teachers and students in the school district. Data for 1936 students and 85
classrooms were used for the analyses.
170 J Pers Eval Educ (2007) 20:165–184

2.1.2 Identifying Teacher Effectiveness

The methodology used for determining teacher effectiveness for the study relied on
the assumption that effective teachers are those who foster achievement gains
beyond that expected from the student’s past achievement. This methodology is
similar to other value-added systems that have been in use for some time (Mendro et
al. 1994). The methodology employed was both ordinary least squares (OLS) and
hierarchical linear modeling (HLM). Control variables were used at both the
individual and classroom levels as previous research (Mendro et al. 1994) has shown
that effectiveness estimates can be biased if individual and classroom level
background influences are not controlled for. Extant research also has shown that
multiple models of the data need to be estimated and examined for fit (Webster et al.
1998). Simple one-stage OLS, two-stage OLS, and two-stage, two-level HLM
models have been found to be the best fitting in previous applications and were
employed in the current study. The tested models and predictors for fitting student
achievement are described in Table 2. The HLM analysis was conducted using HLM
6 (Raudenbush et al. 2005). For the HLM analysis, grand-centering was utilized.

Table 2 Teacher effectiveness identification models employed in the study

Model 1st stage predictors 2nd stage predictors

Basic OLS regression Gender None


Age
Free/reduced lunch status
Race
Days absent
School mobility
English proficiency status
Degrees of reading proficiency grade 1
Degrees of reading proficiency grade 2
Two-stage OLS regression Block 1 Class size
Gender Percent free/reduced lunch
Age Percent male
Free/reduced lunch status Percent minority
Race Percent ESL
Days absent
School mobility
Block 2:
Degrees of reading proficiency grade 1,
Degrees of reading proficiency grade 2
Two-stage, two-level Student level Classroom level
HLM Gender Class size
Age Percent free/reduced lunch
Free/reduced lunch status Percent male
Race Percent minority
Days absent Percent ESL
School mobility Interactions of dichotomous
variables
English proficiency status
Degrees of reading proficiency grade 1
Degrees of reading proficiency grade 2
J Pers Eval Educ (2007) 20:165–184 171

The target variables in each case were the third grade results on Virginia’s high stakes
student assessment, Standards of Learning (SOL), in English, Mathematics, Social
Studies, and Science. It should be noted that third grade teachers in the school district
were in self-contained settings and, consequently, each teacher was primarily responsible
for teaching all four subject areas to the students assigned to her/his class. The purposes
of the state’s standards-based assessments at selected grades and high school subjects are
to inform parents and teachers about what students are learning in relation to the SOL
and to hold schools accountable for teaching the SOL content (Hambleton et al. 2000).
Selection of the specific statistical models was based on examination of statistical
fit. In each instance, two-stage OLS models were determined to provide a sufficient
model.2 Table 3 presents the results of the selected model for each of the dependent
variables. Age, race, number of days absent, and previous achievement (second
grade measure of reading ability) were the consistent predictors across the analyses.
Gender, class size, and percent receiving free or reduced lunch were other predictors
that appeared in at least one of the analyses. The identified OLS models were used to
establish the achievement expectations for each student.
Actual achievement was then compared to expected achievement estimates from the
selected OLS equation. In these analyses, positive differences indicated student
achievement beyond expectation, zero differences indicated achievement commensurate
with expectation, and negative differences indicated achievement below expectation.
The difference scores of the students were standardized, aggregated, and averaged to
develop a composite for each teacher. Consistent with previous research, a minimum of
ten student cases per teacher was set as the floor value for establishing a teacher
composite (Mendro 1998). Teacher composites were then corrected for class size.
Analysis of the distribution of teacher composites allowed the identification of the most
effective and least effective teachers, based on comparisons of student achievement
scores after controlling for the various factors noted above. Figures 1, 2, 3 and 4
illustrate the distribution of teacher residuals for the four subject areas examined.

2.2 Part II: Comparative Analysis of Effective and Less Effective Teachers

2.2.1 Sample

Part II of the study involved an examination of the instructional practices of teachers


who effected higher than predicted gains in student learning and those who effected
lower than predicted gains in student learning as measured by the SOL assessments.
In order to explore the phenomenon of effective teaching, exploratory cross-case
analyses were used. The results of part I were used to identify third-grade teachers
for in-depth case studies from among the highest and lowest quartiles based on their
student academic growth composite. Consequently, five of the 24 teachers from the
top quartile and six of the 21 teachers from the lowest quartile were selected.3 Given

2
Since the Basic OLS and HLM models were not utilized, the statistical results for those analyses are not
presented here.
3
Due to the extensive time and cost involved in conducting case study research, a small sample was
selected (N=11). Thus, caution should be exercised in interpreting the results of this study due to the small
sample size.
172 J Pers Eval Educ (2007) 20:165–184

Table 3 Results of regression analyses

Variable Model R2 Significant predictors

English Two-stage OLS 0.74 Gender


Age
Race
Days absent
Degrees of reading proficiency grade 2
Class size
Mathematics Two-stage OLS 0.69 Age
Race
Days absent
Degrees of reading proficiency grade 2
Class size
Social studies Two-stage OLS 0.70 Gender
Age
Race
Days absent
Degrees of reading proficiency grade 2
Percent free/reduced lunch
Science Two-stage OLS 0.69 Age
Race
Days absent
Degrees of reading proficiency grade 2
Percent free/reduced lunch

that the Effective Teacher (ET) and Ineffective Teacher (IT) samples were small, we
decided to approach this part of the study as a set of case studies. Therefore, we will
not report statistical comparisons of the two groups in this paper.

2.2.2 Data Analysis Approach

In order to explore the phenomenon of effective teaching, the qualitative approach of


exploratory cross-case analysis was used. Using multiple cases makes it possible to
build a logical chain of evidence (Miles and Huberman 1994; Yin 1994).
Additionally, cross-case analysis allows for analysis of consistencies identified
across the cases (Welker 2004).

2.2.3 Instrumentation

A variety of data collection instruments were developed or adapted for use in this study
to empirically capture selected instructional practices. Specifically, the following
instruments were used: (a) questioning analysis chart, (b) narrative running record, (c)
time-on-task chart, (d) student-teacher interaction analysis, (e) checklist of student
assessment practices, (f) overall time use chart, and (e) teacher interview form.
Following the observation and interview, both observers were asked to complete a
teacher effectiveness behavior scale based on the dimensions identified in Table 1.

Questioning Analysis Chart One observer recorded all instructional questions asked
by the teacher, orally and in writing, for one to two lessons, or the equivalent of a
J Pers Eval Educ (2007) 20:165–184 173

Fig. 1 Standardized teacher residuals for English

60-min time period during the 3-h observation. Subsequently, the observer coded
and tallied the questions based on the six levels in Bloom’s taxonomy (1984):
knowledge, comprehension, application, analysis, synthesis, and evaluation.

Narrative Running Record This instrument was designed to record and code the type of
classroom activities and interactions, at 5-min intervals, for a 60-min time period during
the 3-h observation. This instrument is based on Glickman et al. (1998) Teacher Verbal

Fig. 2 Standardized teacher 16


residuals for mathematics
14

12

10

6
Frequency

Std. Dev = .42


2 Mean = .00
N = 85.00
0
-1.09 -.69 -.29 .11 .51 .91
-.89 -.49 -.09 .31 .71

Residuals
174 J Pers Eval Educ (2007) 20:165–184

Fig. 3 Standardized teacher 16


residuals for social studies
14

12

10

Frequency 4

Std. Dev = .52


2
Mean = -.01
0 N = 85.00
-1.09 -.29 .51 1.31
-.69 .11 .91 1.71

Residuals

Behaviors Instrument. Following the actual classroom observation, the audiotapes were
reviewed and verbatim quotes and examples of classroom activities were added to the
record. The teacher’s interactions with the students were categorized as directions/
procedures, monitoring, feedback, management, modifications, and questioning.

Time-on-Task Chart This instrument was designed to record student engagement in


the teaching-learning process at 5-min intervals for a 60-min period. Additionally,
comments regarding off-task behaviors and teacher responses were recorded. It is a
modified version of an instrument used in the validity study of the National Board

Fig. 4 Standardized teacher 20


residuals for science

10
Frequency

Std. Dev = .46


Mean = -.00
0 N = 85.00
-.84 -.44 -.04 .36 .76 1.16
-.64 -.24 .16 .56 .96 1.36

Residuals
J Pers Eval Educ (2007) 20:165–184 175

for Professional Teaching Standards (Bond et al. 2000). Student off-task behaviors
and teacher management of the behavior, both preventive and reactive, were noted.

Student–Teacher Interaction Analysis This instrument was based on Flander’s


Interaction Analysis (Flanders 1970) methodology to capture teacher interactions
with students throughout a 60-min interval. Teacher interactions with students were
categorized according to the following: accepts feelings, praises/encourages, accepts
or uses student ideas, asks questions, lectures, gives directions, reprimands or asserts
authority, records student talk response, and notes student talk initiation.

Checklist of Student Assessment Practices Using a checklist of possible types of


student assessments, one observer noted the types of assessments used in the
classroom and made follow-up notes based on information provided during the
teacher interview after the observation.

Overall Time Use Chart This was a simple recording of how time was used in the
classroom during the 3-h visitation. Major activities and the time dedicated to each were
noted. Results of this analysis were used to determine the amount of time focused on
instruction as compared to administrative tasks, transitions, and other non-instructional
activities. The instrument utilized the Stallings Observation System (Stallings 1986)
method of providing a snapshot of activities engaged in by the teacher.

Teacher Interview Form A structured interview protocol that took 20–30 min was
completed following the classroom observation. It was used to solicit information
from the teacher on teaching credentials, professional development, student
assessment strategies, and lesson objectives.

Teacher Effectiveness Behavior Scale After the 3-h observation, the two raters
scored the entire observation using a behavioral summary scale (Bond et al. 2000;
McGreal 1990) of effective teacher behaviors. The scale is based on research of
effective teaching behaviors and is designed to capture both the types of behaviors
and the degree to which the participating classroom teachers exhibited those
behaviors. Four levels of performance—ranging from most effective to least
effective—were defined for each dimension of effective teaching.

2.2.4 Procedures

A double blind design was employed in which the teachers were not informed as to the
reason for their inclusion in the study; additionally the observers who collected
observational data did not know the effective/ineffective identities of the teachers. All
identifying teacher information was coded such that only a single school district
employee knew the identity of teachers in each group. Classroom observations were
conducted with the selected five teachers from the highest quartile and six teachers from
the lowest quartile. Two observers used a variety of data collection strategies during a
3-h classroom visit and a subsequent half hour interview with each selected teacher.
A training session was provided for classroom observers on conducting observations
using the specific instruments developed for this study. The session included an
176 J Pers Eval Educ (2007) 20:165–184

overview of the study, specific training on the use of each protocol, and instruction on
synthesizing the data for the overall rating of the observation. Observers were given
opportunities to practice using the various observation instruments while viewing
practice videotapes. The practice session continued until observers were able to score the
videotaped performance of the teaching simulations with an 80% or above agreement.

3 Results

The following sections report the findings from the observational data of “effective”
teachers, those who facilitated higher than expected learning gains for students, and
“ineffective” teachers, those who facilitated lower than expected learning gains.

3.1 Student–Teacher Interactions

During a 1-h segment of a lesson, the observers recorded student–teacher interactions


in three specific domains: indirect, direct, and student talk. There were no noteworthy
differences between the effective and less effective teachers noted in this analysis
(Table 4).

3.2 Teacher Classroom Behaviors

The observation team assessed each of the selected third grade teachers (N=11) based on
20 categories within four specific domains: instruction, student assessment, classroom
management, and personal qualities. The information in Table 5 lists the results of the
observation teams’ rating using a four-point behavioral summary scale rubric.
The observational data were summarized for the two groups and compared.
Because of the small samples representing the two teacher groups, statistical
comparisons are not reported here. Nonetheless, in 18 out of 20 dimensions on which
the teachers were compared, the effective teachers received higher scores. Although
we did not report statistical significance on each analysis due to the exploratory nature
of the study and the small sample sizes, it is worth noting the effective teacher group
did perform higher on two dimensions, instructional differentiation and complexity of
instruction, at a significance level of p<.05. A summary of key findings from the
comparative analysis of teacher classroom behaviors revealed the following:
Instruction:
1. The effective teachers studied provided more complex instruction with a greater
emphasis on meaning versus memorization than those teachers who were
considered ineffective.
2. The effective teachers studied demonstrated a broader range of instructional
strategies, using a variety of materials and media to support the curriculum, than
those teachers who were considered ineffective.
Student Assessment:
1. As a domain, student assessment was found to have noteworthy differences
favoring the effective teachers.
J Pers Eval Educ (2007) 20:165–184 177

Table 4 Comparative analysis between effective and ineffective teachers regarding types of student–
teacher interactions

Description Effective teachers (ET) Ineffective teachers (IT) Comparison


Mean Mean Favors

Indirect 66.20 70.67 IT


Accepts feelings 1.40 1.17 ET
Praises/encourages 16.20 11.67 ET
Accepts student ideas 5.80 3.33 ET
Asks questions 42.80 54.46 IT
Direct 39.20 56.33 IT
Lectures 4.00 3.00 ET
Gives directions 26.20 34.83 IT
Reprimands or asserts authority 9.00 18.50 IT
Student talk 36.80 31.17 ET
Response to 23.60 21.33 ET
Initiation of 13.20 9.83 ET
Total interactive behavior 142.20 158.17 IT

2. The effective teachers studied provided more differentiated assignments for


students than did the ineffective teachers.
Learning Environment:
1. The effective teachers studied were more organized than ineffective teachers
with efficient routines and procedures for daily tasks.
2. The behavioral expectations for students of the effective teachers studied were
higher than the expectations of the ineffective teachers.
Personal Qualities:
1. There was a difference between teachers deemed effective and those deemed
ineffective in the overall domain for personal qualities.
2. When compared to the ineffective teachers, the effective teachers studied
demonstrated a higher degree of respect for and fairness toward students.

3.3 Teacher Questioning Analysis

During 1 h of the observation period, the total number of questions asked by the
teachers on three levels was tallied: recall questions, comprehension questions, and
higher order questions (based on Bloom’s Taxonomy). Table 6 illustrates key
findings from this analysis. A comparative analysis of the types of questions asked
by teachers revealed that the effective teachers asked more higher-level questions
than did the ineffective teachers (i.e., application, analysis, synthesis, evaluation),
approximately seven times as many than those teachers considered ineffective.

3.4 Student Off-Task Behavior

During a 60-min period, one member of the observation team noted the number of
students who were disengaged or disruptive at 5-min intervals. Table 7 describes the
178 J Pers Eval Educ (2007) 20:165–184

Table 5 Comparative analysis between effective and ineffective teachers on research-based dimensions

Description Effective teachers (ET) Ineffective teachers (IT) Comparison


Mean Mean Favors

Instruction 25.20 21.83 ET


Instructional focus 3.40 2.67 ET
Achievement expectations 3.00 3.17 IT
Planning 3.20 2.83 ET
Range of strategies 3.20 2.33 ET
Clarity of expectations 3.20 2.83 ET
Complexity of instruction 3.00 1.83 ET
Questioning 2.40 2.00 ET
Student engagement 3.00 2.50 ET
Homework .60 1.67 IT
Student assessment 5.40 3.67 ET
Monitoring students 2.80 2.50 ET
Differentiation 2.60 1.17 ET
Classroom management 11.00 9.00 ET
Management 3.60 3.33 ET
Organization 3.60 2.83 ET
Behavioral expectations 3.80 2.83 ET
Personal qualities 18.60 13.83 ET
Caring 3.60 3.00 ET
Fairness and respect 3.80 3.17 ET
Interactions with students 3.40 2.50 ET
Enthusiasm and motivation 3.40 2.67 ET
Dedication to teaching 2.20 1.00 ET
Reflective practice 2.20 1.50 ET
Overall effectiveness 60.20 48.33 ET

mean number of students who were disengaged, the mean number who were
disruptive, and the total mean number of students who were off-task (disruptive and/
or disengaged). The results of this analysis revealed that the effective and the
ineffective teachers had essentially the same number of students noted as
disengaged; however, the ineffective teachers’ students exhibited more off-task
behaviors than the effective teachers. Teachers who were considered ineffective had,
on average, almost five disruptive behaviors during the 60-min observation periods
compared to approximately one half of a disruptive event per 60-min period for the
effective teachers.

Table 6 Questioning analysis of effective and ineffective teachers

Description Effective teachers (ET) Ineffective teachers (IT) Comparison


Mean Mean Favors

Recall 48.40 52.00 IT


Comprehension 8.80 26.80 IT
Application and beyond 9.80 1.20 ET

A review of the data indicated that one teacher asked a total of 117 questions during the observation
session. This case was considered an anomaly (2.6 questions per minute) and was omitted from the
analysis
J Pers Eval Educ (2007) 20:165–184 179

Table 7 Comparative analysis between effective and ineffective teachers regarding student behavior

Description Effective teachers (ET) Ineffective teachers (IT) Comparison


Mean Mean Favors

Disengaged students 7.80 7.33 ET


Disruptive students 0.60 4.83 IT
Total students with off-task behaviors 8.40 12.17 IT

4 Discussion

4.1 Using Statistical Models to Assess Teacher Effectiveness

The current environment for education is permeated with new calls for accountabil-
ity at the student, teacher, and school levels. The NCLB Act calls for more attention
to student gains and effectiveness of teachers. In the current study, we focused on the
identification of teacher outcomes that can be closely linked to accountability.
Fairness and usability are central issues in any system of accountability that would
be proposed for use in educational settings (Webster and Mendro 1997). Part 1 of the
current study employed a statistical methodology that ensures fairness by creating
the teacher composites through the use of statistical controls of concomitant
variables. The use of statistical models allows for the comparison of outcomes that
are free from the influences of identified extraneous variables.
The statistical models tested in Part 1 have the advantage of including measures at
the student and classroom levels. While the current study found two-stage OLS
regression models to provide an adequate fit, OLS and HLM regression models were
tested. Previous research had recommended the use of two-level HLM (Webster et
al. 1996) but also found that OLS solutions were highly correlated and relatively free
from bias. The testing of multiple models is recommended as an additional safeguard
in the process of identifying effective teachers.
The advantages to school systems that employ such methods to help identify
effective and less effective teachers include the possible future benefits to teacher
evaluation and teacher development, as well as the ability to demonstrate compliance
with new calls for accountability. Although seemingly at odds (Danielson and
McGreal 2000; Stronge 2006), the purposes of accountability and professional
growth in a teacher evaluation system can be met by examining teacher effects on
student achievement and behaviors of those teachers for whom students experience
higher than expected learning gains.
First, adding teacher effectiveness to a school district’s accountability system
would provide a critical empirical perspective to the multifaceted process of teacher
evaluation. Secondly, when the data from teacher effectiveness are associated with
professional development opportunities that are structured on the instructional
characteristics and behaviors of effective teachers, the ultimate outcome may be
increased educational success of more students. The improvement orientation of
evaluating teacher effectiveness serves to meet the professional needs of the teacher
and to support reform efforts within a school (Stronge 2006). Logic dictates that if
teaching improves then student achievement will improve as well.
180 J Pers Eval Educ (2007) 20:165–184

4.2 Characteristics and Behaviors of Effective Teachers

One important finding of this exploratory cross-case analysis is the preliminary


identification of instructional characteristics and behaviors of those teachers who
produced high gains in student learning. In the study, assessments were used that were
closely aligned with the curriculum taught by the teachers, which allowed for a
meaningful interpretation of student learning gains, both greater and lower than
expected. Studies such as this may help us begin to better understand the links between
classroom processes and desirable student outcomes. Moreover, by focusing on the
hallmarks of effective teachers, eventually we may be better equipped to educate
teachers more expertly, to set meaningful performance expectations once teachers are in
classrooms, and to evaluate and reward teachers more fairly. This exploratory study
identified three distinct differences in the practices of those teachers who effected greater
than expected learning gains for students and those who effected lower than expected
learning gains: (1) differentiation and complexity of instructional strategies, (2)
questioning practices, and (3) level of disruptive student behavior. Consequently, the
study reinforces the link between student learning and these teacher behaviors.
& Differentiation and complexity of instruction. The effective teachers in this study
demonstrated that they understood the need to alter the lesson presentation and
materials in order to promote student learning given that a one-size fits all
approach typically is not the best fit.
& Questioning. The effective and less effective teachers asked comparable numbers
of lower-level questions; the distinction between the two groups occurred with
effective teachers asking a far greater number of higher level questions
(approximately seven times more).
& Disruptive student behavior. Effective teachers in the study had a disruptive
behavior incident about once every 2 h whereas the ineffective teachers in the
case analyses had a disruptive event approximately every 12 min.

4.3 Limitations

Due to the very limited sample size of the cross-case analysis (N=11), large number
of variables, and the large number of statistical tests, the analyses are presented as
exploratory analyses focused on the trends of the findings rather than as statistical
analyses. Thus, due caution should be exercised in interpreting or generalizing the
results of the study.
Given the promise in these findings, and considering the limitations of the current
study, we recommend that future work continue this line of research. In particular,
studies that could provide for a larger and more representative sample would allow
for more robust statistical analyses to be conducted.

5 Conclusions

Although policy makers periodically have suggested that schools have little impact
on student learning, recent studies indicate that schools and their efforts do make a
J Pers Eval Educ (2007) 20:165–184 181

difference, and much of that difference can be linked directly to teachers (Darling-
Hammond 2000). Given the clear and undeniable link that exists between teacher
effectiveness and student learning, the use of student achievement information, when
it is curriculum based, can provide an invaluable tool to explore the classroom
practices of teachers who enhance student learning beyond predicted levels of
accomplishment. Student achievement can be, indeed, should be, an important source
of feedback on the effectiveness of schools, administrators, and teachers. The
challenge for educators and policy makers is to make certain that student achievement
is placed in the broader context of what teachers and schools are accomplishing.
Moreover, given the central role that teachers have always played in successful
schools, connecting teacher performance and student performance is a natural extension
of the educational reform agenda. “The purpose of teaching is learning, and the purpose
of schooling is to ensure that each new generation of students accumulates the
knowledge and skills needed to meet the social, political, and economic demands of
adulthood” (McConney et al. 1997, p. 162). Most educators view teaching and learning
as a reciprocal process, an equal partnership, in which teachers and students, alike,
shape the environment and support the learning endeavor through their thoughts and
behaviors. Hence, how we conceptualize teacher effectiveness should reflect a balance
of the instructional practices of teachers that both enhance teaching and curriculum-
based assessments of student learning.

References

Allington, R. L. (2002). What I’ve learned about effective reading instruction. Phi Delta Kappan, 83, 740–747.
Berliner, D. C. (1986). In pursuit of the expert pedagogue. Educational Researcher, 15(7), 5–13.
Berliner, D. C., & Rosenshine, B. V. (1977). The acquisition of knowledge in the classroom. In R. C.
Anderson, R. J. Spiro, & W. E. Montague (Eds.) Schooling and the acquisition of knowledge (pp.
375–396). New Jersey: Lawrence Erlbaum Associates.
Bloom, B. S. (1984). The search for methods of group instruction as effective as one-to-one tutoring.
Educational Leadership, 41(8), 4–17.
Bond, L., Smith, T., Baker, W. K., & Hattie, J. A. (2000). The certification system of the National Board
for Professional Teacher Standards: A construct and consequential validity study. Greensboro, NC:
Center for Educational Research and Evaluation, The University of North Carolina at Greensboro.
Boyle-Baise, M. (2005). Preparing community-oriented teachers: Reflections from a multicultural service-
learning project. Journal of Teacher Education, 56(5), 446–458.
Cawelti, G. (Ed.). (2004). Handbook of research on improving student achievement (2nd ed.). Arlington,
VA: Educational Research Service.
Collinson, V., Killeavy, M., & Stephenson, H. J. (1999). Exemplary teachers: Practicing an ethic of care in
England, Ireland, and the United States. Journal for a Just and Caring Education, 5(4), 349–366.
Corbett, D., & Wilson, B. (2004). What urban students say about good teaching. Educational Leadership,
60(1), 18–22.
Cotton, K. (2000). The schooling practices that matter most. Portland, OR: Northwest Regional Educational
Laboratory, & Alexandria, VA: Association for Supervision and Curriculum Development.
Cruickshank, D. R., & Haefele, D. (2001). Good teachers, plural. Educational Leadership, 58(5), 26–30.
Danielson, C., & McGreal, T. L. (2000). Teacher Evaluation to Enhance Professional Practice. Princeton,
NJ: Educational Testing Service.
Darling-Hammond, L. (1997). Doing what matters most: Investing in quality teaching. New York:
National Commission on Teaching and America’s Future.
Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence.
Educational Policy Analysis Archive, 8(1). Retrieved from http://olam.ed.asu.edu/epaa/v8n1.
Darling-Hammond, L. (2001). The challenge of staffing our schools. Educational Leadership, 58(8), 12–17.
182 J Pers Eval Educ (2007) 20:165–184

Darling-Hammond, L., & Youngs, P. (2002). Defining “highly qualified teachers”: What does
“scientifically based research” actually tell us? Educational Researcher, 31(9), 13–25.
Eisner, E. W. (2003/2004). Preparing for today and tomorrow. Educational Leadership, 61(4), 6–10.
Flanders, N. A. (1970). Analyzing teaching behavior. New York: Addison–Wesley.
Foegen, A., Jiban, C., & Deno, S. (2007). Progress monitoring measures in Mathematics: A review of the
literatura. Journal of Special Education, 41(2), 121–139.
Glickman, C. D., Gordon, S. P., & Ross-Gordon, J. (1998). Supervision of instruction: A developmental
approach (4th ed.). Boston: Allyn and Bacon.
Good, T. L., & Brophy, J. E. (1997). Looking in classrooms (7th ed.). New York: Addison–Wesley.
Hambleton, R. K., Crocker, L., Cruse, K., Dodd, B., Plake, B. S., & Poggio, J. (2000). Review of selected
technical characteristics of the Virginia Standards of Learning (SOL) assessments. Richmond, VA:
Virginia Department of Education.
Hamre, B. K., & Pianta, R. C. (2005). Can instructional and emotional support in the first-grade classroom
make a difference for children at risk of school failure? Child Development, 76(5), 949–967.
Heinecke, W. F., Curry-Corcoran, D. E., & Moon, T. R. (2003). U. S. schools and the new standards
accountability initiative. In D. L. Duke, M. Grogan, P. D. Tucker, & W. F. Heinecke (Eds.)
Educational leadership in an age of accountability: The Virginia experience (pp. 7–35). Albany, NY:
SUNY.
Janisch, C., & Johnson, M. (2003). Effective literacy practices and challenging curriculum for at-risk
learners: Great expectations. Journal of Education for Students Placed At-Risk, 8(1), 295.
Jay, J. K. (2002). Points on a continuum: An expert/novice study of pedagogical reasoning. The
Professional Educator, 24(2), 63–74.
Johnson, B. L. (1997). An organizational analysis of multiple perspectives of effective teaching:
Implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11, 69–87.
Kupermintz, H. (2002). Value-added assessment of teachers: The empirical evidence. In School proposals:
Research evidence by A. Molnar (Ed.). Retrieved February 14, 2002 from http://www.asu.edu/educ/
epsl/Reports/epru/EPRU%202002–101/epru-2002–101.htm.
Marzano, R. J. (2003). What works in schools. Alexandria, VA: Association for Supervision and
Curriculum Development.
Marzano, R. J., Marzano, J. S., & Pickering, D. J. (2003). Classroom management that works. Alexandria,
VA: Association for Supervision and Curriculum Development.
McBer, H. (2000). Research into teacher effectiveness: A model of teacher effectiveness. (Research report
#216). Department for Education and Employment: England.
McCaffrey, D. F., Lockwood, J. R., Koretz, D. M., & Hamilton, L. S. (2003). Evaluating value-added
models for teacher accountability. Santa Monica, CA: RAND.
McConney, A. A., Schalock, M. D., & Schalock, H. D. (1997). Indicators of student learning in teacher
evaluation. In J. H. Stronge (Ed.) Evaluating teaching: A guide to current thinking and best practice
(pp. 162–192). Thousands Oaks, CA: Corwin.
McGreal, T. I. (1990). The use of rating scales in teacher evaluation: Concerns and recommendations.
Journal of Personnel Evaluation in Education, 4, 41–58.
McLeod, J., Fisher, J., & Hoover, G. (2003). The key elements of classroom management: Managing time
and space, student behavior, and instructional strategies. Alexandria, VA: Association for
Supervision and Curriculum Development.
Mendro, R. L. (1998). Student achievement and school and teacher accountability. Journal of Personnel
Evaluation in Education, 12, 257–267.
Mendro, R. L., Webster, W. J., Bembry, K., & Orsak, T. H. (1994). An application of hierarchical linear
modeling in determining school effectiveness. Phoenix, Arizona: Rocky Mountain Educational
Research Association.
Miles, M. B., & Huberman, A. M. (1994). Qualitative data analysis: an expanded sourcebook. Thousand
Oaks, CA: Sage.
No Child Left Behind Act of 2001, Pub. L. no. 107–110, 115 Stat. 1425 (codified in 20 USC §6301).
National Commission on Excellence in Education (1983). A nation at risk: The imperative for educational
reform. Washington, DC: US Department of Education.
Nye, B., Konstantopoulos, S., & Hedges, L. V. (2004). How large are teacher effects? Educational
Evaluation and Policy Analysis, 26(3), 237–257.
Panasuk, R., Stone, W., & Todd, J. (2002). Lesson planning strategy for effective mathematics teaching.
Education, 22(2), 714, 808–827.
Peart, N. A., & Campbell, F. A. (1999). At-risk students’ perceptions of teacher effectiveness. Journal for
a Just and Caring Education, 5(3), 269–284.
J Pers Eval Educ (2007) 20:165–184 183

Pressley, M., Raphael, L., Gallagher, J. D., & DiBella, J. (2004). Providence St. Mel school: How a school
that works for African American students works. Journal of Educational Psychology, 96(2), 216–235.
Quek, C.G. (2005). A national study of scientific talent development in Singapore. Unpublished doctoral
dissertation, The College of William and Mary, Willamsburg, Virginia.
Raudenbush, S., Bryk, A., & Congdon, R. (2005) HLM for Windows. Redfield, D. (2000). Lincolnwood,
IL: SSI.
Rowan, B., Chiang, F. S., & Miller, R. J. (1997). Using research on employees’ performance to study the
effects of teachers on student achievement. Sociology of Education, 70, 256–284.
Sacks, P. (1999). Standardized minds: The high price of America’s testing culture and we can do to change
it. Cambridge, MA: Perseus.
Sanders, W. L., & Horn, S. P. (1995). The Tennessee Value-Added Assessment System (TVAAS): Mixed
model methodology in educational assessment. In A. J. Shinkfield, & D. L. Stufflebeam (Eds.)
Teacher evaluation: Guide to effective practice. Boston: Kluwer.
Shellard, E., & Protheroe, N. (2000). Effective teaching: How do we know it when we see it? The informed
educator series. Arlington, VA: Educational Research Services.
Southeast Center for Teaching Quality (2003). How do teachers learn to teach effectively? Quality
indicators from quality schools. Teaching Quality in the Southeast: Best Practices and Policies, 7(2),
1–2.
Stallings, J. (1986). Using time effectively: A self-analytic approach. In K. K. Zumwalt (Ed.) Improving
teaching (pp. 15–27). Alexandria, VA: Association for Supervision and Curriculum Development.
Sternberg, R. J. (2003). What is an expert student? Educational Researcher, 32(8), 5–9.
Stronge, J. H. (2002). Qualities of effective teachers. Alexandria, VA: Association of Supervision and
Curriculum Development.
Stronge, J. H. (2006). Teacher evaluation and school improvement: Improving the educational landscape.
In J. H. Stronge (Ed.) Evaluating teaching: A guide to current thinking and best practice (pp. 1–23,
2nd ed.). Thousand Oaks, CA: Corwin.
Stronge, J. H. (2007). Qualities of effective teachers (2nd ed.). Alexandria, VA: Association of Supervision
and Curriculum Development.
Stronge, J. H., & Tucker, P. D. (2000). Teacher evaluation and student achievement. Washington, DC:
National Education Association.
Stronge, J. H., & Ward, T. J. (2002). Alexandria City public schools teacher effectiveness study. Report for
Alexandria City Public Schools, Alexandria, VA: Authors.
Tobin, K. (1980). The effect of extended teacher wait-time on science achievement. Journal of Research in
Science Teaching, 17, 469–475.
Tomlinson, C. (1999). The differentiated classroom: Responding to the needs of all learners. Alexandria,
VA: Association for Supervision and Curriculum Development.
Tomlinson, C. A. (2003). Differentiation of Instruction in the Early Grades. ERIC Digest. Washington,
DC: ERIC Clearinghouse on Teaching and Teacher Education (ERIC Document Reproduction service
no. ED443572).
Tucker, P. D., & Stronge, J. H. (2005). Linking teacher evaluation and student learning. Alexandria, VA:
Association for Supervision and Curriculum Development.
VanTassel-Baska, J. (2005). Lessons learned from curriculum differentiation, instruction, and assessment.
Presentation at the National Curriculum Network Conference, Williamsburg, VA.
Walsh, J. A., & Sattes, B. D. (2005). Quality questioning: Research-based practice to engage every
learner. Thousand Oaks, CA: Corwin.
Wang, M. C., Haertel, G. D., & Walberg, H. J. (1993). What helps students learn? Educational
Leadership, 51(4), 74–79.
Webster, W., & Mendro, R. (1997). The Dallas Value-Added Accountability System. In J. Millman (Ed.).
Grading teachers, grading schools: Is Student Achievement a Valid Evaluation Measure? Thousand
Oaks, CA: Corwin Press.
Webster, W. J., Mendro, R. L., Orsak, T. H., & Weerasinghe, D. (1996, April). The applicability of
selected regression and hierarchical linear models to the estimation of school and teacher effects.
Paper presented at the annual meeting of the National Council on Measurement in Education, New
York.
Webster, W. J., Mendro, R. L., Orsak, T. H., & Weerasinghe, D. (1998). An application of hierarchical
linear modeling to the estimation of school and teacher effect. Paper presented at the Annual Meeting
of the American Educational Research Association (San Diego, CA, April 13–17, 1998).
Weiss, I. R., Pasley, J. D., Smith, P. S., Banilower, E. R., & Heck, D. J. (2003). A study of k-12
mathematics and science education in the United States. Chapel Hill, NC: Horizon Research.
184 J Pers Eval Educ (2007) 20:165–184

Welker, G. A. (2004). Patterns of order processing: A study of the formalization of the ordering process in
order-driven manufacturing companies. The Netherlands: University of Groningen. Dissertation.
Retrieved January 11, 2008 from http://dissertations.ub.rug.nl/faculties/management/2004/g.a.welker.
Wenglinsky, H. (2000). How teaching matters: Bringing the classroom back into discussions of teacher
quality. Princeton, NJ: Millikan Family Foundation and Educational Testing Service.
Wenglinsky, H. (2002). How schools matter: The link between teacher classroom practices and student
academic performance. Educational Policy Analysis Archives, 10(12). Retrieved February 13, 2007
from http://epaa.asu.edu/epaa/v10n12/.
Wright, S. P., Horn, S. P., & Sanders, W. L. (1997). Teacher and classroom context effects on student
achievement: Implications for teacher evaluation. Journal of Personnel Evaluation in Education, 11,
57–67.
Yesseldyke, J., & Bolt, D. M. (2007). Effect of technology-enhanced continuous progress monitoring on
math achievement. School Psychology Review, 36(3), 453–467.
Yin, R. K. (1994). Case study research. Design and methods, applied social research methods series.
Thousand Oaks, CA: Sage.
Zahorik, J., Halbach, A., Ehrle, K., & Molnar, A. (2003). Teaching practices for smaller classes.
Educational Leadership, 61(1), 75–77.

You might also like