Professional Documents
Culture Documents
ance of classroom teachers. The evaluation of to young adults considering teaching as an oc-
teachers is criticized as perfunctory and hap- cupation, that teacher performance is taken se-
hazard, relying on limited information and riously. In this way, teacher evaluation might
subject to the whims of the evaluators. The im- have longer-term effects on who enters teach-
age of a principal dropping into a classroom, ing, and on the distribution of good teaching
watching from the back of the class for 10 min- in U.S. schools.
AARON M. PALLAS utes, and then filling out a form rating the Tensions often exist among the diverse pur-
teacher’s performance as satisfactory may seem poses of teacher evaluation, especially when
like a caricature, but there’s enough truth in principals have primary responsibility for eval-
the depiction to cause anyone concerned with uating teachers. Some systems ask principals to
teaching and learning to squirm in discomfort. police teachers’ performance through evalua-
When some school districts hand out top rat- tion so that poor performers get drummed out.
ings to nearly all of their teachers, even when Others ask principals to support teachers and
Although the current their evaluations of student achievement aren’t guide them toward more effective practices.
system of teacher nearly so positive, something is seriously amiss. Principals report discomfort with the “cop vs.
evaluation is an easy The recognition that teacher evaluation is coach” roles thrust on them by teacher evalu-
target, designing a broken has given rise to the view — not nec- ation systems, and a recent study of a teacher
essarily true — that almost any alternative evaluation system in Chicago shows that they
better system is
would be better than the status quo. But sev- may manage this discomfort by inflating their
more complicated eral key issues must be considered when de- evaluations of teacher performance to main-
than it appears. signing an effective teacher evaluation system. tain the trust and support of teachers (Sartain,
Design evaluation systems to promote clear Stoelinga, and Brown 2009). Thus, the certi-
purposes. There are many reasons to evaluate fication and selection functions of teacher eval-
teacher performance, and the features of an uation may be weakened to support the moti-
evaluation system may support some of these vational and directional functions.
better than others. Evaluations can be used to In addition, the number of categories used
certify teachers as competent (for example, the to describe teacher performance, and the la-
award of tenure) or to select them for oppor- bels associated with them, may differ depend-
tunities or rewards (for example, professional ing on the evaluation’s purpose. Certifying
development to address weaknesses or a bonus teachers as competent, for example, might re-
R&D appears in each issue of based on excellent performance). Evaluations quire evaluators simply to differentiate those
Kappan with the assistance of can also direct the attention of teachers and ad- who meet some threshold for competence
the Deans’ Alliance, which is ministrators to what the schools deem impor- from those who do not. In other cases, evalu-
composed of the deans of the tant (for example, raising students’ test scores or ators might need to distinguish teachers who
education schools/colleges at maintaining order in the classroom). And eval- are highly skilled from those who are compe-
the following universities: uations, especially when coupled with low- and tent.
Harvard University, Michigan high-stakes rewards and punishments, might For some purposes, evaluators might need
State University, Northwestern motivate teachers to perform at high levels. to compare or rank teachers’ performances
University, Stanford University,
Beyond the consequences of evaluation for against one another; for other purposes, they
Teachers College Columbia
individual teachers, there are arguments that might need to compare a teacher’s perform-
University, University of
California Berkeley, University of
teacher evaluation can transform the broader ance against an absolute standard. In assessing
California Los Angeles, a teacher’s contribution to student learning, for
University of Michigan, AARON M. PALLAS is a professor of sociology and ed- example, an evaluator might hold samples of
University of Pennsylvania, and ucation at Teachers College, Columbia University, New student work to an absolute standard, then
University of Wisconsin. York. judge whether each teacher has met the stan-
is one reason the National Assessment of Ed- serve as the foundation for classroom observa-
ucational Progress is often called the “gold tion rubrics give short shrift to the knowledge
standard” of assessments in relation to NCLB- and practices that might make teachers suc-
style state tests. New York, by contrast, has 48 cessful in teaching trigonometry, or chemistry,
8th-grade mathematics standards, but the state’s or Shakespeare (Goe, Bell, and Little 2008).
2009 test allowed just seven of those standards Classroom observation rubrics are often ori-
to account for 50% of the total points available. ented toward the presence or absence of low-
If we value the contribution that teachers make inference, observable student behaviors. For
to students’ mastery of standards that don’t ap- example, the recent design standards for
pear on an assessment, we shouldn’t design teacher evaluation created by The New
evaluation systems that give that mastery no Teacher Project suggest rating the percentage
If we are to hold weight. The next generation of assessments of students who raise their hands when the
schools accountable will be aligned with the Common Core stan- teacher poses a question (The New Teacher
for diverse student dards that many states adopted in 2010 and may Project 2010). It’s not evident that effective
learning goals, those thus allow evaluators to assess teachers’ con- practice in the teaching of complex subject
tribution to the learning of most standards. matter can be measured in ways that don’t rely
goals should be
But, it will be several years before they appear. on complex inferences. For this reason, some
reflected in the Recognize that teacher evaluation expresses districts hire experienced secondary school
learning measures what we value as good teaching practice. We subject-matter teachers, who presumably have
used to evaluate have made great strides toward establishing deep content knowledge, to observe secondary
teachers. what counts as good teaching, but there is still classroom teachers.
much we don’t know about how to measure it. The good news is that, at least at the ele-
And though we hope that good teaching will mentary level, evidence exists that principals
result in substantial learning, an uncomfort- and other observers can be trained to use class-
able circularity is built into identifying effec- room observation protocols reliably, such that
tive teaching practices based on their associa- two observers watching the same teaching les-
tion with value-added measures of student son can generate similar ratings. A key element
learning on tests that assess only a small frac- of this training is having observers watch and
tion of the learning that we value. Are some rate videotaped lessons for which there is con-
practices universally good or bad? How much sensus among experts about whether the teach-
does good practice depend on the context in ers’ practices are desirable or undesirable, and
which it is observed? The assessment of teach- providing feedback to the observers that allows
ers’ classroom practices will not wait until we them to align their ratings to the expert judg-
have comprehensive and validated measures of ments (Sartain, Stoelinga, and Brown 2009).
good practice. Instead, the state of the art in Synchronize data collection with reason-
assessing teachers’ practices is to rely on class- able beliefs about how quickly teachers’ per-
room observation protocols. formance changes. We’re accustomed to using
Even classroom observation protocols, how- a single school year as the timeframe for eval-
ever, are an expression of which dimensions of uating teacher performance. However, re-
teachers’ work are to be valued. Teachers’ work searchers, administrators, and experienced
includes planning and classroom assessment teachers recognize that the performance of
skills (among many others), and to rely solely novice teachers can change rapidly over the
on classroom observation protocols to repre- first few years, but that after 10 or 15 years of
sent teachers’ practices is to implicitly devalue teaching, additional years of experience don’t
other dimensions of teachers’ professional seem to matter as much for outcomes such as
practice. If a feature of teachers’ practices gets students’ performance on standardized tests.
no weight in the evaluation system, the evalu- Because novice teachers’ performance can