Winters Et Al PEPG08-15 PDF

Program on Education Policy and Governance Working Papers Series
The Impact of Performance Pay for Public School Teachers:

Theory and Evidence
Marcus A. Winters
University of Arkansas
Gary W. Ritter
Ryan H. Marsh
Jay P. Greene
Marc J. Holley
PEPG 08-15
Preliminary draft
Please do not cite without permission
Prepared for the CESifo/PEPG joint conference

“Economic Incentives: Do They Work in Education”
Insights and Findings from Behavioral Research
CESifo Conference Center

Munich, Germany
May 16-17, 2008
The Impact of Performance Pay for Public School Teachers: Theory and Evidence
Marcus A. Winters*
Ph.D. Candidate
Gary Ritter**
Associate Professor
Ryan Marsh**
Research Associate
Jay P. Greene**
Professor
Marc Holley**
Ph.D. Student
* Department of Economics, University of Arkansas

** Department of Education Reform, University of Arkansas
Abstract:
This paper derives a theoretical model for understanding the impact of providing teachers
with bonuses for increasing student academic proficiency. Similar to previous labor
economic research, we model a teacher’s choice of effort as a labor-leisure tradeoff.
However, we recognize that teachers are different from other workers in that they could
have some internal motivation to increase productivity and that this “caring” for students
could have implications for the impact of performance pay. We then test these predictions
using data from a generous performance pay program in Little Rock, Arkansas. Using a
differences-in-differences approach, we find that students whose teachers were eligible
for performance pay made substantially larger test score gains in math, reading, and
language than students taught by untreated teachers. Further, we find a negative
relationship between the average performance of a teacher’s students the year before
treatment began and the additional gains made after treatment.
1
I) Introduction
In the United States, the majority of public school teachers receive compensation
according to a salary schedule that is nearly entirely determined by their number of years
of service and their highest degree attained. This system, however, has seen increasing
attacks from policymakers and researchers in recent years. Several school systems have
considered adding a component to the wage structure that directly compensates teachers
based upon the academic gains made by the students in a teacher’s care, at least partly
measured by student scores on standardized tests. Several public school systems
including Florida, New York City, Denver, and Nashville have recently adopted such
“performance-pay” policies. Recent survey research suggests that nearly half of all
Americans support performance-pay for teachers whose students are making academic
progress, while about a third of Americans directly oppose such a plan (Howell, West,
and Peterson 2007).
The focus on performance-pay programs recognizes the consensus that teacher
quality is one of the most important parts of the education process. Analyses using panel
data suggest that the quality of the teacher in a classroom is one of the most important
predictors of student achievement (Rivkin, Hanushek and Kain 2005; Harris and Sass
2006; Aaronson, Barrow and Sander 2003; Ballou, Sanders and Wright 2004; Goldhaber
and Brewer 1997; Rockoff 2004). Other research has focused on identifying observable
characteristics that predict teacher productivity, though these papers have had little
success in their search (for a complete review of this literature see Hanushek and Rivkin
2006).
2
However, an important limitation of this previous empirical research is that it
treats teacher productivity as a function of only teacher ability. The papers assume that
this ability is either exogenously given or it is increased through professional
development. Given that these papers have found little evidence that professional
development increases teacher effectiveness, this research suggests that ability is the sole
determinant of the substantial variation in teacher quality.
What may be missing from these previous models is an understanding of the
impact of teacher effort in the educational production process. There are several reasons
that it is important to consider the impact of effort in a teacher production function. First,
focusing on the decision of teachers to put forth effort will align discussion of educational
production with that of the rest of the labor economics literature. In fact, the lack of
discussion of teacher effort is interesting given that the decision to put forth effort at the
job is the driving force of productivity models in other sectors of the labor market. To the
extent that we understand microeconomic decisions to work, wage offers, and worker
productivity in the general labor market we do so through models driven by an
individual’s rational maximization of utility through the choice to put forth effort.
Understanding teacher effort could provide similar understanding of the educational
production function.
Secondly, it is impossible to discuss the impact of performance-pay programs on
student achievement without incorporating the chosen effort level of teachers. Those in
favor of such policies suggest that they will increase teacher productivity at least in part
by providing them with an incentive to put forth effort in the classroom. Thus, a rigorous
understanding of the impact of performance-pay demands an analysis of teacher effort.

3
Finally, the motivational situation in education may not be as directly comparable
to that of the firm as would first appear. The basic structure of the motivational problem
evaluated here is nothing new to research on firms and is the subject of a wide theoretical
and empirical literature comparing salaries and piece-rate compensation structures, and it
is to this literature and general idea that those in favor of performance-pay programs for
teachers point. However, the labor structure in education differs from that of firms in
important ways that could alter the effects of a piece-rate (performance-pay) system.
Important assumptions of the previous research that make sense for evaluating
piece-rate pay for some workers do not apply in education. In particular, this previous
research assumes that under a straight salary structure employees must continue to meet a
minimum benchmark in order to avoid being fired, while the landscape in education is
such that it is nearly impossible to fire a teacher for cause. Secondly, many of the results
of this previous research derive from the assumption that measuring worker productivity
is costly, but in education there is little to no cost from such measurement because of
wide-spread student achievement testing that is in practice for political and policy reasons
completely separate from the performance-pay program. (See Lazear 1986 for a treatment
of the implications of these assumptions for regular workers).
However, perhaps the most important difference between teachers and workers in
most firms is that we may expect that teachers are to some extent internally motivated to
produce a high quality product in a way that most other workers are not. One of the most
consistent criticisms of performance-pay policies is that teachers could already be
working to their fullest potential because their love for their students motivates them to
work their hardest to produce academic gains. A similar relationship would exist in the
4
labor market outside education among workers who take pride in their job or take a
particular interest in their own output. We expect that this internalization of the quality of
one’s output could be magnified for teachers who are tasked with providing children with
a better life than they would otherwise have in absence of teacher effort.
In the context of education, we develop a formal framework for understanding
teacher productivity where effort is the result of a utility maximization problem and
teachers directly gain utility from student achievement. Our model allows such “caring”
for students to be heterogeneous across teachers and sets up a general framework for
understanding the implications of this for performance-pay systems.
We also go on to empirically evaluate the predictions resulting from the theory by
studying the effects of a currently operating performance-pay program. Thus, a second
important contribution of this paper is to add to the limited empirical research on the
impact of performance-pay policies on student achievement.
Several researchers have evaluated the impact of performance pay programs on
reported teacher satisfaction, classroom practices, and retention (Johns, 1988; Jacobson,
1992; Heneman and Milanowski, 1999; Horan and Lambert, 1994). Some U.S. evidence
suggests that programs providing bonuses to entire schools, rather than changing the pay
of individual teachers, have a positive impact on student test scores (Clotfelter and Ladd,
1996). However, there is currently very little empirical evidence from the United States
suggesting that direct teacher-level performance pay leads to better student outcomes. 1
Figlio and Kenny (2006) independently surveyed the schools that participated in
the often-used National Educational Longitudinal Survey (NELS). They then
1
There is also limited evidence on the impact of performance pay in other countries. Lavy (2002) found
that a school-based program in Israel increased student performance, and Glewwe, Ilias, and Kremer (2003)
found similar results from a program in Kenya.
5
supplemented the NELS dataset with information on whether schools compensated
teachers for their performance. They found that test scores were higher in schools that
individually rewarded teachers for their classroom performance.
Eberts, Hollenbeck, and Stone (2000) used a differences-in-differences approach
to evaluate the impact of a performance incentive for teachers in an alternative high
school in Michigan. They found that the program had no effect on grade point averages
or attendance rates and actually increased the percentage of students who failed the
program. However, the study was unable to provide a direct evaluation of student
achievement (i.e. test scores). Further, the study’s focus on an alternative dropout
recovery school produces difficult estimation problems and could limit its use in the
discussion of traditional public K-12 education.
Finally, Keys and Dee (2005) evaluated an incentive improving career ladder
program in Tennessee. They took advantage of the fact that this program operated at the
same time as the notable Tennessee STAR program, a random assignment experiment on
the impact of class size on student achievement. Under STAR, students were randomly
assigned to classrooms of different sizes. This assignment additionally meant that
students were randomly assigned into classrooms led by teachers who were or were not
participating in a state sponsored performance pay program. Importantly, however,
teachers were not similarly randomly assigned to participate in the performance pay
program, and thus the study cannot be considered a conventional random assignment
experiment of the performance pay plan. Nonetheless, they found that students randomly
assigned to classrooms with teachers participating in the performance pay program made
exceptional gains in math and reading, though these results could be driven by selectivity
6
in the teachers that choose to participate in performance pay programs, rather than the
incentives of the program itself.
We add to this limited empirical research and further evaluate the predictions of
our theory by studying the impact of a generous performance-pay program in Little Rock,
Arkansas on student achievement in math, reading, and language. We find that adoption
of performance-pay substantially increased student proficiency in each of these subjects.
We also find evidence of an inverse relationship between previous teacher performance
before treatment and the positive impact of performance-pay on teacher productivity. No
other previous study of which we are aware has evaluated the impact of performance-pay
on the distribution of test score gains across teachers.
The remainder of this paper will be broken out into six sections. Section II
develops the theoretical model for understanding teacher productivity and evaluates the
impact of performance-pay programs for student achievement. Section III discusses the
performance-pay program in Little Rock, Arkansas evaluated in the paper. Section IV
discusses the data analyzed in the paper and develops the empirical models used to
measure the impact of performance pay on student achievement. Section V reports the
result of this estimation, and Section VI concludes.
II) The Model
Teachers, indexed by i, maximize utility, which depends on leisure (L), wages
(w), and student achievement gains (s), each of which has positive but diminishing
returns to utility.
(1) U i = U i ( w i , Li , s i ) subject to L ∈ [0, L] , U n′ > 0, U n′′ < 0, n ∈ [ w, L, s ]

7
The framework is innovative in two ways. First, we are aware of no previous
research in the education literature modeling teacher effort. Secondly, we recognize that
teachers receive utility from the productivity of their students – we call this teacher
“caring”. Without this addition we have the classical labor-leisure trade-off discussed in
previous labor economic research.
Such a caring for students is commonly attributed to teachers, but has yet to be
modeled in the literature. A clear motivation is simply to notice the differences between
children and other forms of output – humans have a natural inclination to care for
children in a way that they do not for other widgets.
It is a normal assumption in the education literature that teachers are so clearly
internally motivated that teacher effort warrants little discussion. Economists and those
interested in modern market-based education reforms famously pursue the opposite track
when discussing the labor force by assuming that individuals are entirely self-interested.
Here, we marry these two extreme points and recognize that our schools are staffed with
self-interested teachers who also possess at least some altruistic intentions.
The premise of this model may also be used to discuss the labor decisions of
workers in the general labor force who have an internal desire to be highly productive.
However, in the model here for teachers we assume that no external incentive for
teachers to put forth positive effort under the current pay system. This assumption rests
on the idea that it is nearly impossible to fire a public school teacher for cause in the
8
United States, which is supported by anecdotal and some empirical evidence. 2 Extensions
of this model for the general labor force would need to include a minimal effort level that
the individual must meet in order to avoid termination, such as in Lazear (2000).
We assume that there are at least some teachers that are internally motivated
enough to put forth some positive effort in absence of a financial reward and that there
are at least some teachers who not so internally motivated as to work to their highest
ability in absence of a financial or other external motivating force. Further, we allow
different teachers to care about their students with different strengths. If teacher 1 cares
more than teacher 2, then teacher 1 will be willing to accept lower levels of wages and
leisure in exchange for higher levels of student achievement. Thus in order to declare
teacher 1 to care more strongly than teacher 2, we compare their marginal rates of
substitution:
(2) MRS1(s,x)>MRS2(s,x) , x ∈ [ L, w]
or expressed as marginal utilities where MUi(x) is the first derivative of utility for teacher
i with respect to input x:
MU s1 ( w, l , s ) MU s2 ( w, l , s )
(3) > , x ∈ [ L, w]
MU x2 ( w, l , s ) MU x2 ( w, l , s )
2
For example, using Freedom of Information Act requests, journalist Scott Reader found that in the ten
year period between 1995 and 2005 there were only 555 instances of a formal remediation of a public
school teacher in the entire 873 schools in the state of Illinois. That is, each year an average of 55 public
school teachers were sanctioned for any reason – including sexual misconduct and other indiscretions other
than poor teaching performance – which amounts to about 0.04% of teachers per year. With such low rates
of sanctioning for any reason coupled with such high rates of low student academic achievement, it is clear
that public school teachers are not often fired (or even reprimanded) for failing to adequately instruct
students.
9
For ceteris paribus conditions, we assume that the marginal rate of substitution between
leisure and wages at equal levels of each is constant across individuals. This assumption
is trivial, for simplification only, and does not affect the results.
Each student is endowed with an initial ability level λ. In order to most directly
focus on teacher effort, here we consider λ as constant and homogenous across students.
Though this is clearly not the case, it is justifiable in our treatment of performance-pay
since these programs most often utilize value-added measures to provide pay bonuses,
which is an attempt to hold student ability constant. In the empirical work of the next
section, we attempt to hold λ constant by controlling for prior student achievement. The
theoretical results would differ if student ability were heterogeneous and teachers were
simply paid for having higher performing students.
Student achievement gains (sj) are a function of the student’s initial ability (λ) as
well as the productivity of their classroom teacher (ti). Teacher productivity is a function
of the teacher’s effort (ei). We assume that both functions have positive but diminishing
returns to the inputs.
∂s ∂2s ∂s ∂2s
(4) s j = s j (λ j , t i ) , > 0, 2 < 0 and > 0, <0
∂λ ∂λ ∂t ∂t 2
∂t ∂ 2t
(5) t i = t i (e i ) , > 0, 2 < 0
∂e ∂e
Effort and leisure are negatively related, such that effort is defined as maximum
available leisure time less the amount of leisure actually enjoyed by the teacher:
(6) e i = L − Li
10
Solving for L and combining equations 1-4, teachers maximize utility by choosing
their level of effort:
(7) max U i = U i ( w i , L − e i , s j (λ j , t i (e i )))
Teachers have an incentive to put forth effort to the extent that it is utility
increasing. We compare the chosen effort levels under both the conventional pay system
and a performance-pay program. We then go on to discuss the potential for
heterogeneous effects of performance-pay across teachers with different levels of internal
motivation.
RESULT 1: If wages are independent of teacher effort and two teachers differ only on
how much they care about student achievement, the teacher that cares more will put forth
greater effort and thus produce higher student achievement.
PROOF:
Consider two teachers such that teacher 1 cares more for her students’
achievement than teacher 2. For a given s, say s , the difference in caring is expressed in
equation (3).
Differentiating (7) with respect to effort, the first order condition for the teacher’s
maximization problem is
∂U i ∂U i ∂L ∂U i ∂s ∂t
(8) = * + * * =0
∂e ∂L ∂e ∂s ∂t ∂e
If we solve for teacher 2, we find

11
⎛ ∂U 2 ∂L ⎞ ⎛ ∂U 2 ∂s ∂t ⎞
(9) ⎜⎜ * ⎟⎟ + ⎜⎜ * * ⎟⎟ = 0 , or
⎝ ∂L ∂e ⎠ e* ⎝ ∂s ∂t ∂e ⎠ e*
⎛ ∂L ⎞ ⎛ ∂s ∂t ⎞
⎜ MU l ( w, L, s) * ⎟ + ⎜ MU s ( w, L, s ) * * ⎟ = 0
2 2
⎝ ∂e ⎠ e* ⎝ ∂t ∂e ⎠ e*
where e* is the equilibrium level of effort for teacher 2. This can be rewritten as
∂L ⎛ MU s2 ( w, L, s ) ∂s ∂t ⎞
(10) +⎜ * * ⎟ =0
∂e ⎜⎝ MU l2 ( w, L, s ) ∂t ∂e ⎟⎠ e*
The marginal utility of effort for teacher 1 at the equilibrium level for teacher 2, e* is
∂U 1 ∂L ⎛ MU s1 ( w, L, s ) ∂s ∂t ⎞
(11) = +⎜ * * ⎟ >0
∂e ∂e ⎜⎝ MU l1 ( w, L, s ) ∂t ∂e ⎟⎠ e*
We can now evaluate whether utility is increasing or decreasing in effort for
teacher 1 at e* by evaluating this condition. The marginal rate of substitution for teacher
1 between student achievement gains and leisure is greater than that for teacher 2. Since
the two derivatives that multiply this MRS are positive and we know that, evaluated at
e*, marginal utility is 0 for teacher 2, we know that teacher 1 has increasing marginal
utility at e*. Because of this, the utility-maximizing effort level for teacher 1, ~
e , is
greater than e*. Since student achievement gains are an increasing function of effort, this
also implies that teacher 1 will have greater student achievement gains than teacher 2,
holding innate ability constant.

12
RESULT 2: A performance-pay plan that makes wages dependent upon student
achievement will cause teachers to put forth more effort and therefore should result in
higher student achievement.
PROOF:
The expected positive overall effect from performance-pay is quite intuitive.
Consider a performance-pay plan set up with payouts being linearly related to student
achievement (or achievement gains; the plan would have the same effect). This changes
the wages from a simple w to
(12) w i ( s j (t i (e i ))) = w0 + a * s j (t i (e i )) where a > 0
This makes wages equal to some initial base pay, w0, plus a linear function of student
achievement gains.
To determine the incentives faced by the teacher in changing from a traditional
pay plan to a performance-pay plan, we evaluate the marginal utility of effort at the old
equilibrium after including the new pay plan. The traditional condition for equilibrium is
found in equation (8).
Marginal utility with respect to effort under performance-pay is:
(13)
∂U i ⎛ ∂w ∂s ∂t ⎞ ⎛ ∂L ⎞ ⎛ ∂s ∂t ⎞
= ⎜ MU wi ( w, L, s ) * * * ⎟ + ⎜ MU li ( w, L, s ) * ⎟ + ⎜ MU si ( w, L, s) * * ⎟ .
∂e ⎝ ∂s ∂t ∂e ⎠ ⎝ ∂e ⎠ ⎝ ∂t ∂e ⎠
13
To determine the effect of implementing a performance pay plan, we evaluate this
derivative at the original equilibrium level of achievement. As seen specifically for
teacher 2 in equation (9), the latter two terms of (13) sum to zero at the original
equilibrium, leaving marginal utility at the old equilibrium equal to the first term of (13).
The sign of this term and therefore of the derivative is positive since all components of
the product are positive. Thus beginning at the old level of effort, the teacher can increase
utility by putting forth more effort when the compensation structure is changed to include
a performance-pay mechanism. The proof of higher achievement follows the same lines
as the argument in proof 1. Teacher ability is constant across regime and effort increases,
so teacher production increases and therefore student achievement increases.
We can further examine the performance pay plan by reorganizing equation 13
into the equation seen below:
∂U i ⎛ MU wi ( w, L, s ) ∂w MU si ( w, L, s ) ⎞ ⎛ ∂s ∂t ⎞ ⎛ ∂L ⎞
(14) = ⎜⎜ * + ⎟⎟ * ⎜ * ⎟ + ⎜ * ⎟
∂e ⎝ MU l ( w, L, s ) ∂s MU l ( w, L, s ) ⎠ ⎝ ∂t ∂e ⎠ ⎝ ∂e ⎠
i i
This reorganization demonstrates a key difference across the two teacher pay
plans. Before, wages were constant with respect to student achievement gains, so the
marginal utility with respect to student achievement gains was the derivative of utility
with respect to its third component. Now, student achievement gains impact wages and
therefore the marginal utility of wages enters into the marginal utility with respect to
student achievement gains.
Granted that, however, we can evaluate effort based on the difference between
teachers marginal rates of substitution. Following the same lines as proof 1 and utilizing
14
the simplifying assumption that the marginal rate of substitution between leisure and
wages at equal levels of each is constant across individuals, the demonstration again
reduces to a comparison of marginal rates of substitution between student achievement
gains and leisure. Utilizing our terminology, this once again demonstrates that the teacher
who cares more will put forth more effort.
Potential for a Differential Effect of Performance-Pay Across Teachers with Different
Internal Motivations
The recognition that teachers have heterogeneous internal motivations begs the
question of whether performance-pay might affect teachers of different "caring" levels
differently. Intuitively, we may expect that performance-pay could have its greatest
impact on teachers with lower internal motivation because teachers with high levels of
caring are already putting forth a potentially large amount of positive effort. However,
though the intuition here is strong, we will see that formally evaluating this relationship is
quite complicated and fails to yield a clear analytical solution.
Let e1* be the level of effort chosen by teacher 1 under the performance pay plan
from result 2. Utilizing the fact that the original plan is simply a variant of performance
de1 *
pay that has a = 0, we can consider the change in effort level by evaluating through
da
the use of comparative statics; the same can be done for teacher 2. To determine the
teacher that reacts more strongly to the introduction of performance pay, we simply
compare these comparative static derivatives.

15
(15)
de i * ⎡ ∂U ∂ 2 w ∂s ∂ 2U ∂w ∂w ∂s ∂ 2U ∂w ∂ 2U ∂w ∂s ⎤
= −⎢ * * + * * * − * + * * ⎥
⎣ ∂w ∂s∂a ∂e ∂w ∂a ∂s ∂e ∂L∂w ∂a ∂s∂w ∂a ∂e ⎦
2
da
⎡ ∂U ⎡ ∂ 2 w ⎛ ∂s ⎞ 2 ∂w ∂ 2 s ⎤ ∂ 2U ⎛ ∂w ∂s ⎞ 2 ∂ 2U ∂U ∂ 2 s ∂ 2U ⎛ ∂s ⎞ 2 ⎤
⎢ * ⎢ 2 *⎜ ⎟ + * ⎥+ *⎜ * ⎟ + 2 + * + *⎜ ⎟ ⎥
⎢ ∂w ⎣⎢ ∂s ⎝ ∂t ⎠ ∂s ∂e 2 ⎦⎥ ∂s 2 ⎝ ∂s ∂e ⎠ ∂L ∂s ∂e 2 ∂s 2 ⎝ ∂e ⎠ ⎥
⎢ 2 ⎥
⎢ ∂ 2U ∂w ∂s ∂ 2U ∂w ⎛ ∂s ⎞ ∂ 2U ∂s ⎥
⎢− 2 * ∂w∂L * ∂s * ∂e + 2 * ∂w∂s * ∂s * ⎜⎝ ∂e ⎟⎠ − 2 * ∂L∂s * ∂e ⎥
⎣ ⎦
Clearly, the comparative static derivative is a complicated function of second and
cross-partial derivatives, about which we are not in a strong position to make
assumptions. Because of this we can make no theoretical statement about the strength of
the program’s impact on teachers who care at different levels. However, though a formal
expectation for the differential impact of performance-pay on teacher productivity
escapes us, this framework provides an interesting question that we can answer
empirically.
III) The Program
The Achievement Challenge Pilot Project (ACPP) is a teacher and staff pay-for-
performance program that has operated within the Little Rock School District (LRSD) for
three years since the 2004-05 school year. The purpose of the program is to motivate
faculty and staff to bring about greater student achievement gains. The ACPP uses
student improvement on nationally-normed standardized tests as the only basis for
financial rewards.
The funding for this project has come through a partnership between private
foundations and the LRSD. In the first year, private foundations supported ACPP at a
single elementary school and the program expanded to include another school in its
16
second year. In the third year the program adopted three additional elementary schools.
For reasons discussed below, our analyses will focus entirely on the impact of
performance-pay in the three schools that began treatment in the third year of the
program. The discussion that follows describes how the program operated in these three
schools.
The performance-pay program provided bonuses directly to teachers based on the
average spring-to-spring achievement gain of students in the teacher’s class on the
composite score of the Iowa Test of Basic Skills. The composite score includes student
achievement on the math, reading, and language arts portion of the exam.
Teachers whose students had an average achievement growth between 0-4%, earn
$50 times the number of students in their class; teachers whose students have an average
achievement growth between 5-9%, earn $100 times the number of students in their class;
teachers whose students have an average achievement growth between 10-14%, earn
$200 times the number of students in their class; teachers whose students have an average
achievement growth over 15%, earn $400 times the number of students in their class.
Table 1 displays the average bonuses that were actually earned in the schools included in
the analysis. Other staff members could also earn various bonuses based on their level of
responsibility.
[TABLE 1 ABOUT HERE]
Schools were selected to participate in ACPP based on their high percentages of
students who were struggling academically and economically disadvantaged. Table 2
reports baseline descriptive statistics for those variables used in the analyses below.
About 63 percent of the LRSD students that were not in a performance-pay eligible
17
school in 2007qualified for the federal free and reduced lunch program, and 67 percent of
these students are African American. The schools that were eligible for the program in
2007 served a more disadvantaged group of students: 88 percent of whom qualify for the
federal free and reduced lunch program and 88 percent of whom are African American.
The table also shows that students in untreated schools had baseline scores in
math, reading, and language that were substantially above those of students who were in
treated schools. Further, students in untreated schools made substantially larger
improvements in these subjects the year before treatment took place.
IV) Data and Method
We acquired individual data for the universe of public school students enrolled in
Little Rock, Arkansas elementary schools in the 2005 through 2007 school years,
providing us with two observations of student test scores gains. 3 For each elementary
student in the district, this dataset included demographic information, test scores, an
identifier for the student’s classroom teacher, and a unique student identifier that allows
us to track each student’s performance over time. We evaluate the impact of adoption of
the performance-pay program on student proficiency in math, reading, and language.
Test scores are reported in our dataset in Normal Curve Equivalent (NCE) units.
NCE’s rank the student on a normal cure compared to a nationally representative group
of students who have taken the test. NCE’s are similar to percentile scores, but differ in
that they are equal-interval scaled, meaning that the difference between two scores on one
3
Here and throughout this paper we use the spring term year to identify the school year. That is, the 2004-
05 school year is referred to as 2005.
18
part of the curve are equivalent to the difference of a similar interval on another part of
the curve. NCE scores are scaled between 1 and 99 with a mean of 50.
We utilize the differences-in-differences procedure to study the impact of
performance pay. Unfortunately, we are forced to exclude students in the schools that
began the performance pay treatment prior to 2007. The reason for the exclusion is that
since these schools were treated in each year for which we have data, in the analysis they
would become part of the comparison group.
We use OLS to estimate a model taking the form:
Yi ,a ,t = β o + β 1Yi ,a ,t −1 + β 2 Student i ,t + β 3 Schooli ,t + β 4Yeart + β 5Treat i ,t + ε i ,t (18)
Where Yi,a,t is the test score of student i in subject a in the spring of year t; Student is a
vector of observable characteristics about the student; School is vector indicating the
school that the student attended; Year is an indicator variable for the year; and ε is a
stochastic term clustered by school.
Treat is an indicator variable for whether the observation occurred for a student
attending the treatment school during the treatment year. That is, this variable is an
interaction between Year = 2007 and the indicator variable for each school that was
eventually treated. When Equation (1) is estimated using OLS, the Treat (β5) coefficient
becomes an estimate of the change in the conditional expectations of test score gains
resulting from the performance pay treatment. That is, β5 represents the impact of the
performance pay treatment after accounting for the differences in the test scores that
occur naturally over time and within the individual schools.
We also estimate a model working from (18) but which includes a teacher fixed
effect. This model takes the form:

19
Y i , a , t = ψ o + ψ 1Y i , a ,t −1 + ψ 2 Student i ,t + ψ 3 School i ,t
(19)
+ ψ 4 Year t + ψ 5 Treat i ,t + ψ 6 Teacher i ,t + ρ i ,t
Where Teacher is an indicator for the student's teacher, ρ is a stochastic term
clustered by school, and all other variables are as previously defined.
Secondly, as discussed in the above theoretical section, we are interested in
testing whether there is a differential relationship between the impact of performance-pay
and a teacher's prior productivity. We can evaluate whether teachers of varying success
had different responses to performance-pay by altering equation (18) to contain an
interaction between the treatment and a measure of a teacher’s pre-treatment productivity.
An obvious measure of pre-treatment productivity that is available in our dataset is the
average test score gain of students in the teacher’s classroom in the year prior to adoption
of the policy, 2006. Since treatment begins in 2007, and we only have test scores back
until 2005, we utilize the gains in 2006 as the only measure of pre-treatment productivity.
We slightly alter equation (18) to take the form:
Yi ,a ,t = φ o + φ1Yi ,a ,t + φ 2 Student i ,t + φ3 School i ,t + φ 4Yeart + φ5 Pr e _ Gaini ,a +

(20)
φ 6Treat i ,t + φ 7 (Pr e _ Gaini ,a * Treat i ,t ) + ρ i ,t
Where Pre_Gaini,t is the average test score gain in 2006 for students in the class of
student i’s current teacher, and ρ is again a normally distributed mean zero stochastic
term.
We are now particularly interested in φ7, which can be interpreted as the
heterogeneous effect of the performance-pay treatment by previous teacher performance.
If we find that φ7 < 0, we could interpret it as indicate that lower performing teachers
made the largest gains from the performance-pay policy.

20
We are able to estimate these equations in math, reading, and language in
elementary schools. However, the grades included in the analyses of each subject differ
due to limitations of the testing scheduled in Little Rock. Students were administered the
math version of the ITBS in all grades K-5 in each of the three years from 2005 - 2007,
and so each of these grades are included in the analyses. However, Little Rock students
were not administered the ITBS language or reading test in grades 3, 4, or 5 until 2006.
Further, students were not administered the ITBS reading test in Kindergarten until 2007.
These data limitations lead us to only include students in grades 2 and 3 for the reading
analyses and students in grades 1, 2, or 3 in the language analyses -- the only grades for
which we have both a pre- and post test score for students in both the baseline and
treatment eligible year.
A potential limitation of our approach is that we may have an endogeneity
problem since schools were not randomly assigned to the performance-pay treatment. In
particular, as discussed above, the treatment was made available to schools non-randomly
and treated schools had higher minority populations and lower income students on
average.
We are able to partially account for this endogeneity bias by including school and
in one analysis teacher fixed effects in order to account for heterogeneity in school
quality. However, it is also worth noting that summary statistics indicate that any
endogeneity bias should likely tend to underestimate the impact of the performance pay
treatment. Note that Table 2 shows that in 2006, the year before the policy was available,
on average students in eventually treated schools made smaller test score improvements
in each of the three subjects used in our analyses. That is, we should expect that in
21
absence of treatment these schools should have made smaller test score improvements
than the control schools, which would tend to bias the estimation of the treatment effect
downward. Nonetheless, we recognize that lack of random assignment is a concern with
any results.
V) Results
The results from estimation of equation (18) are reported in Table 3. Recall that
we are forced to use a more restricted group of grades in the reading and language
analyses, which accounts for the variation in the number of observations across subjects.
Our results suggest that students made statistically significant improvements in
math and reading, though the results in language just fail the test for significance at the
10% level (p = 0.126). The analyses suggest that the performance-pay treatment led to an
increase of about 3.52 NCE points in math, 3.29 NCE points in reading, and 4.56 NCE
points in language.
The size of these effects is substantial. We can use the summary statistics for
baseline achievement in these subjects reported in Table 2 to put our results into terms of
standard deviation units. Dividing the effect size by the standard deviation of the baseline
test score in the subject, our results suggest that performance-pay increased student
proficiency by 0.16 standard deviations in math, 0.15 standard deviations in reading, and
0.22 standard deviation units in language.
Table 4 reports the results of estimation of the overall treatment effect when we
include a fixed effect for each individual teacher. The table shows that the results are
qualitatively similar to those without a teacher fixed effect.

22
Somewhat surprisingly, the small gain in the R-Squared value between the
analyses reported in Tables 3 and 4 suggest that the teacher fixed-effect is explaining
very little of the variance in student achievement. We tested the explanatory power of the
teacher fixed-effect itself by estimating a regression of math test scores against only the
teacher fixed-effect. That is, we estimated Equation (19) but removed all independent
variables other than the teacher fixed-effect. These analyses found R-Squared values
between 0.20 and 0.25 for the three subjects. 4 This indicates that there is variation in
teacher effectiveness but that here it is correlated with other regressors included in the
model estimated in (18).
Table 5 reports the results of estimating equation (20). Here we are interested in
evaluating any differential impact from the performance-pay treatment by the teacher’s
previous productivity. In each subject we find that the coefficient on the overall treatment
effect remains positive, though the treatment effect in reading just misses the threshold
for significance (p = 0.110). However, we find a negative relationship between the
teacher's prior productivity (measured by the average test score gain of students in the
teacher's classroom in the baseline year) and the impact of performance-pay on teacher
productivity. The inverse relationship between prior teacher productivity and the
performance-pay effect is statistically significant in each subject. These results suggest
that the previously lowest performing teachers made the greatest improvements due to the
incentives of the performance-pay program.
VI) Conclusion
4
Analyses available upon request.
23
This paper makes a variety of contributions to the literature through the lens of
performance-pay for teachers. First, we provide a general theoretical framework for
understanding teacher productivity that is aligned with the labor economics structure of a
decision to exert effort, which has been so far absent from the economics of education
literature.
We suggest that the labor-leisure trade-off for teachers could be different from
those of other workers in important ways that are worth consideration in future
theoretical and empirical work. In particular, teachers very likely have adopted the
quality of their production – student achievement – into their own utility function. We
show that could hold important consequences for understanding teacher productivity. We
believe that these theoretical contributions could prove fruitful for future research not
only on the impacts of performance-pay, but for our understanding of academic
productivity more generally.
We have also added to the limited empirical research on performance-pay
programs for teachers. The results of our evaluation of the performance-pay program in
Little Rock, Arkansas coincide with the theoretical predictions. We find that adoption of
performance-pay led to substantial improvements in student math, reading, and language
proficiency. Further, the results indicate that performance-pay was beneficial for nearly
all teachers, and had a particularly large effect on the lowest-performing teachers.
24
References
Aaronson, D., Barrow, L., & Sander, W. (2003). “Teachers and student achievement in
the Chicago public high schools”. Unpublished manuscript.
Ballou, D., Sanders, W., & Wright, P. (2004). “Controlling for student background in
value-added analysis of teachers”. Journal of Educational and Behavioral Statistics,
29(1), 37-65.
Clottenfelter, C., and Ladd, H., 1996. “Recognizing and Rewarding Success in Public
Schools” in H. Ladd, ed. Holding Schools Accountable: Performance-Based Reform in
Education. Washington, D.C., Brookings Institution.
Eberts, R., Hollenbeck, K., and Stone, J., 2002. “Teacher Performance Incentives and
Student Outcomes.” Journal of Human Resources, 37, p. 913-27.
Figlio, D., and Kenny , L., 2006. “Individual Teacher Incentives and Student
Performance”. Journal of Public Economics, doi: 10.1016/j.jpubeco 2006.10.001,
forthcoming.
Glewwe, P., N. Ilias, and M. Kremer 2003. “Teacher Incentives”. NBER working paper
9671.
Goldhaber, D.D., & Brewer, D.J. (1997). “Why don’t schools and teachers seem to
matter? Assessing the impact of unobservables on educational productivity”. Journal of
Human Resources, 32(3), 505-523.
Hanushek, E.A., & Rivkin, S.G. (2006). “Teacher Quality”. In Eric Hanushek and Finis
Welch, eds. “Handbook of the Economics of Education, Volume 2”. Elsevier. Pp 1051-
1075.
Harris, D. & Sass, T.R. (2006). “The effects of teacher training on teacher value added”.
Unpublished manuscript.
Heneman, H. G., and Milanowski, A. T., 1999. “Teachers’ attitudes about teacher
bonuses under school-based performance award programs”. Journal of Personnel
Evaluation in Education, 12, p. 327–41.
Horan, C. B., and Lambert, V., 1994. “Evaluation of Utah career ladder programs”. Beryl
Buck Institute for Education. Utah State Office of Education and Utah State Legislature.
Howell, W.G., West M.R., & Peterson, P.E. (2007). “What Americans think about their
schools”. Education Next, 7(4), 12-26
25
Jacobson, S. L. 1992. “Performance-related pay for teacher: the American experience”. In

Tomlinson, H. (Ed.) “Performance-related pay in education” (pp. 34-54). London:
Routledge.
Johns, H.E. (1988). “Faculty perceptions of a teacher career ladder program”.

Contemporary Education, 59(4), 198-203.
Keys, B., and Dee, T., 2005. “Dollars and Sense”. Education Next, 5, p. 60-67.
Lavy, V. 2002. “Evaluating the Effect of Teachers’ Group Performance Incentives on

Pupil Achievement”. Journal of Political Economy, 110, p. 1286-1317.
Lazear, E.P. (2000). “Performance pay and productivity”. American Economic Review,
90(5), 1346-1361.
Rivkin, S.G., Hanushek, E.A., & Kain, J.F. (2005). “Teachers, schools and academic
achievement,” Econometrica, 73(2), 417-458.
Rockoff, J.E., “The impact of individual teachers on student achievement: Evidence from
panel data.” American Economic Review, 94(2), 247-252.
26
Table 1
Summary of ACPP Payouts by Year and School
Average
Highest Lowest Average Cost
Total Teacher Teacher Teacher Total Per
School Year Bonus Bonus Bonus Bonus Enrollment Pupil
Mabelvale 2006-2007 $39,550 $6,400 $450 $1,187.50 338 $117
Geyer
Springs 2006-2007 $64,530 $7,600 $350 $2,846 333 $194
Romine 2006-2007 $12,450 $5,200 $450 $723 365 $34
27
Table 2
Baseline Descriptive Statistics
Eventually
All Never Treated Treated
Variable Mean Std. Mean Std. Mean Std.
Black 0.69 0.46 0.67 0.47 0.88 0.33

Asian 0.02 0.12 0.02 0.13 0.00 0.00
Hispanic 0.04 0.19 0.04 0.19 0.06 0.23
Indian 0.00 0.06 0.00 0.06 0.00 0.05
Male 0.50 0.50 0.50 0.50 0.52 0.50
Eligible for Free or Reduced
Lunch 0.65 0.48 0.63 0.48 0.88 0.33
Baseline Math 50.41 21.54 51.15 21.57 38.57 17.27
Baseline Reading 50.16 21.53 51.12 21.55 40.53 18.87
Baseline Language 49.87 21.13 50.88 21.18 40.21 18.02
Math Gain 2006 1.94 14.37 2.14 14.25 -1.29 15.83
Reading Gain 2006 1.83 14.51 1.89 14.53 1.19 14.29
Language Gain 2006 0.00 16.07 0.18 15.90 -1.75 17.45
Note: Only students included in overall math regression are included in above summary statistics
for demographic variables. Reading and language test descriptive statistics include only students used in
those regressions.
28
Table 3
Regression Results - Overall
Treatment
Math Reading Language

Coef t Coef t Coef t
Math t-1 0.70 69.42 ***
Reading
t-1 0.68 64.59 ***
Language
t-1 0.68 51.33 ***
Black -4.60 -11.01 *** -4.69 -11.27 *** -2.75 -5.48 ***
Asian 3.65 4.46 *** 1.04 1.00 5.81 4.53 ***
Hispanic -1.14 -1.72 * -1.62 -2.38 ** 1.18 1.50
Indian -1.80 -1.47 -3.78 -1.79 * -3.19 -1.01
Male 0.03 0.14 -0.41 -1.45 -2.87 -10.43 ***
Lunch
Eligible -2.47 -10.17 *** -2.88 -6.06 *** -3.19 -8.58 ***
Treat 3.52 2.32 ** 3.29 1.91 * 4.56 1.57
Constant 23.11 20.24 *** 19.40 26.84 *** 20.04 27.49 ***
Teacher
Fixed
Effect No No No
N 13,389 5,948 8,933

R-
Squared 0.6479 0.7118 0.6211
Estimated via OLS. Models also control for school, grade, and year fixed effects. Standard errors clustered
by school.
*** Significant at p<= .01
** Significant at p<= .05
* Significant at p<= .10
29
Table 4
Regression Results - Overall Treatment with Teacher Fixed Effect

Math t-1 0.71 71.94 ***
Reading
t-1 0.69 65.22 ***
Language
t-1 0.68 51.22 ***
Black -4.41 -10.82 *** -4.56 -11.04 *** -2.70 -4.69 ***
Asian 3.64 4.01 *** 1.33 1.23 5.92 5.37 ***
Hispanic -0.86 -1.30 -1.27 -1.85 * 1.68 2.10 **
Indian -1.34 -0.93 -2.89 -1.58 -3.11 -0.97
Male 0.06 0.29 -0.43 -1.32 -2.71 -10.30 ***
Lunch
Eligible -2.24 -8.33 *** -2.82 -5.55 *** -2.90 -6.96 ***
Treat 5.23 2.21 ** 3.05 2.76 *** 2.04 0.93
Constant 17.36 9.47 *** 22.60 6.87 *** 24.54 9.12 ***
Teacher
Fixed
Effect Yes Yes Yes
N 13,388 5,948 8,933

R-
Squared 0.6780 0.7293 0.6541
by school.
30
Table 5
Regression Results - Differential Effect by Prior Teacher Productivity

Math t-1 0.72 62.03 ***
Reading
t-1 0.66 50.93 ***
Language
t-1 0.65 38.16 ***
Black -4.17 -9.10 *** -4.57 -9.04 *** -2.48 -4.03 ***
Asian 3.90 4.19 *** -0.10 -0.07 6.48 5.54 ***
Hispanic -0.79 -1.00 -1.80 -2.49 ** 0.89 0.94
Indian 0.87 0.62 -2.01 -0.93 -1.86 -0.43
Male -0.05 -0.24 -0.56 -1.48 -2.90 -8.27 ***
Lunch
Eligible -2.62 -10.02 *** -3.07 -5.24 *** -3.61 -8.14 ***
Average
2006
Gain for
Teacher 0.62 17.42 *** 0.22 3.06 *** 0.37 6.59 ***
Treat 6.93 14.32 *** 3.63 1.65 4.24 3.93 ***
Treat *
Average
2006
Gain for
Teacher -0.48 -13.72 *** -0.35 -3.61 *** -0.50 -9.76 ***
Constant 17.13 16.88 *** 17.66 17.77 *** 19.42 14.53 ***
N 10,305 4,560 6,695

R-
Squared 0.6756 0.7015 0.6025
by school.

Winters Et Al PEPG08-15 PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Winters Et Al PEPG08-15 PDF

Uploaded by

Copyright:

Available Formats

Program on Education Policy and Governance Working Papers Series

The Impact of Performance Pay for Public School Teachers:

Prepared for the CESifo/PEPG joint conference

CESifo Conference Center

* Department of Economics, University of Arkansas

measured by student scores on standardized tests. Several public school systems

and Peterson 2007).

The focus on performance-pay programs recognizes the consensus that teacher

However, an important limitation of this previous empirical research is that it

this ability is either exogenously given or it is increased through professional

determinant of the substantial variation in teacher quality.

What may be missing from these previous models is an understanding of the

productivity in the general labor market we do so through models driven by an

Understanding teacher effort could provide similar understanding of the educational

Secondly, it is impossible to discuss the impact of performance-pay programs on

understanding of the impact of performance-pay demands an analysis of teacher effort.

Finally, the motivational situation in education may not be as directly comparable

of the implications of these assumptions for regular workers).

consistent criticisms of performance-pay policies is that teachers could already be

In the context of education, we develop a formal framework for understanding

understanding the implications of this for performance-pay systems.

We also go on to empirically evaluate the predictions resulting from the theory by

studying the effects of a currently operating performance-pay program. Thus, a second

impact of performance-pay policies on student achievement.

Several researchers have evaluated the impact of performance pay programs on

the often-used National Educational Longitudinal Survey (NELS). They then

supplemented the NELS dataset with information on whether schools compensated

individually rewarded teachers for their classroom performance.

Eberts, Hollenbeck, and Stone (2000) used a differences-in-differences approach

to evaluate the impact of a performance incentive for teachers in an alternative high

discussion of traditional public K-12 education.

assigned to classrooms of different sizes. This assignment additionally meant that

participating in a state sponsored performance pay program. Importantly, however,

incentives of the program itself.

of performance-pay substantially increased student proficiency in each of these subjects.

We also find evidence of an inverse relationship between previous teacher performance

before treatment and the positive impact of performance-pay on teacher productivity. No

on the distribution of test score gains across teachers.

performance-pay program in Little Rock, Arkansas evaluated in the paper. Section IV

result of this estimation, and Section VI concludes.

II) The Model

Teachers, indexed by i, maximize utility, which depends on leisure (L), wages

(1) U i = U i ( w i , Li , s i ) subject to L ∈ [0, L] , U n′ > 0, U n′′ < 0, n ∈ [ w, L, s ]

The framework is innovative in two ways. First, we are aware of no previous

previous labor economic research.

children in a way that they do not for other widgets.

It is a normal assumption in the education literature that teachers are so clearly

self-interested teachers who also possess at least some altruistic intentions.

ability in absence of a financial or other external motivating force. Further, we allow

i with respect to input x:

simply paid for having higher performing students.

returns to the inputs.

their level of effort:

(7) max U i = U i ( w i , L − e i , s j (λ j , t i (e i )))

and a performance-pay program. We then go on to discuss the potential for

heterogeneous effects of performance-pay across teachers with different levels of internal

greater effort and thus produce higher student achievement.

If we solve for teacher 2, we find

We can now evaluate whether utility is increasing or decreasing in effort for

holding innate ability constant.

RESULT 2: A performance-pay plan that makes wages dependent upon student

higher student achievement.

The expected positive overall effect from performance-pay is quite intuitive.