You are on page 1of 4

824331

editorial2019
JTEXXX10.1177/0022487118824331Journal of Teacher EducationRichmond et al.

Editorial

Journal of Teacher Education

Assessment and the Future of


2019, Vol. 70(2) 86­–89
© 2019 American Association of
Colleges for Teacher Education
Teacher Education Article reuse guidelines:
sagepub.com/journals-permissions
DOI: 10.1177/0022487118824331
https://doi.org/10.1177/0022487118824331
journals.sagepub.com/home/jte

Gail Richmond1, María del Carmen Salazar2,


and Nathan Jones3

It is always a challenge to decide how to select manuscripts Why


to appear together in a hard copy issue of the Journal of
Teacher Education, especially when many have already Why do we need to assess? Authors of the articles in this
appeared online. For the second issue in our current volume, issue discuss a variety of goals for teacher and program
we chose to bring together a set of papers because, although assessment. These include improving teacher quality and
they differ in grain size and specific focus, they address a demonstrating impact on student outcomes, accreditation,
matter which has broad reach and at the same time is situated program improvement, self-reflection, and social justice–
at the heart of discussion and debate about teacher and pro- centered instruction.
gram quality. It is the issue of assessment. Scholars also have asserted that the goal of assessment
The papers in this issue all represent efforts to address a should be to address issues that challenge the nation such as
persistent challenge facing teacher education—how do we global competition, college and career readiness, and aca-
develop assessments that are informative, scalable, and demic achievement gaps (Darling-Hammond, 2015; Wagner,
accepted by the majority of experts in the field. Identifying 2014). We also must find ways to use data in powerful ways
common approaches to assessment has rarely been met with to support the development of excellent, equity-driven teach-
unbridled enthusiasm and agreement but has instead been ers who advance student learning and are committed to and
met with skepticism, debate, and frustration. In addition, skilled in addressing problems of social justice. Moreover,
there are those who are suspicious that there can ever be we must not lose sight of the fact that assessment should not
assessment tools developed which are independent enough be used to assimilate children but rather to be responsive to,
from specific contextual variables that they can be imple- sustain, and revitalize (Ladson-Billings, 2014; McCarty &
mented meaningfully at sufficient scale and thus provide Lee, 2014; Paris, 2012) their cultural resources while adding
guidance which diverse stakeholders can trust and imple- resources that are prized in U.S. society. We assess so that the
ment confidently. Then, there are those who challenge results are meaningful and empowering for every member of
assessments that are purportedly objective and neutral and the educational community.
thus perceived as scalable (Salazar, 2018). Critics argue that One obvious conclusion is that assessment is complex and
assessments are value statements that are based on the per- serves many different purposes. And these different purposes
spectives of the developers of such assessments (Flynn, should determine not only the kind of data which are col-
2015; Salazar, 2018). When accreditation agencies prioritize lected, but the ways these data are analyzed, used, and shared
certain indices of teacher quality (e.g., dispositions, core with others.
practices) and issues of scalability and value-neutrality are
ignored, those playing leadership roles in teacher preparation What
programs may find themselves without the necessary time,
resources, and expertise to respond thoughtfully to external What should we assess? Goe, Bell, and Little (2008) state,
demands. The result may be the cobbling together of assess- “What is measured is a reflection of what is valued, and as a
ments that are not useful for evaluative or learning purposes corollary, what is measured is valued” (p. 4). What is mea-
and may not reflect programmatic or institutional values. sured is of utmost importance in an assessment tool. The
In this editorial, we want to build upon some of these
dilemmas as well as related ideas we presented in an earlier 1
Michigan State University, East Lansing, USA
editorial (Bartell, Floden, & Richmond, 2018). We also want 2
University of Denver, CO, USA
to challenge some of the notions upon which tools and asser- 3
Boston University, MA, USA
tions about assessment are often built. We propose that four
Corresponding Author:
questions ought to be asked to carefully examine the assess- Gail Richmond, Michigan State University, 620 Farm Lane, Room 319, East
ment of teaching and teacher preparation programs: Why? Lansing, MI 48824, USA.
What? Who? and How? Email: gailr@msu.edu
Richmond et al. 87

articles in this edition assess dispositions, knowledge, and kinds of instructional practices that research indicates bene-
skills that are said to improve teaching and learning. What fits students with disabilities. In her article in this issue, Nava
knowledge, dispositions, and skills should be assessed? challenges notions of objectivity and neutrality by reflecting
There are many sources used to answer this question, includ- the values and needs of diverse learners. She states that many
ing, but not limited to, standards, accreditation requirements, classroom observation assessment tools exclude equity,
nationally implemented classroom observation tools, theory humanizing pedagogy, and social justice and, instead,
and research, practice, and personal experience. Many describes the development of content-specific observation
researchers have offered frameworks for understanding what rubrics that embody program values of equity and humaniz-
teachers should know and be able to do. Darling-Hammond ing pedagogy.
(2012) proposes one set of knowledge and skills, composed The issue of what data matter most raises the question of
of the following understand content concepts; connect con- the extent to which those same data do or should reflect pro-
tent to prior knowledge and experiences; scaffold learning; gram or institutional values and mission (e.g., social justice,
facilitate standards-based and outcome-based instruction; culturally responsive pedagogy) and of whether value-neu-
provide students with opportunities to apply knowledge and tral measures are to be preferred or are even possible. And
master content; assess student learning, make instructional although the paper by Vagi, Pivovarova, and Barnard in this
adjustments, and support students in monitoring their own issue suggest greater retention of teachers who have higher
learning; give explicit feedback; and manage student behav- observational scores while in their teacher preparation pro-
ior and classroom routines. Other researchers have nomi- grams, there remains the question of the extent to which such
nated that the field of teaching can be improved through a measures reflect the mission, values, or focus of the program
tight focus on a set of “high-leverage” or “core” practices in which they are being used. Indeed, it has been argued that
that cut across grades, subjects, and student populations it is unlikely that any value-neutral measures of teacher
(e.g., Ball & Forzani, 2009; Forzani, 2014; Grossman, effectiveness are possible to create (e.g., Caughlan & Jiang,
Hammerness, & McDonald, 2009; McDonald, Kazemi, & 2014).
Kavanagh, 2013). It should be noted, however, that some The field will continue to grapple with the question of
have pointed out the danger of such a focus leading to the what should be assessed, and we will continue to engage in
privileging of subject matter over practices which address context-reduced and context-responsive approaches to
issues of social justice and cultural responsivity (Philip et al., assessment. There is a need for additional research on the
2018; Richmond, Bartell, Floden, & Petchauer, 2017). impact of each of these approaches on teaching and learning
Such generic knowledge and skills are often perceived as and on job satisfaction and retention.
objective and neutral; these are often meant to promote effec-
tive teaching for all students. But others have challenged this
Who
notion. Mirra, Garcia, and Morrell (2015), for example,
argue that neutral assessment tools may serve to marginalize By “who,” we mean “Who should be developing and admin-
certain populations, “What is seen as objective, in fact, rep- istering assessments; who should be interpreting and com-
resents the experience of those who possess more societal municating data gleaned from these assessments; and for
power, while the experiences of marginalized others are whom are these assessments meaningful?” The Organization
downplayed or outright ignored” (p. 17). Flynn (2015) repu- for Economic Cooperation and Development (2009) empha-
diates notions of neutrality as false promises, myth-making, sizes the importance of including a range of stakeholders in
and bamboozling. He argues, “The very creation of a tool teacher assessment such as “parents, students, teachers,
happens in a context with a certain set of assumptions, inten- school leaders, teacher unions, educational administrators
tions, and repercussions (both intended and unintended)” (p. and policy makers in the development and implementation
212). In reality, “Most of the protocols for measuring perfor- of teacher evaluation and assessment processes” (p. 4).
mance give inadequate attention to teaching practices, gener- Salazar and Lerner (2019) emphasize the importance of
ally called, ‘culturally responsive pedagogy’ or CRP, any including students and parents in assessment, “particularly
high-stakes teaching evaluation is likely−unintentionally and those whose survival depends on education to be the greatest
ironically−to fail the very students most in need of highly equalizer, as promised by Horace Mann” (p. 144). Although
effective teaching” (Hawley & Irvine, 2011, p. 1). Such prac- scholars and practitioners encourage an inclusive approach
tice situates the dominant culture at the center of assessment; to who should be involved in assessment, it is important to
this results in the systemic marginalization of those who are ask the question: Who is assessment meaningful for? It is
“othered” (Salazar, 2018). meaningful for everyone involved in the educational com-
The effort to identify neutral, objective measures also munity, but especially for those who are systematically left
may mask epistemological differences surrounding what behind. The language that we use to describe the participants
counts as effective teaching and learning. Jones and Brownell in assessment matters, for example, the word “stakeholder”
(2014) document how commonly used observation tools, implies a language of transaction and return on investment,
such as the Framework for Teaching, may not reflect the whereas the word “community” denotes a language of
88 Journal of Teacher Education 70(2)

collaboration and meaning-making. The articles in this issue (p. 144) when considering teacher assessment. They and
use a variety of terms to describe the assessment community other scholars (e.g., Croft, Roberts, & Stenhouse, 2015)
and assert that assessment is meaningful for students, teach- advocate for communal and egalitarian strategies to teacher
ers, policy makers, programs, and the field. Ultimately, assessment. These include engaging students and commu-
assessment is meaningful when participants can use the nities in developing assessment tools; assessing teacher
results in powerful ways to improve teaching and learning. impact on students’ full potential (e.g., academic, cultural
Concomitantly, who is developing the assessments matters? competence, transformative capacities); and using alterna-
This is not only a matter of considering at the broadest level tive terms for teacher assessment such as “teaching and
the knowledge and skills needed for assessment develop- learning collectivo” or “teacher and student development”
ment. Because assessment developers advance notions of to drive efforts for collaboration and support (Salazar &
teaching that are based on their own assumptions of quality Lerner, 2019, p. 145).
and worth (Flynn, 2015), it is important that those who
develop assessments of teacher and program quality are
inclusive of, as well as representative of, the communities
Conclusion
these assessments will serve (Salazar & Lerner, 2019). Collectively, the papers in this issue highlight the central role
of measurement in teacher preparation and development and
appear at an interesting time; the ground has shifted in the
How K-12 evaluation space, with less enthusiasm for the kinds of
How should we assess? Researchers in the K-12 evaluation rigorous, systematic efforts to connect teachers to the perfor-
field have written much about lessons learned concerning the mance of their students. Although policies around evalua-
complex and thorny issues that emerge in trying to use mea- tions are still on the books, policy makers, and practitioners
surement tools for purposes other than those for which they seem to be moving away from the idea of using observation
were designed. Classroom-based observation is the most data for making decisions about promotion and tenure; class-
widely used type of tool to measure teacher effectiveness room observations are increasingly seen as a source of infor-
(Little, Goe, & Bell, 2009). Since 2013, all states have mation to guide teacher development.
required classroom observation as a component of their state Looking from this specific set of papers to the broader set
teacher evaluation system (Hull, 2013). Most all of the obser- of issues raised about assessment within teacher education,
vation tools used in teacher evaluation were developed for we suggest that there are three principal “takeaways” or
use in research settings, and we should not assume that they ideas. The first of these is that this kind of work, that is, artic-
will function in similar ways in the context of teacher educa- ulated efforts to build a rigorous and robust set of measure-
tion. We have learned, for example, that observation scores ment tools in teacher education, takes time to develop,
are often subject to bias and can be sensitive to a variety of substantial efforts across many researchers, and sufficient
contextual factors (Garrett & Steinberg, 2015; Gill, Shoji, funding to support large-scale studies of the kind that have
Coen, & Place, 2016; Steinberg & Garrett, 2016; Whitehurst, allowed measurement work on teacher quality to grow in the
Chingos, & Lindquist, 2014). And school and building K-12 literature.
administrators commonly struggle to use observation sys- The second takeaway is that we should be focusing our
tems in the ways they were trained (Bell et al., 2013; Bell, energies on two goals simultaneously—developing tools
Jones, Qi, & Lewis, 2018; Donaldson & Woulfin, 2018). which are context responsive and developing a common
Administrators also approach the observation process in dif- language for communicating “lessons learned” from the
ferent ways and with different priorities than raters would in use of these tools and for the purposes of decision
the context of a research study—they are not just focused on making.
creating “reliable” scores but are focused on managing rela- The third and final takeaway is that time and resources
tionships with employees where a central goal is helping demanded for such work should be set against a sense of
their staff improve. These findings have implications for how great urgency, as we see greater and greater opportunity gaps
we think about the use of these tools in preservice settings. between those from more privileged communities and those
How can those who should be using these tools be prepared from communities that have the most to gain from the
to understand the purpose for which particular tools were informed, powerful use of data and who have historically
developed and to use tools and to apply scores in valid, been the least advantaged by the educational system.
appropriate ways?
The articles in this volume address challenges they face in References
teacher and program assessment including pressure to adapt
Ball, D., & Forzani, F. M. (2009). The work of teaching and the
rigorous and high-stakes performance-based assessments; challenge for teacher education. Journal of Teacher Education,
data deficiencies; measurement error; a changing policy 60(5), 497-511.
landscape; and dealing with unintended consequences. Bartell, T., Floden, R., & Richmond, G. (2018). What data and mea-
Salazar and Lerner (2019) encourage scholars and prac- sures should inform teacher preparation. Journal of Teacher
titioners to “move beyond our self-imposed boundaries” Education, 69(5), 426-428.
Richmond et al. 89

Bell, C. A., Jones, N. D., Lewis, J. M., Qi, Y., Liu, S., & McLeod, Hull, J. (2013). Trends in teacher evaluation: How states are mea-
M. (2013). Understanding consequential assessment systems suring teacher performance. Alexandria, VA: National School
of teaching: Year 1 final report. Los Angeles, CA: Los Angeles Boards Association.
Unified School District. Jones, N. D., & Brownell, M. T. (2014). Examining the use of class-
Bell, C. A., Jones, N. D., Qi, Y., & Lewis, J. M. (2018). Strategies for room observations in the evaluation of special education teach-
assessing classroom teaching: Examining administrator thinking ers. Assessment for Effective Intervention, 39(2), 112-124.
as validity evidence. Educational Assessment, 23(4), 229-249. Ladson-Billings, G. (2014). Culturally relevant pedagogy 2.0:
Caughlan, S., & Jiang, H. (2014). Observation and teacher A.k.a. the remix. Harvard Educational Review, 84(1), 74-84.
quality: Critical analysis of observational instruments Little, O., Goe, L., & Bell, C. (2009). A practical guide to evaluat-
in preservice teacher performance assessment. Journal ing teacher effectiveness. National Comprehensive Center for
of Teacher Education, 65(5), 375-388. doi:10.1177/ Teacher Quality. Retrieved from https://files.eric.ed.gov/fulltext
0022487114541546 /ED543776.pdf
Croft, S. J., Roberts, M. A., & Stenhouse, V. L. (2015). The per- McCarty, T., & Lee, T. (2014). Critical culturally sustaining/
fect storm of education reform: High-stakes testing and teacher revitalizing pedagogy and Indigenous education sovereignty.
evaluation. Social Justice, 42, 70-92. Harvard Educational Review, 8(4), 101-124.
Darling-Hammond, L. (2012). Creating a comprehensive system McDonald, M., Kazemi, E., & Kavanagh, S. S. (2013). Core prac-
for evaluating and supporting effective teaching. Stanford tices and pedagogies of teacher education: A call for a common
Center for Opportunity Policy in Education. Retrieved from language and collective activity. Journal of Teacher Education,
https://edpolicy.stanford.edu/sites/default/files/publications 64(5), 378-386.
/creating-comprehensive-system-evaluating-and-supporting Mirra, N., Garcia, A., & Morrell, E. (2015). Doing youth participa-
-effective-teaching.pdf tory action research: Transforming inquiry with researchers,
Darling-Hammond, L. (2015). The flat world and education: How educators, and students. New York, NY: Routledge.
America’s commitment to equity will determine our future. Organization for Economic Cooperation and Development. (2009).
New York, NY: Teachers College Press. Teacher evaluation: A conceptual framework and examples
Donaldson, M. L., & Woulfin, S. (2018). From tinkering to going of country practices. Retrieved from https://www.oecd.org/
“rogue”: How principals use agency when enacting new education/school/44568106.pdf
teacher evaluation systems. Educational Evaluation and Policy Paris, D. (2012). Culturally sustaining pedagogy: A needed change
Analysis, 40(4), 531-556. in stance, terminology, and practice. Educational Researcher,
Flynn, J. E. (2015). Racing the unconsidered: Considering white-
41(3), 93-97.
ness, rubrics, and the function of oppression. In M. Tenam-
Philip, T. M., Souto-Manning, M., Anderson, L., Horn, I. J., Carter
Zemach & J. E. Flynn (Eds.), Rubric nation: Critical inquiries
Andrews, D., Stillman, J., & Varghese, M. (2018). Making
on the impact of rubrics in education (pp. 201-221). Charlotte,
justice peripheral by constructing practice as “core”: How the
NC: Information Age.
Forzani, F. M. (2014). Understanding “core practices” and “prac- increasing prominence of core practices challenges teacher
tice-based” teacher education: Learning from the past. Journal education. Journal of Teacher Education. https://doi.org/
of Teacher Education, 65(4), 357-368. 10.1177/0022487118798324
Garrett, R., & Steinberg, M. P. (2015). Examining teacher effective- Richmond, G., Bartell, T., Floden, R., & Petchauer, E. (2017). Core
ness using classroom observation scores: Evidence from the teaching practices: Addressing both social justice and academic
randomization of teachers to students. Educational Evaluation subject matter. Journal of Teacher Education, 68(5), 432-434.
and Policy Analysis, 37(2), 224-242. Salazar, M. (2018). Interrogating teacher evaluation: Unveiling
Gill, B., Shoji, M., Coen, T., & Place, K. (2016). The content, pre- whiteness as the normative center and moving the margins.
dictive power, and potential bias in five widely used teacher Journal of Teacher Education, 69(5), 463-476.
observation instruments (REL 2017–191). Washington, DC: Salazar, M., & Lerner, J. (2019). Teacher evaluation as culture:
US Department of Education, Institute of Education Sciences. A framework for equitable and excellent teaching. New York,
National Center for Education Evaluation and Regional NY: Routledge.
Assistance, Regional Educational Laboratory Mid-Atlantic. Steinberg, M. P., & Garrett, R. (2016). Classroom composition and
Retrieved from http://ies.ed.gov/ncee/edlabs measured teacher performance: What do teacher observation
Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluat- scores really measure? Educational Evaluation and Policy
ing teacher effectiveness: A research synthesis. National Analysis, 38(2), 293-317.
Comprehensive Center for Teacher Quality. Retrieved from Wagner, T. (2014). The global achievement gap: Why even our best
https://files.eric.ed.gov/fulltext/ED521228.pdf schools don’t teach the new survival skills our children need-
Grossman, P., Hammerness, K., & McDonald, M. (2009). and what we can do about it. New York, NY: Basic Books.
Redefining teaching, re-imagining teacher education. Teachers Whitehurst, G., Chingos, M., & Lindquist, K. (2014). Evaluating
and Teaching: Theory and Practice, 15(2), 273-289. teachers with classroom observations: Lessons learned in four
Hawley, W. D., & Irvine, J. J. (2011, December). The teaching districts. Washington, DC: Brown Center on Education Policy
evaluation gap: Why students’ cultural identities hold the key. at Brookings.
Education Week, 31, 30-31.

You might also like