Integrated Literature Review: The Relationship Between Assessment and Student Achievement
William Kralovec
In education, the term assessment refers to the wide variety of methods or tools that
educators use to evaluate, measure, and document the academic readiness, learning progress,
skill acquisition, or educational needs of students. (Great Schools Partnership, 2013) A precise
definition of student achievement is difficult due to term having several layers of meaning and
nuance. Student learning and student achievement are closely related ideas and are often used
understandings, and skills at one point in time. (Linn, Bond, Darling-Hammond, Harris, Hess,
& Shulman, 2011). Student learning is the growth of achievement, or as previously defined,
assessed through standardized tests, developed by testing companies and administered to large
groups of students. Standardized tests compare and rank students. Teachers, students, parents,
policymakers and others focus on what is being measured, thus assessment and achievement are
closely related. The type of assessment determines what type of achievement is measured and
therefore valued. Achievement can also be measured through graduation rates, attendance,
university acceptances, and ultimately, whether a student matures into a happy, contributing
member to society. As Dewey wrote in Curriculum and the Child (1902) the following:
The world in which most of us live is a world in which everyone has a calling and
occupation, something to do. Some are managers and others are subordinates. But the
great thing for one as for the other is that each shall have had the education which enables
him to see within his daily work all there is in it of large and human significance.
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 3
The full interpretation of achievement will not be considered in this review and will focus on the
achievement that is measured or tested. However, student achievement is larger than what is
formally tested in the core subjects. There is a larger body of knowledge and skills that can be
measured that are not included on standardized tests. There is also an even larger amount of
classroom learning that cannot be measured easily. Linn, et al. (2011) capture this model in
figure 1.
There are many forms of assessment for different reasons. One major division of
assessment is formative versus summative. The seminal study of formative assessment by Black
and Wiliam (1998) defined it as all those activities undertaken by teachers, and/or by their
students, which provide information to be used as feedback to modify the teaching and learning
activities in which they are engaged. Dunn and Mulvenon (2009) further clarified that the
assessments need to be used to monitor student progress during the learning process, and
teachers must provide qualitative and quantitative feedback to the student, if they are to be
considered formative. Many researchers, including the Black and Wiliam (1998) meta analysis of
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 4
over 250 articles related to formative assessment shows conclusively that formative assessment
does improve learning, and that gains in student achievement were amongst the largest ever
reported. One criticism about the numerous studies demonstrating this is that there are no
well-defined practices or artifacts that represent formative assessment. The wide variety of
implementations greatly differ from one implementation and student population to the next.
(Bennett, 2011) Bennett urges new development should focus on conceptualising well-specified
approaches built around process and methodology rooted within specific content domains.
Stiggins and Chappuis (2005) did set conditions for effective formative assessment as follows:
communicated clearly.
The research on formative assessment does back up the claim that these aspects of
formative assessment, improve student achievement. Orsmond, Merry and Reiling (2002)
demonstrated the use of examplars, self-assessment and peer marking criteria improved first-year
previous students work at different levels. Feedback done well, is a major influence on learning
and achievement (Hattie & Timperley, 2007). This means to be asking the right questions to
identify knowledge gaps, find erroneous understanding and determine remediation or alternative
steps to learning. Feedback is so important that teachers should automate much of the routine
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 5
tasks in schools so more time and resources can be devoted to responding to feedback. (Hattie &
Jaegar, 1998)
the individualization of learning. For example, Faber, Luyten & Visscher (2016) found large
gains in grade 3 mathematics use of digital formative assessment programs. Other technology
tools like Audience Response Systems (clicker-based technologies) and mobile devices (phones,
laptops) also had a small, but positive effect on learning through increased communication of
feedback between teacher and students. (Hunsu, Adesope, Bailey, 2016; Sung, Chang, Liu,
2016).
In contrast, summative assessments are used to evaluate student learning, skill acquisition
and academic achievement at the conclusion of a defined instructional period. (Great Schools
Partnership, 2013) This is typically at the end of a project, unit, course, semester, program or
school year. Smith, 2014 gives a comprehensive classification model of all categories of
Summative testing summarizes how well an individual is doing at a given point in time.
(Looney, 2009; Brookhart, 2011) Scores are disseminated at the district, state, national, regional
or world levels. Testing for assessment evaluates academic progress and may direct instruction,
hence becoming formative in nature. For example, graded work with comments will allow
students to take the feedback and improve on the next summative assessment. (Basey, Maines,
Frances, 2014) Testing for advancement determines the academic trajectory for students. For
junior high schools, high schools and universities is determined by exams. (Yamamoto, 2016)
Examples of testing for accreditation are teacher certification exams or high school graduation
proficiency exams, which can serve both for accreditation and advancement.
In the USA, testing for accountability is becoming more prevalent. Because formative
tests have become synonymous with student achievement. Testing for accountability shifts the
blame from low performing students to low performing schools, including teachers and
administration. (Apple, 1999) There is much research outlining the history of this movement in
America, starting in the 1970s, when education leaders established a link between test scores and
school accountability. (Dorn, 2007; Kornhaber, 2004; Carl, 1994; Lee, 2008) The failure to
narrow the achievement gap between upper and middle class white students and disadvantaged
minority students was the focus of the 1983, Nation at Risk report. (Hopman, 2008) The Educate
America Act of 1994 under the Clinton administration pushed for national standards and
voluntary national testing in grades 4, 8 and 12. (Carl, 1994) The movement culminated with the
No Child Left Behind Act (NCLB), signed by George W. Bush in January of 2002. The law
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 7
required states to link standards, assessment and accountability. (Lohman, 2010) Schools were
judged on making adequate yearly progress (AYP) towards 100% student proficiency within
three years or face sanctions and even potential school closure. (Springer, 2008)
Does the increase in testing for accountability form of assessment improve student
achievement? A meta analysis research study of 25 states by Nichols, Glass & Berliner (2012)
looked at the relationship between high-stakes testing pressure and student achievement. ...a
pattern seems to have emerged that suggests that high-stakes testing has little or no relationship
to reading achievement, and a weak to moderate relationship to math, especially in fourth grade
but only for certain student groups. (Nichols, Glass, Berliner, 2012 p.3) Dee & Jacob in 2009,
concurred that the NCLB improved math achievement in grade 4 students, improved the lower
Beyond the impact on student test scores, other unintended consequences resulted from
assessment for accountability. Nichols, Glass & Berliner in 2005 reminded us of Campbells
Law and that high-stakes tests cannot be trusted they are corrupted and distorted. Campbells
Law predicts when quantitative indicators are used for social decision-making, the more subject
they will be to corruption pressures and the more apt will be to distort and corrupt the social
processes it is intended to monitor. (Campbell, D. 1979, p.85) Scholarly evidence finds that
curriculum (Au, 2007; Jacob, 2005), gaming the system by reclassifying students to remove them
from the testing pool (Cullen & Reback, 2006; Figlio & Getzer, 2006; Jacob, 2005), and outright
cheating. (Jacob & Levitt, 2003) A recent report from the National Assessment of Educational
Progress (NAEP) showed little progress between the 2008 and 2016 assessment in music and
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 8
visual arts achievement of grade 8 students. (Johnson, 2016) This may confirm the narrowing of
the curriculum with resources going towards mathematics and reading because they are assessed
The goal of 100% proficiency announced with the NCLB legislation was unrealistic and
the consequences of school closure and personnel changes were too high for schools. This
resulted in individual states lowering their standards to ensure all students reached the
proficiency standards (Peterson & Hess, 2008) In response, the United States Department of
Education announced the Race to the Top program in 2009. The competitive grant program
administrators, adopting common learning standards and policies that do not prohibit charter
schools (US Department of Education, 2015). The program takes control away from the states to
a national focus on standards and inducements, instead of punitive measures. (Lohman, 2010)
There has been much public discourse on the adoption of Common Core State Standards (CCSS)
and the growth of charter schools. Lee and Wu (2017) found that the CCSS movement has
indeed raised standards but not student achievement on the NEAP math and reading assessments.
The Common Core has helped America race to the top for performance standards, but not for
performance outcomes yet. (Lee and Wu, 2017) This might be that the NEAP is not aligned yet
to the CCSS (Wixson, Valencia, Murphy, & Phillips, 2013; Hughes, Daro, Holtzman &
Middleton, 2013)
The testing for accountability movement is spreading throughout the world. (Smith,
2014) The Programme for International School Assessment (PISA) is one of the most known
assessments used to measure student achievement across the world. The number of countries
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 9
participating has increased from 43 in the first round in 2000, to 71 countries in the latest round
of tests in 2015. Surveys show PISA affects educational systems throughout the world through
ministries of education learning about and emulating practices and policies of countries of high
achievement or that have demonstrated growth. (PISA, 2017) One example is reading
achievement greatly improved in Germany between 2000 and 2009. PISA evidence indicated
great inequalities in schools in the country and Germany invested in sub-par and disadvantaged
schools. (Hanushek & Woessmann, 2010) In another example, US Education Secretary Arne
Duncan in 2012, called for ...accelerating achievement in secondary school and the need to
close large and persistent achievement gaps and the results of the revealed US students failed
computer-based adaptive assessments in reading, mathematics, language usage and science can
be administered to students in kindergarten through grade 10. Students can compare themselves
to norm groups in the USA, international schools and the several regional associations of
international schools. Currently, over 1,000 international schools in 145 countries are using the
assessment, along with over 5,000 schools in the USA. (NWEA, 2017) Precise feedback is
provided immediately to teachers and it is designed for teachers to use to direct instruction. The
assessment is usually given two to three times per year in order to assess progress.
NWEA has a team of 20 curious and innovative researchers spend their days
investigating strategies to advance academic student growth and measurement. (NWEA, 2017)
In a series of studies focusing on high achieving students, defined by being in the 90th percentile
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 10
on the MAP assessment, approximately 40% dropped to around the 80th percentile over time,
however a larger number of students in the 80th percentile rose to the 90th percentile and above.
The study also revealed that high achievers in both high poverty and low poverty schools had no
difference in growth rates. (Xiang, Dahlin, Cronin, Theaker, & Durant, 2011) Another study
reaffirmed that the poverty rate at a given school had little effect on the growth of these high
achieving students. There was also much variance in growth among schools, with a significant
number high poverty schools of the 1300 schools in the study, outperforming low poverty
schools with growth of high achieving students. (Dahlin & Terasawa, 2013) In both studies,
although growth rates in high and low poverty schools were the same, pre-existing achievement
gaps gave more students in high achieving schools access to merit-based scholarships. Dahlin
and Terasawa conclude that policy makers should not define the nations elite students by
scoring in the top 1%, 5% or 10% in the national standardized pool, but every school has its own
10% of students and improving the achievement these elite students promotes American
competitiveness and a more fair and just society. (Dahlin & Terasawa, 2013 p5)
For such a common assessment used in American and international schools, there is a
disturbing lack of studies by independent researchers on the relationship between MAP and
student achievement. One study showed that MAP had no significant impact on the scores of
students on the state reading tests in grades 4 and 5. Evidence showed that subgroups of low and
high-achieving students may benefit most from MAP but more research is needed. (Cordray,
2013) Other groups, such as English language learners, low-income students and special needs
students, grow at the same rate as all students on MAP reading, mathematics and language usage
teacher evaluation using MAP. Gray (2010) found principals partially able to identify
On the basis of the research findings in this work, the following recommendations to
growth and achievement, greater emphasis on best practices and wider implementation of
Policies of assessment for accountability have too narrowly focused the attention of
teachers and schools on low-performing students and students close to proficiency level,
and a broader assessment program is needed to address the needs of all students. For our
satisfied with proficiency or world averages, but individualize assessment and focus on
achievement for all students.. This will allow us to have more data to support
Assessment of reading and mathematics have narrowed teaching and learning to these
subjects. A drive to improve and assess student achievement in the fine arts, natural
sciences, technology and innovation, instead of solely assessing these two subjects
would benefit students. Developing assessments for achievement in these areas should be
completed.
international schools.
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 12
Using assessment data at our school, explore developing a value-added model of teacher
evaluation.
Although not an issue at our school, improving family and home life of high poverty and
disadvantage students would raise achievement of students coming from these homes.
Research indicates that the pre-existing conditions of students determines their ultimate
take the form of outreach meetings and workshops for our families (parents,
References
Andere, E. (2015). Are teachers crucial for academic achievement? Finland educational success
http://dx.doi.org/10.14507/epaa.v23.1752
Apple, M. W. (1999). Rhetorical reforms: Markets, standards, and inequality. Current Issues in
Basey, J.M., Maines, A.P., Frances, C.D. (2014) Time efficiency, written feedback, and student
Ben-Shakhar, G. & Sinai, Y.. (1991). Gender differences in multiple-choice tests: The role of
Benavot, A. & Tanner, E., (2007) The growth of national learning assessments in the world,
1995-2006. Paper commissioned for the EFA Global Monitoring Report 2008, Education
for All by 2015: will we make it? New York, NY: United Nations Educational, Scientific,
Black, P. & Wiliam, D. (1998). Assessment and classroom learning. Assessment in Education,
5(1), 7-74.
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 14
Black, P., & Wiliam, D. (1998). Inside the black box: Raising standards through classroom
Bracey, G. (2007). The proficiency illusion. Phi Delta Kappan, 89(4), 316.
Brookhart, S. (2011). Educational assessment knowledge and skills for teachers. Educational
(1367603795)
Candal, C. S. (2016). Massachusetts charter public schools: Best practices using data to
Carl, J. (1994). Parental choice as national policy in england and the united states. Comparative
Carney, S., Rappleye, J., Silova, I. (2012). Between faith and science: World culture theory and
Carnoy, M., & Rothstein, R. (2013). What do international tests really show about U.S. student
doi:10.1080/08963560903017680
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 15
Cave, P. (2007). Primary school in japan: Self, individuality and learning in elementary
Center on Education Policy. (2008). Has student achievement increased since 2002? State test
score trends through 2006-2007. Washington, DC: The George Washington University.
Cordray, D., Pion, G., Brandt, C., & Molefe, A. (2013). The impact of the measures of
doi:10.1016/S1590-8658(04)00533-X
Cullen, J. B., & Reback, R. (2006). Tinkering toward accolades: School gaming under a
Research.
Culross, R., & Tarver, E. (2011). A summary of research on the international baccalaureate
doi:10.1177/1475240911422139
Dahlin, M., & Tarasawa, B. (2013). A level playing field? How college readiness standards
Association.
Dee, T., & Jacob, B. (2009). The impact of no child left behind on student achievement.
Dewey, J. (2008). The child and the curriculum including, the school and society. New York,
Dorn, S. (2007). Accountability frankenstein: Understanding and taming the monster. Charlotte,
Dunn, K. E., & Mulvenon, S. W. (2009). A critical review of research on formative assessment:
Faber, J. M., Luyten, H., & Visscher, A. J. (2017). The effects of a digital formative assessment
Figlio, D. N., & Getzler, L. S. (2002). Accountability and disability: Gaming the system.
Figlio, D., & Loeb, S. (2011). School accountability. In Hanushek, E., Machin, S. & Woessman,
doi:10.1016/B978-0-444-53429-3.00008-9
Govan, C. (2013). Observing the academic performance and student graduation rates of grade
8 students who failed Texas state assessments (Doctoral dissertation). Retrieved from
teachers principal ratings and residual gain on standardized tests. (Doctoral dissertation)
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 17
Great Schools Partnership. (2013) Glossary of education reform [website]. Retrieved from
http://edglossary.org/
Hanushek, E., & Woessmann, L. (2010). The high cost of low educational performance. Paris,
doi:10.1787/9789264077485-en
Harris, D., & Herrington, C. (2006) Accountability standards and the growing achievement gap:
Lessons from the past half-century. American Journal of Education, 112(2), 209-238.
doi:10.1086/498995
Hattie, J., Biggs, J., & Purdie, N. (1996). Effects of learning skills interventions on student
doi:10.3102/00346543066002099
Hattie, J., & Jaeger, R. (1998). Assessment and classroom learning: A deductive approach.
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research,
Hopmann, S. T. (2008). No child, no school, no state left behind: Schooling in the age of
doi:10.1080/00220270801989818
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 18
Hughes, G., Daro, P., Holtzman, D., & Middleton, K. (2013). A study of the alignment between
the NAEP mathematics framework and the common core state standards for mathematics
Jacob, B. A., & Levitt, S. D. (2003). Rotten Apples: An Investigation of the prevalence and
Jacob, B. A. (2005). Accountability, incentives and behavior: The impact of high-stakes testing
doi:10.1016/j.jpubeco.2004.08.004
Johnson, L. (2017). Nation's report card finds mixed grades for U.S. students in visual arts,
music. National Public Radio Ed: How Learning Happens. Retrieved from
http://www.npr.org/sections/ed/2017/04/25/525444055/nations-report-card-finds-mixed-g
rades-for-u-s-students-in-visual-arts-and-music
Korb, M. B. (2014). Differences race to the top funded programs make in student AIMS reading
(1545674285)
Lauen, D., & Gaddis, S. (2012). Shining a light or fumbling in the dark? The effects of NCLBs
Lay, Y., & Chandrasegaran, A. (2016). Availability of school resources and TIMSS grade 8
Lee, J. (2008). Is test-driven external accountability effective? Synthesizing the evidence from
Lee, J., & Wu, Y. (2017). Is the common core racing america to the top? Tracking changes in
state standards, school practices, and student achievement. Education Policy Analysis
Lee Lauen, D. & Gaddis, S.M. (2012). Shining a light or fumbling in the dark? The effects of
Li, H. (2012). How is formative assessment related to students' reading achievement? Findings
from PISA 2009. Paper presented at the Annual Meeting of the Northeastern Educational
Linn, R., Bond, L., Darling-Hammond, L., Harris, D., Hess, F., & Shulman, L. (2011). Student
learning, student achievement: How do teachers measure up. Arlington, VA: National
Lohman, J. (2010). Comparing no child left behind and race to the top. (2010-R-0235).
Looney, J.W. (2009) Assessment and innovation in education. OECD Working Papers in
Luo, S. (2013). The effects of advanced placement and international baccalaureate programs
19(2), 202-235.
"measures of academic progress" (MAP) data. Retrieved from ProQuest Central K12.
(3642219).
educational progress. Washington, DC: National Center for Education Statistics (ED IES
13 C 0025)
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 21
Nichols, S., Glass, G., & Berliner, D. (2012). High-stakes testing and student achievement:
Updated analyses with NAEP data. Education Policy Analysis Archives, 20, 20.
doi:10.14507/epaa.v20n20.2012
Northwest Evaluation Association. (2013). NWEAs measures of academic progress myths and
Organisation for Economic Co-operation and Development. (2016). PISA 2015 results in focus.
Orsmond, P., Merry, S., & Reiling, K. (2002). The use of exemplars and formative feedback
when using student derived marking criteria in peer and self-assessment. Assessment &
Peterson, P. E., & Hess, F. M. (2008). Few states set world-class standards: In fact, most render
Phelps, R. (2000). Trends in large-scale testing outside the United States. Educational
Reardon, S., Fahle, E., Kalogrides, D., Podolsky, A., & Zarate, R. (2016). Test format and the
variation of gender achievement gaps within the United States. Paper presented at
RELATIONSHIP BETWEEN ASSESSMENT & STUDENT ACHIEVEMENT 22
impacts of accountability when standards are set low. Economics of Education Review,
44, 1-16.
Schleicher, A. (2016). Challenges for PISA. eJournal of Educational Research, Assessment and
Evaluation, 22(1)
Smith, W. (2014). The global transformation toward testing for accountability. Education
doi:10.1016/j.econedurev.2007.06.004
Stevens, J. J., Schulte, A. C., Elliott, S. N., Nese, J. F. T., & Tindal, G. (2015). Growth and gaps
doi:10.1016/j.jsp.2014.11.001
Stiggins, R., & Chappuis, J. (2005). Using student-involved classroom assessment to close
Sung, Y., Chang, K., & Liu, T. (2016). The effects of integrating mobile devices with teaching
United Nations Educational, Scientific and Cultural Organization Education for All. (2007). The
growth of national learning assessments in the world, 1995-2006 New York, NY:
schools under Race to the Top. Washington DC: Synergy Enterprises, Inc.
Wayne Au. (2007). High-stakes testing and curricular control: A qualitative metasynthesis.
Wixson, K. K., Valencia, S. W., Murphy, S., Phillips, G. W. (2013) American Institutes, f. R.,
& National Center for, Education Statistics. (2013). A study of NAEP reading and writing
frameworks and assessments in relation to the common core state standards in english
Xiang, Y., Dahlin, M., Cronin, J., Theaker, R., & Durant, S. (2011). Do high flyers maintain
their altitude? performance trends of top students. ( No. 198). Washington DC, USA: