Ej 2

Journal of Deaf Studies and Deaf Education, 2020, 398–410
doi: 10.1093/deafed/enaa020
Advance Access Publication Date: 22 July 2020
Theory/Review Manuscript
T H E O RY / R E V I E W M A N U S C R I P T
Systematic Review of Curriculum-Based Measurement

with Students Who Are Deaf
Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Elizabeth A. Lam*, Kristen L. McMaster, and Susan Rose
Educational Psychology Department, University of Minnesota
*Correspondence should be sent to Elizabeth Lam, Educational Psychology Department, 250 Education Sciences Building, 56 East River Road, Minneapolis,
MN 55455, USA (e-mail: elam@umn.edu)
Abstract
This review systematically identified and compared the technical adequacy (reliability and validity evidence) of reading
curriculum-based measurement (CBM) tasks administered to students who are deaf and hard of hearing (DHH). This review
included all available literature written in English. The nine studies identified used four CBM tasks: signed reading f luency,
silent reading f luency, cloze (write in missing words given blank lines within a passage), and maze (circle the target word
given multiple choice options within a passage). Data obtained from these measures were generally found to be internally
consistent and stable with validity evidence varying across measures. Emerging evidence supports the utility of CBM for
students who are DHH. Further empirical evidence is needed to continue to explore technical properties, identify if student
scores are sensitive to growth over short periods of time, and examine whether CBM data can be used to inform
instructional decision-making to improve student outcomes.
Students who are deaf or hard of hearing (DHH) are a heteroge- underachievement for this subgroup of students who are DHH
neous population, with some students meeting or exceeding lit- (Luft, 2018). Evidenced-based practices are supported by the use
eracy benchmarks and others demonstrating limited proficiency of valid and reliable assessment to measure student response
(Mayer & Trezek, 2018). A large majority of students experience to instruction and to use data to modify the approach when
persistent delays in obtaining proficiency similar to their same- needed (Rose, 2007; Thomas & Marvin, 2016).
aged peers with typical levels of hearing (Allinder & Eccarius, Within the field, there remains a critical need for assess-
1999; Luckner, 2013). Limited literacy proficiency can impede ment tools that can reliably measure a student’s skill level,
students who are DHH from finishing high school (Appelman can generate data that predict performance on comprehensive
et al., 2012), can create barriers in postsecondary education assessment measures, are sensitive to growth over short periods
(Hartmann, 2010), and present challenges with meeting literacy of time, and can inform instructional decision-making for stu-
demands in the workplace (Luft, 2012). dents who are DHH (Thomas & Marvin, 2016). One assessment
Educational legislation calls for the use of data-based method that shows promise to inform instructional planning is
instruction and evidenced-based educational strategies to curriculum-based measurement (CBM; Devenow, 2003; Luckner
efficiently and effectively provide high-quality instruction and & Bowen, 2006; Rose, 2007).
targeted individualized instruction to students who are at risk CBM, as conceptualized by Deno (1985), is an approach
or with disabilities (IDEIA, 2004; ESSA, 2015). As such, there is a that can effectively provide “vital signs” of a student’s overall
need for researchers to develop and practitioners to implement academic performance. CBM was designed to measure students’
evidenced-based practices to address the persistent problem of responsiveness to instruction and inform service delivery.
Received December 18, 2019; revisions received June 13, 2020; accepted June 15, 2020
© The Author(s) 2020. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
398
E. A. Lam et al. 399
Deno (1985) described that CBM should meet the following only communication mode, and that communication modali-
criteria: It should be (a) reliable and valid, (b) simple and efficient, ties may be more complex than a three-category delineation.
(c) easy to understand, and (d) inexpensive. CBM tools are Luckner (2013) called for further research exploring the technical
designed to be quick, efficient, and sensitive to growth over properties of student scores and the potential of using CBM for
short periods of time. progress monitoring. Given the need for viable assessments tools
for students who are DHH, the appropriateness of these tools
Use of CBM with Hearing Populations for other special populations, and the call for future research
to explore the technical features of CBM with students who are
For students with typical levels of hearing, oral reading fluency DHH, we conducted this systematic review.
(ORF) is the most commonly used CBM tool. Within the context
of progress monitoring, students are administered the ORF task
as often as once a week with performance scored and charted. Purpose of Review
The general process is as follows: Students are presented with a
passage and prompted to read the text aloud in the time allotted The purpose of this review was to explore the technical adequacy
(e.g., 1 min). The examiner tracks as the student reads, marks of CBM with students who are DHH. According to Fuchs (2004), a

incorrectly read words on the examiner copy, and calculates the three-phased approach is needed to establish a body of evidence
total number of words read correctly. to support CBM measures within the instructional planning
Previous literature reviews have established the appropri- process. The process includes (a) demonstrating the technical
ateness of using ORF within the screening and benchmarking adequacy (reliability and criterion validity) of the measures, (b)
process (Marston, 1989; Wayman et al., 2007). Researchers have examining sensitivity to growth over time, and (c) exploring how
demonstrated that CBM tasks correlate well with high-stakes CBM data is used within the data-based decision-making pro-
state tests and standardized achievement tests and are sensitive cess. We aimed to explore the current body of evidence (phase
to growth over short periods of time (Ticha et al., 2009; Hunley 1) and create a springboard for future researchers to explore
et al., 2013; Shin & McMaster, 2019). The utility of CBM with whether the data generated from these measures are sensitive
students with intellectual disabilities and with English learners to growth over short periods of time (phase 2) or can be used in
(Els) has been supported (Christ et al., 2010; Sandberg & Reschly, the data-based decision-making process (phase 3) for students
2011; Hosp et al., 2014). When practitioners who work with stu- who are DHH.
dents with typical levels of hearing, they use CBM data to make
instructional decisions, and improved student literacy outcomes
have been observed (Fuchs & Vaughn, 2012; Jung et al., 2018). Method
To determine the utility of CBM for this population, the first step
is to explore features of technical adequacy including reliabil-
Research on CBM with Deaf Students ity and validity (Fuchs, 2004). Below we describe the literature
search process, the inclusion criteria, and coding process.
Since CBM has been demonstrated to provide reliable and valid
data for students with typical levels of hearing, it begs the ques-
tion: Could CBM provide useful data for students who are DHH?
Currently, the most common assessment tools used with stu-
Literature Search
dents who are DHH are state and commercial achievement tests To identify relevant studies, we followed a three-step process
(Luckner & Bowen, 2006), which are limited in the extent to which (see Figure 1). First, we broadly searched the Academic Search
they can inform instruction (Bruce et al., 2018). Progress moni- Premier, ERIC, Education Index Retrospective (1929–1983), and
toring tools such as CBM could supplement achievement tests Psych Info databases. Keyword terms related to deafness (deaf ,
and provide more timely information for day-to-day instruction deafness, hearing impairment, hearing loss, partial hearing, and hard
(Devenow, 2003; Luckner & Bowen, 2006; Rose, 2007). of hearing) were paired with CBM terms (CBM, curriculum-based
Careful consideration is needed to determine the appropri- assessment, progress monitoring, response to intervention, cloze pro-
ateness of CBM with the heterogeneous population of students cedure, DIBELS, oral reading fluency, and miscue analysis). As the
who are DHH. Research from cognitive science identified that search included a large timespan, the term “hearing impair-
some students who are DHH may decode and linguistically pro- ment” was used with the intent to identify a greater number
cess text differently than their hearing counterparts (Marschark, of potential studies, especially for studies with older publication
2006; Marschark et al., 2011), which suggests that selecting CBM dates. We limited our search to articles written in English. The
tasks that require less reliance on oral production in decoding initial search yielded 1,333 articles. The initial search articles
may be more appropriate. were screened for duplication, with 476 articles remaining. We
Luckner (2013) asked experts in the field of deafness to rate screened each title for relevance (n = 16) and then reviewed the
the appropriateness of Dynamic Indicators of Basic Early Lit- full articles based on the inclusion criteria, yielding six articles.
eracy Skills (DIBELS) reading probes, which are a type of CBM. Next, we then identified articles from various data sources
The probes included the following: initial sound fluency, let- and applied the same search method. We conducted ancestral
ter naming fluency, phoneme segmentation fluency, nonsense searches of 2 highly relevant conceptual/review articles (Luckner,
word fluency, oral reading fluency, retell fluency, and word 2012; Rose, 2007), which yielded 122 studies; 7 were identified
use fluency. In general, experts rated DIBELS probes as mostly as potentially relevant, and 4 were identified for inclusion. We
appropriate for children who use spoken language, possibly identified the most frequently occurring data sources from the
appropriate for students who use spoken language and sign, and initial searches (Exceptional Children, American Annals of the Deaf ,
not appropriate for students who use sign language. Luckner and Journal of Deaf Studies and Deaf Education) and examined
(2013) acknowledged limitations such as the term “appropriate” current abstracts (2012–2019), 547 articles were identified; one
not being defined, how students vary across multiple factors not article was identified for review, but did not meet criteria. We
400 Journal of Deaf Studies and Deaf Education, 2020, Vol. 25, No. 4
Figure 1. Systematic review screening process documents the systematic search process including the following steps: identification, screening, eligibility, and included
studies.
examined 38 technical reports from the Research Institute on Last, we conducted an interrater agreement (IRA) to deter-
Progress Monitoring (RIPM), and 1 article was identified (Rose, mine the consistency in decision-making. A trained researcher,
2008) and met the inclusion criteria in the full-text screen. We independent of the authors, conducted the review of 20%
identified an additional unpublished master’s thesis (Barkmeier, of the articles at each stage. IRA was as follows: initial
2009) which shared the same sample of participants as Rose database (99% title screen, 100% full screen), ancestral searches
(2008) but analyzed student performance on a different set of (88%, 100%), frequently occurring journals (99%, 100%), and
measures. The master’s thesis was screened, met criteria, and RIPM (100%, 100%). All disagreements were reviewed and
was ultimately included. resolved.
Inclusion Criteria IRA was conducted with a trained researcher with 20% of
the studies selected for review. In some cases, the first author
To be included in this review, each study needed to (1) assess
reported ranges where the rater reported the summary or com-
the reading performance of students who are DHH using at least
posite value, though both representations of the data were cor-
one CBM reading task, (2) explore the technical characteristics
rect. Inter-rater agreement was 96%; two disagreements were
(validity and reliability) of student scores, and (3) be written in
identified and resolved.
English inclusive of theses, dissertations, and technical reports.
For the CBM requirement, we accepted studies that generally
met the criteria set forth by Deno (1985), except in some cases
the scoring did not meet the criteria of simple and efficient.
Results
For the reliability and validity requirements, we used the def- The literature search yielded nine studies that fell into four
initions set forth by the American Educational Research Asso- categories (signed reading fluency, silent reading fluency, cloze,
ciation (AERA, 1999). As defined, reliability refers to “the con- and maze) described in more depth below. For each study, the
sistency of such measurements when the testing procedure is reliability and validity findings were reviewed. See Table 1 for the
repeated on a population of individuals or groups” (p. 25). Within reliability and Table 2 for the validity findings for each study.

this context, researchers explored the consistency of the metrics For the purpose of this review, descriptive classifications are
of the CBM tasks by analyzing inter-rater agreement, internal used to denote the strength of the correlations. What consti-
consistency, stability, and equivalence. According to AERA (1999), tutes strong, moderate, and weak correlations is context depen-
validity refers to “the degree to which evidence and theory dent; thus, previous narrative reviews of the technical adequacy
support the interpretations of test scores entailed by proposed of CBM measures with students with typical levels of hearing
uses of the test” (p. 9). In this context, students CBM performance (Marston, 1989; Wayman et al., 2007) and recommendations
was compared to a student’s performance on standardized tests, from the field of measurement were considered (Cohen, 1988;
other fluency measures, and teacher ratings. Thorndike & Thorndike-Christ, 2010). For reliability, a correlation
The review included peer-reviewed articles, technical reports, above .80 is considered strong, between .60 and .79 as moderate,
dissertations, and master’s theses. In their article on quality and .59 and below as weak. For validity, correlations above .50
indicators for systematic reviews of research, Talbott et al. (2017) are considered strong, between .30 to .49 as moderate, and .29
recommended actively seeking and including all works (includ- and below weak. These descriptive classifications should be
ing dissertations, theses, and other forms of “gray literature”) to interpreted with the understanding that standardized ranges
have the full range of relevant studies to review. The intent of the have not been established for this population.
inclusive approach is to review the full population of studies to
reduce bias, consistent with the recommendations from Talbott
et al. (2017). Signed Reading Fluency
Two research teams (Allinder & Eccarius, 1999; Easterbrooks &
Coding Huston, 2008) examined the technical properties of signed read-
Each article identified in this systematic review was coded on ing fluency measures (students read the presented passage to
key features of reliability and validity. For reliability, articles were the examiner using sign language). Allinder and Eccarius (1999)
coded on three features: analysis, metrics, and findings. For the assessed students who were prelingually deaf with moderate to
analysis, the article was categorized based on the method by profound hearing loss and attended general education elemen-
which reliability was explored, which included one or more of tary schools. As reported by the authors, students use variants
the following codes: inter-rater, internal consistency, alternate of an English-based sign system, Signing Exact English. The
form, and test retest. For metrics, the article was coded based on participants (n = 36, ages 6 to 13) were administered CBM tasks
the scoring type identified by the authors. For example, for the derived from the Comprehensive Reading Assessment Battery
maze CBM task, the metrics included the number of correctly (CRAB; Fuchs et al., 1989) and the Test of Early Reading Ability—
identified words (“correct” metric), number of words correctly Deaf or Hard of Hearing (TERA-DHH; Toubanos, 1995) (norm-
identified words after adjusting for guessing (“corrected” metric), referenced test).
number of words read (“scan” metric), and number of incorrectly Study results yielded statistically significantly poorer perfor-
identified words (“incorrect” metric). Due to the variation across mance (p < .001) for both the 1 and 3 min time lengths when
study types, the metrics used varied. For findings, the authors morphological endings were required for signed responses. For
reported reliability correlations or percentages, with findings reliability, inter-rater agreement varied (40% to 100%, M = 79%),
from significance testing included when available (see Table 1). and internal consistency and alternate form were strong. For
For validity, articles were coded on the following dimen- criterion validity evidence, the correlation between the CBM and
sions: type of measure, name of measure, CBM metrics, relation the TERA-DHH (Toubanos, 1995) fell in the moderate range.
between measures (most often correlations), and significance. Allinder and Eccarius (1999) required one-on-one correspon-
For type of measure, the article was coded based on the criterion dence between the printed word and the signed word; in con-
measure to which the CBM tools were compared. Across studies, trast, Easterbrooks and Huston (2008) measured the quality of
measures included achievement tests, fluency tests, informal the rendered sign interpretation. Easterbrooks and Huston (2008)
measures, and CBM. Often times, an author used more than one assessed students with severe to profound bilateral sensorineu-
comparison type. The name of the measure used as the criterion ral hearing loss who attended a school for the deaf (n = 29,
was also documented. In most cases, the criterion measure was a age 9 to 16). One student communicated receptively through
standardized achievement test, in other cases the measure was sign language and expressively through spoken English. The
a teacher rating or fluency test. Similar to reliability, the data remaining students varied in their sign communication ranging
reported by the authors on the CBM metrics used, the relations from a more English-based signing approach to American Sign
between measures, and any significance testing (when available) Language (ASL) or a combination of features within this range.
were reported; see Table 2. Students’ performance was scored using the Signed Reading
Table 1 Reliability of CBM for students who are deaf or hard of hearing
Citation Analysis Metrics Findings p values when

available
Signed reading fluency
Allinder & Eccarius (1999) Inter-rater Words read correctly 40 to 100% (M = 78.69%)
Idea units retold 0 to 100% (M = 78.76%)
Internal consistency 1- & 3-min passages .89 to .97
Alternate form 1-min passages .85
3-min passages .94
Easterbrooks and Huston Inter-rater Fluency envelope .98

(2008) Visual grammar .75
Internal consistency .86
Silent reading fluency
Rose, (2008) Inter-rater Correct words identified 92%

Alternate form Correct words identified .92
Lam (2020) Alternate form: Correct words identified .50 to .69 <.01
paper–pencil Correct boundaries .51 to .70 <.0 l
Percent correct .39 to .59 <.01 to .02
Alternate form: e-based Correct boundaries .68 to .75 <.01
Percent correct .49 to .71 <.01
Cloze
LaSasso (1980) Internal consistency Fifth grade passages (six forms) .67 to .82
Kelly & Ewoldt (1984) Inter-rater (sample of 100 Meaningful to passage 82%
responses) Meaningful in sentence 81%
Related to English form 79%
Sign form classification 82%
Maze
Chen (2002) Inter-rater Maze 100%

Alternate form Correct .86 & .77
Corrected .85 & .79
Scan .78 & .62
Incorrect .55 &. 30
Test–retest Correct .83 <. 01
Corrected .85 <. 01
Scan .82 <. 01
Accuracy .39 <. 01
Incorrect .11 Ns
Devenow (2003) Alternate form Correct (Phase 2) .60 to .80 <. 001
Corrected (Phase 2) .64 to .82 <. 001
Scan (Phase 2) .45 to .70 <. 001
Barkmeier (2009) Alternate Form Form A .42 to .75

Form D .80 to .86
Form A & Form D .41 to .90
Form E −.21to .85
Lam (2020) Alternate form: CMC .76 to .84 <.01

paper–pencil CMC-IMC .66 to .84 <.01
CMC-IMC/2 .71 to .84 <.01
Alternate form: e-based CMC .66 to .80 <.01
CMC-IMC .61 to .72 <.01
CMC – IMC/2 .64 to .73 <.01
Note: CMC, correct maze choices; IMC, incorrect maze choices; CMC-IMC, correct maze choices minus incorrect maze choices; CMC-IMC/2, correct maze choices minus
the value of incorrect maze choices divided by a value of 2.
Table 2 Validity of CBM for students who are deaf or hard of hearing
Citation Type of measure Name of measure CBM metrics Relationship Significance

between measures
Signed reading fluency
Allinder & Eccarius Achievement test TERA-DHH M no. of words read 1 min .30 ns
(1999) M no. of words read 3 min .21 ns
M no. of idea units retold .36 ns
M no. of words retold .46 p < .05
M no. of unique words retold .47 p < .05
% of content words retold .46 p < .05
Easterbrooks and Achievement test WRMT-R Word comprehension .38 to .46 p < .05
Huston (2008) Passage comprehension .55 to .64 p < .01

Total comprehension .43 to .50 p < .05 or p < .01
Silent reading fluency
Rose, (2008) Fluency test TOSCRF Correct words identified .84 & .90 Not reported
Achievement test MAP Correct words identified .58 to .75 Not reported
Informal Teacher ratings Correct words identified .54 to .85 Not reported
Lam (2020) Achievement WJ-III Passage Correct words identified .33 to .48 <.01 to .05
(paper–pencil CBM) Comprehension Correct boundaries .30 to .45 <.01 to .07
Total wrong −.51 to −.26 <.01 to .12
MAP Correct words identified .55 to .71 <.01
Correct boundaries .53 to .72 <.01
Total wrong −.42 to −.71 <.01 to .04
Achievement WJ-III Passage Correct boundaries .25 to .34 .04 to .14
(e-based CBM) Comprehension Total wrong −.53 to .05 01 to .76
MAP Correct boundaries .37 to .51 <.01 to .07
Total wrong −.59 to −.21 <.01 to .32
Cloze
LaSasso (1980) CBM Reading for concepts Passages (third, fifth, & Ranking (easy to NA
series seventh) difficult): fifth,
seventh, third grade
Kelly & Ewoldt, Achievement test SAT-HI Verbatim χ 2 =8.94 p = .003

(1984) Meaningful in passage χ 2 =10.02 p = .003
Acceptable English form χ 2 =17.54 p < .001
Story retell Story retell Verbatim χ 2 =.036 p = .849
Meaningful in passage χ 2 =3.76 p = .05
Acceptable English form χ 2 =.000 p = 1.0
Maze
Chen (2002) Achievement test TOWL-3 (winter, Correct .76 & .88 p < .01
spring) Corrected .77 & .89 p < .01
Scan .82 & .83 p < .01
Accuracy .52 & .62 p < .05 & p < .01
Incorrect .26 & -.17 ns
Informal measure Teacher ratings Correct .79 & .76 p < .01
(winter, spring) Corrected .82 & .74 p < .01
Scan .80 & .74 p < .01
Accuracy .48 & .50 p < .05 & p < .01
Incorrect .19 & .01 ns
Devenow (2003) Achievement test SAT P1-2M correct .72 & .74 p < .05
P1-2M corrected .74 & .75 p < .05
P1-2M scan .56 & .71 p < .05
P1-4M correct .64 & .64 p < .05
P1-4M corrected .64 & .66 p < .05
(Continued)
Table 2 Continued
Citation Type of measure Name of measure CBM metrics Relationship Significance

between measures
P1-4M scan .46 & .56 p < .05

P1-untimed correct .62 & .57 p < .05
P1-untimed .60 & .60 p < .05
corrected
P1-untimed scan .02 & .49 ns & p < .05
P2 correct .87 & .88 p < .001
P2 corrected .89 & .89 p < .001
P2 scan .77 & .83 p < .001
Barkmeier (2009) Achievement test MAP Elementary .80 & .91 Not Reported
Middle school .59 to .91 Not Reported

High school −.10 to .85 Not Reported
Informal measure Teacher ratings Elementary .86 & 81 Not Reported
Middle school .86 & .85 Not Reported
High school −.20 & -.28 Not Reported
Lam (2020) Achievement WJ-III Passage CMC .33 to .57 <.01 to .05
(paper–pencil CBM) Comprehension IMC −.27 to −.43 <.01 to .10
CMC-IMC .50 to .58 <.01 to .02
CMC-IMC/2 .36 to .58 <.01 to .03
MAP CMC .49 to .65 <.01 to .01
IMC −.50 to −.20 .01 to .33
CMC-IMC .50 to .67 <.01 to .01
CMC-IMC/2 .50 to .67 <.01 to .01
Achievement WJ-III Passage CMC .43 to .50 <.01
(e-based CBM) Comprehension IMC −.50 to −.33 <.01 to .05
CMC-IMC .49 to .52 <.01
CMC-IMC/2 .46 to .51 <.01
MAP CMC .56 to .60 <.01
IMC −.18 to −.04 .40 to .84
CMC-IMC .51 to .61 <.01
CMC-IMC/2 .54 to .61 .01
Note: TERA-DHH, Test of Early Reading Ability—Deaf or Hard of Hearing; WRMT-R, Woodcock Reading Mastery Test—Revised; TOSCRF, Test of Silent Contextual Reading
Fluency; MAP, measures of academic progress; CBM, curriculum-based measurement; WJ-III, Woodcock-Johnson Tests of Achievement—Third Edition; SAT-HI, Stanford
Achievement Test—Hearing Impaired; TOWL-3, Test of Written Language—Third Edition; SAT, Stanford Achievement Test; P1-2M, Phase 1—2 min; P1-4M, Phase 1—4
min; P1, Phase; P2, Phase 2; CMC, correct maze choices; IMC, incorrect maze choices; CMC-IMC, correct maze choices minus incorrect maze choices; CMC-IMC/2, correct
maze choices minus the value of incorrect maze choices divided by a value of 2.
Fluency Rubric for Deaf Children (Easterbrooks & Huston, 2008). authors varied in their approaches of scoring student responses
Study participants were first administered the Signed Read- which included requiring word-by-word translation or assessing
ing Fluency Rubric for Deaf Children and then, 0 to 4 months the quality of the signed interpretation of the text. Since the task
later, were administered two subtests (Word Comprehension demands and scoring methods varied widely for signed reading
and Passage Comprehension) from the Woodcock Reading Mas- fluency, it is questionable if the authors were measuring the
tery Test—Revised (WRMT-R; Woodcock, 1987). same construct. As such, these findings suggest a complexity
Inter-rater agreement was strong for the fluency envelope in administering, scoring, and interpreting the performance of
metric (overall visual appearance of the signed interpretation— sign reading fluency with students who are DHH who use a sign-
with or without voice) and moderate for the visual grammar based system.
metric (elements that demonstrate the reader is deriving mean-
ing from the text—in an English-like mode or ASL). Internal
consistency fell in the strong range. Criterion validity evidence Silent Reading Fluency
for the Signed Reading Fluency Rubric for Deaf Children with
Two research teams explored silent reading fluency (Rose, 2008;
the WRMT-R (Woodcock, 1987) spanned the moderate to strong
Lam, 2020). Rose (2008) conducted research at a school for the
ranges.
deaf within the context of a school-wide progress monitoring
In summary, two studies (Allinder & Eccarius, 1999; East-
program. Participants (n = 101, grades 3 to 12) attended a resi-
erbrooks & Huston, 2008) explored the reliability and validity
dential school for the deaf with six students attending the com-
of student performance when students were presented with a
munity public school part-time. All students qualified for special
passage and prompted to read the passage in sign language.
education services due to their hearing loss status (n = 36 mild
Participants within the studies varied in the features of their
to moderate, n = 61 severe to profound, n = 4 range unknown).
expressive sign language communication, which may impact
Twenty-three percent (n = 23) of the sample had additional dis-
how the students processed the task, the degree of difficulty
abilities. Within the school-wide program, Rose (2008) analyzed
of the task, and the type of student response generated. The
student performance on the measures of academic progress
(MAP; NWEA, 2003), silent reading fluency test (APPROACH 1: ranges. For criterion validity in the paper–pencil conditions, WJ-
(SRFT); Rose and McAnally, 2008), test of silent contextual read- III Passage Comprehension (Woodcock et al., 2001) correlations
ing fluency (TOSCRF; Hammill et al., 2006), and teacher ratings. fell in the weak to moderate range and MAP (NWEA, 2003)
See Maze section for further details on the findings from Bark- correlations in the moderate to strong ranges. For e-based, the
meier (2009), who analyzed student data from the CBM maze, correlations ranged from weak to strong for passage comprehen-
MAP (NWEA, 2003), and teacher ratings. sion subtests and weak to strong for MAP.
Rose (2008) presented the SRFT (in which students read mod- In summary, Rose (2008) and Lam (2020) explored the relia-
ified passages in which the story was presented in all upper bility and criterion-related validity of silent reading fluency. The
case letters with no spaces or punctuation and students put a results across the studies varied with the results of Rose (2008)
slash between the boundaries of words). This measure used the generally reporting higher correlations for reliability and validity
formatting structure of the TOSCRF (Hammill et al., 2006) with as compared to Lam (2020). Even when comparing the studies
content derived from Reading Milestones (Quigley et al., 2001) and using only the paper–pencil condition and the same metrics,
Reading Bridge (Quigley et al., 2003). For the SRFT and TOSCRF the differences remained. It appears as if sample differences in
tasks, the participant’s score was calculated by summing the size, demographics, measures selected, and length of the tasks
correct number of words identified. The SRFT and the TOSCRF, presented may have impacted differences in scores. Due to these

presented as 3-min timed tasks, were administered within a differences, further research is needed to examine the technical
3-day time frame and presented four times in the year. MAP properties and utility of silent reading fluency with students
(NWEA, 2003) was administered twice a year, within 10 days of who are DHH.
the progress monitoring measures. In the fall, teachers rated the
student’s reading ability (1 = low reading ability to 5 = average
reading ability). Inter-rater agreement was high with strong
alternate form reliability. For criterion validity, strong correla-
Cloze
tions were present between the SRFT, TOSCRF (Hammill et al., Two studies (LaSasso, 1980; Kelly & Ewoldt, 1984) explored the
2006), MAP (NWEA, 2003), and teacher ratings. SRFT and TOSCRF technical adequacy of the cloze procedure (students read pas-
were sensitive to growth (p < .001) for students in elementary sages in which every fifth word was removed and replaced with
and high school, but not middle school. a blank, and students wrote in the missing word for each blank).
Lam (2020) explored the technical adequacy of silent reading LaSasso (1980) assessed students who were prelingually and
fluency in a traditional paper–pencil and electronic-based deliv- profoundly deaf who attended residential schools for the deaf
ery formats. Additionally, Lam (2020) also explored the technical (n = 95, ages 14 to 18). For this study, four passages were selected
adequacy of maze in both delivery formats. See Maze section from the Reading for Concepts reading series (McGraw-Hill, 1970).
below for student performance on the maze portion of the Lam The passages were selected at third, fifth, and seventh grade lev-
(2020) study. Students were included if they met the follow- els. Participants were administered three passages: third grade
ing four criteria: (a) a documented hearing loss, (b) a reading (selected from one of three forms), fifth grade (one of six forms),
level between the second and fourth grade according to teacher and seventh grade (one of three forms).
report, (c) placement in grades 2 to 12, and (d) no known motor of Internal consistency spanned the moderate to strong range.
uncorrected vision impairment. Students (n = 40) were generally For criterion validity, the cloze procedure did not accurately rank
in elementary school (n = 33, 83%), were White (n = 20, 50%), used the readability of passages. LaSasso (1980) proposed three possi-
Spoken English as their primary communication mode (n = 26, ble interpretations: (1) the readability formulas used not predict
65%), and used amplification (n = 36, 90%). reading difficulty, (2) the cloze scores are non-predictive, or (3)
Students were presented with CBM materials with the order neither approaches predict passage difficulty. Further research
of conditions (paper–pencil versus e-based) and measures (maze was needed to explore these findings.
versus silent reading fluency) counterbalanced. Within each Kelly and Ewoldt (1984) examined the technical adequacy of
measure the forms were randomized. For silent reading flu- the cloze procedure when both verbatim and responses that
ency, students were presented with six passages (two repeating were not verbatim but maintained text coherence were consid-
passages across the paper–pencil and e-based conditions, one ered correct. Students who were DHH and attended the Kendall
non-repeating passage in the paper–pencil condition and one Demonstration Elementary School (n = 96, age 7 to 15) partici-
non-repeating passage in the e-based condition). Silent reading pated. Prior to administration, teacher judgment was used to
fluency passages at the third grade reading level were selected. identify the cloze passage that matched each student’s approxi-
For the task, students marked the boundaries between words by mate reading level. For the task, each student was administered
drawing a vertical line (paper–pencil) or by clicking on the space one cloze passage. Some students also completed a story retell
between the words with a computer mouse (e-based). task in which students provided a signed (or spoken) retelling of
Scoring metrics included: correct words identified (CWI, the text.
number of words after boundary lines are drawn) a metric Using a borderline group technique, performance standards
only for paper–pencil condition, correct boundaries (number were established, with the student performance on each of the
of boundary lines selected that correctly separated words), and three cloze metrics classified as “acceptable” or “unacceptable.”
percent correct (correct boundaries/(correct + incorrect bound- Additional data from a story retell (n = 57) and performance on
aries)). In addition to the silent reading fluency task, students the reading comprehension subtest of the Stanford Achievement
completed the Woodcock Johnson III Test of Achievement Third Test—Hearing Impaired (SAT-HI; as cited by Kelly & Ewoldt,
Edition Passage Comprehension subtest (Woodcock et al., 2001). 1984) (n = 74) were analyzed using this same technique. Inter-
Spring 2015 MAP data (NWEA, 2003) were available for 67% of rater reliability was generally high. The criterion validity of cloze
the sample. passages with story retell and the SAT-HI (as cited by Kelly &
For alternate form reliability, Pearson product–moment cor- Ewoldt, 1984) yielded significant agreement between all three
relations for the passages delivered in the paper–pencil and the cloze metrics and the SAT-HI, but only one metric was significant
e-based conditions correlations fell within the weak to moderate for story retell.
In summary, two studies (LaSasso, 1980; Kelly & Ewoldt, 1984) Measurement, 1995) yielded the strongest correlations when
explored the reliability and criterion validity of Cloze. Overall, administered in the 2-min time condition.
reliability was moderate to strong (LaSasso, 1980) and relativity As noted earlier, Rose (2008) and Barkmeier (2009) shared the
high (Kelly & Ewoldt, 1984). To evaluate validity, both authors same sample of participants but Rose (2008) analyzed silent
selected a different method of analysis. For LaSasso (1980) the reading fluency results, whereas Barkmeier (2009) explored
passages did not rank from easy to hard based on student maze. Maze passages, derived from the basic academic skill
performance, and for Kelly and Ewoldt (1984), there was higher sample (Espin et al., 1989), were administered three times during
agreement in the consistency of decisions for the SAT-HI than the school year. The score was the number of incorrect words
the story retell. subtracted from the total number of words correct. Perfor-
mance was compared to the MAP (NWEA, 2003) and teacher
ratings.
For reliability, student performance varied across forms and
Maze grade levels, with the highest correlations noted for elementary
Four studies (Chen, 2002; Devenow, 2003; Barkmeier, 2009; Lam, aged students using testing form D. For criterion-related validity,
2020) explored the technical adequacy of the maze, which is a strong correlations were present for both MAP (NWEA, 2003) and

modified-cloze technique (Fuchs & Fuchs, 1992), in which stu- teacher ratings in the elementary and middle school grades, with
dents read passages in which every seventh word in the passages lower correlations noted for high school students.
is replaced with multiple-choice options. Students circle the Lam (2020) analyzed student performance using silent
word that best completes the sentence. reading fluency (described earlier in the text) and maze
Chen (2002) explored the technical adequacy of CBM reading (described below) under two delivery formats (e-based versus
and writing passages with elementary students who are DHH paper–pencil). Three scores were used in this study: correct
(age 6 to 12, M = 10.12). All students attended the same school maze choices (CMC, total number of correctly identified selec-
and received instruction either in a self-contained classroom tions within one min), correct maze choices minus incorrect
(n = 37) or in the mainstream (n = 14). Participants completed maze choices (CMC-IMC, total number of correctly identified
three 1-min maze passages each in the winter and spring. Mea- selections minus incorrect selections in one min), and correct
sures included the maze, CBM written language probes, a sub- maze choices minus 1/2 incorrect maze choices (CMC-IMC/2,
test from the Test of Written Language—Third Edition (TOWL- total number of correctly identified selections minus the total of
3; Hammill & Larsen, 1996), and teacher ratings. Maze metrics incorrect selections divided by two).
included correct (number of correct selections), corrected (num- For alternate form reliability, Pearson product–moment cor-
ber of correct selections adjusted for incorrect selections), scan relations for the passages delivered in the paper–pencil and the
(total number of words read), and accuracy (percent of correct e-based conditions correlations fell within the weak to moderate
selections out of the total number of choices). ranges. For criterion validity in the paper pencil conditions, WJ-
Inter-rater reliability was high. For alternate form reliability III Passage Comprehension (Woodcock et al., 2001) and MAP
and test–retest, correlations ranged from weak to strong, with (NWEA, 2003) correlations fell in the moderate to strong range.
the highest correlations present for the metrics of correct and For e-based, the correlations across CBM probes ranged from
corrected. All correlations between the maze and the achieve- moderate to strong for passage comprehension subtest in the
ment test fell in the strong range except for the metric of strong range for MAP.
incorrect that fell in the weak range. For teacher ratings, the In summary, the maze was assessed in four of the nine
metrics of correct, corrected, and scan fell in the strong range, studies. Reliability evidence was generally strong across mea-
accuracy spanned the moderate to strong ranges, and incorrect sures. Validity evidence supported the use of the maze with
choices fell in the weak range. students who are DHH especially for elementary and middle
Devenow (2003) assessed the technical adequacy of the maze school students, when the task was 1–2 min in length, and the
but used a two-phase approach. Thirty-four students (ages 11 to metrics of correct choices and corrected choices were employed.
17, M = 13.37) participated in Phase 1 and 31 students (ages 9 to The maze results suggest that educators could use this tool,
15, M = 12.25) participated in Phase 2. All participants attended a in conjunction with additional data sources, as an indicator of
residential school for the deaf either in the geographic Midwest students’ reading proficiency.
(Phase 1) or Southwest (Phase 2). American Sign Language was
the primary mode of communication for the majority of par-
ticipants with spoken English, or a combination of both modes
(spoken language and sign) also reported.
Discussion
For Phase 1, each participant was administered six maze The purpose of this systematic literature review was to explore
passages with two probes administered at each of the three the reliability and validity of CBM with students who are DHH as
testing conditions (untimed procedure, 2-min procedure, 4- a first step in determining the utility of CBM for instructional
min procedure). The six CBM metrics in each of the three planning and data-based decision-making. Nine studies were
conditions were correlated with the participants’ performance identified and included the CBM tasks of signed reading fluency,
on the Stanford Achievement Test (SAT; Harcourt Brace silent reading fluency, cloze, and maze.
Educational Measurement, 1995). For Phase 2, the technical The technical adequacy (reliability and validity) for each
adequacy of the 2-min condition was explored further with each study was analyzed. Reliability was assessed using a range of
participant administered four maze passages under the 2-min techniques including inter-rater, internal consistency, stability,
condition. and equivalence analysis. Validity evidence included the degree
Alternate form reliability, calculated only in Phase 2, spanned to which the scores obtained could be interpreted for the
the moderate to strong range for correct and corrected and intended uses. In the context of CBM, can an educator interpret
spanned the weak to moderate range for incorrect. Correlations the student’s score on a CBM task to have meaning beyond the
between the maze and the SAT (Harcourt Brace Educational given CBM task; can the score be interpreted as an indicator
of the student’s overall reading ability? Would a student’s role of oral reading fluency when used with students who are
performance on a CBM task be similar to their performance DHH who use oral communication.
on an achievement task (e.g., high CBM performance and high
achievement performance, would the inverse be true)? Below we
describe the reliability and validity evidence for each of the four Silent Reading Fluency
measures.
Silent reading fluency was measured in two of the nine studies
(Rose, 2008; Lam, 2020) in this review. In Rose (2008) reliability
Signed Reading Fluency was high. Additionally, for students with typical levels of hearing,
correlations between the TOSCRF (the measure to which the
For signed reading fluency variation was present in how the
SRFT was based) and criterion measures ranged from .57 to .80
authors approached viewed, interpreted, and scored student
(Hammill et al., 2006), which is generally similar to the findings
signed responses (Allinder & Eccarius, 1999; Easterbrooks & Hus-
of Rose (2008) using the SRFT with students who are DHH. In
ton, 2008). Across studies correlations were strong for internal
contrast, Lam (2020) reported lower reliability coefficients for
consistency with more notable variation for other reliability
test retest and validity coefficients that were rarely sufficient.
types. Validity evidence between signed reading fluency and an

There were notable differences in the sample size, demograph-
achievement test yielded correlations in the moderate (Allinder
ics, length, and construction of the silent reading passages,
& Eccarius, 1999) or moderate to strong ranges (Easterbrooks &
task demands (paper–pencil and e-based), and scoring protocol,
Huston, 2008).
which limit direct comparison across the Rose (2008) and Lam
It is critical to note a few factors that present challenges
(2020) studies.
during the scoring and interpretation of signed reading flu-
However, when completing the silent reading fluency task,
ency—factors that are not present when native English speakers
students who are DHH and students with typical levels of hear-
with typical levels of hearing complete the task. First, students
ing respond to the prompt in the same way (drawing vertical
who are DHH vary not only in their expressive language ability
lines to indicate boundaries between words), which may pro-
but also in the way they present information in a signed form
mote some commonalities in performance. Also since the task
(e.g., Signed Exact English, Signed English, Total Communication,
is read silently, it is possible that a sequential word-by-word
American Sign Language). As such, the scoring of signed reading
encoding of the text may not be required, and students could
fluency with students who use signed English with an emphasis
more flexibly work through the task. Future research is needed
on word-by-word replication of the written text (Allinder &
to explore the utility of silent reading fluency with students who
Eccarius, 1999) is different from assessing the quality of the
are DHH, as silent reading fluency may provide additional means
language interpreted text into a sign-based system or American
to assess students reading performance that does not require
Sign Language (Easterbrooks & Huston, 2008). Both methods
interpretation of signed responses.
are renderings of the text but may require different cognitive
processes and engagement with the task.
Second, students with typical hearing who live in native Cloze
English speaking environments may have full access to spoken
The cloze procedure was assessed in two of the nine studies. For
English, a language in which the passage is presented. In con-
LaSasso (1980) and Kelly and Ewoldt (1984), results suggest gen-
trast, students who are DHH, even in a native English speaking
erally consistent student performance but variation in validity
home, may have limited or no functional access to English.
evidence for these measures. Both cloze and maze procedures
Limited access to English may create challenges as students are
have students read a passage with missing words presented at
then required to translate the written English text (the student’s
set intervals, and the students insert the missing word by either
second language) into their first language (signed-based system)
writing in the word (cloze) or circling the correct word when
and then generate a response for an examiner in a signed
given three choices (maze). As described below, the evidence
form.
supporting the validity of the scores obtained in maze appear
Of further note, correlations between ORF and criterion mea-
stronger than cloze. It may be possible that differences in a
sures with students with typical levels of hearing generally range
student’s language exposure, levels of communication access,
from .60 to .90 (Marston, 1989; Wayman et al., 2007), which is
and cognitive organization of linguistic information may create
notably above the correlations presented in this review. These
greater difficulty in self-generating and writing a missing word
findings suggest that signed reading fluency with students who
as compared to recognizing the correct word when given a set of
are DHH do not function similarly as ORF with students with
choices. Further research is needed to test this hypothesis.
typical levels of hearing. ORF is the most common CBM tool,
and due to its popularity and availability, it is feasible that
an accommodation to promote test access would be to have
the student who is DHH read the passage “aloud” using sign
Maze
language rather than orally. The maze was assessed in four of the nine studies within this
The findings from this review are consistent with prior review. For reliability, Chen (2002), Devenow (2003), and Lam
research in cognitive science suggesting that students who (2020) reported generally high reliability correlations when
are DHH may interact with text differently (Marschark, 2006; using the metrics of correct and corrected; lower correlations
Marschark et al., 2011) and accommodations alone do not were present in alternative scoring methods of scan, accuracy,
create equivalency (Cawthon, 2015). These results suggest that and incorrect responses. Barkmeier (2009) analyzed correlations
student performance data from a signed passage (signed reading across forms rather than by metric. Results yielded that student
fluency) cannot be interpreted to have the same meaning as an performance on some forms as highly reliable, whereas on other
orally read passage (oral reading fluency). Further research is form student performance varied especially in the upper grades.
needed to empirically explore the relation between sign and These findings suggest that form selection and grade level may
oral reading fluency, and evidence is needed to establish the be a critical factor to screen when selecting Maze probes for
students. This pattern of more modest correlations in the upper Additionally, students who are DHH who use spoken English
grades is consistent with the pattern of correlations for students to communicate were underrepresented in this review. Since
with typical levels of hearing (Marston, 1989; Wayman et al., ORF is the most common measure for students with typical
2007). hearing, the consideration of appropriateness of students who
These findings provided preliminary validity evidence that are DHH who use spoken English is likely. Available research
student data on the maze may serve as a general outcome suggests that students who are DHH are diverse in how they
measure of students’ overall reading competency. In general, acquire reading skills and the cognitive processes they use when
when students performed well on maze, their overall perfor- engaging in the reading process (Herman et al., 2019). Reading
mance on achievement tests was also high, with the inverse tasks, including ORF, provides limited information as to how the
finding present as well. In addition to the empirical evidence, student is cognitively processing the text and assumes auditory
theoretically maze may be a promising option as it does not and phonological fluency in English (Luft, 2019). Understanding
require oral or signed production and may allow for variation of these factors is needed when administrating and interpreting
in cognitive processing (Marschark, 2006; Marschark et al., 2011). the findings of student performance using ORF and may limit
Since only four studies explored maze, additional evidence are comparison of performance to students with typical levels of
needed to confirm these findings. hearing.

The legislative call emphasizes the importance of using data-
based instruction and evidenced-based educational strategies to
Limitations
efficiently and effectively provide high-quality instruction to all
Findings of this review should be considered in light of a few students (IDEIA, 2004; ESSA, 2015). According to Fuchs (2004), a
limitations. First, following the systematic search, only nine three-phased approach is needed to establish a body of evidence
studies were identified which is a limited number of studies. to support CBM measures within the instructional planning
Although inclusion criteria included articles and unpublished process. Based on this systematic review, the field is currently
works from a variety of data sources to include the full pop- in phase 1 of this process exploring reliability and validity. To
ulation of studies, it is notable that three of the maze studies move through the phases of generating evidence, we need to
were dissertations or master’s theses originating from the same build strong relationships between researchers and practition-
university. Conclusions would have been stronger if additional ers. From a research perspective, it is within these relationships
studies from a variety of generating institutions and authors that empirical studies can be conceptualized and applied with
were present. students in real-word contexts and results examined to build the
Second, studies varied in the administration, scoring, and evidence base. From a practitioner perspective, these partner-
interpretation of the tasks, even within the same CBM task. ships can create an avenue to meet the legislative call by using
For example, with signed reading fluency, Allinder and Eccarius CBM data—and other data forms—to inform instructional plan-
(1999) required a one-to-one correspondence of word-to-sign, ning, in which additional guidance and resources are provided
whereas Easterbrooks and Huston (2008) assessed the quality through a research partnership.
of the interpreted signed response. For silent reading fluency, Currently, practitioners are seeking ways to engage in data-
cloze, and maze, there were variations in task difficulty, length based decision-making, with CBM appearing to be a viable tool.
of passage presented, and scoring techniques used. Last, in most The results of this systematic review provide preliminary evi-
cases there were a limited number of study participants per dence that ORF used with students with typical levels of hearing
study, which is often a challenge with students in a low incidence may not be appropriate for students who are DHH. Findings
disability area. Across studies, the students were heterogeneous, suggest that silent reading fluency or maze may serve as viable
which limits inferences as to how these measures function with alternatives, as the nature of these tasks reduces some chal-
subsets of students who are DHH but may provide a broad lenges with spoken responses or signed interpretations, but
understanding of the performance of students who are DHH further research is needed. When administering maze, the most
across the population. robust results were present when it was administered to elemen-
tary and middle school students, when the task was 1–2 min in
length, and when the metrics of correct choices and corrected
Future Research
choices were employed.
Future research is needed to explore the utility of CBM with
students who are DHH. We recommend that future researchers
include larger sample sizes, systematically assess subgroups of Conclusion
students who are DHH, and clearly describe the demographic
Results from this systematic review of the literature suggest
characteristics of the sample. Future research is recommended
that student performance on CBM tasks is generally consistent
in the areas of administration procedures including accommo-
and stable with emerging evidence that select CBM tasks that
dations and longitudinal datasets. When determining the most
may have the capacity to serve as general outcome measures of
promising CBM tools, Deno’s (1985) criteria (reliable and valid,
students’ general reading proficiency. These promising prelimi-
simple and efficient, easy to understand, and cost-effective)
nary results suggest the need for continued research to grow the
should also be considered.
evidence base related to the technical characteristics of the most
More specifically, students who are DHH may have varying
promising measures, identify how these measures demonstrate
levels of auditory access to spoken English and different levels of
progress over time, and identify ways to integrate and interpret
language fluency in one or more languages and may vary in how
student-level data to inform day-to-day instructional planning
they engage these language(s) when processing written English
to improve student outcomes.
text. As such, it seems important for future researchers to assess
and clearly describe the language experiences of student par-
ticipants, as this information may shed light on a student’s
reading fluency, comprehension across language modes, and the
Conf licts of Interest
relation between fluency and comprehension. No conflicts of interest were reported.
References Fuchs, L. S., Fuchs, D., & Hamlett, C. L. (1989). Monitoring reading
growth using student recalls: Effects of two teacher feed-
Allinder, R. M., & Eccarius, M. A. (1999). Exploring the technical
back systems. Journal of Educational Research, 83, 103–111. doi:
adequacy of curriculum-based measurement in reading for
10.1080/00220671.1989.10885938
children who use manually coded English. Exceptional Chil-
Fuchs, L. S., & Vaughn, S. (2012). Responsiveness-to-intervention:
dren, 65(2), 271–283. doi: 10.1177/001440299906500210
A decade later. Journal of Learning Disabilities, 45(3), 195–203.
Appelman, K. I., Callahan, J. O., Mayer, M. H., Luetke, B. S., &
doi: 10.1177/0022219412442150
Stryker, D. S. (2012). Education, employment, and inde-
Hammill, D. D., & Larsen, S. C. (1996). Test of Written Language (3rd
pendent living of young adults who are deaf and hard of
Ed.). Austin, TX: Pro-Ed.
hearing. American Annals of the Deaf , 157(3), 264–275. doi:
Hammill, D. D., Wiederholt, J. L., & Allen, E. A. (2006). TOSCRF: Test
10.1353/aad.2012.1619
of Silent Contextual Reading Fluency, Examiner’s manual. PRO-
American Educational Research Association, American Psycho-
ED.
logical Association, & National Council on Measurement in
Harcourt Brace Educational Measurement (1995). Stanford
Education (1999). Standards for educational and psychological
Achievement Test (9th Ed. Ed.Form S (Stanford 9)). Pearson.
testing. Washington D.C: American Educational Research
Hartmann, E. A. (2010). Evaluating employment outcomes of adults

Association.
who are deaf and hard of hearing[Unpublished Doctoral Dis-
Barkmeier, L.M., & Rose, S. (2009). Technical adequacy of the Maze
sertation]. Austin, TX: Teachers College, Colombia Univer-
with secondary deaf and hard of hearing students. [Unpublished
sity.
masters thesis.] University of Minnesota, Minneapolis.
Herman, R., Kyle, F. E., & Roy, P. (2019). Literacy and phonological
Bruce, S. M., Luckner, J. L., & Ferrell, K. A. (2018). Assessment of
skills in oral deaf children and hearing children with a
students with sensory disabilities: Evidenced-based prac-
history of dyslexia. Reading Research Quarterly, 5(4), 553–575.
tices. Assessment for Effective Intervention, 43(2), 79–89. doi:
doi: 10.1002/rrq.244
10.1177/1534508417708311
Hosp, J. L., Hensley, K., Huddle, S. M., & Ford, J. W. (2014).
Cawthon, S. (2015). From the margins to the spotlight: Diverse
Using curriculum-based measures with postsecondary
deaf and hard of hearing student populations and standard-
students with intellectual and developmental disabili-
ized assessment accessibility. American Annals of the Deaf ,
ties. Remedial and Special Education, 35(4), 247–257. doi:
160(4), 385–394. doi: 10.1353/aad.2015.0036
10.1177/0741932514530572
Chen, Y. (2002). Assessment of reading and writing samples of deaf
Hunley, S. A., Davies, S. C., & Miller, C. R. (2013). The relation-
and hard of hearing students by curriculum-based measure-
ship between curriculum-based measures in oral reading
ments[Unpublished doctoral dissertation]. Minneapolis, MN:
fluency and high-stakes tests for seventh grade students.
University of Minnesota.
RMLE Online: Research In Middle Level Education, 36(5), 1–8. doi:
Christ, T. J., Silberglitt, B., Yeo, S., & Cormier, D. (2010). Curriculum-
10.1080/19404476.2013.11462098
based measurement of oral reading: An evaluation of
Individuals with Disabilities Education Act, 20 U.S.C. . § 1400
growth rates and seasonable effects among students
(2004). www.ed.gov.
served in general and special education. School Psychology
Jung, P. G., McMaster, K. L., Kunkel, A. K., Shin, J., & Stecker,
Review, 39(3), 447–462. https://www.tandfonline.com/toc/u
P. M. (2018). Effects of data-based individualization for
spr20/39/3?nav=tocList.
students with intensive learning needs: A meta-analysis.
Cohen, J. (1988). Statistical power analysis for the behavioral sciences.
Learning Disabilities Research & Practice, 33(3), 144–155. doi:
New York, NY: Erlbaum Associates.
10.1111/ldrp.12172
Deno, S. (1985). Curriculum-based measurement: The emerg-
Kelly, L. P., & Ewoldt, C. (1984). Interpreting nonverbatim cloze
ing alternate. Exceptional Children, 52, 219–232. doi: 10.1177/
responses to evaluate program success and diagnose stu-
001440298505200303
dent needs for reading instruction. American Annals of the
Devenow, P. (2003). A study of the CBM Maze procedure as a measure
Deaf , 129(1), 45–51. doi: 10.1353/aad.2012.0857
of reading with deaf and hard of hearing students[Unpublished
Lam, E. A., Rose, S., & McMaster, K. L. (2020). Technical char-
doctoral dissertation]. Minneapolis, MN: University of
acteristics of curriculum-based measurement with stu-
Minnesota.
dents who are deaf. Journal of Deaf Studies and deaf Educa-
Easterbrooks, S. R., & Huston, S. G. (2008). The signed reading
tion, 318–333. doi: 10.1093/deafed/enaa003. https://doi.org/
fluency of students who are deaf/hard of hearing. Jour-
10.1093/deafed/enaa003.
nal of Deaf Studies and Deaf Education, 13(1), 37–54. doi:
LaSasso, C. (1980). Validity and reliability of the cloze procedure
10.1093/deafed/enm030
as a measure of readability for prelingually, profoundly deaf
Espin, C., Deno, S., Maruyama, G., & Cohen, C. (1989). The Basic
students. American Annals of the Deaf , 125(5), 559–563. doi:
Academic Skills Samples (BASS): An instrument for the screen-
10.1353/aad.2012.1497
ing and identification of children at risk for failure in regular
Luckner, J. L., & Urbach, J. (2012). Reading Fluency and Stu-
education classrooms[Paper presentation]. San Francisco: CA:
dents who are Deaf and Hard of Hearing: Synthesis of the
American Educational Research Association.
Research. Communication Disorders Quarterly, 33(4), 230–241.
Every Student Succeeds Act, 20 U.S.C. § 6301 (2015). https://www.
https://doi.org/10.1177/1525740111412582.
congress.gov/bill/114th-congress/senate-bill/1177.
Luft, P. (2012). Employment and independent living skills of
Fuchs, L. S. (2004). The past, present, and future of curriculum-
public school high school deaf students: Analyses of the
based measurement research. School Psychology Review,
transition competence battery response patterns. Journal
33(2), 188–193. https://naspjournals.org/loi/spsr.
of the American Deafness and Rehabilitation Association, 45(3),
Fuchs, L. S., & Fuchs, D. (1992). Identifying a measure for monitor-
292–313. https://repository.wcsu.edu/jadara/about.html.
ing student reading progress. School Psychology Review, 21(1),
Luft, P. (2018). Reading comprehension and phonics research:
45–59. https://www.tandfonline.com/toc/uspr20/21/1?nav=
Review of correlational analyses with deaf and hard of
tocList.
hearing students. Journal of Deaf Studies and Deaf Education, Rose, S., McAnally, P., Barkmeier, L., Virnig, S., & Long, J. (2008).
23(2), 148–163. doi: 10.1093/deafed/enx057. Silent reading fluency test: reliability, validity, and sensitivity to
Luft, P. (2019). Strengths-based reading assessment for deaf and growth for students who are deaf and hard of hearing at the
hard-of-hearing students. Psychology in the Schools, 57(3), elementary, middle school, and high school levels. (Report No. 9).
375–393. doi: 10.1002/pits.22277. Minneapolis, MN: University of Minnesota.
Luckner, J. L. (2013). Using the dynamic indicators of basic literacy Sandberg, K. L., & Reschly, A. L. (2011). English learners: Chal-
skills with students who are deaf or hard of hearing: Per- lenges in assessment and the promise of curriculum-based
spectives of a panel of experts. American Annals of the Deaf , measurement. Remedial and Special Education, 32(2), 144–154.
158(1), 7–19. doi: 10.1353/aad.2013.0012. doi: 10.1177/0741932510361260.
Luckner, J., & Bowen, S. (2006). Assessment practices of profes- Shin, J., & McMaster, K. (2019). Relations between CBM (oral
sionals serving students who are deaf or hard of hearing: reading and maze) and reading comprehension on state
An initial investigation. American Annals of the Deaf , 151(4), achievement tests: A meta-analysis. Journal of School Psychol-
410–417. doi: 10.1353/aad.2006.0046. ogy, 73, 131–149. doi: 10.1016/j.jsp.2019.03.005.
Marschark, M. (2006). Intellectual functioning of deaf adults and Talbott, E., Maggin, D. M., Van Acker, E. Y., & Kumm, S.
children: Answers and questions. European Journal of Cogni- (2017). Quality indicators for reviews of research in

tive Psychology, 18(1), 70–89. doi: 10.1080/09541440500216028. special education. Exceptionality, 6(4), 245–265. doi:
Marschark, M., Spencer, P., Adams, J., & Sapere, P. (2011). Evidence- 10.1080/09362835.2017.1283625.
based practice in educating deaf and hard-of-hearing chil- Thomas, A. E., & Marvin, C. A. (2016). Program monitoring
dren: Teaching to their cognitive strengths and needs. practices for teachers of the deaf and hard of hearing in
European Journal of Special Needs Education, 26(1), 3–16. doi: early intervention. Communication Disorders Quarterly, 37(3),
10.1080/08856257.2011.543540. 184–193. doi: 10.1177/1525740115597862.
Marston, D. (1989). A curriculum-based measurement approach Thorndike, R. M., & Thorndike-Christ, T. (2010). Measurement and
to assessing academic performance: What is it and why do evaluation in psychology and education: Eighth edition. New
it. In M. Shinn (Ed.), Curriculum-based measurement: Assess- York, NY: Pearson Education Inc.
ing special children (pp. 18–78). New York: NY: Guilford Ticha, R., Espin, C. A., & Wayman, M. (2009). Reading
Press. progress monitoring for secondary-school students:
Mayer, C., & Trezek, B. J. (2018). Literacy outcomes in deaf stu- Reliability, validity, and sensitivity to growth of
dents with cochlear implants: Current state of the knowl- reading-aloud and maze-selection measures. Learning
edge. Journal of Deaf Studies and Deaf Education, 23(1), 1–16. Disabilities Research & Practice, 24(3), 132–142. doi:
doi: 10.1093/deafed/enx043. 10.1111/j.1540-5826.2009.00287.x.
McGraw-Hill (1970). Reading for Concepts. Columbus, OH: Author. Toubanos, E. S. (1995). Review of the Test of Early Reading
Northwest Evaluation Association (2003). Technical manual Ability—Deaf or Hard of Hearing. In J. C. Connelly & J. C.
for the NWEA Measures of Academic Progress and Impara (Eds.), The twelve mental measurement yearbook (pp.
achievement level tests. Portland, OR: Northwest Evaluation 1051–1053). Lincoln, NE: Buros Institute of Mental Measure-
Association. ment.
Quigley, S. P., McAnally, P., Rose, S., & King, C. (2001). Reading Wayman, M., Wallace, T., Wiley, H., Ticha, R., & Espin, C. A.
milestones. Austin, TX: PRO-ED. (2007). Literature synthesis on curriculum-based measure-
Quigley, S. P., McAnally, P., Rose, S., & Payne, J. (2003). Reading ment in reading. Journal of Special Education, 41(2), 85–120.
bridge. Austin, TX: PRO-ED. doi: 10.1177/00224669070410020401.
Rose, S. (2007). Monitoring progress of students who are deaf or Woodcock, R. (1987). Woodcock Reading Mastery Test – Revised.
hard of hearing. Minneapolis, MN: National Center on Stu- Circle Pines, MN: American Guidance Service.
dent Progress Monitoring. https://files.eric.ed.gov/fulltext/E Woodcock, R. W., McGrew, K. S., & Mather, N. (2001). Woodcock-
D502455.pdf. Johnson III. Itasca, IL: Riverside Publishing.

Ej 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ej 2

Uploaded by

Copyright:

Available Formats

Journal of Deaf Studies and Deaf Education, 2020, 398–410

Systematic Review of Curriculum-Based Measurement

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Citation Analysis Metrics Findings p values when

Signed reading fluency

Easterbrooks and Huston Inter-rater Fluency envelope .98

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Silent reading fluency

Rose, (2008) Inter-rater Correct words identified 92%

Chen (2002) Inter-rater Maze 100%

Barkmeier (2009) Alternate Form Form A .42 to .75

Lam (2020) Alternate form: CMC .76 to .84 <.01

Citation Type of measure Name of measure CBM metrics Relationship Significance

Signed reading fluency

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Silent reading fluency

Kelly & Ewoldt, Achievement test SAT-HI Verbatim χ 2 =8.94 p = .003

Citation Type of measure Name of measure CBM metrics Relationship Significance

P1-4M scan .46 & .56 p < .05

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

Downloaded from https://academic.oup.com/jdsde/article/25/4/398/5874736 by guest on 02 March 2024

You might also like