You are on page 1of 14

Learning and Individual Differences 32 (2014) 40–53

Contents lists available at ScienceDirect

Learning and Individual Differences


journal homepage: www.elsevier.com/locate/lindif

Development of a new reading comprehension assessment: Identifying


comprehension differences among readers☆
Sarah E. Carlson a,⁎, Ben Seipel b, Kristen McMaster c
a
University of Oregon, Center on Teaching and Learning, United States
b
California State University, Chico, United States
c
University of Minnesota, Twin Cities, United States

a r t i c l e i n f o a b s t r a c t

Article history: The purpose of this study was to evaluate the Multiple-choice Online Cloze Comprehension Assessment
Received 28 September 2012 (MOCCA), designed to identify individual differences in reading comprehension. Data were collected with two
Received in revised form 9 January 2014 sets of 3rd through 5th grade students during two years: 92 students participated in Year 1 and 98 students
Accepted 17 March 2014
participated in Year 2 to address primary research questions, and an additional 94 (N = 192) students participated
in Year 2 to address the limitation of test administration time. Participants were group administered the MOCCA and
Keywords:
Reading comprehension assessment
a standardized reading proficiency assessment, and individually administered other reading measures. Preliminary
Individual differences analyses indicated that the MOCCA produced reliable and valid scores as a new reading comprehension assessment
Reading comprehension processes for identifying types of comprehension processes used during reading, as well as for identifying individual
differences in the types of comprehension processes used during reading. Findings are discussed in terms
of developing a new measure to identify cognitive reading comprehension processes used during reading.
Future research is needed to provide additional support for the technical adequacy of the assessment.
© 2014 Elsevier Inc. All rights reserved.

1. Introduction Online Cloze Comprehension Assessment (MOCCA), developed to


identify individual differences in reading comprehension. In this paper,
Many students struggle with reading, and in particular, reading we first discuss theories of reading comprehension that guided the devel-
comprehension. As students advance in school, they transition from opment of MOCCA. Second, we describe existing reading comprehension
learning to read (e.g., learning to decode and developing fluency and assessments used to measure specific aspects of comprehension, and how
comprehension skills) to reading to learn (e.g., using comprehension they have informed the development of MOCCA. Finally, we report initial
skills to learn from text; Chall, 1996). This transition is often most evidence of the reliability and validity of MOCCA, and discuss how the
evident in the upper elementary grades, when many readers begin present study extends the reading comprehension assessment literature.
to encounter difficulties with new comprehension requirements
(Shanahan & Shanahan, 2008, 2012). 1.1. Reading comprehension theories and assessments
Assessments are needed to determine why readers experience
comprehension difficulties in order to develop appropriate instruction Reading comprehension is a complex and multidimensional
to meet their individual needs, yet few such assessments are available. construct; thus, the development of reading comprehension assessments
Thus, the purpose of this study was to report preliminary findings should be guided by theory (August, Francis, Hsu, & Snow, 2006; Fletcher,
from a new reading comprehension assessment, the Multiple-choice 2006). Reading comprehension theories help identify constructs that
work during the process of comprehension and specify the relationships
among them so that researchers can better operationalize the dimensions
☆ This research was supported by Grant #R305C050059 from the Institute of Education to be assessed.
Sciences (IES), U.S. Department of Education, to the University of Minnesota and through
Reading comprehension theories suggest that successful reading
the Interdisciplinary Education Sciences Predoctoral Training Program, “Minnesota
Interdisciplinary Training in Education Sciences (MITER)” for data collection and resources, comprehension involves the extent to which a reader can develop a
as well as by Grant #R305b110012 from the IES, U.S. Department of Education, to the coherent mental representation of a text through developing a coherent
Center on Teaching and Learning at the University of Oregon, through a Postdoctoral situation model (e.g., Graesser, Singer, & Trabasso, 1994; Kintsch, 1998;
Fellowship for writing resources. The opinions expressed are those of the authors and do McNamara, Kintsch, Songer, & Kintsch, 1996; van den Broek, Rapp, &
not necessarily represent views of the IES or the U.S. Department of Education.
⁎ Corresponding author at: University of Oregon, 1600 Millrace Drive, Suite 207,
Kendeou, 2005). A situation model is comprised of the situations that
Eugene, OR 97403, United States. Tel.: +1 541 346 8363; fax: +1 541 346 8353. take place in a text (e.g., time, space, characters, and causality) (van
E-mail address: carlsons@uoregon.edu (S.E. Carlson). Dijk & Kintsch, 1983; Zwaan, Magliano, & Graesser, 1995). For instance,

http://dx.doi.org/10.1016/j.lindif.2014.03.003
1041-6080/© 2014 Elsevier Inc. All rights reserved.
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 41

a reader may track causality by keeping track of the goal of the text readers to answer causal questions (Why questions that prompted
(Trabasso & van den Broek, 1985; van den Broek, Lynch, Naslund, readers to make causal connections during reading), and one that
Ievers-Landis, & Verduin, 2003). The following example describes a prompted readers to answer general questions (questions that prompted
causal connection: “Jimmy wanted to buy a bike. He got a job and readers to make any kind of connections during reading). The researchers
earned enough money. He went to the store to buy the bike. Jimmy found that paraphrasers benefitted more from the general questioning
was happy.” In this example, a reader could make a causal connection intervention than elaborators did, whereas elaborators benefited more
by generating an inference that Jimmy was happy because he reached from the causal questioning intervention than paraphrasers did. These
his goal and bought a bike. findings suggest that different types of poor comprehenders may respond
Researchers have found that many poor comprehenders (i.e., differently to intervention.
readers with adequate word reading skills but with poor comprehension Though researchers have employed methods to assess reading
skills compared to peers with similar word reading skills) fail to comprehension processing differences among readers (e.g., think-aloud
make causal inferences while reading, which may stem from failure tasks), most traditional school-based reading comprehension assess-
to track causal relations and goals in a text (e.g., Cain & Oakhill, 1999, ments (e.g., reading proficiency assessments, standardized measures)
2006; McMaster et al., 2012; Rapp, Broek, McMaster, Kendeou, & have not been designed to detect such processes or to identify individual
Espin, 2007; van den Broek, 1997). To provide appropriate instruction comprehension differences. In addition, many of these methods assess
to improve such inference generation, it is important that reading the product of reading comprehension rather than the process, limiting
comprehension assessments identify the specific processes with the types of conclusions that can be drawn about how readers compre-
which poor comprehenders struggle. hend differently. For example, Keenan, Betjemann, and Olson (2008)
Researchers have assessed reading comprehension processes to found that commonly used standardized reading comprehension assess-
understand how readers build connections (i.e., inferences) and track ments measure aspects of reading such as decoding and word recogni-
relations during reading to develop a coherent representation of a tion, but not necessarily reading comprehension, and what is measured
text, and have assessed reading comprehension products to evaluate varies depending on the age of the reader. Thus, such traditional assess-
the result of the representation of the text. The products are the ‘end ments may be insufficient for identifying specific reading comprehension
result’ of reading, or what the reader learned or stored in memory differences; yet, educators often make instructional decisions based on
from the text after reading (i.e., offline). Reading products are typically their outcomes (Keenan et al., 2008).
assessed using recall, questioning activities, and traditional multiple- Researchers have begun to develop other methods to help address
choice assessments. the shortcomings of traditional reading assessments and measure
In contrast, reading processes occur during the act of reading (i.e., how readers comprehend text rather than only assessing the prod-
online) and can be somewhat more difficult to assess because the exam- uct of comprehension. For instance, Magliano and colleagues devel-
iner must infer what is taking place during reading. Methods to assess oped the Reading Strategy Assessment Tool (RSAT; Magliano, Millis,
online reading comprehension processes include eye-tracking methods, Development Team, Levinstein, & Boonthum, 2011), which measures a
reading time measures, and think-aloud tasks (e.g., Ericsson & Simon, subset of the comprehension processes found to lead to a coherent
1993; Kaakinen, Hyönä, & Keenan, 2003; Linderholm, Cong, & Zhao, representation of a text. RSAT is an automated computer-based
2008). Think-aloud tasks, for example, are used to identify specific assessment in which readers read texts one sentence at a time, and
reading comprehension processes (e.g., causal, bridging, elaborative are asked either indirect questions (i.e., “What are your thoughts
inferences; paraphrases) that readers use during reading (Ericsson regarding your understanding of the sentence in the context of the
& Simon, 1993). Findings from think-aloud studies indicate that passage?”) or direct questions (i.e., Why questions related to a target
readers use different types of comprehension processes during reading sentence). Readers type their responses, which are later analyzed for
to develop coherent situation models (e.g., Laing & Kamhi, 2002; types of comprehension processes (e.g., paraphrases, bridging inferences,
Trabasso & Magliano, 1996a,b; van den Broek, Lorch, Linderholm, & elaborations) and content words (e.g., nouns, verbs, adjectives, adverbs)
Gustafson, 2001). Although think-aloud data provide fruitful informa- used during reading.
tion about the processes that readers use during comprehension, they Magliano et al. (2011) identified unique types of comprehension
are laborious, time consuming, and impractical for practitioners to use processes that readers used during reading using RSAT, and also found
to identify reading comprehension differences among their students for that RSAT predicted scores on measures of reading comprehension.
instructional purposes. However, the measure is limited in several ways. First, RSAT uses an
open-ended response format where participants type their responses
1.2. Identifying comprehension differences to questions, limiting its use to older participants who have developed
appropriate typing skills. Second, linguistic algorithms used to identify
Researchers who have assessed reading comprehension processes the types of comprehension processes produced in responses may be
using think-aloud methods have identified individual processing limited in capturing the quality of responses and identifying individual
differences among readers at different levels of comprehension profiles of readers. Finally, like think alouds, the open-ended response
skill (McMaster et al., 2012; Rapp et al., 2007). Specifically, McMaster task used in RSAT can produce a large amount of variability in how
et al. (2012) administered a think-aloud task to fourth grade readers at readers interpret the task instructions, especially the instructions for
different levels of comprehension skill (i.e., good, average, and poor). answering the indirect question which could be interpreted differently
They identified two types of poor comprehenders: (1) paraphrasers: from reader to reader. Thus, it seems useful to develop an assessment
poor comprehenders who mostly paraphrased during reading; and that capitalizes on the strengths of RSAT (e.g., identify comprehension
(2) elaborators: poor comprehenders who elaborated about the text, processes during reading), but is also familiar to readers in terms of
including information that was connected to background knowledge testing format, efficient for educators to administer and score, and can
that was not always relevant to the text. These findings were consistent be used for making instructional decisions with children.
with previous research that found similar types of poor comprehenders, Other recently developed assessments, such as the Diagnostic
and support other researchers' conclusions that poor comprehenders Assessment of Reading Comprehension (DARC; August et al., 2006)
may struggle with reading in different ways (Cain & Oakhill, 2006; and The Bridging Inferences Test, Picture Version (Bridge-IT, Picture
Nation, Clarke, & Snowling, 2002; Perfetti, 2007; Rapp et al., 2007). Version; Pike, Barnes, & Barron, 2010), measure individual differences
McMaster et al. (2012) also found that the two types of poor in reading comprehension processes for readers in Grades 2–6. The
comprehenders responded to intervention in different ways. Specifical- DARC requires readers to remember newly read text, connect to and
ly, they compared two questioning interventions: one that prompted integrate relevant background knowledge, and generate bridging
42 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

inferences (August et al., 2006). Despite its usefulness for identifying In the present study, we developed and evaluated an assessment
certain types of comprehension processes, the DARC uses unfamiliar to measure comprehension processes that readers use during reading
pseudo-word relational statements embedded in texts. Readers are (i.e., online), capitalizing on the benefits of existing measures
only asked to judge if such statements are true or false, and the assess- (e.g., efficient and familiar presentation formats), but also addressing
ment does not identify whether readers build a coherent representation the shortcomings of existing measures (i.e., identify specific online
of a text. The Bridge-IT, Picture Version also assesses children's ability to reading comprehension processes and individual processing differences
generate bridging inferences during reading, as well as the ability to used to develop a coherent representation of a text). The resulting
suppress irrelevant text information (Pike et al., 2010). In addition, tool is the Multiple-choice Online Cloze Comprehension Assessment
this assessment involves a task in which readers choose the last (MOCCA). MOCCA is a paper and pencil assessment that consists of short
sentence of a narrative text and each text is either accompanied narrative texts (seven sentences long). For each text, the sixth sentence is
with a related picture, inconsistent picture, or no picture. Similar to deleted and readers are required to choose among four multiple-choice
the DARC, the Bridge-IT, Picture Version is limited in its utility for responses to complete the sixth sentence of the text. The best response
distinguishing between different comprehension processes used requires the reader to make a causal inference that results in a coherent
to develop a coherent representation of a text, and for identifying representation of the text. Unlike traditional multiple-choice assess-
individual comprehension differences. ments, MOCCA was designed with alternate response types that repre-
In sum, researches have developed assessments that target identify- sent specific reading comprehension processes used during reading
ing comprehension processes; however few reading comprehension (i.e., causal inferences, paraphrases, local bridging inferences, and lateral
assessments are available for educators and practitioners to use to easily connections). Fig. 1 provides an item from MOCCA, with each response
assess differences in readers' comprehension processes among children type labeled for the comprehension process it identifies. Instructions
at various levels of comprehension skills. Limitations from previously and additional items from MOCCA can be found in Appendix A.
developed assessments provide a rationale for developing new assess-
ments that address the needs of readers who struggle with reading 1.4. Study purpose and research questions
comprehension in different ways. Furthermore, developing reading
comprehension assessments that focus on efficiently identifying specific The purposes of this study were to evaluate the initial technical
reading comprehension processes used to develop a coherent repre- adequacy of MOCCA and examine its capacity to identify reading
sentation of a text may be useful for identifying different types of comprehension processes differences among readers. Included in this
comprehenders for the purposes of instruction. examination was also whether MOCCA can be used to identify subtypes
of poor comprehenders similar to those identified in previous research
1.3. Designing a reading comprehension assessment using think-aloud approaches (McMaster et al., 2012; Rapp et al., 2007).
Our research questions included: (1) Does MOCCA produce scores that
In addition to variation in purpose and utility for educational are reliable (internally consistent) depending on the amount of time pro-
decision making, reading comprehension assessments vary across many vided during test administration (i.e., timed vs. untimed) and depending
dimensions, including response format (e.g., cloze, multiple-choice, on the difficultly and discrimination levels of the items? (2) Does MOCCA
open-ended), presentation format (e.g., paper–pencil and computer- produce scores that are valid (in terms of criterion validity)? and (3) To
based), and the components of reading comprehension measured what extent does MOCCA distinguish among comprehension processes
(e.g., literal comprehension, inferential processes, main idea identifica- of good, average, and poor comprehenders, including subtypes of poor
tion) (Eason & Cutting, 2009; Keenan et al., 2008). Each dimension comprehenders, during reading depending on the amount of time
presents a challenge for assessment development. provided during test administration?
In designing an assessment, the developer must make decisions
about each dimension, which requires careful consideration of the 2. Methods
benefits and drawbacks of options under each dimension. For instance,
multiple-choice tests are efficient for administrating in group settings 2.1. Participants
and are familiar to readers; however, traditional multiple-choice tests
require readers to choose only one correct choice and alternative To address our research questions, data were collected across two
choices are mainly distracters without diagnostic meaning (Cutting & years. Specifically, 92 third, fourth, and fifth grade students in Year 1
Scarborough, 2006). Additionally, multiple-choice questions are and 98 third, fourth, and fifth grade students in Year 2 completed the
traditionally presented after an entire text, thus measuring the product MOCCA (timed version) and a full battery of additional reading related
of comprehension rather than the processes used to build a coherent assessments (as described under Measures). In Year 2, an additional
representation of the text. Open-ended questions allow readers to 94 third, fourth, and fifth grade students, along with the other 98
demonstrate comprehension processes used to build a coherent text students (N = 192) were provided additional testing time (untimed
representation; however, open-ended assessments, like think alouds, version) to complete as many items possible. This additional time was
are time consuming and difficult to score (e.g., Magliano et al., 2011). provided to address limitations from initial findings that indicated
Modified cloze tests, such as the maze task in which every nth word is many MOCCA items were incomplete during Year 1 because insufficient
deleted and replaced with three options for the reader to select, are time was provided to complete all MOCCA items. Thus, additional time
efficient to administer and score, and have been demonstrated to provide was provided to conduct more accurate item analyses. We also used the
a general indicator of reading proficiency (Deno, 1985; Espin & Foegen, additional time from participants in Year 2 who were also administered
1996; Fuchs & Fuchs, 1992; Wayman, Wallace, Wiley, Ticha, & Espin, additional reading measures (N = 98) to address whether adding
2007). In addition, maze tasks are often timed, which does not allow additional time would provide more accurate validity and comprehen-
the reader to build a complete and coherent mental representation of sion processing information among comprehension groups who took
the text. In fact, researchers have provided evidence that such approaches the MOCCA. Participant demographic information is presented in Table 1.
assess decoding or sentence level comprehension, rather than discourse Participants (N = 92 in Year 1; N = 98 in Year 2) were also screened
level comprehension (Francis, Fletcher, Catts, & Tomblin, 2005; Keenan and divided into good, average, and poor comprehender groups using
et al., 2008; Nation & Snowling, 1997). Further, maze tasks were designed percentile scores from three measures: (1) the Computerized Achieve-
primarily for progress monitoring in reading rather than for assessing ment Levels Tests (CALT; Northwest Evaluation Association, 2001);
processes that take place during reading, and are thus limited in their (2) Dynamic Indicators of Basic Early Literacy Skills (DIBELS), 6th Ed.
diagnostic utility for comprehension (Wayman et al., 2007). Oral Reading Fluency (ORF) (Good & Kaminski, 2002); and (3) the
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 43

Fig. 1. Sample MOCCA item.

Curriculum Based Measurement (CBM) Maze Task (Deno, 1985; Espin & text. ORF is typically used to identify readers who may need additional
Foegen, 1996; Fuchs & Fuchs, 1992). Specifically, we used percentile instructional support, and monitor progress in oral reading fluency. ORF
ranges from the CALT to determine the groups, and DIBELS ORF consists of a set of standardized passages that participants read aloud
and CBM Maze scores to corroborate the CALT. Poor comprehenders for 1 min each. Words omitted, substituted, or hesitated upon by the
(Year 1 n = 25; Year 2 n = 24) were at the 25th percentile; average reader for longer than 3 s are counted as errors. The score is the number
comprehenders (Year 1 n = 22; Year 2 n = 26) were at the 50th of words read correctly in 1 min. Reliability coefficients for ORF have
percentile; and good comprehenders (Year 1 n = 45; Year 2 n = 48) been reported as r = .65 to .98 (Good & Kaminski, 2002). Participants
were at the 75th percentile on the CALT. who read below 75 words per min were not included in the current
study to ensure reading comprehension problems did not stem from
2.2. Measures and materials fluency difficulties. ORF scores were provided by the school district.
The CBM Maze task is a modified cloze task used to index overall
2.2.1. Screening measures reading proficiency (Deno, 1985; Espin & Foegen, 1996; Fuchs & Fuchs,
The CALT (Northwest Evaluation Association, 2001) is a standard- 1992). The CBM Maze task is group administered. For each text, every
ized computer adapted reading comprehension assessment that is seventh word is deleted and replaced with three options, only one of
administered to students one to two times each year. The CALT measures which makes syntactic and semantic sense. The reader is to select the
literal, inferential, and vocabulary components of reading comprehen- word that best fits the sentence, and the score is the number of correct
sion. The CALT has a reported reliability range of r = .76 to .87 word selections. In this study, participants read three texts used in
(Northwest Evaluation Association, 2001). Percentile scores were provid- previous research (e.g., McMaster et al., 2012). Participants were
ed by the school district. given 1 min for each text. The total number of correct words selected
DIBELS ORF (Good & Kaminski, 2002) is an individually adminis- in 1 min was summed for each participant, and scores from the three
tered assessment of students' accurate and fluent reading of connected passages were averaged. The CBM Maze has a reported reliability

Table 1
Participant demographic information.

Variable Year 1 (Primary sample; N = 92) % Year 2 (Primary sample; N = 98) % Year 2 (Exploratory sample; N = 192) %

Grade
Third 8 8.7 5 5.1 13 6.8
Fourth 40 43.5 41 41.8 95 49.5
Fifth 44 47.8 52 53.1 84 43.8
Sex
Female 65 70.6 66 67.3 111 57.8
Male 27 29.4 32 32.7 81 42.2
Disability type
Autism 1 1.1 1 1.0 4 2.0
EBD 1 1.1 3 3.0 3 1.6
OHD 0 0.0 1 1.0 3 1.6
SLD 1 1.1 1 1.0 8 4.2
SLI 5 5.4 5 5.1 6 3.1
None 84 91.3 87 88.9 168 87.5
ELL 19 20.7 14 14.3 34 17.7
Free/reduced lunch 45 48.9 39 39.8 78 40.6
Race/ethnicity
Missing 0 0.0 0 0.0 1 0.6
Native American 1 1.1 0 0.0 0 0.0
Asian 11 12.0 17 17.3 25 13.0
Black 21 22.8 13 13.3 28 14.6
Hispanic 16 17.4 12 12.1 31 16.1
White 43 46.7 56 56.6 107 55.7

Note. EBD = emotional/behavioral disorders, OHD = other health disabilities, SLD = specific learning disabilities, SLI = speech/language impairment, ELL = English language learners.
44 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

range from r = .60 to .80 for elementary school participants (Wayman number of words read correctly. Both tasks take approximately 5 min
et al., 2007). to complete. The reported reliability for WJ III subtests range from
r = .80 to .90 and higher, and have been normed from Kindergarten
2.2.2. Measure under development through Grade 12 (Woodcock et al., 2001).
MOCCA was developed as a paper and pencil, multiple-choice, To measure comprehension processes used during reading, partici-
online (i.e., during reading), cloze comprehension assessment designed pants were administered a think-aloud task. This task involves asking
to identify comprehension processes used during reading of narrative the reader to read a text out loud one sentence at a time. After each
texts. Narrative text items were written so the causal structure of each sentence, the reader is asked to say whatever comes to mind, even if it
text (plot, nature of events, tense of language) was centered on a main seems obvious, because there is no right or wrong answer. After reading
goal that motivated subgoals and events in the text (e.g., Trabasso & the entire text, the reader is asked two yes/no questions about the story.
van den Broek, 1985). Instead of deleting every nth number of words, Participants read two narrative texts for the think-aloud task adminis-
as in traditional cloze tasks (e.g., CBM Maze), one line of each text was tered in the current study. For each text, each sentence was printed on
deleted. The deleted line was always the sixth sentence and occurred a separate index card. This task was first demonstrated by the experi-
before the last line of the text. Participants were required to choose one menter with a practice text, and then any questions were answered
of four response types to complete the deleted line from each text. Re- before moving on to the experimental texts. Each text (practice and ex-
sponse types were developed for each short narrative based on previous perimental) was 15–21 sentences long, with a Flesch Kincaid (Kincaid
findings regarding the types of cognitive processes produced by different et al., 1975) reading level of 4.0. Each story during the think-aloud
comprehenders (good, average, and two types of poor comprehenders; task took approximately 10 min to complete. Scoring for the think-
McMaster et al., 2012; Rapp et al., 2007) during a think-aloud task. First, aloud task was conducted using a coding protocol and is described in
typical cognitive processes produced during think-aloud tasks were more detail in the Think-aloud coding section. The texts used for the
identified. Of these processes, the most frequently produced types were think-aloud task are presented in Appendix B.
bridging inferences, elaborative inferences, associations, and paraphrases To measure working memory, participants were administered a
or text repetitions. Thus, for each text, an inferential response type was sentence span task, which measures both storage and processing com-
developed to best complete the missing sentence with a causally coher- ponents of working memory. In this task, participants listen to unrelated
ent bridging statement (goal completion statement; e.g., Trabasso & van sets of sentences and are asked to remember different aspects of each
den Broek, 1985). The following response types were developed: (1) (Daneman & Carpenter, 1980). The sentence span task used in the cur-
Causally coherent inference best completes the missing sentence with a rent study was modified for American children by Swanson, Cochran,
causally coherent statement (goal completion statement; e.g., Trabasso and Ewers (1989). First, participants listen to sets of sentences (ranging
& van den Broek, 1985), thus closing the causal chain; (2) local bridging from 2 to 5 sentences) read by an experimenter. Second, participants
inference is a semantically correct lure that connects to information in answer a comprehension question about one of the sentences. Third,
the sentence immediately preceding the missing sentence and is related participants are asked to recall the last word from each sentence in
to the goal of the story, but does not close the causal chain; (3) lateral con- order from the first to last sentence. All components (recall of the last
nection is a lure with semantic overlap with the content in the narrative words and question answer) must be answered correctly to move on
(e.g., the action), but is unrelated to relevant text information. This re- to the next difficulty level. The task materials consist of 6 practice
sponse type combined associations, elaborations, and self-explanations sentences and 28 test sentences. Each sentence is between six and 10
found during think-aloud studies which involve readers accessing back- words long. All of the final words and answers to the comprehension
ground knowledge, but did not always close the causal chain of the text. questions are nouns and none of the words are repeated in any of the
A final response type developed was (4) a paraphrase: a lure designed sets. Words recalled and responses to the questions are answered out
to paraphrase or repeat the earlier goal statement, or a combination of loud and scored for accuracy. The task lasts approximately 10 min.
the goal and subgoal statements presented in the text.
MOCCA includes 40 narrative text items, each with four multiple- 2.3. Procedures
choice responses. The 40 items are divided into two separate sections
of 20 items each, to provide a break for participants. Participants answer In both Years 1 and 2, there were two testing phases. In Phase 1,
MOCCA items during two 10 min testing periods. The average Flesch participants completed two group-administered assessments (CBM
Kincaid (Kincaid, Fishburne, Rogers, & Chissom, 1975) grade level Maze and MOCCA) during scheduled class periods. In Phase 2, partici-
for the items (i.e., texts and responses) is 4.4. Each item has a title, is pants were individually administered additional assessments (WJ Word
seven sentences long, and is an average of 80.5 words. MOCCA is ID and Attack, think aloud, and sentence span) in designated areas of
coded and scored for the number of times the reader selects each the school during regular school hours or during an after-school program.
response type. Specifically, each response type (causally coherent Participating teachers helped arrange individual testing schedules based
inference, local bridging inference, lateral connection, and paraphrase) on availability. Students received a $5.00 gift card from Target for partic-
is coded with a 1 or a 0 to indicate whether or not the response type ipating in this study, and teachers were compensated for their time with
was chosen. Total scores for each response type are then calculated. gift cards from Amazon.com to supplement their classroom libraries.

2.2.3. Additional reading measures 2.3.1. Group testing


Participants completed additional measures to examine other Trained project staff administered the CBM Maze and MOCCA. Staff
aspects of reading (decoding, comprehension processes, and working consisted of two doctoral candidates in Educational Psychology and 11
memory) during individual testing sessions. Scores from these measures undergraduate Psychology majors, all who had previous experience
were used to establish the relation between MOCCA response types and working with children in school settings. Staff members were trained
other reading-related measures. to administer and score the CBM Maze and MOCCA during three
To measure word reading or decoding, participants were adminis- 1-hour sessions. Project staff practiced administering and scoring each
tered the Woodcock Johnson (WJ) III Word Identification (ID) and assessment with each other and project supervisors until each was
Word Attack subtests (Woodcock, McGrew, & Mather, 2001). During correct and consistent.
these subtests, participants read lists of increasingly difficult real In Years 1 and 2, project staff administered both assessments during
and nonsense words aloud. Words read correctly are scored as 1, and one session that lasted approximately 45–60 min. First, they adminis-
words read incorrectly are scored as 0. The ceiling for both the WJ ID tered the CBM Maze by reading the directions, and providing two
and Attack is six consecutive 0 s. The score for each subtest is the total sample items. Participants were told they would be reading three
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 45

narrative texts for 1 min each, and whenever they encountered three average of 90% agreement for Year 1 and an average of 93% agreement
words in parentheses within a sentence, they would need to circle the Year 2. Disagreements between judges were resolved by discussion.
word that belonged in the sentence, even if they were unsure of the
answer. All questions about this task were answered when practicing 3. Results
with the sample items.
Second, project staff administered the MOCCA by reading the direc- Data were analyzed separately for participants in Years 1 and 2 to
tions, providing two sample items, and answering any questions. Partici- address each of our research questions. First, separate analyses were
pants were told they would be reading short narrative stories and each conducted to assess the internal consistency of the MOCCA (both
story was missing a sentence. The participants were instructed to read timed and untimed data) as well as the difficulty and discrimination
each story silently and choose one of four choices below each story for levels of the items. Second, we assessed the criterion validity of the
the missing sentence that best completed the story. Participants were MOCCA using the Year 1 (timed) and 2 (timed and untimed) datasets.
given 10 min to complete up to 20 items in the first section of the assess- Third, separate analyses were conducted to identify different types
ment. Then, participants were asked to stop, take a short break, and then of comprehenders during Year 1 and 2, to determine whether Year 1
turn to the second section and complete up to 20 more items for another results replicated in Year 2, and determine if providing additional time
10 min. The MOCCA took approximately 30 min to administer. for completing more MOCCA items better distinguishes between the
During Year 2 only, participants were given additional time to types of comprehenders, including subgroups of poor comprehenders.
complete all MOCCA items. The purpose of adding this time in Year 2
was because findings from Year 1 indicated that few participants com- 3.1. Reliability and validity of the MOCCA
pleted all MOCCA items in the allotted time limit. Thus, after completing
the initial timed version of the MOCCA (used to replicate findings from 3.1.1. Internal consistency
Year 1), Year 2 participants were instructed to return to unfinished Cronbach's alpha was used to determine the internal consistency of
items in both sections, and complete as many items as possible. The each of the MOCCA response types (i.e., causally coherent inference,
untimed version lasted for approximately 20 min. Data were used for paraphrase, local bridging inference, lateral connection). Specifically, we
item analyses, including item differentiation and difficulty. were interested not only in determining the internal consistency of the
correct (i.e., causally coherent inference), but also in determining wheth-
2.3.2. Individual testing er responses for each item cohere with the overall response for each type.
Project staff also administered and scored all individually adminis- Thus, Cronbach's alpha tests were used to measure the reliability for each
tered assessments. Staff members were trained to administer and response type. Cronbach's alphas range between 0 and 1 (Cronbach,
score each assessment with each other and the project supervisors 1951), and can be interpreted as follows: excellent = α ≥ .9; good =
until each was correct and reliable. Project staff individually admin- .89 N α ≥ .8; acceptable = .79 N α ≥ .7; questionable = .69 N α ≥ .6;
istered assessments during one session that lasted approximately poor = .59 N α ≥ .5; and unacceptable = .49 N α (Streiner, 2003).
30 min. Each session took place in designated locations in the partici- We found the following patterns across years and timed vs. untimed
pants' schools arranged before the testing day. Individual testing was test administrations that the causally coherent inference and paraphrase
audio recorded for future data coding, scoring, and entering. response type reliabilities fell between excellent and good ranges of
internal consistency; whereas, the local bridging inference and lateral
2.4. Think-aloud coding connection response type reliabilities did not.
To determine if the low alphas for the local bridging inference
Trained project staff transcribed the think-aloud sessions for coding. and lateral connection response types were due to a lack of internal
Think-aloud protocols were scored for types of comprehension process- consistency, or were due to lower variances because the response
es used during reading. First, think-aloud responses were parsed into types were chosen less often than the causally coherent inference and
idea units (generally, phrases including a subject and verb). Second, paraphrase response types, additional analyses were conducted. Specif-
each idea unit was coded using a scheme adapted from previous ically, correlation analyses were conducted between the mean of the
research to examine types of comprehension processes used during each response types chosen by the participants (i.e., the average propor-
reading to build a coherent representation of a text (e.g., McMaster tion of the total items chosen for a particular response type) and partic-
et al., 2012; Rapp et al., 2007; van den Broek et al., 2001). Coded vari- ipants' response type total score means (i.e., the average proportion of
ables included: associations (concepts from background knowledge the total responses chosen for a particular response type). Analyses
brought to mind by the text); evaluative comments (opinions about were conducted for both Years 1 and 2 (timed) and Year 2 (untimed).
the text); bridging inferences (connecting contents of the current Coefficient alphas, the proportion of total response type means and
sentence with local/near or global/distant text information); valid elab- variances are provided in Table 2.
orative inferences (explanations about the contents of the current sen- The average proportion of each response type total was highly
tence using background knowledge relevant with the text); invalid correlated with the corresponding average participant response type
elaborative inferences (explanations about the contents of the current total score. For instance, the average proportion of when the causally
sentence using background knowledge irrelevant with the text); valid coherent inference was chosen was highly correlated with the total
predictive inferences (anticipations of what will occur next in the text score for the causally coherent inference (Year 1 timed r = .42; Year 2
and is also connected to relevant text information); invalid predictive timed r = .42; Year 2 untimed r = .52; ps b .01). In addition, the
inferences (anticipations of what will occur next in the text and is also patterns of correlations for Year 1 (timed) showed similar patterns as
not connected to relevant text information); metacognitive responses the coefficient alphas (i.e., the correlations for the causally coherent
(reflections of understanding or agreement with the text); paraphrases/ inference N paraphrase N local bridging inference N lateral connection).
text repetitions (putting the current sentence or part of the current In Year 2 (timed and untimed) the correlations for the lateral connection
sentence into his/her own words, or restating the text verbatim); and were slightly higher than the correlations for the local bridging inference.
affective responses (emotions related to contents of the text). Correlation coefficients are provided in Table 3.
For Year 1, there were eight independent judges, and for Year 2, Finally, as shown in Table 2, the variances for the average participant
twelve independent judges who coded the think-aloud responses. total score and average proportion of the total score across the response
During both years, judges were paired into dyads to assess interrater types showed similar patterns as found between the coefficient alphas,
agreement of the think-aloud coding. Interrater agreement was calcu- as well as the correlations. That is, the variance of the causally coherent
lated using a randomly selected 20% of the transcripts. There was an inference total score was greater than any of the other three total scores;
46 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

Table 2
Coefficient alphas, means and variances for the MOCCA response types across years and administration times.

Year and administration time Causally coherent inference Paraphrase Local bridging inference Lateral connection

Year 1 (timed) Coefficient α .92 .85 .63 .62


Proportion of total response type M .52 .08 .06 .03
Proportion of total response type σ2 .25 .07 .05 .03
Participant response type total score M 20.79 3.36 2.35 1.35
Participant response type total score σ2 71.57 16.65 5.48 3.13
Year 2 (timed) Coefficient α .93 .86 .32 .43
Proportion of total response type M .50 .07 .06 .03
Proportion of total response type σ2 .25 .06 .05 .03
Participant response type total score M 20.02 2.71 2.20 1.05
Participant response total score σ2 71.91 14.14 2.79 1.69
Year 2 (untimed) Coefficient α .94 .87 .65 .68
Proportion of total response type M .61 .13 .10 .05
Proportion of total response type σ2 .24 .15 .09 .05
Participant response type total score M 24.22 5.27 4.13 2.14
Participant response type total score σ2 103.02 29.16 9.55 5.90

the variance for the paraphrase total score was greater than the variance 85% and above are classified as easy, items 51–84% are classified as
for the local bridging inference and lateral connection response types; moderate, and items below 50% are classified as difficult (Lord, 1952).
and so on. Thus, the initial coefficient alpha patterns appear to be due Item difficulty ranged from 36% to 81% for the MOCCA items.
to increasing restriction of range on the total scores rather than lack of
internal consistency. 3.1.3. Validity
We calculated Pearson's r correlation coefficients to examine
3.1.2. Item discrimination and difficulty levels criterion-related validity with the MOCCA response types during Years
To explore discrimination and difficulty levels for the MOCCA items 1 (timed) and Year 2 (timed and untimed). The criterion measures
(N = 40), we used MOCCA data from the Year 2 (untimed) dataset considered were with each of the MOCCA response types (causally
because these participants were allowed additional time to complete coherent inference; paraphrases; local bridging inferences; lateral
more of the MOCCA items (N = 192 participants), and thus, provide connections) and the CALT, CBM Maze, DIBELS ORF, WJ ID, WJ Attack,
more accurate item information. Whereas, in the timed administrations and WM Words. Correlation coefficients between the MOCCA response
of MOCCA (Years 1 and 2), few participants completed all of the items. type scores during for Years 1 and 2 and scores on the measures
Item discrimination and difficulty data are listed in Table 4. considered for criterion-related validity ranged between r = −.37–.75
(ps b .05 and b .001). Correlation coefficients for the validity analysis
3.1.2.1. Discrimination levels. An item analysis was conducted on MOCCA are provided in Table 5.
items (N = 40 items) from the Year 2 (untimed) dataset using Iteman,
Version 3.5 (1989), a software program designed to provide item 3.2. Capacity of the MOCCA: Distinguishing among comprehension
analyses using classical test theory (CTT) (Iteman for Windows, Version processes
3.5, Assessment Systems Corp.). Discrimination and point biserial
correlation ranges between − 1 and + 1. Values of r = .40 and 3.2.1. Comprehension skill
above are considered to be desirable, and values of r = .20 and To determine whether MOCCA can distinguish among comprehen-
below are considered to be low (Ebel, 1954), of which items should sion processes of different types of readers, depending on the adminis-
be revised. Discrimination and point biserial reliabilities for MOCCA tration time, we compared the MOCCA response types selected by good,
items ranged from r = .15–.89. average, and poor comprehenders. Comprehension groups were identi-
fied using scores from the CALT, a state standardized reading compre-
3.1.2.2. Difficulty levels. Item difficulty was calculated for the proportion hension assessment. Specifically, we conducted repeated measures
of participants who answered the item correctly (i.e., chose the causally analyses of variance (RM-ANOVAs) with MOCCA response types (caus-
coherent response type). Item difficulty indices range between 0 ally coherent inferences; paraphrases; local bridging inferences; lateral
and 100 — the higher the values, the easier the question. Thus, items connections) as the within subjects variable and comprehension group

Table 3
Correlation coefficients between the average proportion of totals by MOCCA response type across years and administration times.

Year and administration time Causally coherent inference Paraphrase Local bridging inference Lateral connection
MOCCA response type

Year 1 (timed) Causally coherent inference .423⁎⁎


Paraphrase −.172⁎⁎ .368⁎⁎
Local bridging inference −.083⁎⁎ .122⁎⁎ .249⁎⁎
Lateral connection −.092⁎⁎ .099⁎⁎ .152⁎⁎ .245⁎⁎
Year 2 (timed) Causally coherent inference .424⁎⁎
Paraphrase −.166⁎⁎ .374⁎⁎
Local bridging inference −.056⁎⁎ .027 .183⁎⁎
Lateral connection −.154⁎⁎ .012 .066⁎⁎ .203⁎⁎
Year 2 (untimed) Causally coherent inference .519⁎⁎
Paraphrase −.323⁎⁎ .399⁎⁎
Local bridging inference −.226⁎⁎ .080⁎⁎ .254⁎⁎
Lateral connection −.276⁎⁎ .106⁎⁎ .140⁎⁎ .270⁎⁎

Note.
⁎⁎ p b .01.
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 47

Table 4 (good, average, poor) as the between subjects variable, as well as simple
Discrimination and difficulty indices for MOCCA Items (N = 40 items) for Year 2 untimed main effects to assess the interaction between response type and
test administration (N = 192 participants).
comprehension group. Descriptive statistics (means and SDs) and simple
Item Proportion correct: main effects for Year 1 (timed) and Year 2 (timed and untimed) for
Difficulty index Discrimination index Point biserial index the MOCCA response types selected by good, average, and poor
comprehenders are presented in Table 6.
1 .69 .18 .15
2 .63 .49 .39 Results from the RM-ANOVA revealed statistically significant
3 .63 .46 .34 interactions of response type by comprehension group across years and
4 .73 .70 .60 administration times. Simple main effects of response type by compre-
5 .69 .51 .45 hension skill revealed that good comprehenders chose more causally
6 .74 .66 .59
7 .81 .57 .59
coherent inferences than did the average and poor comprehenders
8 .36 .50 .37 during both years administration times. In addition, both the poor and
9 .80 .53 .54 average comprehenders varied slightly in the other types of MOCCA
10 .47 .43 .33 response types (i.e., paraphrase, local bridging inference, lateral
11 .74 .53 .53
connection) chosen; however, did choose each more often than did
12 .74 .62 .59
13 .61 .67 .55 the good comprehenders. Although these findings were promising for
14 .46 .61 .51 distinguishing between readers at different levels of comprehension
15 .50 .65 .51 skills, we were also interested in determining if MOCCA could distinguish
16 .60 .72 .63 between subtypes of poor comprehenders as seen in previous research
17 .60 .76 .60
18 .61 .76 .60
using other more laborious methods (i.e., think alouds).
19 .45 .61 .46
20 .58 .87 .69
3.2.2. Poor comprehenders
21 .74 .55 .47
22 .69 .55 .48 Thus, to determine whether MOCCA can distinguish between
23 .78 .62 .62 subtypes of poor comprehenders found in previous research (i.e.,
24 .72 .44 .47 paraphrasers and elaborators; McMaster et al., 2012), we first analyzed
25 .80 .44 .49
outcomes from the think-aloud task.
26 .54 .68 .57
27 .76 .64 .63
28 .66 .76 .62
3.2.2.1. Think-aloud subtypes. A cluster analysis was conducted to
29 .56 .63 .54
30 .54 .70 .57
determine whether poor comprehenders cluster into two types (para-
31 .60 .65 .53 phraser and elaborator) as seen in previous research (e.g., McMaster
32 .56 .78 .64 et al., 2012). We used Ward's method (Ward & Hook, 1963), which
33 .54 .83 .70 uses a hierarchical method to group a larger number of participants into
34 .58 .74 .65
smaller groups with similar characteristics. The goal of forming groups
35 .50 .89 .74
36 .59 .77 .64 with this method is to use the sum of the squared within deviations for
37 .37 .74 .56 the group mean of each variable, which is minimized for all the variables
38 .43 .69 .57 in each group at the same time (Ward & Hook, 1963). For the current
39 .43 .56 .48
study, we used a two-cluster solution with the following think-aloud
40 .40 .67 .53
variables: Associations; evaluations; connective inferences (local

Table 5
Correlation coefficients between MOCCA response types and other reading measures across years and administration times.

Year and administration time Measure MOCCA variable

Causally coherent inference Paraphrase Local bridging inference Lateral connection

Year 1 (timed) CALT total .719⁎⁎⁎ −.309⁎⁎ −.292⁎⁎ −.322⁎⁎


CBM Maze .728⁎⁎⁎ −.110 −.102 −.101
DIBELS .589⁎⁎ −.153 −.103 −.169
WJ ID .583⁎⁎ −.232⁎ −.265⁎⁎ −.225⁎
WJ Attack .388⁎⁎ −.110 −.217⁎ −.138
WM Words .308⁎ −.035 −.054 −.036
Year 2 (timed) CALT total .747⁎⁎⁎ −.197~ −.258⁎⁎ −.371⁎⁎
CBM Maze .626⁎⁎ .066 −.035 −.210⁎
DIBELS .647⁎⁎ −.043 −.102 −.221⁎
WJ ID .523⁎⁎ −.142 −.054 −.278⁎⁎
WJ Attack .277⁎⁎ −.129 .024 −.151
WM Words .413⁎⁎ −.166 −.069 −.118
Year 2 (untimed) CALT total .636⁎⁎ −.396⁎⁎ −.475⁎⁎ −.557⁎⁎
CBM Maze .421⁎⁎ −.164 −.280⁎⁎ −.346⁎⁎
DIBELS .565⁎⁎ −.243⁎ −.347⁎⁎ −.424⁎⁎
WJ ID .431⁎⁎ −.219⁎⁎ −.212⁎ −.374⁎⁎
WJ Attack .232⁎ −.174 −.076 −.180
WM Words .346⁎ −.284⁎⁎ −.233⁎ −.287⁎⁎

Note.
~
p = .05.
⁎ p b .05.
⁎⁎ p b .01.
⁎⁎⁎ p b .001.
48 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

Table 6
Simple main effects for repeated measures analysis of variance for the MOCCA response types by comprehender group across years and administration times.

Year and administration time MOCCA response type Good Average Poor

(Yr1 n = 45) (Yr1 n = 22) (Yr1 n = 25)


(Yr2 n = 48) (Yr2 n = 26) (Yr2 n = 24)

Mean (SD) Mean (SD) Mean (SD)

Year 1 (timed) Causally coherent inferences 28.80a (6.84) 16.64b (4.88) 14.16b (4.69)
Paraphrases 2.60a (2.96) 3.27a (4.74) 5.32b (4.38)
Local bridging inferences 1.62a (1.56) 2.32b (3.08) 3.04b (2.85)
Lateral connections 1.62a (1.47) 2.95b (2.59) 2.96b (3.03)
Year 2 (timed) Causally coherent inferences 25.54a (6.28) 16.77b (7.89) 12.50b (4.88)
Paraphrases 2.08a (2.27) 3.54a (6.29) 3.08b (2.26)
Local bridging inferences 1.21a (1.18) 1.54a (1.27) 2.17b (1.97)
Lateral connections 1.31a (0.98) 2.12b (1.68) 2.17b (1.97)
Year 2 (untimed) Causally coherent inferences 32.17a (6.23) 24.92b (9.69) 21.38b (7.15)
Paraphrases 2.65a (2.76) 5.85b (7.48) 6.04b (3.62)
Local bridging inferences 2.08a (1.40) 2.85a (2.60) 4.75b (3.22)
Lateral connections 1.56a (1.05) 2.96b (2.16) 3.63b (2.45)

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Bonferroni adjustment. Year 1 timed (N = 92), and Year 2 timed and untimed (N = 98). The
F-tests for the interaction of MOCCA response types by comprehender group were F = 28.46 (Year 1 timed); F = 27.12 (Year 2 timed); and F = 15.79 (Year 2 untimed); all ps b .001.

and global connectives); paraphrases (paraphrases and verbatim also yielded two groups of poor comprehenders for Years 1 (timed) and
text repetitions); elaborative inferences (elaborative and predictive 2 (timed and untimed) (paraphrasers: poor comprehenders who chose
inferences); and metacognitive comments (understandings and text the causally coherent inference response type, but not as consistently
agreements). as the other comprehender groups, and also chose the paraphrase
Results from the cluster analysis replicated findings from previous response type more than other response types; and elaborators: poor
research and yielded two groups of poor comprehenders for both comprehenders who chose the causally coherent inference type, but
Years 1 and 2 (paraphrasers and elaborators) (McMaster et al., 2012; not as consistently as the average and good comprehenders, and also
Rapp et al., 2007). A RM-ANOVA comparing the think-aloud response chose the lateral connection response type some of the time). The
types used by the good, average, and two types of poor comprehenders same labels for the subtypes of poor comprehenders were used for
revealed a statistically significant interaction of comprehender type MOCCA findings to be consistent with the above think-aloud findings.
in Year 1 and Year 2. Descriptive statistics, simple main effects, and F A RM-ANOVA comparing MOCCA response types used by the good,
statistics for think-aloud response types by comprehender group average, and two types of poor comprehenders revealed a statistically
administered in Years 1 and 2 are found in Table 7. significant interaction of comprehender type in Year 1 (timed) and
Simple main effects revealed statistically significant differences Year 2 (timed and untimed). Descriptive statistics, simple main effects,
between the two types of poor comprehenders with each other and and F statistics for the MOCCA response types chosen by good, average,
with the average and good comprehenders when comparing different and two types of poor comprehenders in Years 1 and 2 are listed in
types of think-aloud responses (McMaster et al., 2012; Rapp et al., Table 8.
2007) across each year. That is, the most unique feature for elaborators Simple main effects revealed statistically significant differences
was that they made reliably more invalid elaborative and predictive between the two types of poor comprehenders with each other and
inferences than did paraphrasers, average, and good comprehenders with the average and good comprehenders when comparing the differ-
during both Years 1 and 2. Paraphrasers, on the other hand, made ent MOCCA response types. Specifically, one unique feature was that the
more paraphrases and text repetitions than did the elaborators, average, elaborators chose the lateral connection response type more often than
and good comprehenders during both Years 1 and 2. did the good comprehenders during Year 2 when there was additional
time provided to complete more MOCCA items (i.e., Year 2 untimed).
3.2.2.2. MOCCA subtypes. To examine if MOCCA can identify similar However, the elaborator groups across Years 1 and 2 were not choosing
subtypes of poor comprehenders as those identified by think-aloud this response type more often than the other distractor response types
results, an additional cluster analysis was conducted with the MOCCA (i.e., paraphrase and local bridging inference) when not choosing the
results. We used a two-cluster solution with the following MOCCA causally coherent inference.
response type variables: Causally coherent inference, paraphrase, local The paraphrasers, on the other hand, chose the paraphrase
bridging inference, and lateral connection response types. This analysis response type more often than did the elaborators, average, and good

Table 7
Means, SDs, and simple main effects for repeated measures analyses of variance for the think-aloud response types chosen by good, average, and two types of poor comprehenders across
years.

Year and administration time Think aloud response Poor: Poor: Average Good
Elaborators Paraphrasers (Yr1 n = 22) (Yr1 n = 45)
(Yr1 n = 19) (Yr1 n = 6) (Yr2 n = 26) (Yr2 n = 48)
(Yr2 n = 14) (Yr2 n = 10)

Mean (SD) Mean (SD) Mean (SD) Mean (SD)

Year 1 (timed) Valid elaborative and predictive inferences 29.39a (6.05) 16.83a (11.44) 30.09a (10.83) 28.42a (14.12)
Invalid elaborative and predictive inferences 3.58a (3.72) 0.83b (0.75) 1.36b (1.94) 1.51b (2.81)
Paraphrases and text repetitions 4.74a (4.56) 23.67b (8.07) 8.18a (9.66) 7.49a (9.38)
Year 2 (timed) Valid elaborative and predictive inferences 24.00a (5.90) 15.30a (3.53) 20.92a (10.92) 18.17a (13.06)
Invalid elaborative and predictive inferences 5.57a (5.30) 2.00b (1.76) 2.31b (3.17) 1.40b (2.67)
Paraphrases and text repetitions 2.43a (2.21) 14.90b (6.52) 6.92a (9.96) 5.42a (8.01)

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Bonferroni adjustment. Year 1 timed (N = 92) and Year 2 timed (N = 98). The F-tests for the
interaction of think-aloud response types by comprehender group were F = 4.77, p = .001 (Year 1); F = 3.46 (Year 2); p = .006.
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 49

Table 8
Means, SDs, and simple main effects for repeated measures analyses of variance for MOCCA response types chosen by good, average, and two types of poor comprehenders across years and
administration times.

Year and administration time MOCCA response Poor: Poor: Average Good
Elaborators Paraphrasers (Yr1 n = 22) (Yr1 n = 45)
(Yr1 n = 20) (Yr1 n = 5) (Yr2 n = 26) (Yr2 n = 48)
(Yr2 n = 15) (Yr2 n = 9)

Mean (SD) Mean (SD) Mean (SD) Mean (SD)

Year 1 (timed) Causally coherent inferences 14.25a (3.87) 13.80a (7.76) 16.64a (4.88) 25.80b (6.84)
Paraphrases 4.65a (4.21) 8.0b (4.42) 3.27a (4.74) 2.60a (2.96)
Local bridging inferences 2.65a (2.46) 4.60b (4.04) 2.32a (3.08) 1.62a (1.56)
Lateral connections 2.50a (2.86) 4.80b (3.34) 2.95a (2.59) 1.62a (1.47)
Year 2 (timed) Causally coherent inferences 13.33a (4.72) 11.11a (4.99) 16.77a (7.89) 25.54b (6.28)
Paraphrases 2.80a (2.31) 3.56a (2.24) 3.54a (6.29) 2.08a (2.27)
Local bridging inferences 2.27a (1.98) 2.0a (2.06) 1.54a (1.27) 1.21a (1.18)
Lateral connections 2.07a (1.57) 2.56a (2.65) 2.12a (1.68) 1.31a (0.98)
Year 2 (untimed) Causally coherent inferences 22.13a (7.71) 20.11a (6.33) 24.92a (9.69) 32.17b (6.23)
Paraphrases 5.33a (2.66) 7.22b (4.76) 5.85a (7.48) 2.65c (2.76)
Local bridging inferences 5.20a (3.26) 4.0b (3.20) 2.85c (2.60) 2.08c (1.40)
Lateral connections 3.67a (2.47) 3.56a (2.56) 2.96a (2.16) 1.56b (1.05)

Note. Means in a row that do not share subscripts are significantly different at p b .05 using a Bonferroni adjustment. Year 1 timed (N = 92) and Year 2 timed and untimed (N = 98). The
F-tests for the interaction of MOCCA response types by comprehender group were F = 19.14 (Year 1 timed); F = 18.21 (Year 2 timed); and F = 10.68 (Year 2 untimed); all ps b .001.

comprehenders during Year 1 (timed) and Year 2 (untimed). In addition, were chosen less often or not at all during the test administrations,
the paraphrasers did choose the paraphrase response type more often regardless of whether additional time was provided.
than the other distractor response types (i.e., local bridging inference Participants chose these response types less often perhaps because
and later connection) when not choosing the causally coherent inference. they are not the best response types that reflect comprehension
processes readers engage in during reading. For instance, the lateral
4. Discussion connection response type may have been low because of the inherent
challenge of detecting when readers elaborate, draw on associations,
In this study, we examined a new reading comprehension assess- or provide self-explanations during reading (Magliano et al., 2011;
ment (MOCCA) to identify individual comprehension processing differ- Millis, Magliano, Wiemer Hastings, Todaro, & McNamara, 2007). That
ences. The MOCCA was developed to assess the processes of reading is, such lateral connections, especially elaborations that connect to
comprehension used when reading narrative texts. Assessing reading irrelevant background knowledge, are less constrained by semantic
comprehension processes has been useful in previous research for context when compared to other textual processes. Thus, the MOCCA
identifying individual comprehension differences among readers (e.g., lateral connection response types may not accurately reflect what
McMaster et al., 2012; Rapp et al., 2007), which may in turn be useful readers engage in during reading. Future studies that develop new
for identifying appropriate instructional methods needed to address response types that more reliably reflect readers' comprehension
specific comprehension needs (e.g., McMaster et al., 2012). The MOCCA processes are warranted.
was developed because there are currently few reading comprehension However, these preliminary data provide a positive step toward
assessments available for practitioners when assessing reading compre- developing a new way to identify and assess the types of processes
hension processes for instructional purposes. Overall, this study provides that readers use during reading. Future studies are needed to collect
initial evidence that the MOCCA can produce reliable and valid infor- additional data to refine the current MOCCA items and response types
mation that can be used to identify types of processes that readers at and examine other response modes. For instance, a study that uses the
different levels of comprehension use during comprehension. MOCCA texts without the multiple-choice response types, but rather
asks participants to complete the missing sentence, could be one
4.1. Reliability and validity of the MOCCA approach for identifying other potential MOCCA response types that
represent readers' comprehension processes.
Findings from the current study indicate that the MOCCA demon- In addition to the reliability analyses, the data from the untimed
strates moderate to high internal consistency as a reading comprehen- administration of the MOCCA allowed us to conduct an item analysis
sion assessment. Specifically, the MOCCA yielded reliable responses to examine the discrimination and difficulty levels of the items. Item
for the causally coherent inference and paraphrase response types. discrimination and difficulty indices were calculated using the total
Thus, the findings for the causally coherent inference response type, number of correct (i.e., causally coherent inference) response types.
developed to complete the missing sentences in the MOCCA with a We found that many of the items fit in acceptable ranges for both
causal, goal completing statement, support previous research for how discrimination and difficulty. However, some of the items were limited
readers develop a coherent representation of a text (e.g., Graesser in their range of difficulty, presenting a limitation in the scope to
et al., 1994; Trabasso & van den Broek, 1985). In addition, these findings discriminate between items at the low and high ends; thus, revisions
extend previous research demonstrating that MOCCA may be used to should be conducted on items that did not fit within acceptable ranges.
identify such comprehension processes used during reading instead of For instance, items that had low discrimination levels may not have
using other more laborious methods (i.e., think-aloud tasks). However, functioned as well as the other items because the response types were
it would be worth replicating these findings with a much larger national not strongly aligned to their corresponding definitions. That is, the
representative sample to confirm such findings. response types may have overlapped with one another which may
The results for the local bridging inference and lateral connection have confused the reader, and thus, became hard to choose the best
response types, however, were not reliable. However, additional analy- response to complete the missing sentence. For instance, in some
ses revealed that the lower alpha coefficients were due to an increased cases, the paraphrase response type paraphrases or repeats the goal
restriction of the range of total scores because these response types sentence of the text; where as in other cases, the paraphrase response
50 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

type paraphrases a combination of the goal and subgoal sentences in the comprehenders who generally repeat the text and make fewer responses
text. This inconsistency for some of the paraphrase response types, and that are inferential, associative, or metacognitive; and (2) elaborators,
potentially others, across MOCCA items warrants additional revisions or poor comprehenders who make more inferential responses that
and refinements for the low functioning items. connect to background knowledge, but do not appear to support overall
Finally, the MOCCA produced scores that were significantly correlated comprehension of a text (McMaster et al., 2012; Rapp et al., 2007). These
with other reading measures. Specifically, the causally coherent inference data support the notion that poor comprehenders may struggle with
response type was significantly positively correlated with all the other comprehension in different ways (e.g., Cain & Oakhill, 2006; Nation
reading measures administrated in this study. These results provide initial et al., 2002; Perfetti, 2007) and further raises the question of whether
support that suggests that readers should be able to decode as well as poor comprehenders respond to interventions in different ways
comprehend to perform well on the MOCCA. Additionally, the other (e.g., McMaster et al., 2012). Therefore, it is important to assess read-
MOCCA response types (paraphrases, local bridging inferences, lateral ing comprehension processes using appropriate measures to identify
connections) were negatively correlated or not correlated with the comprehenders who may benefit from different types of instructional
other reading measures. These findings may suggest that the better stu- approaches.
dents perform on other reading measures, the less likely they are to We also found subtypes of poor comprehenders using the MOCCA:
choose the less than desirable response types on the MOCCA (i.e., para- (1) paraphrasers: poor comprehenders who chose the paraphrase
phrase, local bridging inference, lateral connection). In other response type more than the local bridging and lateral connection
words, the performing well on other reading measures negatively corre- response types when they were not choosing the causally coherent
lates with choosing the non-causally coherent response types on the inference response type; and (2) elaborators: poor comprehenders
MOCCA. However, additional administrations of the MOCCA and other who chose the lateral connection response type much of the time when
reading measures should be conducted with a much larger sample of par- they were not choosing the causally coherent inference type. These
ticipants to further examine the validity of the measure. preliminary findings extend previous research for using other types of
methods to identify subtypes of poor comprehenders (e.g., Magliano
4.2. Identifying comprehension differences among readers et al., 2011; McMaster et al., 2012); however, the findings did not consis-
tently yield identical groups of poor comprehenders or similar patterns
Our findings also indicate that the MOCCA identifies reading for both years of this study. One limitation of these findings is the small
comprehension processing differences between comprehenders. Spe- sample size of the poor comprehender subgroups. Sample sizes for the
cifically, we found that, overall, good comprehenders chose the causally poor comprehender group were small due to limited access to a
coherent inference response type more than did the average and poor population of poor readers (i.e., participants were selected from only
comprehenders; and average and poor comprehenders chose the one suburban school district). De Ayala (2009) recommends that, to
other response types (paraphrases, local bridging inferences, lateral detect a group difference with new assessments, a minimum sample
connections) more than did the good comprehenders. These findings size of 10 times the number of parameters should be estimated. That is,
extend previous research that indicates that different comprehenders if I is the number of items and A is the number of alternatives for each
use different comprehension processes during reading (e.g., McMaster item, this means the sample size should be at least 2I(A − 1). A nominal
et al., 2012; Rapp et al., 2007). The response types developed for the model has 2(A − 1) parameters for each item. For example, with I = 20
MOCCA are important because they are analogous to types of compre- items and A = 4 alternatives, this means 2 ∗ 10 ∗ 20 ∗ 3 = 1200 (De Ayala,
hension processes that readers use to develop a representation of a 2009). Thus, future research with a much larger sample size may provide
text (e.g., Graesser & Clark, 1985; Graesser et al., 1994; Kintsch, 1998; a better estimate of whether the MOCCA can be used to predict subtypes
Magliano et al., 2011; McMaster et al., 2012; McNamara et al., 1996; of poor comprehenders.
Rapp et al., 2007; van den Broek et al., 2001, 2006). Such processes Another limitation for identifying subtypes of poor comprehenders
have been traditionally found using methods such as think-aloud using MOCCA is that the MOCCA and the think-aloud task may be
tasks that are time consuming and inefficient. measuring reading comprehension processes differently. For instance,
In addition, many traditional reading comprehension assessments during the MOCCA, readers are required to choose one out of four
(e.g., standardized comprehension measures) fail to identify specific response types that mimic reading comprehension processes to com-
comprehension processes among readers, but instead measure total plete a missing sentence from a text. In a think-aloud task, readers are
reading comprehension scores or the product of comprehension, and prompted to talk aloud during reading and responses are later coded
do not provide information for why a reader struggles with comprehen- for types of comprehension processes used during reading. Thus, it
sion. Like other unique assessments (e.g., RSAT; Magliano et al., 2011), may be more difficult to compare using the MOCCA and a think-aloud
the MOCCA was developed using theories of reading comprehension task for identifying subtypes of poor comprehenders. Further research
to identify how (i.e., the processes by which) readers develop a coherent examining other MOCCA response types that are analogous to other
representation of a text or situation model during reading rather than types of think-aloud responses may provide useful information for
after reading. However, unlike other assessments, MOCCA was de- continuing to develop the MOCCA as an appropriate measure for
veloped to: expend fewer resources during test administration and identifying subtypes of poor comprehenders.
scoring; provide readers with a familiar testing format (i.e., multiple-
choice); provide educators with shorter administration times compared 5. Conclusion
to using other more laborious methods (i.e., think-aloud tasks);
and provide an assessment that can be used in a variety of academic The purposes of this study were to examine MOCCA, a new reading
settings (e.g., Magliano et al., 2011). In addition, we have initial evidence comprehension assessment designed to identify specific comprehen-
that the MOCCA can be used instead of think alouds to efficiently identify sion processes used during reading, and to identify individual differ-
individual reading comprehension differences (e.g., types of poor ences among the types of processes different comprehenders use
comprehenders) during reading for the purpose of making instructional during reading, in particular, poor comprehenders. The results from this
decisions (e.g., August et al., 2006; Magliano et al., 2011; McMaster study support our purpose for developing a reading comprehension
et al., 2012; Pike et al., 2010). assessment around how readers use cognitive processes to build a coher-
In addition, in this study, two subtypes of poor comprehenders ent representation of a text. That is, the development of MOCCA supports
emerged from both the think-aloud and MOCCA data. First, our find- the need for assessments that are developed based on cognitive theories
ings support previous research that has found subtypes of poor of reading comprehension (e.g., Johnson-Laird, 1983; Kintsch & van Dijk,
comprehenders using a think-aloud task: (1) paraphrasers, or poor 1978; Oakhill & Cain, 2007; Perfetti, Landi, & Oakhill, 2005; van den Broek,
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 51

Young, Tzeng, & Linderholm, 1999). For instance, previous research has B) Janie worked at the store.
shown that readers engage in different types of processes to develop [Local bridging inference]
a coherent representation of a text or situation model during read- C) Janie's dad was upset with her choice.
ing, and processes that help readers track causality in a text are par- [Lateral connection]
ticularly strong in this development (e.g., Graesser et al., 1994; D) Janie wanted to go to the store.
Trabasso & van den Broek, 1985). Thus, the causally coherent inference [Paraphrase]
response type developed for the MOCCA provides initial evidence to
support our purpose.
In addition, a unique feature of MOCCA is the qualitative nature of
differences between response types for each item. Results from this
study provide preliminary support for MOCCA as being a reliable and
valid reading comprehension assessment that not only identifies the
types of cognitive processes that readers using during reading, but
also distinguishes between types of comprehension processes that
good, average, and subtypes of poor comprehenders use during reading.
MOCCA items were designed with alternative answers that were based The Gift
on findings from psychological studies of reading comprehension to Maria was still too young to work and earn money.
help identify such differences (e.g., Kintsch & van Dijk, 1978; Magliano Whenever birthdays came up, she relied on her older siblings to buy
et al., 2011; Trabasso & van den Broek, 1985; van den Broek et al., the gift.
1999, 2001). Maria's mother was having a birthday soon, and Maria wanted to
These results extend the literature in the areas of the cognitive buy her own gift.
processes of reading comprehension and reading comprehension Because she was good at doing chores, Maria decided to ask some
assessments. That is, the MOCCA differs from more traditional school- neighbors if they would pay her to clean.
based reading comprehension assessments because of its diagnostic She went to all of the neighbor's houses to offer her help, but no one
qualities. These findings have implications for educational research wanted her help.
focused on using appropriate assessments for developing appropri- MISSING SENTENCE
ate instruction and interventions for improving struggling readers' Her mother said, “The best gift of all is to know that you appreciate
comprehension. me.”
We wish to thank the undergraduate and graduate students who
CHOICES:
took part in this research project and provided support with collect-
A) One nice neighbor offered to give Maria money to buy a present.
ed, scored, and coded data. We would also like to acknowledge and
[Local bridging inference]
thank Dr. Mark Davison and Dr. Gina Biancarosa for their ongoing
B) So, Maria decided to do all her mother's chores for her birthday.
support and suggestions for data analyses, and Dr. Susan Goldman
[Causally coherent inference]
for her suggestions for future revisions of the MOCCA.
C) Maria wanted to buy her own gift for her mother's birthday
party.
[Paraphrase]
Appendix A. MOCCA practice item and corresponding response types
D) Maria walked downtown to look for a job to earn money for the
gift.
NOTE: Instructions and practice items follow the format used during
the MOCCA administration. The following page includes a practice item, [Lateral connection]
and two additional MOCCA items. Children see the items one at a time
Kayla at the Restaurant
during the MOCCA administration.
Kayla is going to meet a friend lunch at her favorite restaurant.
Kayla loves French fries at this restaurant and plans to order them.
Rachel, Kayla's friend, arrives at the restaurant.
Kayla and Rachel hadn't seen each other in months and were happy to
have lunch.
Kayla and Rachel sat at their table and began to talk.
MISSING SENTENCE
Kayla is happy with her choice and hands back the menu to the
waiter.

CHOICES:
A) Kayla loves the French fries served at this restaurant.
[Paraphrase]
Practice Story: Janie and the Trip to the Store B) Rachel tells Kayla that she just bought a new house.
Janie's dad was heading to the store. [Local bridging inference]
Janie wanted to go with him. C) Kayla likes how the restaurant is decorated with flowers.
She wanted to get a treat at the store. [Lateral connection]
Janie had saved up some money. D) Kayla orders the French fries when the waiter arrives.
At the store, there was lots of candy to choose from. [Causally coherent inference]
MISSING SENTENCE
Janie was happy.
Appendix B. Think-aloud texts
CHOICES:
A) She picked out her favorite candy bar. NOTE: Both think-aloud texts are administered one sentence at
[Causally coherent inference] a time.
52 S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53

B.1. Brian's magical skill Espin, C. A., & Foegen, A. (1996). Validity of general outcome measures for predicting
secondary students' performance on content-area tasks. Exceptional Children, 62,
497–514.
Brian liked to perform for people. His teacher said that she was going Fletcher, J. M. (2006). Measuring reading comprehension. Scientific Studies of Reading, 10,
to choose someone to be in a school show that would be in the spring. 323–330.
Francis, D. J., Fletcher, J. M., Catts, H. W., & Tomblin, J. B. (2005). Dimensions affecting the
Brian wanted to be chosen. He sang a song for his teacher. It was not assessment of reading comprehension. In S. G. Paris, & S. A. Stahl (Eds.), Children's
very good. His teacher did not choose him. Brian decided to put on a reading comprehension and assessment (pp. 369–394). Mahwah, NJ: Erlbaum.
magic act. He bought some fancy magic cards and a hat. He tried to do Fuchs, L. S., & Fuchs, D. (1992). Identifying a measure for monitoring student reading
progress. School Psychology Review, 58, 45–58.
some tricks on a table. The tricks were difficult to do. Brian wanted to Good, R. H., & Kaminski, R. A. (Eds.). (2002). Dynamic indicators of basic early literacy skills
learn other kinds of activities. He asked a magician if he would teach (6th ed.). Eugene, OR: Institute for the Development of Educational Achievement..
him. The magician agreed to teach him. Brian visited every day for a Graesser, A. C., & Clark, L. F. (1985). Structures and procedures of implicit knowledge.
Norwood, NJ: Ablex.
month. He watched how each routine was done and practiced a lot.
Graesser, A. C., Singer, M., & Trabasso, T. (1994). Constructing inferences during narrative
Brian learned how to perform different kinds of magic. He selected text comprehension. Psychological Review, 101, 371–395.
tricks that he could do best. He put together a good act, showing his Johnson-Laird, P. N. (1983). Mental models: Towards a cognitive science of language,
teacher. He made some pretty flowers come out of her ear and then inference, and consciousness. Cambridge: Cambridge University Press.
Iteman, Version 3.5 (1989). Conventional item analysis program. St. Paul, MN: Assessment
made the flowers disappear. The magic act was a hit and was selected Systems Corporation.
for the show. Kaakinen, J. K., Hyönä, J., & Keenan, J. M. (2003). How prior knowledge, working memory
Comprehension Questions: capacity, and relevance of information affect eye-fixations in expository text. Journal
of Experimental Psychology: Learning, Memory, and Cognition, 29, 447–457.
1) Does Brian like to Perform? _______ Keenan, J. M., Betjemann, R. S., & Olson, R. K. (2008). Reading comprehension tests vary in
the skills they assess: Differential dependence on decoding and oral comprehension.
2) Did Brian learn magic by himself? _______ Scientific Studies of Reading, 12, 281–300.
Kincaid, J. P., Fishburne, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new
readability formulas (Automated Readability Index, Fog Count, and Flesch Reading
B.2. Candle crafts for school Ease formula) for Navy enlisted personnel. Research Branch report 8-75. Chief of
Naval technical training: Naval air station Memphis. Scientific Studies of Reading, 12,
One day, Sally's class had show-and-tell. Her best friend showed a 281–300.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: Cambridge
picture that she had painted. She told the class how she had made the
University Press.
picture and everyone liked it. Then Sally decided that she wanted to Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production.
make something special for show-and-tell. So, she went to the library Psychological Review, 85, 363–393.
and checked out a book. Sally read that candles could be made by Laing, S. P., & Kamhi, A. G. (2002). The use of think-aloud protocols to compare
inferencing abilities in average and below-average readers. Journal of Learning
melting crayons and pouring them into a cup. Finally, she decided she Disabilities, 35, 436–447.
wanted to make a red candle. Sally found some crayons and a cup at Linderholm, T., Cong, X., & Zhao, Q. (2008). Differences in low and high working memory
home. She put the crayons into the cup. She melted the wax in the capacity readers' cognitive and metacognitive processing patterns as a function of
reading for different purposes. Reading Psychology, 29, 61–85. http://dx.doi.org/10.
cup and held a string to make a wick. The wax hardened quickly. At 1080/02702710701568587.
last, the beautiful candle was finished. She put her new candle in a Lord, F. M. (1952). The relationship of the reliability of multiple-choice test to the
holder and began decorating it with ribbons. The next day, Sally carried distribution of item difficulties. Psychometrika, 18, 181–194.
Magliano, J. P., Millis, K. K., Development Team, R. S. A. T., Levinstein, I., & Boonthum, C.
the candle to school. When she arrived, she asked her teacher if she (2011). Assessing comprehension during reading with the Reading Strategy
could be in show-and-tell. She won the show-and-tell grand prize for Assessment Tool (RSAT). Metacognition and Learning, 6, 131–154. http://dx.doi.
her candle. Her friend was happy for her and they celebrated after org/10.1007/s11409-010-9064-2.
McMaster, K. L., van den Broek, P., Espin, C. A., White, M. J., Rapp, D. N., Kendeou, P., Bohn-
school.
Gettler, C. M., & Carlson, S. (2012). Making the right connections: Differential effects
Comprehension Questions: of reading intervention for subgroups of comprehenders. Learning and Individual
Differences, 22(1), 100–111. http://dx.doi.org/10.1016/j.lindif.2011.11.017.
1) Did Sally go to the library? ______ McNamara, D. S., Kintsch, E., Songer, N. B., & Kintsch, W. (1996). Are good texts
2) Did Sally dislike show-and-tell? ______ always better? Interactions of text coherence, background knowledge, and levels
of understanding in learning from text. Cognition and Instruction, 14, 1–43.
Millis, K. K., Magliano, J. P., Wiemer Hastings, K., Todaro, S., & McNamara, D. (2007).
References Assessing and improving comprehension with latent semantic analysis. In T. K.
Landauer, D. S. McNamara, D. Simon, & W. Kintsch (Eds.), Handbook of latent semantic
August, D., Francis, D. J., Hsu, H. -Y. A., & Snow, C. E. (2006). Assessing reading comprehen- analysis (pp. 207–225). Mahwah, NJ: Erlbaum.
sion in bilinguals. The Elementary School Journal, 107, 221–238. Nation, K., & Snowling, M. (1997). Assessing reading difficulties: The validity and utility of
Cain, K., & Oakhill, J. (1999). Inference making ability and its relation to comprehension current measures of reading skill. British Journal of Educational Psychology, 67, 359–370.
failure in young children. Reading and Writing: An Interdisciplinary Journal, 11, Nation, K., Clarke, P., & Snowling, M. J. (2002). General cognitive ability in children with
489–503. reading comprehension difficulties. British Journal of Educational Psychology, 72,
Cain, K., & Oakhill, J. (2006). Assessment matters: Issues in the measurement of reading 549–560.
comprehension. British Journal of Educational Psychology, 76, 697–708. Northwest Evaluation Association (2001). Computerized Achievement Levels Tests (CALT).
Chall, J. S. (1996). Stages of reading development (2nd ed.). Fort Worth, TX: Harcourt-Brace. Lake Oswego, OR: Independent School District 271, Bloomington, MN.
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, Oakhill, J., & Cain, K. (2007). Introduction to comprehension development. In K. Cain, & J.
16(3), 297–334. Oakhill (Eds.), Children's comprehension problems in oral and written language: A
Cutting, L. E., & Scarborough, H. S. (2006). Prediction of reading comprehension: Relative cognitive perspective (pp. 3–40). New York: Guilford Press.
contributions of word recognition, language proficiency, and other cognitive skills Perfetti, C. A. (2007). Reading ability: Lexical quality to comprehension. Scientific Studies of
can depend on how comprehension is measured. Scientific Studies of Reading, 10, Reading, 11, 357–383.
277–299. Perfetti, C., Landi, N., & Oakhill, J. (2005). The acquisition of reading comprehension skill.
De Ayala, R. J. (2009). The theory and practice of item response theory. New York: The In M. J. Snowling, & C. Hulme (Eds.), The science of reading: A handbook (pp. 227–247).
Guilford Press. Oxford: Blackwell.
Daneman, M., & Carpenter, P. A. (1980). Individual difference in working memory and Pike, M. M., Barnes, M. A., & Barron, R. W. (2010). The role of illustrations in children's
reading. Journal of Verbal Learning and Verbal Behavior, 19, 450–466. inferential comprehension. Journal of Experimental Child Psychology, 105, 243–255.
Deno, S. L. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Rapp, D. N., van den Broek, P., McMaster, K. L., Kendeou, P., & Espin, C. A. (2007). Higher-
Children, 52, 219–232. order comprehension processes in struggling readers: A perspective for re-
Eason, S. H., & Cutting, L. E. (2009). Examining sources of poor comprehension in older search and intervention. Scientific Studies of Reading, 11(4), 289–312. http://
poor readers: Preliminary findings, issues, and challenges. In R. K. Wagner, C. S. dx.doi.org/10.1080/10888430701530417.
Schatschneider, & C. Phythian-Sence (Eds.), Beyond decoding: The behavioral and Shanahan, T., & Shanahan, C. (2008). Teaching disciplinary literacy to adolescents:
biological foundations of reading comprehension (pp. 263–283). New York, NY: Guilford. Rethinking content-area literacy. Harvard Educational Review, 78, 40–59.
Ebel, R. L. (1954). Procedures for the analyses of classroom tests. Educational and Shanahan, T., & Shanahan, C. (2012). What is disciplinary literacy and why does it matter?
Psychological Measurement, 14, 352–364. Topics in Language Disorders, 32, 1–12.
Ericsson, K. A., & Simon, H. (1993). Protocol analysis: Verbal reports as data. Cambridge, Streiner, D. L. (2003). Starting at the beginning: An introduction to coefficient alpha and
MA: MIT Press. internal consistency. Journal of Personality Assessment, 80, 99–103.
S.E. Carlson et al. / Learning and Individual Differences 32 (2014) 40–53 53

Swanson, H. L., Cochran, K. F., & Ewers, C. A. (1989). Working memory in skilled and less van den Broek, P., Lynch, J. S., Naslund, J., Ievers-Landis, C. E., & Verduin, C. J. (2003).
skilled readers. Journal of Abnormal Child Psychology, 17, 145–156. Children's comprehension of main ideas in narratives: Evidence from the selection
Trabasso, T., & van den Broek, P. (1985). Causal thinking and the representation of of titles. Journal of Educational Psychology, 95, 707–718.
narrative events. Journal of Memory and Language, 24, 612–630. http://dx.doi.org/ van den Broek, P., McMaster, K. L., Rapp, D. N., Kendeou, P., Espin, C. A., & Deno, S. L. (2006,
10.1016/0749596x(85)90049-x. June). Connecting cognitive science and educational practice to improve reading
Trabasso, T., & Magliano, J. P. (1996a). Conscious understanding during text comprehension. comprehension. Paper presented at the Institute of Education Sciences Research
Discourse Processes, 21, 255–288. Conference, Washington, DC.
Trabasso, T., & Magliano, J. P. (1996b). How do children understand what they read and van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York:
what can we do to help them? In M. Graves, P. van den Broek, & B. Taylor (Eds.), Academic Press.
The first R: A right of all children (pp. 160–188). New York: Columbia University Press. Ward, J. H., & Hook, M. E. (1963). Application of an hierarchical grouping procedure to a
van den Broek, P. W. (1997). Discovering the cement of the universe: The development of problem of grouping profiles. Educational and Psychological Measurement, 23, 69–81.
event comprehension from childhood to adulthood. In P. W. van den Broek, P. Bauer, Wayman, M., Wallace, T., Wiley, H. I., Ticha, R., & Espin, C. A. (2007). Literature synthesis
& T. Bourg (Eds.), Developmental spans in event comprehension and representation: on curriculum-based measurement in reading. Journal of Special Education, 41,
Bridging fictional and actual events (pp. 321–342). Hillsdale, NJ: Erlbaum. 85–120.
van den Broek, P., Rapp, D. N., & Kendeou, P. (2005). Integrating memory-based and Woodcock, R., McGrew, K. S., & Mather, N. (2001). Woodcock–Johnson III tests of cognitive
constructionist processes in accounts of reading comprehension. Discourse abilities and achievement. Itasca, IL: Riverside Publishing.
Processes, 39, 299–316. http://dx.doi.org/10.1080/0163853x.2005.9651685. Zwaan, R. A., Magliano, J. P., & Graesser, A. C. (1995). Dimensions of situation model
van den Broek, P., Young, M., Tzeng, Y., & Linderholm, T. (1999). The landscape model of construction in narrative comprehension. Journal of Experimental Psychology:
reading: Inferences and the online construction of memory representation. In S. R. Learning, Memory, and Cognition, 21, 386–397.
Goldman, & H. van Oostendorp (Eds.), The construction of mental representations dur-
ing reading (pp. 71–98). Mahwah, NJ: L. Erlbaum Associates.
van den Broek, P., Lorch, R. F., Linderholm, T., & Gustafson, M. (2001). The effects
of readers' goals on inference generation and memory for texts. Memory and
Cognition, 29, 1081–1087.

You might also like