You are on page 1of 22

Language Testing 2009 26 (1) 101–122

Lexical patterns in L2 textual gist


identification assessment
Kyoko Yamada Kyoto, Japan

Gist identification or coherent situation model construction performance is an


important criterion not only in L1 and L2 reading comprehension assessment
but also in every aspect of discourse processing. Whereas most previous L2
research has investigated schematic knowledge of readers about relations
within a text, more recent studies have used models of comprehension that
also pay attention to external reference and world knowledge. The present
study proposes a way to assess gist identification by L2 readers using Hoey’s
(1991) theory of networks of repetition of lexical items including
morphologically similar words and paraphrases of keywords containing the
gist of a text. Based on Hoey’s tax-onomy, two models were tested. One, fol-
lowing Kintsch (e.g. 1998), assessed the ability to identify the gist of a text
using minimal lookbacks. The other encouraged multiple lookbacks. The sec-
ond model, which involved more reader involvement in finding referential
relationships within a text, resulted in more detailed understanding of gist.

Keywords: coherence, gist, pragmatics, reading, referential, signs, strate-


gies, synonym

An important goal, perhaps the most important one, for readers is to


construct the original gist of a text (Fitzgerald, 1995; Swaffar, 1988).
Some authors would say that the reader must build a coherent situ-
ation model (Graesser, Millis & Zwaan, 1997; Kintsch, 1998; Raney,
2003). There is now some consensus that being able to construct the
gist of a text – a process of constructing the main idea of a text and
activating relevant world knowledge – is what separates a skillful
reader from a less skillful one (Hyönä, Lorch & Kaakinen, 2002;
Kintsch, 1998; Pressley & Afflerbach, 1995) and some discourse
processing theories, and especially ones based on a general and
abstract theory of signs (e.g., Oller, et al., 2006), go further in

Kyoko Yamada, email: wsedikol@hotmail.com

© 2009 SAGE Publications (Los Angeles, London, New Delhi and Singapore) DOI:10.1177/0265532208097338
102 Lexical patterns in L2 textual gist identification assessment

arguing that there must be a dynamic interplay between the reader


and writer of the text and the context relevant to them.
In L2 research, numerous studies have dealt with the discourse
processing involved in reading comprehension (e.g., Carrell, 1985;
Chan, 2006; Connor, 1984; Horiba, 1990; Kobayashi, 2002). Most of
these have focused on how much of the textual information readers
can recall or paraphrase off-line by activating their relevant world
knowledge (e.g., schemas, frames, scripts, associations, episodic
memory). Recently, however, some L2 researchers have turned to
Kintsch’s (1998) text comprehension model (Barry & Lazarte, 1998;
Heinz, 2004; Nassaji, 2002; Pulido, 2004) widely used in L1 dis-
course comprehension research (e.g., Dell, Mckoon & Ratcliff, 1983;
Kintsch, 1998; McKoon & Ratcliff, 1980; Hyönä & Nurminen,
2006). The Kintsch model assumes that a coherent situation model is
constructed beginning with printed word recognition. This triggers
development of text propositions which are then matched with other
propositions already in the readers’ long-term working memory or
world knowledge to formulate a coherent situation model (i.e., the
gist) of the text.
One common feature of Kintsch’s model and previous discourse
models (e.g., schemas, frames, scripts, associations, episodic mem-
ory, rhetorical organization as applied by Anderson and Pearson
(1984) and Meyer (1975)) is that all of them regard world knowledge
(i.e., long-term working memory) as playing an important role in
constructing coherent textual situation models (Barsalou, 1999;
Zwaan, 2004). Yet, they differ on two points. The first difference has
to do with word recognition skills (Kintsch, 1998). Whereas fluent
and automatic word recognition skill was not crucial to the previous
models, in Kintsch’s model it plays a significant role in situation
model construction. The second difference is in the treatment of
working memory resources (Kintsch, 1998). Unlike previous
models, Kintsch’s model focuses on human memory resources to
text processing, especially the role of long-term memory (i.e., world
knowledge) in situation model construction. However, as Nassaji
(2002) reminds us, when exploiting Kintsch’s model in L2 research,
the more fundamental problem might be the effect of L2 readers’
short-term working memory resources on this process. There are in
fact two inherent problems in Kintsch’s model. The first is the lack
of evidence of how word recognition might lead to propositional for-
mation and word meaning uptake. Even if words played a role in
meaning formation in Kintsch’s model, they do so merely as cues
that allow activation and retrieval of semantic propositions (Kintsch,
Kyoko Yamada 103

1998, Raney, 2003). Due to such a minimalist assumption of text


processing, most studies that follow Kintsch have focused on read-
ers’ performance in connecting a single surface form such as pro-
noun or demonstrative, or determiner with the referent of that
element somewhere else in the text, usually, in close proximity to it
(e.g., Dell et al., 1983; Ehrlich & Rayner, 1983; Kintsch, 1998;
McKoon & Ratcliff, 1980; Rayner, Raney & Pollatsek, 1997).
Recently, however, evidence has been accumulating in L1 and
L2 discourse studies that challenge Kintsch’s minimalist approach
methodologically and empirically. Methodologically, the focus is
on greater varieties of referential relations such as repeated lexical
items, synonyms, and paraphrases in longer stretches of texts
(Linderholm, Virtue, Tzeng & van den Broek, 2004; Pretorius,
2005; Raney, Therriault & Minkoff, 2000) sometimes spanning
several pages of text (e.g., Hyönä et al., 2002) as well as pragmatic
ones that link real or imagined persons, events, and the like (Oller,
et al., 2006). Empirically, doubts have been cast on the necessity of
Kintschian type of propositions, which will be illustrated below, as
a mediatory variable in the human text information uptake process
(Barsalou, 1999; Graesser, Millis, et al., 1997; Sadoski, 1998;
Zwaan, 2004). Although, as Oller, et al. (2006) have noted, all
models that try to dispense with abstract propositions and their
abstract constituents fail just as miserably as ones that try to do
away with the need to refer to persons, events, and relations in the
real world (e.g., Pylyshyn, 2002), evidence is accumulating that
shows situation model construction does not have to depend on
Kintschian type propositions. In fact, textual input mediated by the
readers’ world knowledge and experience has been shown to have
a direct relationship with situation model construction (Glenberg &
Robertson, 1999; Zwaan, 2004). For instance, a phrase such as ‘an
eagle flying in the sky’ is likely to trigger the image of an eagle
with its wings spread rather than one with its wings folded (Zwaan,
2004) and when people hear or read about a certain bodily motion
(e.g. a hand motion), they tend to produce that motion (Glenberg &
Kaschak, 2002; Zwaan, 2004). Similarly, a study investigating
think-aloud protocols of L2 readers solving cloze tests found that a
single encounter with a familiar word in text was all it took for L2
readers’ textual meaning uptake process to be activated
(Yamashita, 2003).
The second problem in Kintsch’s model is its assumption that
effective short-term working memory resources are necessary to suc-
cessful situation model construction. Kintsch’s position is, in fact,
104 Lexical patterns in L2 textual gist identification assessment

backed by L1 research that suggests that non-fluent word recognition


skills cause inefficient use of working memory resources (e.g., Just &
Carpenter, 1992; Koda, 2005), which result in unsuccessful situation
model construction. Yet, recent research suggests that with training
this limitation can be partially overcome (McNamara & Scott, 2001;
Walczyk & Taylor, 1996). In fact, working memory limitation has
been found to fluctuate depending on the genre of the text and the
strategy readers employ. Graesser, Bowers, et al. (1997), for example,
report that following referential connections in a narrative is easier
than in expository texts. Yet, more recent research using eye-tracking
behavior shows that looking back to previous lexical information in
the expository text to confirm a text’s referential connections is a hall-
mark of very skillful L1 readers (Hyönä et al., 2002). Unsurprisingly,
it was such skillful readers who produced high-quality summaries
of the text despite previous claims that such lookbacks are a feature
of unskillful reading behavior (e.g., Nassaji, 2002). In line with the
results of these L1 studies is an L2 study, where Yamada (2005)
reports that L2 intermediate learners of English engaging in multiple
lookbacks of an English journalistic text, a genre that offers multiple
referential connections in text, were able to guess synonyms of
words that make up the main idea of the text found in the first sen-
tence of the text.
The findings presented above have provided a rationale for start-
ing to investigate the construction of L2 gist identification perform-
ance measures through analyzing readers’ raw textual information
uptake performance, unmediated by Kintschian type propositions.
To test this hypothesis, a new framework is needed that investigates
gist identification performance while readers are processing a text
with multiple – rather than few – referential connections as they
engage in lookbacks (i.e., reading to identify referential relation-
ships) of the text. This is because lookbacks are said to eliminate the
impact of readers’ working memory limitation and would allow a
more direct access to readers’ gist identification performance.

I Lexical patterns in L2 reading instruction


The present study proposes that a discourse analytical scheme pres-
ented in Hoey’s (1991) book Patterns of Lexis in Text can serve as
a framework for investigating readers’ gist identification perform-
ance patterns. In his book, Hoey discusses two gist processing
Kyoko Yamada 105

assessment models, both of which involve readers’ looking back at


previous sections of extensive text to identify referential networks of
paraphrases (which Hoey uses interchangeably with synonyms – a
practice followed in the present study) and morphologically identi-
cal or similar words in English non-narrative texts, which make up
the main idea of the text. The first model is the ‘link’ (Hoey, 1991,
p. 51), which is a chain of words in a text consisting of semantically
similar words that readers identify when they are looking back in the
text. The second is the ‘bond’ (Hoey, 1991, p. 265), which is a ‘con-
nection that exists between a pair of sentences’ (Hoey, 1991, p. 265)
wherein the sentences share three or more links that readers identify.
How such referential networks are built is illustrated using the fol-
lowing excerpts from a journalistic article:
1. A new report by the UN Food and Agriculture Organization, (FAO)
projects that deaths caused by HIV/AIDS in the ten most affected
African countries will reduce the labour force … by 2020.
3. An estimated 16 million more deaths are reported likely in the next two
decades.
19. According to the FAO report, the loss of able-bodied adults
affects the entire society’s ability to maintain and reproduce itself. (Food
and Agricultural. Organization of the United Nations, 2001)
The six underlined words (report, FAO, projects, deaths, affected
countries) in sentence 1 are repeated or paraphrased in sentences 3 and
19 of this passage. Links and bonds are identified by readers when
they can spot these repeated items and paraphrases. One unambiguous
example of a link here is FAO repeated in sentences 1 and 19. Other
examples of links in sentences 1, 3, and 19 consist of semantically
similar lexical items including morphologically identical or similar
words and paraphrases (report-reported-report, projects-estimated/
likely, deaths-deaths-loss, affected-affects, and countries-society’s).
On the other hand, this text has two bonds, one between sentences 1
and 3 (report-reported, projects-estimates/likely, deaths-deaths) and
another between sentences 1 and 19 (FAO–FAO report-report, deaths-
loss, affected-affects countries-societies). Notice here that these lexi-
cal patterns include three kinds of relationships: morphologically
identical repetitions (FAO–FAO report-report), morphologically simi-
lar repetitions (affected-affected-affects), and paraphrases (projects-
estimates, deaths-loss).
Hoey advises L2 learners of English to develop sensitivity to lex-
ical patterns as it would help them process L2 texts more smoothly.
Despite the fact that in real-life text processing, such a surface or
lexical-level sensitivity would need to be synchronized with a deeper
106 Lexical patterns in L2 textual gist identification assessment

or innate syntactic sensitivity (Oller, Oller & Badon, 2006), Hoey’s


claim nonetheless stands due to the fact that sensitivity to referential
cues in text such as paraphrases is a reliable measure of native
English speakers’ reading comprehension skill (Carver, 2000; Kuo &
Anderson, 2006).
Hoey suggests two ways in which lexical pattern identification
can be incorporated in L2 reading instruction: either by instructing
readers to identify lexical patterns of designated keywords in the
text or by encouraging them to find the patterns on their own. One
previous study (Yamada, 2005) tested the applicability of the first
method qualitatively by having EFL readers identify designated rep-
etitions or paraphrases of keywords in the topic sentence of a short
journalistic text. The results were then compared against a modified
version of Hoey’s (1991) results of lexical pattern searches of the
same text, which confirmed that readers were able to identify an
average of 60 percent of words constituting lexical patterns and at
least one bond.
Of the two models of text processing – namely, bonds and links –
Hoey assumes that bond identification would make a greater contri-
bution to L2 learners because ‘[n]ot only would this give the learner
ample opportunity to see the items in operation and learn how they
are most typically used, but it would give them quicker access to the
content of the text by enabling them to make principled selections of
sentences from the text’ (1991, p. 241) whereas links are judged less
reliable.
If interpreted in light of the present study, Hoey’s lexical patterns
can be applied as measures that test L2 readers’ textual gist identifi-
cation performance. This is because they offer (1) a framework that
deals with gist identification performance in a context where readers
are processing extensive text with multiple referential connections,
rather than a few sentences with a few discourse cues and (2) two
models of gist identification (i.e., links and bonds) while readers are
engaging in lookbacks of the text.
To illustrate, links can be regarded as a gist identification model
that is similar to the Kintschian proposition construction in two
ways. First, both are about constructing overlapping semantic repre-
sentations extracted from a text reduced to the minimum. In Hoey’s
model, each link is a string of semantically related words (e.g., deaths-
deaths-loss) which, once in the head of the reader, converges into one
mental representation. This representation then merges with other
links to form a gist of the text. In Kintsch’s model, it is propositions
that are said to create this mental representation. However, unlike
Kyoko Yamada 107

links, they are abstract entities. Thus, when exploited, they must be
translated into tangible entities (i.e., the textbases), which are special
notations consisting of a verb (i.e., predicate) plus one or more nouns
and/or adjectives (i.e., arguments). For example, the sentences John
liked rock music. When he entered high school, he joined a rock
band, would cause construction of propositions expressed in the fol-
lowing seven textbases:
P1 [LIKE, JOHN, MUSIC]
P2 [ROCK, MUSIC]
P3 [ENTER, JOHN, SCHOOL]
P4 [HIGH, SCHOOL]
P5 [JOIN, JOHN, BAND]
P6 [ROCK, BAND]
P7 [WHEN, P3, P5]
According to this analysis, P1 is a textbase, with LIKE as the predicate
and JOHN and MUSIC as the arguments. Moreover, arguments in P1
(JOHN, MUSIC) overlap with arguments in P2 (MUSIC), P3 (JOHN),
and P5 (JOHN) and these argument overlaps result in constructing a
hierarchical network of propositions. Whereas textbases do not always
retain the features of the original written text in verbatim, when
employing them in investigating L2 readers’ gist identification per-
formance (e.g. Pulido, 2004), such argument overlaps having similar
shape, identical meaning, or both are employed as means of gaining
access to the abstract entities. In fact, it is such overlapping features of
a textbase that makes propositions comparable to Hoey’s links.
The second common feature between propositions and links is
that they are units based on a minimum of one pair of decontextual-
ized referential relationship. To illustrate, a link is minimally consti-
tuted when one word in a text is identified by the readers as referring
to or overlapping another word elsewhere in the text as its morpho-
logically similar or identical lexical item or its paraphrase regardless
of the original textual context. Similarly, the minimum constituent of
Kintsch’s propositional network is one overlap of lexical items, rep-
resented as a textbase, formed in the mind of the readers.
On the other hand, Hoey’s bonds can be regarded as a gist identifi-
cation model different from links and Kintsch’s propositions in one
important respect. Whereas links and propositions are based on read-
ers making a minimum of a single lookback, bonds are based on
readers making multiple (i.e., a minimum of three) synchronized look-
backs, which require more extensive effort on the part of the readers
to associate their world knowledge with the textual information at
108 Lexical patterns in L2 textual gist identification assessment

hand (Hoey, 1991). Thus by comparing readers’ bond and link identi-
fication performances, we can investigate which of the two discourse
processing models serves as a better measure of L2 readers’ textual
gist identification performance.

II The objectives of the present study


The present study builds on a previous study (Yamada, 2005), which
confirmed EFL readers’ ability to identify some elements of lexical pat-
terns as originally identified by Hoey (1991). The study had three lim-
itations. First, because it was a qualitative study, it did not offer
quantitative assessment of EFL readers’ gist identification performance
of the text, making its results inconclusive. Second, it did not compare
the efficacy of the two ways in which lexical patterns could be used in
gist identification task as proposed in Hoey (1991): learners identifying
lexical patterns of designated keywords in the text or learners identify-
ing the patterns on their own. Third, and most importantly, it did not ask
which of the two text processing models – i.e., bonds or links – is more
appropriate as a measure of L2 learners’ gist identification perform-
ance, which is an attempt to empirically test Hoey’s (1991) hypothesis
that bonds are indeed superior to links. The present study aims at over-
coming the three limitations in the previous study.

III Participants
Ninety-nine Japanese EFL college students participated in this study.
They were in one of six parallel EFL classes that the present
researcher taught and were intermediate to low-advanced learners of
English. Previous investigation of the participants’ TOEIC scores
confirmed that the six classes that participated in this study had no
significant variance in their levels of English language proficiency.
Each class was randomly assigned to either of the two treatment
groups (i.e., Group One or Two).

IV Materials
As discussed earlier, because the present study builds on a previous
study (Yamada, 2005), it uses the same text employed there: a
five-sentence journalistic text (‘Green Piece’, 1984) in Hoey (1991)
Kyoko Yamada 109

(Appendix 1). Based on this journalistic text, two worksheets were


created. In the first worksheet, each sentence in the text was num-
bered 1 to 5 and each of the seven keywords in sentence 1 (i.e., drug,
produce, humans, used, sedating, grizzly, and bears), based on
Hoey’s (1991) original taxonomy, was highlighted with a circle
around it. The second worksheet was similar to the first except that
there was no word highlighting in sentence 1.

V Procedure
In the last 10 minutes of a 90-minute regular class meeting, students
completed a lexical pattern search task by working on either of the two
worksheets described above. Group One participants, who were dis-
tributed the first worksheet, and were instructed to look at the seven
highlighted keywords in sentence 1 and find and mark with a pencil
repetitions or paraphrases of these keywords elsewhere in the text.
Group Two participants, who were distributed the second worksheet,
were told to focus on words in sentence 1 and spot and mark with a
pencil any repetitions or paraphrases elsewhere in the text on their
own. Participants in both groups were reminded that the words in the
text could be repeated or paraphrased with different parts of speech,
tenses and may be pluralized, singularized, or both. They were not per-
mitted to use their dictionary and were encouraged to guess if unsure
of their answers. To ensure that participants understood the task, prior
to the actual task they did a mini lexical pattern search exercise printed
on the reverse side of the worksheet with the instructor.

VI Data coding
All participants’ answers were coded by the present researcher; 10
percent of them were coded by a second coder, a native-speaker EFL
instructor. Both followed a coding system which involved two main
criteria. First, any content word that Hoey identified as an element
constituting lexical patterns was treated as correct and was awarded
one point.
Hoey’s (1991) original lexical patterns consisted of four of the
nine categories of repetitions and paraphrases discussed in his book:
‘Simple lexical repetition’ (1991, p. 52) (e.g., bear-bear (sentences 1
and 2)); ‘complex lexical repetition’ (1991, p. 55), (e.g., [D]rugging-
drug (sentences 1 and 4)); ‘simple paraphrase’ (1991, p. 62)
110 Lexical patterns in L2 textual gist identification assessment

(e.g., produce-causes-was responsible for (sentences 1, 2, and 5));


‘complex paraphrase’ (1991, p. 64) (e.g., drug-tranquillized (sen-
tences 1 and 2)). Based on these categories, the following 13 words
were identified as constituting links, bonds, or both deriving from
the seven words of sentence 1: bear, tranquillized, causes, and user
(sentence 2); bears and human (sentence 3); humans, animals, and
drugging (sentence 4); and drug, is responsible for, grizzly and bear
(sentence 5).
Second, because some participants identified associations that
were not included in Hoey’s original lexical patterns, two additional
criteria were made to identify acceptable constituents of lexical pat-
terns, awarding one point for each item that met these criteria. First,
a word identified in sentences 2 to 5 was coded as a sentence 1 para-
phrase if the word (or its morphologically similar word) was found
in a thesaurus entry of a sentence 1 word or vice versa. Roget’s 21
century thesaurus in dictionary form (Kipfer, 1992) was the main
thesaurus used to make this judgment. For example, gives (sentence
2) and giving (sentence 5) were treated as paraphrases of produce
(sentence). This is because the thesaurus entry of produce included
give, which is morphologically related to gives and giving. Second,
a word in sentences 2 to 5 was coded as a sentence 1 semi-paraphrase
if the thesaurus entries of the former (or its morphologically similar
word) and the latter both included a common constituent. For example,
violent in sentence 1 was regarded as a semi-paraphrase of dangerous
(sentence 4) because both entries shared the word urgent.
Three countermeasures were taken to cope with special cases.
First, if the thesaurus did not carry a word’s exact entry, its closest
entry was used. For example, because there was no entry for humans
(i.e., the plural form entry), its closest entry human (its singular form
entry) was used. Second, if there was no thesaurus entry for an en-
cyclopedic word (e.g., angle dust), a second thesaurus, Macquarie
encyclopedic thesaurus (Bernard, 1990), which was more encyclope-
dic than the Roget’s, was referred to. Third, special steps were taken
to treat two items in the text (i.e., junkies and effects) as para-
phrases of words in sentence 1 (i.e., used, produce). According to
Macquarie’s, junkies is a paraphrase of drug user. Because sentence
1 had both drug and used, junkies was coded as a paraphrase of eith
er of the two but not of both. For example, if participants identified a
link between used and junkies, junkies was interpreted as a para-
phrase of used. However, if a link between used and user (sentence 2)
was identified, junkies was not interpreted as a paraphrase of used. In
such a case, user was identified as a paraphrase of used. On the other
Kyoko Yamada 111

hand, because the thesaurus entry of effect as a verb included pro-


duce and the entry of effect as a noun included reaction, effect was
interpreted as a paraphrase of either produce or reactions but not of
both.
To facilitate the coding procedure, colored pens were used to high-
light the constituents of each link. To illustrate this procedure, readers
are referred to Appendix 2, coded data of five links and two bonds.
Here it shows that phencyclidine-drugging-drug-phencyclidine consti-
tuted a link based on the sentence 1 keyword drug; in the actual cod-
ing, these words were marked in the same color of pen (i.e., green).
Similarly, the sentence 1 keyword known and its repetition in sentence
2, known, were marked with a gray pen and the sentence 1 keyword
violent and its paraphrase in sentence 4, dangerous, were marked with
a blue pen. The two coders worked according to the same coding pro-
cedure, which resulted in an intercoder reliability of .95. Differences
in interpretation were solved by consultation. Later, the number of (1)
paraphrases and (2) morphologically similar repetitions were tallied
by the present researcher. Following Pretorius (2005), the data were
analyzed a second time after a long interval, for the present study,
about three months later. The intracoder reliability was .95.

VII Statistical analyses


Five variables were created from the data: (1) the number of para-
phrases (PARA), (2) the number of morphologically similar (but not
identical) words (MORPH), (3) the number of links (LINK), (4) the
number of bonds (BOND), and (5) treatment (TASK). Although
morphologically identical words were not investigated as a separate
variable due to their being easy to identify (Buck, Tatsuoka & Kostin,
1997), morphologically similar words (inflections and derivatives)
were, because developing readers have been reported to have diffi-
culty with them (Kuo & Anderson, 2006; Pretorius, 2005; Schmitt &
Meara, 1997). The variables were submitted to a SPSS/PC spread-
sheet and descriptive statistics, Pearson correlation, one t-test, two
standard multiple regression analyses, and one Wilcoxon Signed-
Rank Test were conducted. A Bonferroni’s correction was used for
the two comparisons, with the alpha set at .025 (.05/2).
The descriptive statistics analyzed the 99 cases, which are lexical
pattern scores for both treatment groups. The results indicated that (1)
the number of paraphrases (PARA), (2) the number of morphologi-
cally similar words (MORPH), and (3) the number of links (LINK)
112 Lexical patterns in L2 textual gist identification assessment

were non-normal, which were normalized following Tabachnick and


Fidell (2001).
First, a series of tests was conducted to determine which of the two
ways of using lexical patterns in L2 reading, as proposed by Hoey, is
better: learners instructed to identify lexical patterns of designated
keywords in the text or learners identifying the patterns on their own.
First, Pearson correlations were calculated that resulted in two main
findings (See Table 1). The first finding is that morphologically simi-
lar word search performance (MORPH) was the only variable that
significantly correlated with treatment outcomes. A t-test was run to
identify the direction of this relationship, which revealed that the
instructed group outperformed the non-instructed group [t(97) 
4.727, p  .000]. The means and standard deviations of Groups One
and Two were .25, .23 and .45 and .18, respectively. The second set
of findings is moderate but statistically significant relationships
between participants’ search performances of paraphrases and mor-
phologically similar words (i.e. PARA and MORPH) with that of
their bond and link identification. Two standard multiple regression
analyses were conducted to investigate which of the two models of
lexical patterns would better predict the participants’ paraphrase and
morphologically similar word search performance variables.
Table 2 shows that the two search performance variables (PARA
and MORPH) significantly accounted for 57% of the bond variance.
Semipartial correlations revealed that they contributed to 40% and
17% of bond identification predictions, respectively.

Table 1 Descriptive statistics and correlations of the five variables

1 2 3 4 5

1 PARA —
M 4.68 SD 2.82
2 MORPH .00 —
M 3.49 SD 1.34
3 LINK .47** .49** —
M 5.44 SD 1.36
4 BOND .63** .41** .72** —
M 2.10 SD 1.13
5 TASK .08 .43** .18 .10 —
M 1.57 SD .50

p  .01. **
PARA  paraphrases; MORPH  morphologically similar words;
LINK  links;
BOND  bonds; TASK  treatments.
Kyoko Yamada 113

Table 3 shows that the two search variables significantly


accounted for 46% of the link variance. Semipartial correlations
revealed that they contributed to 22% and 24% of link identification
predictions, respectively.
Finally, a Wilcoxon Signed-Rank Test was conducted to test
whether links and bonds were different models of gist identification.
A one-tailed probability of the data revealed that bonds and links are,
as predicted by the present study and Hoey, totally different models
(z  –4.37, p .000). Taken together, the results of the present
analyses indicate that bonds better predicted the search performances
of paraphrases and morphologically similar words than links.

VIII Discussion
The first goal of the present study was to conduct a series of quanti-
tative assessments of EFL readers’ gist identification performance of
written text, which had been left uninvestigated in a previous study
(Yamada, 2005), which resulted in two main findings that were dealt
with as the second and third goals of the present study.

Table 2 Regression analysis summary for


PARA and MORPH variables predicting
BOND

Variable B SEB 

PARA .96 .10 .63*


MORPH 2.0 .33 .41*

Note: R2 .57 (N  99, p  .001). * p  .000.


PARA  paraphrases; MORPH  morpho-
logically-similar words; BOND  bonds.

Table 3 Regression analysis summary for


PARA and MORPH variables predicting LINK

Variable B SEB 

PARA .67 .10 .49*


MORPH .20 .03 .47*

Note: R2 .46 (N  99, p  .001).


* p  .000.
PARA  paraphrases; MORPH  morpho-
logically-similar words; LINK  links
114 Lexical patterns in L2 textual gist identification assessment

The second goal of the present study was to identify whether it is


better for L2 readers to do lexical pattern searches based on desig-
nated keywords in the text or to do so by finding out constituents of
the patterns on their own. The results showed that there was no dif-
ference in these two conditions as far as link, bond, and paraphrase
searches were concerned. However, as for the morphologically simi-
lar item searches, the instructed group performance surpassed that of
the non-instructed group, which indicates that instruction can con-
tribute to enhancement of L2 learners’ awareness of morphological
relationships in text, which is an area that has been regarded as diffi-
cult to teach (e.g., Schmitt & Meara, 1997). This result is in line with
Walczyk and Taylor’s (1996) L1 research that reading deficiencies
can partially be overcome by instructing readers to carry out look-
backs. The absence of group differences in the link, bond, paraphrase
search performances suggests that there was more involved to such
searches than being alert to the surface textual information, which
leads us to discuss the third goal of the present study: identifying
which of the two text-processing models – i.e., bonds and links –
functions as a more appropriate L2 gist identification performance
measure. This was also an attempt to empirically test Hoey’s (1991)
hypothesis that bonds are indeed superior to links. The results of
the two multiple regression analyses suggest that of the two text-
processing models, the bonds proved to be a better L2 gist identifi-
cation assessment model, offering empirical support to Hoey’s claim
that bonds are more superior to links as a means of conveying and
identifying textual gist, which no other study has achieved to date.
These analyses revealed two areas of bonds superiority. The first
is in the paraphrase word search performance. As mentioned earlier,
the number of paraphrases and morphologically similar words iden-
tified better predicted bond identification performance (57%) than it
did link identification performance (46%). More important, how-
ever, are the results of the semipartial correlations that revealed that
bonds consisted of nearly twice as many paraphrases as links (40%
versus 22%). That bond identification was better predicted by para-
phrase searches than link identification suggests bonds contributed
to constructing a better situation model than links. This claim is sup-
ported by a series of L1 testing research that has found a close rela-
tionship between paraphrase recognition skill and L1 reading
comprehension skill (Anderson, 1972; Carver, 2000) and L2 studies
that have reported successful use and awareness of synonyms in text
contributing to building improved text coherence or appropriate situ-
ation models (Demel, 1990; Reynolds 1995; Tyler, 1992).
Kyoko Yamada 115

There are two reasons why bond-based reading may have con-
tributed to creating a better situation model. First, due to the fact that
it involves more frequent synonym processing, it offers greater
opportunities for readers to trigger inferential processes that can lead
to a greater activation of their world knowledge. Although synonyms
have often been treated as pure textual or surface form entities in
many previous L2 studies (e.g., Buck, Tatsuoka & Kostin, 1997;
Kostin & Freedle, 1993; Pretorius, 2005, Rupp et al., 2006), they
have been confirmed to possess a supratextual property (Hasan,
1984; Hoey, 1991). Their presence in text is a proof that the author
of the text exerted his or her pragmatic effort to connect the textual
information with his or her ‘experiential’ (Hasan, 1984, p. 201)
world knowledge. Studies comparing the quality and quantity of L1
and L2 speakers’ essays (Reynolds, 1995) and monologue speeches
(Tyler, 1992) have found that the discourses of proficient L1 speak-
ers and writers were characterized by their use of synonyms to elab-
orate on the main argument, which contributed to their creating
improved coherence for their readers and listeners; whereas such a
sensitivity to synonyms was weak in the discourses of L2 users,
which were often characterized by a high frequency of repetition that
caused discourse comprehension problems for L1 readers and writers.
Applying these findings to lexical pattern processing, bond and link-
based reading can be likened to readers’ participation in the text
author’s inferential processes of connecting the textual information
with his or her experiential world knowledge. One reason why bond-
based reading is considered superior to link-based reading in this
area is due to the fact that bond-based reading activates the readers’
world knowledge more. This is because it directs L2 readers’ atten-
tion to a greater number of synonyms in a text, which would demand
their greater participation in the text author’s activation of world
knowledge that would in turn evoke a situation model that is more in
line with the content of that text (Zwaan, 2004) whereas link-based
reading often demands minimal reader participation in the text
author’s inferential processes, which often results in a failure to con-
struct any situation model at all (Enright et al., 2000; Perkins & Jones,
1985; Rupp et al., 2006). While there could be cases in which a sin-
gle encounter of a word in a text may be enough to evoke a text-
appropriate situation model (e.g., Yamashita, 2003; Zwann, 2004),
there is evidence that text processing involving lesser words leads
to a greater risk of evoking a situation model that does not appro-
priately match the given textual content (Demel, 1990; Oller, 1994;
Yamashita, 2003) no matter how successfully the low-order level
116 Lexical patterns in L2 textual gist identification assessment

word processing is carried out (e.g., Bourassa, Levy, Dowin &


Casey, 1998; Levy, Abello & Lysynchuk, 1997).
The second reason why bond-based reading may have contributed
to stronger situation-model construction is that it would have
involved a greater degree of semantic and syntactic processing.
Previous L2 studies have reported that identifying synonymous rela-
tions in text requires synchronization of both of these processes
(Jonz, 1994; Perkins & Jones, 1985). This suggests that in order for
readers to process a synonym-rich discourse unit such as bonds, they
would be forced to depend on such synchronization. On the other
hand, such semantico-syntactic processes are not necessarily
required in link-based reading (Kostin & Freedle, 1993; Pretorius,
2005, Rupp et al., 2006).
The second area of superiority of bonds over links is in the
amount of textual lookbacks. Whereas links minimally involves one
lookback, bonds involve a minimum of three. As discussed earlier,
previous L1 research has confirmed that more lookbacks contribute
to creating a denser textual gist (Hyönä et al., 2002; Walczyk &
Taylor 1996). Whereas L2 reading strategy research has reported that
L2 reading is often hampered by text-boundness (Cziko, 1980; Oller,
1972; Kostin & Freedle, 1993; Jonz, 1994; Pretorius, 2005; Rupp et
al., 2006) and inefficient strategy use (e.g. Zhang, 2001), it has res-
onated well with the findings from L1 research that lookbacks serve
to help L2 readers treat their text more globally and improve their L2
situation model building (Anderson, 1991; Jonz, 1994; Pretorius,
2005; Zhang, 2001).
In light of the above evidence, bond-based reading, in contrast to
link-based reading, is believed to have had a stronger effect on
improving inexperienced L2 readers’ global text processing. One
reason is that bond-based reading requires more lookbacks, which in
turn resulted in their developing a fuller situation model than link-
based reading. There are two possible reasons why more lookbacks
in bond identification could offer a better situation model construc-
tion. First, frequent lookbacks could alleviate the effect of working-
memory limitation on text processing and, as a result, afford readers
more freedom to engage in situation model construction (Hyönä
et al., 2002; Walczyk & Taylor, 1996). Second, because bond identi-
fication allows readers to deal with more text data than link identifi-
cation, it allows them to activate their world knowledge more
smoothly. This result is in line with Oller’s (1994) finding that L2
readers’ performance of filling out clozed text increased when they
Kyoko Yamada 117

were working with longer texts with more contextual information


than shorter ones. In light of this evidence, it can be argued that L2
reading is facilitated or better assessed when learners have greater
freedom to interact with greater varieties of referential relations.

IX Conclusion
This study was an exploratory study with three goals. First, it
attempted to approach L2 lexical pattern search performance quanti-
tatively. Second, it asked if there was any difference between
instructing L2 readers to identify designated lexical patterns and
encouraging them to identify the patterns by themselves. The result
revealed that, at least in a relatively short text such as the one used
in this study, except for morphologically similar word search per-
formance, there was no group difference. The third goal of the study
was to investigate whether multiple lookbacks (bond searches) con-
tribute to better L2 gist identification performance than less frequent
lookbacks (link searches). The result has demonstrated superiority of
bonds, confirming Hoey’s (1991) hypothesis.
Due to the study being exploratory in nature, there were also limi-
tations. First, the unique features of the text used, including its
length, may have had an unexpected impact on the results. In light of
Hoey’s (1991) suggestion that the true strength of bond-based ana-
lyses will show up in stretches of texts that reach across paragraphs,
and even book chapters, and books, future studies will need to inves-
tigate these wide reaches of bonds. Second, the lexically focused
nature of the present study has prevented us from investigating the
connection between readers’ ability to spot lexical patterns and their
innate syntactic sensitivity and how such sensitivity can make add-
itional contribution to situation model constructions. These questions
need to be explored in future studies. Third, further investigation
needs to be made of whether lexical pattern searches can serve as a
more radical index of L2 pragmatic mapping (Oller, Oller & Badon,
2006), that is, whether lexical patterns searches can be associated
with readers’ extratextual experience brought to text processing.
Fourth, the results of this study are based on cross-sectional data.
Future studies may need to look at longitudinal changes that take
place on the effects of bonds in L2 reading within the same group of
subjects. The latter sort of study would require the type of repeated
measures designs recommended by Oller and Jonz (1994).
118 Lexical patterns in L2 textual gist identification assessment

Acknowledgements
I would like to thank Professors J. D. Brown and J. W. Oller, Jr. as
well as the previous and present Editors and the Reviewers of
Language Testing for their valuable comments and suggestions on
earlier versions of this article. Any remaining errors are my own.

X References
Anderson, N. J. (1991). Individual differences in strategy use in second lan-
guage reading and testing. Modern Language Journal, 75(4), 460–472.
Anderson, R. C. & Pearson, D. P. (1984). A schema-theoretic view of basic
processes in reading. In D. P. Pearson (Ed.), Handbook of reading
research (pp. 255–291). New York: Longman.
Anderson, R. C. (1972). How to construct achievement tests to assess com-
prehension. Review of Educational Research, 42(2), 145–170.
Barry, S. & Lazarte, A. A. (1998). Evidence for mental models: How do prior
knowledge, syntactic complexity, and reading topic affect inference gen-
eration in recall task for nonnative readers of Spanish? Modern
Language Journal, 82(2), 176–193.
Barsalou, L. W. (1999). Perceptual symbol systems. Behavioral and Brain
Sciences, 22(4), 577–660.
Bernard, J. R. A. (1990). The Macquarie encyclopedic thesaurus: The book
of words. Melbourne: Macquarie Library.
Bourassa, D. C., Levy, B. A., Dowin, S., & Casey, A. (1998). Transfer effects
across contextual and linguistic boundaries: Evidence from poor readers.
Journal of Experimental Child Psychology, 71, 45–61.
Buck, G., Tatsuoka, K. & Kostin, I. (1997). The subskills of reading: Rule-
space analysis of a multiple-choice test of second language reading com-
prehension. Language Learning, 47(3), 423–466.
Carrell, P. L. (1985). Facilitating ESL reading by teaching text structure.
TESOL Quarterly, 19(4), 727–752.
Carver, R. P. (2000). The causes of high and low reading achievement. Mahwah,
NJ: Lawrence Erlbaum.
Chan, Y. (2006). On the use of the immediate recall task as a measure of sec-
ond language reading comprehension Language Testing, 23(4), 520–543.
Connor, U. (1984). Recall of text: Differences between first and second lan-
guage readers. TESOL Quarterly, 18(2), 239–256.
Cziko, G. A. (1980). Language competence an reading strategies: A compari-
son of first- and second-language oral reading errors. Language
Learning, 30(1), 101–114..
Dell, G. S., McKoon, G., & Ratcliff, R. (1983) The activation of antecedent
information during the processing of anaphoric reference in reading.
Journal of Verbal Language and Verbal Behavior, 22(1), 121–132.
Demel, M. C. (1990). The relationship between overall reading comprehension
and comprehension of coreferential ties for second language readers of
English. TESOL Quarterly, 24(2), 267–292.
Kyoko Yamada 119

Ehrlich, K., & Rayner, K. (1983). Pronoun assignment and semantic integra-
tion during reading: Eye movements and immediacy of processing.
Journal of Verbal Learning and Verbal Behavior, 22(1), 75–87.
Enright, M. K., Grabe,W., Koda, K., Mosenthal, P., Mulcahy-Ernt, P., &
Schedl, M. 2000: TOEFL 2000 Reading framework: A working paper
(TOEFL Monograph Series MS-17). Princeton, NJ: Educational Testing
Service.
Fitzgerald, J. (1995). English-as-a-second-language learners’ cognitive read-
ing processes: A review of research in the United States. Review of
Educational Research, 65(2), 89–108.
Food and Agriculture Organization of the United Nations. (2001). January 30:
HIV/AIDS devastating rural labour force in many African countries, says
FAO. Retrieved February 2008, from http://www.fao.org/WAICENT/
OIS/PRESS_NE/PRESSENG/2001/pren0130.htm
Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action.
Psychonomic Bulletin & Review, 9(3), 558–565.
Glenberg, A. M., & Robertson, D. A. (1999). Indexical understanding of
instructions. Discourse Processes, 28(1), 1–26.
Graesser, A. C., Bowers, C., Bayen, U. J., & Hu, X. (1997) Who said what?
Who knows what? Tracking speakers and knowledge in narratives. In W.
Van Peer & S. Chatman (Eds.), New perspectives on narrative perspec-
tive (pp. 258–272). NY: State University of New York Press.
Graesser, A. C., Millis, K. K., & Zwaan, R. A. (1997). Discourse compre-
hension. Annual Review of Psychology, 48, 163–189.
Green Piece. (1984). BBC Wildlife 2, 160.
Hasan, R. (1984). Coherence and cohesive harmony. In J. Flood (Ed.),
Understanding reading comprehension (pp. 181–219). Newark, DE:
International Reading Association.
Heinz, P. J. (2004). Towards enhanced second language reading comprehen-
sion assessment: Computerized versus manual scoring of written recall
protocols. Reading in a Foreign Language, 16(2), 97–124.
Hoey, M. (1991). Patterns of lexis in text. Oxford: Oxford University.
Horiba, Y. (1990). Narrative comprehension processes: A study of native and
non-native readers of Japanese. Modern Language Journal, 74(2),
188–202.
Hyönä, J., Lorch, R. F., Jr., & Kaakinen, J. K. (2002). Individual differences
in reading to summarize expository text: Evidence from eye fixation pat-
terns. Journal of Educational Psychology, 94(1), 44–55.
Hyönä, J. & Nurminen, A. (2006). Do adult readers know how they read?
Evidence from eye movement patterns and verbal reports. British
Journal of Psychology, 97(1), 31–50.
Jonz, J. (1994). The effects of textual cohesion and prior knowledge on native
and nonnative cloze test scores. In J. W. Oller and J. Jonz (Eds.), Cloze
and coherence. Cranbury, NJ: Bucknell University Press, 269–285.
Just, M. A. & Carpenter, P. A. (1992). A capacity theory of comprehension:
Individual differences in working memory. Psychological Review, 99(1),
122–149.
Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York:
Cambridge University Press.
120 Lexical patterns in L2 textual gist identification assessment

Kipfer, B. A. (1992). Roget’s 21st century thesaurus in dictionary form.


New York: Laurel.
Kobayashi, M. (2002). Method effects on reading comprehension test per-
formance: Text organization and response format. Language Testing,
19(2), 193–220.
Koda, K. (2005). Insights into second language reading. Cambridge:
Cambridge University Press.
Kostin, I., & Freedle, R. (1993). The prediction of TOEFL reading item diffi-
culty: implications for construct validity. Language Testing, 10(2),
133–170.
Kuo, L., & Anderson, R. C. (2006). Morphological awareness and learning to
read. Educational Psychologist, 4, 161–180.
Levy, B. A., Abello, B., & Lysynchuk, L. (1997). Transfer from word training
to reading in context: Gains in reading fluency and comprehension.
Learning Disability Quarterly, 20(3), 173–188.
Linderholm, T., Virtue, S., Tzeng, Y., & van den Broek, P. (2004).
Fluctuations in the availability of information during reading: Capturing
cognitive processes using the landscape model. Discourse Processes,
37(2),165–186.
McKoon, G. & Ratcliff, R. (1980). The comprehension processes and mem-
ory structures involved in anaphoric reference. Journal of Verbal
Learning and Verbal Behavior, 19(6), 668–682.
McNamara, D. S., & Scott, J. L. (2001). Working memory capacity and strat-
egy use. Memory & Cognition, 29(1), 10–17.
Meyer, B. J. F. (1975). The organization of prose and its effects on memory.
Amsterdam: North-Holland Publishing.
Nassaji, H. (2002). Schema theory and knowledge-based processes in second
language reading comprehension: A need for alternative perspectives.
Language Learning, 52(2), 439–481.
Oller, J. W., Jr. (1972). Scoring methods and difficulty levels for cloze tests of
proficiency in ESL. Modern Language Journal, 56(3), 151–158.
Oller, J. W., Jr. (1994). Cloze, discourse, and approximation to English. In
J. W. Oller, Jr.& J. Jonz (Eds.), Cloze and coherence (pp. 119–134).
Cranbury, NJ: Bucknell University Press.
Oller, J. W., Jr., Chen, L., Oller, S. D., & Pan, N. (2006). Empirical predictions
from a general theory of signs. Discourse Processes, 40(2), 115–144.
Oller, J. W., Jr., & Jonz, J. (Eds.). (1994). Cloze and coherence. Cranbury,
NJ: Bucknell University Press.
Oller, J. W., Jr., Oller, S. D., & Badon, L. C. (2006). Milestones: Normal
speech and language development across the life span. San Diego, CA:
Plural Publishing.
Perkins K., & Jones, B. (1985). Measuring passage contribution in ESL read-
ing comprehension. TESOL Quarterly, 19(1), 137–153.
Pretorius, E. J. (2005). English as a second language learner differences in
anaphoric resolution: Reading to learn in the academic context. Applied
Psycholinguistics, 26(4), 521–539.
Kyoko Yamada 121

Pressley, M., & Afflerbach, P. (1995). Verbal protocols of reading. Mahwah,


NJ: Lawrence Erlbaum.
Pulido, D. (2004). The relationship between text comprehension and second
language incidental vocabulary acquisition: A matter of topic familiarity?
Language Learning, 54(3), 469–523.
Pylyshyn, Z. W. (2002). Mental imagery: In search of a theory. Behavioral and
Brain Sciences, 25(2), 157–238.
Raney, G. E. (2003). A context-dependent representation model for explaining
text repetition effects. Psychonomic Bulletin & Review, 10(1), 15–28.
Raney, G. E., Therriault, D. J., & Minkoff, S. R. B. (2000). Repetition
effects from paraphrased text: Evidence for an integrated representation
model of text representation. Discourse Processes, 29(1), 61–81.
Rayner, K., Raney, G. E., & Pollatsek, A. (1997). Eye movements and dis-
course processing. In R. F. Lorch, & E. J. O’Brien (Eds.), Sources of
coherence in reading (pp. 9–35). Mahwah, NJ: Lawrence Erlbaum.
Reynolds, D. W. (1995). Repetition in nonnative speaker writing: More than
quantity. Studies in Second Language Acquisition, 17(2), 185–209.
Rupp, A. A., Ferne, T., & Choi, H. (2006). How assessing reading compre-
hension with multiple-choice questions shapes the construct: A cognitive
processing perspective. Language Testing, 23(4), 441–474.
Sadoski, M. (1998). Comprehending comprehension. Reading Research
Quarterly, 34(4), 493–500.
Schmitt, N. & Meara, P. (1997). Researching vocabulary through a word
knowledge framework. Studies in Second Language Acquisition, 19(1),
17–36.
Swaffar, J. K. (1988). Readers, texts, and second languages: The interactive
process. Modern Language Journal, 72(2), 123–149.
Tabachnick, B. G. & Fidell, L. S. (2001). Using multivariate statistics.
Boston: Allyn and Bacon.
Tyler, A. (1992). Discourse structure and the perception of incoherence in
international teaching assistants’ spoken discourse. TESOL Quarterly,
26(4), 713–729.
Yamada, K. (2005). Lexical patterns in the eyes of intermediate EFL readers.
RELC Journal, 36(2), 177–188.
Yamashita, J. (2003). Process of taking a gap-filling test: Comparison of
skilled and less skilled EFL readers. Language Testing, 20(3), 267–293.
Walczyk, J. J. & Taylor R. W. (1996). How do the efficiencies of reading sub-
components relate to looking back in text? Journal of Educational
Psychology, 88(3), 537–545.
Zhang, L. J. (2001). Awareness in reading: EFL students’ metacognitive
knowledge of reading strategies in an acquisition-poor environment.
Language Awareness, 10(4), 268–288.
Zwaan, R. A. (2004). The immersed experiencer: Toward an embodied theory
of language comprehension. In B. H. Ross (Ed.), The psychology of
learning and motivation (pp. 35–62). New York: Academic Press.
122 Lexical patterns in L2 textual gist identification assessment

Appendix 1
A drug known to produce violent reactions in humans has been used
for sedating grizzly bears Ursus arctos in Montana, USA, according
to a report in The New York Times.
After one bear, known to be a peaceable animal, killed and ate a
camper in an unprovoked attack, scientists discovered it had been
tranquillized 11 times with phencyclidine, or ‘angel dust’, which
causes hallucinations and sometimes gives the user an irrational feel-
ing of destructive power.
Many wild bears have become ‘garbage junkies’, feeding from
dumps around human developments.
To avoid potentially dangerous clashes between them and
humans, scientists are trying to rehabilitate the animals by drugging
them and releasing them in uninhabited areas.
Although some biologists deny that the mind-altering drug was
responsible for uncharacteristic behaviour of this particular bear, no
research has been done into the effects of giving grizzly bears or
other mammals repeated doses of phencyclidine.

Appendix 2
Table A. 1: Sample coding

Links

Bonds
Sentence 1 drug known violent reactions bears
Sentence 2 phencyclidine known bear
Sentence 3 bears
Sentence 4 drugging dangerous
Sentence 5 drug/phencyclidine effects bear

Note: The underlined words are constituents of bonds in Sentence 2 and Sentence 5.

You might also like