Professional Documents
Culture Documents
Journal
Noriko Iwashita
To cite this article: Noriko Iwashita (2006) Syntactic Complexity Measures and Their Relation
to Oral Proficiency in Japanese as a Foreign Language, Language Assessment Quarterly: An
International Journal, 3:2, 151-169, DOI: 10.1207/s15434311laq0302_4
The study reported in this article is a part of a large-scale study investigating syntactic
complexity in second language (L2) oral data in commonly taught foreign languages
(English, German, Japanese, and Spanish; Ortega, Iwashita, Rabie, & Norris, in prepa-
ration). In this article, preliminary findings of the analysis of the Japanese data are re-
ported. Syntactic complexity, which is referred to as syntactic maturity or the use of a
range of forms with degrees of sophistication (Ortega, 2003), has long been of interest
to researchers in L2 writing. In L2 speaking, researchers have examined syntactic com-
plexity in learner speech in the context of pedagogic intervention (e.g., task type, plan-
ning time) and the validation of rating scales. In these studies complexity is examined
using measures commonly employed in L2 writing studies. It is assumed that these
measures are valid and reliable, but few studies explain what syntactic complexity
measures actually examine. The language studied is predominantly English, and little
is known about whether the findings of such studies can be applied to languages that are
typologically different from English. This study examines how syntactic complexity
measures relate to oral proficiency in Japanese as a foreign language. An in-depth anal-
ysis of speech samples from 33 learners of Japanese is presented. The results of the
analysis are compared across proficiency levels and cross-referenced with 3 other pro-
ficiency measures used in the study. As in past studies, the length of T-units and the
number of clauses per T-unit is found to be the best way to predict learner proficiency;
the measure also had a significant linear relation with independent oral proficiency
measures. These results are discussed in light of the notion of syntactic complexity and
the interfaces between second language acquisition and language testing.
in-depth analysis of learner language (e.g., Crookes, 1989; Ellis, 1987; Ortega,
1999; Robinson, 1995; Skehan & Foster, 1999). To investigate these three aspects
of language development, a number of measures have been devised. These mea-
sures have also been widely employed in language testing research, especially in
the validation of holistic evaluation that uses rating scales (for speaking: Halleck,
1995; Iwashita, Brown, McNamara, & O’Hagan, in press; for writing: Cooper,
1976; Falhive & Snow, 1980; Homburgh, 1984; Monroe, 1975; Perkins, 1980).
This type of analysis is an obvious cross-fertilisation from SLA and language test-
ing research. As the foci of research in SLA and language testing are different, it
remains to be seen whether measures used in SLA to understand the developmen-
tal process of learner language can serve to describe language development in lan-
guage testing research. The study reported in this article is a part of a large-scale
cross-linguistic investigation of syntactic complexity that examined how the vari-
ous syntactic complexity measures typically used in SLA and language testing
studies predict oral proficiency (Ortega et al., in preparation). Whereas four lan-
guages (English, German, Japanese, and Spanish) were studied in the large study,
in this study only Japanese was investigated.
SYNTACTIC COMPLEXITY
1Index measure is based on a formula that yields a numerical score (e.g., Falhive & Snow, 1980;
Perkins, 1980).
SYNTACTIC COMPLEXITY MEASURES 153
ORAL PROFICIENCY
In the area of oral proficiency, several studies have also employed objective mea-
sures of syntactic complexity and investigated relations between test scores
awarded using rating scales and objective measures of learner discourse. Halleck
(1995) examined the relation between holistic judgments of oral proficiency and
objective measures of syntactic maturity in oral proficiency interviews of 107 stu-
dents of English as a foreign language in China. The comparison was based on
three separate tasks. Halleck found learner performance was different according to
proficiency level and task type. The syntactic complexity measures used in the
154 IWASHITA
study were mean T-unit length, mean error-free T-unit length, and percentage of er-
ror-free T-units. Based on L1 research findings (e.g., Witte & Sodowsky, 1978), it
was assumed that syntactic complexity increases as L2 learners gain experience in
the L2 (Farhady, 1979; Scott & Tucker, 1974). Halleck also discussed the rationale
of using T-unit length as a complexity measure and referred to the studies con-
ducted in languages such as French, German, Spanish, and Arabic as well as Eng-
lish as a second language (ESL).
Ortega et al. (in preparation) examined the relation of a variety of syntactic
complexity measures with three independent proficiency measures (i.e., elicited
imitation [EI] task, institutional version of Test of English as a Foreign Lan-
guage, and self-assessment). The data drawn from 40 learners of ESL at college
level in Japan were analysed using a variety of syntactic complexity measures.
The results showed that a significant difference between the high and low profi-
ciency groups was found for only one of the length measures (i.e., number of
words per T-unit). The three independent proficiency measures had a significant
relation with a few syntactic complexity measures under study, but each inde-
pendent proficiency measure had a significant relation with a different syntactic
complexity measure (e.g., number of clauses and independent clauses per T-unit
for the EI task, number of words per T-unit for the institutional version of Test of
English as a Foreign Language, and number of verb phrases per T-unit with
self-assessment measures).
Iwashita et al. (in press) investigated speaking proficiency in ESL in the con-
text of a larger project to develop a rating scale. Spoken test performances repre-
senting five different tasks and five different proficiency levels (200 perfor-
mances in total) were analysed using a range of measures of grammatical
accuracy and complexity, vocabulary, pronunciation, and fluency. Complexity
measures used in the study were the number of clauses per T-unit (the T-unit
complexity ratio), the ratio of dependent clauses to the total number of clauses
(the dependent clause ratio), the number of verb phrases per T-unit (the verb
phrase ratio), and the mean length of utterance. The first three of these measures
were identified in a review of L2 writing studies by Wolfe-Quintero et al. (1998)
as the measures that best capture grammatical complexity. They have also been
used in studies involving the analysis of learner speech in both pedagogic and
testing contexts (e.g., Iwashita et al., 2001; Skehan & Foster, 1999). Iwashita et
al. found the expected gradient of complexity per proficiency level in only one
of the measures (length of utterance). They expressed concern that writing mea-
sures, particularly ratios, are not always applicable considering the nature of
spoken language (short rather than extended discourse; e.g., Foster et al., 2000).
Although the ratio measures used in the study by Iwashita et al. had been recom-
mended on the basis of previous studies as among the most useful measures of
complexity (Wolfe-Quintero et al., 1998), the volume of clauses and T-units pro-
SYNTACTIC COMPLEXITY MEASURES 155
duced at the higher levels was in sharp contrast to that of the lower levels, but
this difference was cancelled when ratios were used. It is possible that these
measures are useful only with longer stretches of text, such as written discourse.
Other researchers have voiced similar concerns about the use of ratio measures
(Richards, 1987; Vermeer, 2000).
The aim of the Iwashita et al. (2001) study was to validate rating scales, not
syntactic measures. In other words, the syntactic complexity measures used in
the studies were considered valid but were chosen based on empirical studies
that examined written discourse rather than speech samples. In addition, little
explanation was offered as to what these measures specifically examined in rela-
tion to syntactic complexity. The production unit used in the studies investigat-
ing syntactic complexity (e.g., Halleck, 1995; Iwashita et al., in press) was pre-
dominantly the T-unit. Subordination, coordination, and unit length have been
used as indexes of syntactic complexity, but it seems that T-unit length is the
only measure found by both written and oral language to discriminate profi-
ciency levels satisfactorily.
Several studies have taken error frequency into consideration in their analyses
(e.g., Halleck, 1995; Larsen-Freeman, 1983). However, error occurrence indicates
accuracy rather than complexity, as explained by Polio (1997). Research in L2
writing and speaking has focussed on the analysis of data from English and a few
other European languages. In relation to the Japanese language, several studies
have attempted to analyse oral discourse of learners of Japanese as a foreign lan-
guage using T-units (e.g., Harrington, 1986; Kanakubo, Kim, Honda, &
Matsuzaki, 1993; Ishida, 1991; Shimura, 1989). Harrington examined speaking
performances of 14 learners of Japanese using T-units to see whether T-unit analy-
sis is a reliable measure of Japanese as a foreign language oral proficiency. The re-
sults showed that the average T-unit length (the number of words per T-unit) and
average length of error-free T-unit (the number of words per error-free T-unit)
serve in some degree to discriminate between learner levels. However, significant
differences were obtained only after the number of levels was reduced. Harrington
concluded the T-unit measures were of only limited usefulness as an index of oral
proficiency. Tamaru, Yoshioka, and Kimura (1993) used various T-unit measures
(e.g., the number of word per T-unit, the number of clauses per T-unit) to investi-
gate the development of oral proficiency of 6 learners who were Japanese as a for-
eign language learners and found significant improvement in terms of the length
and complexity of learner speech over 18 months. Although the result is promising
in terms of the use of T-unit measures and syntactic complexity (measured in terms
of the number of words per T-unit, the number of clauses per T-unit, the number of
words per error-free T-unit, and the number of error-free clauses per T-unit), the
sample size is small and there was no mention of the relation with independent pro-
ficiency measures.
156 IWASHITA
METHODOLOGY
The Data
Data were drawn from oral performances by 33 learners of Japanese at two levels
of proficiency (low N = 13, high N = 20). The low-proficiency learners had com-
pleted three semesters at the time of data collection. The learners in the
high-proficiency group had completed four semesters. The learners were native
speakers of English learning Japanese at the tertiary level in the United States.
There were 17 men and 16 women. The mean age was 23.4 with a standard devia-
tion of 5.24. Most of the students had studied Japanese at high school (M = 4.84
years, SD = 2.49) and had spent some time in Japan (M = 1.23 years, SD = 2.5). De-
tailed biographical information about the participants was collected at the time of
data collection using the questionnaire shown in the appendix.
Three oral narrative story-telling tasks typically used in oral proficiency inter-
views, language classrooms in SLA, and language testing research were chosen to
elicit leaner speech. In the first task, learners were asked to retell the story after
viewing Chaplin’s short silent film Alone and Hungry. In the second task, learners
listened to a story in their first language (English) on tape and recounted the story
in Japanese using the accompanying pictures. The last task was divided into three
sections. In each section, students told a story from pictures. These tasks had all
been used in other studies and had proved satisfactory in eliciting a range of lan-
guage from learners (e.g., Ortega, 1999). Learners were given approximately 15
min to tell each story. Three minutes of planning time for the first task and 2 min
for the second and third tasks were given to ensure that learners produced at their
maximum level of complexity. All speech samples were collected in a language
laboratory, audiorecorded, and transcribed using guidelines developed for the
larger study (Ortega et al., in preparation).
To address the RQs and also establish a comparison framework across mea-
sures, three independent criteria were used together with a bio data survey: EI task,
oral proficiency interview, and self-assessment. EI tasks have been shown to corre-
2These questions are addressed in the large study (Ortega et al., in preparation).
SYNTACTIC COMPLEXITY MEASURES 157
late with other external criterion measures such as oral proficiency ratings (e.g.,
Bley-Vroman & Chaudron, 1994; Cartier, 1980; Clark, 1980; Hendricks, Scholz,
Spurling, Johnson, & Vandenburg 1980; Henning, 1983; Radloff, 1991). The EI
task consisted of 30 sentences varying in length and syntactic complexity. The par-
ticipants were asked to repeat as much of each sentence as they could after listen-
ing to it on tape. The sentences that students were asked to repeat are listed in the
appendix. A tape-mediated oral proficiency test was administered using the Simu-
lated Oral Proficiency Interview (SOPI) developed by the Center for Applied Lin-
guistics in Washington DC (1992). All SOPI performances were assessed by ac-
credited SOPI raters. A self-assessment instrument also developed by Center for
Applied Linguistics (n.d.) was used.
TABLE 1
Production Units and Syntactic Complexity Measures Used
in the Present Study
past studies, it was assumed that learners combine short, simple sentences into lon-
ger and complex sentences as their language develops.
There are three types of complexity measures: general complexity, coordina-
tion, and subordination. The general complexity measure (the number of clauses
per T-unit) considers the depth of a T-unit. It is assumed that the more clauses per
unit, the more complex the production. In this study, the degree of coordination
was measured by calculating the number of independent clauses per T-unit and
also the proportion of independent clauses to the total number of clauses. It is as-
sumed that the greater the number of independent clauses per T-unit and the larger
the proportion of independent clauses to the total number of clauses, the greater the
degree of coordination. However, it is not clear whether the instances of coordina-
tion have a trade-off effect on subordination. Coordination is expected to decrease
as learners become more proficient and move toward subordination (Wolf-
Quintero et al., 1998). Subordination concerns the degree of embedding. It is as-
sumed that more advanced learners will produce more dependent clauses and verb
phrases per T-unit and that the proportion of dependent clauses to the total number
of clauses will increase. It is also expected that learners will use more verb phrases
to reduce clauses as their proficiency develops. In deciding the production unit of
the analysis, we considered using the AS-unit (Foster et al., 2000), but due to the
complex procedure involved in coding AS-units we decided to use the T-unit in-
stead. In coding T-units, we included features of spoken languages mentioned in
the Foster et al. study (e.g., independent subclausal unit, minor utterances) and
nontarget-like features. More details of our codings are given later.
The definition of each unit and examples are given in the following paragraphs.
All examples are drawn from the data in this study. Prior to coding the data, guide-
lines for coding Japanese data were developed by adopting the guidelines for other
languages in the large study and also by consulting empirical studies on Japanese
in the field.
A word in Japanese is defined as tango. The definition of a Japanese word dif-
fers from other languages in the sense that a bound morpheme in English is treated
as one word. For example, ’hanasu’ (speak) is one word, but ’hanas-eru’ (be able to
speak) consists of two words. In this study, sentences were divided into words fol-
lowing Harrington (1986) and Tamaru and Yoshioka (1994). Words consisting of
more than one word but learned as a single word were treated as one word, for ex-
ample, otoko-no-ko3 (boy), hoka-no (other), onna-no-ko (girl).
Following the definition by Harrington (1986), “A T-unit is defined as a nuclear
sentence with its embedded or related adjuncts” (p. 53). A nuclear sentence can be
as short as a single verb or an adjectival stem plus affix. Examples of T-units are
given in the following:
3“Otoko-no-ko” contains three Japanese words and is translated into boy in English, but the literal
meaning of the words is a male child (i.e., otoko: male, no: possessive particle, ko: child).
SYNTACTIC COMPLEXITY MEASURES 159
{demo tatoe ookina kazoku de konna semai ie de sunde i te ironna mondai ya yana
koto ga okot temo [dependent clause]}{kitto kore ga hontoo no shiawase da [depend-
ent clause]} to omoi masu. (No. 201)
“Even if the size of the family is large and they live in a big house, I think this is
real happiness for them.”
Verb phrase refers to all nonfinite verbs (i.e., bare infinitives, to infinitives, ger-
unds, and gerundives; Wolfe-Quintero et al., 1998).
{Onaka ga suki [verb phrase]} soo na onna no ko ga panya no mado no tokoro o jitto
mi te i mashi te (No. 202).
“The girl, looking very hungry, was watching from the window.”
The data were coded by two coders, and interrater reliability was calculated by
the percentage of agreement between the coders. A high level of agreement was
160 IWASHITA
reached for all measures (T-unit, 95%; independent clause, 92%; dependent
clause, 90.8%; verb phrase, 88%).
RESULTS
To answer the first question, we compared the frequencies of the three produc-
tion units (T-unit, independent clause, dependent clause, and VP) and word token,
the length of the T-units and clauses, and the three complexity measures between
the two proficiency groups. The descriptive statistics and the t test statistics are
summarised in Tables 2 and 3. As is shown in Table 2, the frequency of production
unit is significantly different between high- and low-proficiency groups only for
clauses (both dependent clauses and independent clauses) and word token. No sig-
nificant difference was observed for verb phrases and T-units. This means that
high-proficiency learners produced a significantly larger number of smaller units
and words, but when the speech samples were measured with a larger unit (T-unit),
the difference was not significant. Also, the use of verb phrases (nonfinite verbs)
was not significantly different between the two proficiency groups. It should be
noted that the mean frequency of verb phrases in both proficiency groups was very
small compared with other production units.
Second, T-unit and clause length was compared between the two proficiency
groups. A significant difference was observed only for the number of words per
T-unit. Although the number of T-units was not significantly different between the
TABLE 2
High–Low Comparison of Production Units
Prof N M SD t df p
TABLE 3
High–Low Comparison of Syntactic Complexity Measures
Prof N M SD t df p
Length
Words/T-unit High 20 13.45 2.70 2.40 31 0.02
Low 13 11.45 1.60
Words/Clause High 20 8.06 0.81 0.03 31 0.98
Low 13 8.05 0.95
General complexity measure
Clause/T-unit High 20 1.67 0.26 2.64 31 0.01
Low 13 1.44 0.22
Coordination measure
IC/T-unit High 20 0.20 0.07 2.32 31 0.03
Low 13 0.13 0.08
IC/Clause High 20 0.76 0.10 –1.66 31 0.11
Low 13 0.82 0.08
Subordination measure
DC/T-unit High 20 0.41 0.23 1.93 31 0.06
Low 13 0.27 0.16
DC/Clause High 20 0.24 0.10 1.67 31 0.11
Low 13 0.18 0.08
VP/T-unit High 20 0.16 0.11 1.13 31 0.27
Low 13 0.13 0.07
VP/Clause High 20 0.10 0.06 0.45 31 0.65
Low 13 0.09 0.05
To examine the relation of the syntactic complexity measures with the three in-
dependent proficiency measures, Pearson’s correlation was performed. As is
shown in Table 4, the scores of two of the three proficiency measures (i.e., EI task
and SOPI) were significantly different between the high- and low-proficiency
162 IWASHITA
groups. As is shown in Table 5, only the number of clauses per T-unit and the num-
ber of dependent clauses per T-unit had a significant relation with all three profi-
ciency measures. T-unit length (words/T-unit) was significantly correlated with EI
task and SOPI. The number of dependent clauses per clause was found to have a
significant relation with SOPI only.
Summary of Results
In light of the RQs, T-unit length, the number of clauses per T-unit, and the number
of independent clauses per T-unit were found to predict learner proficiency signifi-
cantly better than did other measures. Accordingly, scores yielded by the EI task
had a significant correlation with all T-unit measures that discriminate proficiency
except one (i.e., the number of verb phrases per T-unit). Similarly, SOPI scores had
a significant relation with several measures, especially subordination measures.
TABLE 4
High–Low Comparison of Independent Proficiency Measures
Prof N M SD t df p
TABLE 5
Relation of SC Measures With Proficiency Measures
EI Task SOPI SA
Length
Words/T-unit .346* .437* .333
Words/Clause –.34 –.013 –.121
Complexity measures
Clause/T-unity .491** .517** .408*
IC/T-unit .400* .343 .254
IC/Clause –.316 –.408 –.348
DC/Clause .343 .410* .315
DC/T-unit .393** .458** .371*
VP/Clause –.023 .261 .07
Vp/t-UNIT .112 .393* .07
TABLE 6
Summary of the Results
Note. IC = independent clause; DC = dependent clause; VP = verb phrase; SOPI = simulated oral
proficiency interview.
The self-assessment measure had a weaker relation with syntactic complexity than
did the EI task and SOPI. A summary of the results is presented in Table 6.
DISCUSSION
In this study, speech samples from three narrative tasks were analysed with a vari-
ety of syntactic measures from empirical studies in an attempt to find the most
valid and reliable measure of syntactic complexity in the context of oral Japanese
language. The findings were similar to those of past studies. The findings of the
study are discussed in the light of syntactic complexity using speaking data and of
the interface between language testing and SLA research.
As reported previously, oral production, as measured by T-unit, extends with
learner proficiency; but when speech is measured by smaller units such as clauses,
differences between proficiency levels are not significant. Although these results
support the findings of past studies, we need to consider the notion of syntactic
complexity in relation to the definitions introduced earlier. Syntactic maturity can
be completely expressed not only by the length but also the range and sophistica-
tion of structures, for example instances of subordination and coordination. Al-
though no significant difference was found in the subordination and coordination
measures, when subordination and coordination measures were combined and as-
sessed in terms of general complexity, the number of clauses per T-unit was signifi-
cantly different between the high- and low-proficiency groups. As the number of
dependent clauses shows (Table 2), the high-proficiency group produced a signifi-
164 IWASHITA
cantly larger number of dependent clauses than did the low-proficiency group.
However, when the subordination and coordination measures were presented with
the ratio data, the difference was not found to be significant. This may be explained
by the problematic nature of ratio data explained earlier (Iwashita et al., in press;
Richards, 1980; Vermeer, 2000). In this regard, methodological refinement will be
required in future investigation. As for complexity in relation to verb phrases, little
difference was found between high- and low-proficiency learners. Wolfe-Quintero
et al. (1998) noted that as proficiency increases, learners use verb phrases more fre-
quently than clauses, but this was not the case in this study.
Although the EI task and the SOPI were found to have a significant relation with
several syntactic measures that discriminate proficiency well, such as the length of
T-unit and the number of clauses per T-unit, and to have a significant correlation with
all three proficiency measures, the self-assessment measure had little relation to the
syntactic complexity measures. The significant correlation found between SOPI and
the syntactic complexity measures is somewhat surprising. As pointed out by
Iwashita et al. (in press), following Brindley (1998), global rating scales describing
features of “real-life” performance in specific language-use contexts and based on
systematic observation and documentation of language performance are not based
on a theory of L2 learning. In other words, there is no linguistic basis for positing the
proposed hierarchies in those scales (Brindley, 1998). This issue is further explained
by Ortega (2003), who said that more complex means more developed in many dif-
ferent ways, and the nature of L2 development cannot be sufficiently investigated by
means of global measures alone. Ortega further argued that more complex does not
necessarily mean better and that although progress in a learner’s ability to use lan-
guage may include syntactic complexification, it also entails the development of dis-
course and sociolinguistic repertoires that the language user can adapt appropriately
to particular communication demands. The significant correlation found between
the global proficiency scale (SOPI) and syntactic complexity measures may be due
to the two distinct proficiency levels of learners from whom the data were elicited. If
the data had been collected from a range of proficiencies, the relation between the
global proficiency measures and syntactic complexity measures might not be as lin-
ear as in the study.
The relation between results yielded by the analysis of syntactic complexity and
global proficiency measures can be further discussed in terms of different foci and
approaches in language testing and SLA research. As Bachman (1989) and
Bachman and Cohen (1998) stated, SLA research tries to explain language devel-
opment and describe learner language by focusing on smaller aspects of language
such as syntactic complexity. On the other hand, a significant number of researches
in language testing have been devoted to describing a model of language ability
that can provide a basis for describing and assessing this ability. The fine-tuned
analyses in SLA research do not always show the distinctions in adjacency levels
and may not have linear relations with proficiency derived from the use of rating
scales (Larsen-Freeman, 1983; Wolf-Quintero et al., 1998). Obviously, there is a
SYNTACTIC COMPLEXITY MEASURES 165
CONCLUSIONS
This study reported the preliminary findings of the analysis of the Japanese data in
a large-scale cross-linguistic study. As in past studies, length (i.e., number. of
T-units and number of clauses per T-unit) was found to be the best way to predict
learner proficiency. The measure also had a significant linear relation with inde-
pendent oral proficiency measures. These results were discussed in light of syntac-
tic complexity and the interfaces between SLA and language testing. The study not
only provides further insights into the study of syntactic complexity in oral perfor-
mance but also posits difficulties of cross-fertilisation of SLA and language test re-
search. Finally, it points to the importance of cross-linguistic investigation.
ACKNOWLEDGMENTS
REFERENCES
Bachman, L. F. (1989). Language testing–SLA research interfaces. Annual Review of Applied Linguis-
tics, 9, 193–209.
Bachman, L., & Cohen, A. (1998). Language testing–SLA interfaces: An update. In L. Bachmann & A.
Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp.
29–31). Cambridge, England: Cambridge University Press.
Berman, R. A., &. Slobin, D. I. (1994). Filtering and packaging in narrative. In R. A. Berman & D. I.
Slobin (Eds.), Relating events in narrative: A crosslinguistic developmental study (pp. 551–554).
Mahwah, NJ: Lawrence Erlbaum Associates, Inc.
Bley-Vroman, R., & Chaudron, C. (1994). Elicited imitation as a measure of second-language compe-
tence. In E. E. Tarone, S. M. Gass, & A. D. Cohen (Eds.), Research methodology in second-language
acquisition (pp. 245–261). Hillsdale, NJ: Lawrence Erlbaum Associates, Inc.
Brindley, G. (1998). Describing language development? Rating scales and SLA. In L. Bachmann & A.
Cohen (Eds.), Interfaces between second language acquisition and language testing research (pp.
112–140). Cambridge, England: Cambridge University Press.
Cartier, F. A. (1980). Alternative methods of oral proficiency assessment. In J. R. Frith (Ed.), Mea-
suring spoken language proficiency (pp. 7–14). Washington, DC: Georgetown University Press.
Center for Applied Linguistics. (1992). Japanese Speaking Test. Washington, DC: Author.
Center for Applied Linguistics. (n.d.). Self-Assessment Questionnaire. Washington, DC: Author.
Clark, J. L. D. (1980). Toward a common measure of speaking proficiency. In J. R. Firth (Ed.), Mea-
suring spoken language proficiency (pp. 15–25). Washington, DC: Georgetown University Press.
Cooper, T. C. (1976). Measuring written syntactic patterns of second language learners of German.
Journal of Educational Research, 69, 176–183.
Crookes, G. (1989). Planning and interlanguage variation. Studies in Second Language Acquisition, 11,
367–383.
Ellis, R. (1987). Interlanguage variability in narrative discourse: Style shifting in the use of the past
tense. Studies in Second Language Acquisition, 9, 1–20.
Falhive, D., &. Snow, B. G. (1980). The use of objective measures of syntactic complexity in the evalu-
ation of compositions by EFL students. In J. Oller & K. Perkins (Eds.), Research in language testing
(pp. 171–176). Rowley, MA: Newbury House.
Farhady, H. (1979). The disjunctive fallacy between discrete-point and integrative tests. TESOL Quar-
terly, 13, 347–357.
SYNTACTIC COMPLEXITY MEASURES 167
Foster, P., & Skehan, P. (1996). The influence of planning and task type on second language perfor-
mance. Studies in Second Language Acquisition, 18, 299–323.
Foster, P., Tonkyn, A., & Wigglesworth, G. (2000). Measuring spoken language: A unit for all reasons.
Applied Linguistics, 21, 354–374.
Halleck, G. B. (1995). Assessing oral proficiency: A comparison of holistic and objective measures.
The Modern Language Journal, 79, 223–234.
Harrington, M. (1986). The T-unit as a measure of JSL oral proficiency. Descriptive and Applied Lin-
guistics, 19, 49–56.
Hendricks, D., Scholz, G., Spurling, R., Johnson, M., & Vandenburg, L. (1980). Oral proficiency test-
ing in an intensive English language program. In J. W. Oller & K. Perkins (Eds.), Research in lan-
guage testing (pp. 77–90). Rowley, MA: Newbury House.
Henning, G. (1983). Oral proficiency testing: Comparative validities of interview, imitation and com-
pletion methods. Language Learning, 33, 315–332.
Homburgh, T. J. (1984). Holistic evaluation of ESL compositions: Can it be validated objectively?
TESOL Quarterly, 18, 87–107.
Hunt, K. W. (1970). Syntactic maturity in school children and adults. Monographs of the Society for Re-
search in Child Development, 35 (Serial No. 134).
Ishida, T. (1991). Learning process in Japanese of French speaking university students. Nihongo
kyooiku: Journal of Japanese Language Teaching, 75, 64–79.
Ishikawa, S. (1995). Objective measurement of low-proficiency EFL narrative writing. Journal of Sec-
ond Language Writing, 4, 51–69.
Iwashita, N., Brown, A., McNamara, T., & O’Hagan, S. (in press). What features of language distin-
guish levels of learner proficiency? Applied Linguistics.
Iwashita, N., McNamara, T., & Elder, C. (2001). Can we predict task difficulty in an oral proficiency
test? Exploring the potential of an information processing approach to task design. Language
Learning, 21, 401–436.
Kanakubo, N., Kim, I., Honda, A., & Matsuzawa, H. (1993). The usage of Japanese in university
classes. Nihongo kyoiku: Journal of Japanese Language Teaching, 80, 74–90.
Larsen-Freeman, D. (1983). Assessing global second language proficiency. In H. W. Seliger & M. H.
Long (Eds.), Classroom oriented research in second language acquisition (pp. 287–304). Rowley,
MA: Newbury House.
Monroe, J. H. (1975). Measuring and enhancing syntactic fluency in French. The French Review, 48,
1023–1031.
Ortega, L. (1999). Planning and focus on form in L2 oral performance. Studies in Second Language Ac-
quisition, 21, 109–148.
Ortega, L. (2003). Syntactic complexity measures and their relationship to L2 proficiency: A research
synthesis of college-level L2 writing. Applied Linguistics, 4, 492–518.
Ortega, L., Iwashita, N., Norris, J., & Rabie, S. (2002, October). An investigation of elicited imitation
tasks in crosslinguistic SLA research. Paper presented at Second Language Research Forum, To-
ronto, Ontario, Canada.
Ortega, L., Iwashita, N., Norris, J., & Rabie, S. (in preparation). A multi-language comparison of syntactic
complexity measures and their relationships to foreign language proficiency. Manuscript in preparation.
Perkins, K. (1980). Using objective methods of attained writing proficiency to discriminate among ho-
listic evaluations. TESOL Quarterly, 14, 61–69.
Polio, C. (1997). Measures of linguistic accuracy in second language writing research. Language
Learning, 47, 101–143.
Radloff, C. F. (1991). Sentence repetition testing for studies of community bilingualism. Arlington, TX:
Summer Institute of Linguistics and the University of Texas at Arlington.
Richards, B. (1987). Type/token ratios: What do they really tell us? Journal of Child Language, 14,
201–209.
168 IWASHITA
Robinson, P. (1995). Task complexity and second language narrative discourse. Language Learning,
45, 99–140.
Scott, M. S., & Tucker, G. R. (1974). Error analysis and English-language strategies of Arab students.
Language Learning, 24, 69–97.
Shimura, A. (1989). Foreigner talk in Japanese as a foreign language. Nihongo kyooiku: Journal of Jap-
anese Language Teaching, 68, 204–215.
Shohamy, E. (1998). How can language testing and SLA benefit from each other? The case of dis-
course. In L. Bachmann & A. Cohen (Eds.), Interfaces between second language acquisition and
language testing research (pp. 156–176). Cambridge, England: Cambridge University Press.
Skehan, P., & Foster, P. (1999). The influence of task structure and processing conditions on narrative
retellings. Language Learning, 49, 93–120.
Tamaru, Y., & Yoshioka, K. (1994). Some problems surrounding the units of analysis for spoken data of
Japanese as a second language The Language Programs of the International University of Japan
Working Papers, 5, 84–100.
Tamaru, Y., Yoshioka, K., & Kimura, S. (1993). A longitudinal development of sentence structures: A
study of JSL adult learners. Nihongo kyoiku: Journal of Japanese Language Teaching, 81, 43–54.
Vermeer, A. (2000). Coming to grips with lexical richness in spontaneous speech data. Language
Testing, 17, 65–83.
Witte, S. P., & Sodowsky, R. E. (1978, March). Syntactic maturity in the writing of college freshmen.
Paper presented at the Conference on College Composition and Communication. Denver, CO.
(ERIC Document Reproduction Service No. ED 163 460)
Wolfe-Quintero, K., Inagaki, S., & Kim, H. (1998). Second language development in writing: Mea-
sures of fluency, accuracy and complexity. Honolulu: University of Hawai’i, Second Language
Teaching and Curriculum Center.
APPENDIX
Background Information
1. Gender Male/Female
2. Age _______________
3. Country of birth _______________
4. The name of the language course you are studying (e.g., beginners’
Spanish) _______________
5. Amount of time studied (English/German/Japanese/Spanish)
_______________
6. Have you been to the country of the target language? (e.g., Germany, Ja-
pan, Spain, USA) Yes/No
7. If yes, amount of time spent in the country.
_____ weeks
_____ months
_____ years
8. The purpose of the stay (e.g., vacation, work, study at a language school)
_______________
9. Did you study (English/German/Japanese/Spanish) at high school?
Yes/No
If yes, how long? _____ years
SYNTACTIC COMPLEXITY MEASURES 169
Japanese Sentences