Professional Documents
Culture Documents
EXPERIMENTAL
SYNTAX
OX F OR D HA N DB O OK S I N L I NG U I ST IC S
RECENTLY PUBLISHED
EXPERIMENTAL
SYNTAX
...........................................................................................................
Edited by
JON SPROUSE
Great Clarendon Street, Oxford OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© editorial matter and organization Jon Sprouse 2023
© the chapters their several authors 2023
The moral rights of the authors have been asserted
First Edition published in 2023
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2022935323
ISBN 978–0–19–879772–2
DOI: 10.1093/oxfordhb/9780198797722.001.0001
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
Contents
..............................
Preface ix
List of figures and tables xiii
The contributors xvii
PA RT I J U D G M E N T M ET HOD S I N
SY N TAC T IC T H E ORY
1. Acceptability judgments 3
Jon Sprouse
PA RT I I AC QU I SI T ION M ET HOD S I N
SY N TAC T IC T H E ORY
5. Behavioral acquisition methods with infants 137
Laurel Perkins and Jeffrey Lidz
PA RT I V N E U ROL I NG U I ST IC M ET HOD S I N
SY N TAC T IC T H E ORY
15. Electrophysiological methods 533
Jon Sprouse and Diogo Almeida
Index 665
Preface
........................
The field of syntax has always been interdisciplinary. Part of this is simply the nature of
cognitive science—the immensity of the problem posed by human cognition requires
a concerted effort from multiple disciplines. And part of this is the nature of syntactic
theory: It mediates between sound and meaning, it is a theory of the representations
constructed during sentence-processing, and it is a theory of the end state for language
acquisition. As technology has advanced, so too have the methods that syntacticians
have brought to bear on the central questions of the field. The past two decades in par-
ticular have seen an explosion in the use of various experimental methods for probing
the syntax of human languages. This Handbook is an attempt to bring these strands
of research together into a single volume. I have three goals for this Handbook: (i) to
provide high-level reviews of the experimental work that has been driving the field of
experimental syntax, (ii) to inspire new research that will push the boundaries of the
theory of syntax, and (iii) to provide high-level methodological guidance for researchers
who wish to incorporate experimental methods into their own research. I hope readers
will agree that the contributors to this volume have created chapters that succeed in all
three goals.
For this handbook, I have intentionally defined experimental syntax in the broadest
possible terms—as the use of any (and all) experimental methods in service of syntactic
theory. I am aware that the term experimental syntax is sometimes used in a narrower
sense that is more or less synonymous with formal acceptability judgment experiments
(I have used it that way myself in my own work), but I believe this synonymy is merely
a symptom of the important role that acceptability judgments play in syntactic theory,
and not a meaningful delimiter of the types of methods that syntacticians can profitably
employ in their research. The space of possible methods is large—too large for any single
volume. In assembling this Handbook, I have chosen to focus on methods that are (i)
relatively well-understood, (ii) relatively practical in terms of the equipment required,
and (iii) relatively likely to yield information that is relevant to syntactic theory. All of
these choices are subjective. I do not intend the exclusion of any given method to mean
that it does not, or could not, fall under the broad definition of experimental syntax.
In fact, what I hope this Handbook shows is that this broader definition of experimen-
tal syntax is still in its infancy. We have not yet explored all of the methods that could
potentially contribute to theories of syntax, nor have we seen the full potential of the
methods that we have explored. As such, this handbook is aspirational—it is simulta-
neously a snapshot of the knowledge we have collected to date and a pointer to the kind
of work that will be possible in the future.
x preface
of their hard work and dedication, both in writing their chapters and in doing the kind
of research that pushes the boundaries of the field. Finally, I would like to thank every-
one who has supported me throughout my career—advisors, collaborators, colleagues,
students, family, and friends. Science is a community effort. And I am grateful beyond
words for the community that I have somehow been given in this life.
List of figures and tables
...............................................................................
Figures
1.1 The two predictions of the 2×2 design for whether-islands (left panel
and center panel), and the observed results of an actual experiment
(right panel) 17
1.2 Three demonstrations of the continuous nature of acceptability
judgments 21
2.1 Scene verification display from the experiment by Kaiser et al. (2009) 43
2.2 Picture selection display from the experiment by Kaiser et al. (2009) 44
3.1 Outcome of example test story from Conroy et al.’s (2009) TVJT task 87
7.1 Model of the acquisition process adapted from Lidz and Gagliardi (2015) 213
11.1 SAT function for one condition, illustrating the three phases of
processing 369
11.2 Idealized differences in the three phases of the SAT functions for two
conditions 370
11.3 Idealized differences in the finishing time distributions corresponding to
the SAT differences shown in Fig. 11.2 370
12.1 We can use surprisal to formulate a linking hypothesis which, taken
together with a probability distribution over sentences, produces
empirical predictions about sentence comprehension difficulty 401
12.2 Since surprisal can act as a test of probability distributions and
probability distributions can be seen as consequences of hypothesized
grammars, surprisal can act as a test of hypothesized grammars 402
12.3 Graphical illustration of lc-predict and lc-connect 431
13.1 Example item from Breen et al. (2010) designed to elicit naturalistic
productions 459
14.1 A sample itemset from Sussman and Sedivy (2003) 494
14.2 Some culturally specific illustrations created for the Chamorro
Psycholinguistics na Project 497
16.1 The BOLD signal 562
xiv list of figures and tables
Tables
3.1 Percentage of surface scope response for comprehension question 64
3.2 Summary of experimental findings of the language acquisition studies
reviewed in this paper 74
4.1 Experimental paradigm for studying subject preference,
morphologically ergative languages 108
7.1 The qualitative fit Yang discovered between the unambiguous data
advantage (Adv) perceived by a VarLearner in its acquisitional intake
and the observed age of acquisition (AoA) in children for six parameter
values across different languages 236
7.2 Optional infinitive examples in child-produced speech in different
languages, and their intended meaning 236
8.1 Summary of key artificial language learning methods 276
12.1 A first illustration of bottom-up parsing 417
12.2 The effect of center-embedding on bottom-up parsing 420
12.3 The effect of left-embedding on bottom-up parsing 421
12.4 The effect of right-embedding on bottom-up parsing 423
12.5 A first illustration of top-down parsing 424
12.6 The effect of center-embedding on top-down parsing 426
12.7 The effect of left-embedding on top-down parsing 427
12.8 The effect of right-embedding on top-down parsing 428
12.9 A first illustration of left-corner parsing 430
12.10 The effect of center-embedding on left-corner parsing 433
12.11 The effect of left-embedding on left-corner parsing 434
12.12 The effect of right-embedding on left-corner parsing 435
list of figures and tables xv
psychology from the University of Pennsylvania. Her research focuses on the processes
and representations involved in comprehension and production, especially in domains
involving multiple aspects of linguistic representation (syntax, semantics, pragmatics),
such as reference resolution. She has investigated multiple languages (e.g. Finnish, Es-
tonian, French, German, and Dutch, including collaborative work on Bangla/Bengali,
Hindi, Italian, Korean, Chinese, and Vietnamese).
Dave Kush is an Assistant Professor of Linguistics at University of Toronto. He is
interested in sentence-processing, syntactic theory, and cross-linguistic variation.
Jeffrey Lidz is Distinguished Scholar-Teacher and Professor of Linguistics at the Uni-
versity of Maryland. His research explores language acquisition from the perspective
of comparative syntax and semantics, focusing on the relative contributions of experi-
ence, extralinguistic cognition, and domain-specific knowledge in learners’ discovery
of linguistic structure and linguistic meaning.
Andrea E. Martin is a Lise Meitner Research Group Leader at the Max Planck Institute
for Psycholinguistics, and a Principal Investigator at the Donders Centre for Cogni-
tive Neuroimaging at Radboud University in Nijmegen, the Netherlands. Her work
has spanned structural and semantic aspects of sentence processing. She has used the
speed–accuracy trade-off procedure and cognitive neuroimaging to study the role of
memory in sentence processing via ellipsis, a line of research begun with Brian McElree,
who pioneered application of SAT to psycholinguistic issues. The current focus of her
lab, Language and Computation in Neural Systems, is on developing theories and mod-
els of language representation and processing which harness the computational power
of neural oscillations, such that formal properties (viz., constituency, compositionality)
can be realized in biological and artificial neural networks.
William Matchin is an Assistant Professor of Communication Sciences and Disorders
in the Arnold School of Public Health at the University of South Carolina. As part of
the Center for the Study of Aphasia Recovery, he directs the NeuroSyntax lab, using
functional neuroimaging and lesion–symptom mapping and incorporating insights of
linguistic theory to understand the architecture of language in the brain. He is currently
investigating the nature of grammatical deficits in aphasia, including paragrammatism
and agrammatism.
Lisa S. Pearl is a Professor in the Department of Language Science at the University
of California, Irvine. Her research lies at the interface of language development, com-
putation, and information extraction, including both cognitively oriented research and
applied linguistic research that combines theoretical and computational methods. Her
cognitively oriented research focuses on child language acquisition, with a particular
focus on theory evaluation via acquisition-modeling, and how children’s input affects
their linguistic development.
Laurel Perkins is an Assistant Professor in the Department of Linguistics at the Uni-
versity of California, Los Angeles. She earned her PhD in linguistics from the University
xx the contributors
Jeffrey Runner is a Professor of Linguistics and Brain & Cognitive Sciences, Dean of
the College, and Vice Provost and University Dean for Undergraduate Education at
the University of Rochester. He earned a BA in linguistics at the University of Califor-
nia, Santa Cruz, in 1989 and a PhD in linguistics at the University of Massachusetts at
Amherst in 1995. He joined the department of Linguistics at the University of Rochester
in 1994. His research uses experimental methodologies to investigate natural language
syntax. In 2017, he became dean of the College in Arts, Sciences and Engineering,
and is responsible for the curricular, co-curricular, and extra-curricular undergraduate
experience.
Jon Sprouse is a Professor of Psychology at New York University Abu Dhabi. He re-
ceived an AB in linguistics from Princeton University (2003) and a PhD in linguistics
from the University of Maryland (2007). His research focuses on the use of experi-
mental syntax techniques, including acceptability judgments, EEG, and computational
modeling, to explore fundamental questions in syntax. He has authored over forty jour-
nal articles and book chapters on experimental syntax. His work has been recognized
by the Best Paper in Language award, the Early Career award, and the C. L. Baker
mid-career award from the Linguistic Society of America.
the contributors xxi
JUDGMENT
METHODS IN
S Y N TA C T I C
T H E O RY
...................................................................................................
c ha p t e r 1
...........................................................................................................
a c c e p ta b i l i t y
judgments
...........................................................................................................
jon sprouse
1.1 Introduction
..........................................................................................................................
Before delving into the primary content of this chapter, I would like to briefly men-
tion a few assumptions (and/or decisions) that I am making. The first is that I assume,
following many working syntacticians, that acceptability judgments are in principle
valuable for the construction and evaluation of syntactic theories. I will, therefore, not
attempt to motivate the use of acceptability judgments in general (see Schütze 1996
for a comprehensive discussion of this). The second is that I will assume a relatively
minimal linking hypothesis between acceptability judgments and the cognitive prop-
erties of sentence processing. Under this linking hypothesis, an acceptability judgment
is a relatively automatic behavioral response that arises when a speaker comprehends
a sentence, and that this behavioral response is impacted by a large number of cog-
nitive factors, such as the grammaticality of the sentence, the processing dynamics of
the sentence, the sentence-processing resources required by the sentence, the meaning
of the sentence, the plausibility of the sentence relative to the real world, and even the
properties of the specific task that is given to the speaker. I believe wholeheartedly that
a more precise linking hypothesis would be helpful for using judgments as evidence in
syntax; however, I also believe that the minimal linking hypothesis above is more than
sufficient to begin to explore the value of formal acceptability judgment experiments
in syntax. My third assumption is that there is no substantive qualitative difference
between “informal” and “formal” judgment experiments. Both are experiments in the
sense that they involve the manipulation of one variable (syntactic structure) to re-
veal a causal relationship with another variable (acceptability). Therefore both involve
all of the components that typify psychology experiments: a set of conditions, a set of
items in each condition, a set of participants, a task for the participants to complete
using the items, and a process for analyzing the results of the task. The difference ap-
pears to me to be primarily quantitative, in that “formal” experiments tend to involve
more conditions, more items per condition, more participants, and more complex anal-
ysis processes. To my mind, the labels “informal” and “formal” simply point toward
different ends of this quantitative spectrum. In practice, when I say that formal ex-
periments are valuable in some way, what I mean is that increasing the number of
conditions, items, or participants, and/or increasing the complexity of the analysis,
can yield insights that fewer conditions, items, participants, and/or less complex anal-
yses cannot. The labels “informal” and “formal” are a more concise way to express
this idea. My fourth assumption is that Schütze 1996 already provides a comprehen-
sive review of experimental syntax work that was published before 1996. Therefore,
in order to provide something new for the field, I will focus here on work published
after 1996.
Finally, this chapter is not a how-to for constructing formal judgment experiments.
The goal is for this to be the chapter one reads, either before or after reading a how-
to, for inspiration about the types of questions one can ask with the method. I will
provide some references for learning acceptability judgment methods in the annotated
bibliography for Part I of this Handbook.
acceptability judgments 5
Perhaps the most frequently asked question in the experimental syntax literature is to
what extent the informally collected judgments that have been published in the liter-
ature can be trusted to form the empirical basis of syntactic theory. This question has
arisen since the earliest days of generative grammar (Hill 1961; Spencer 1973); it played
a central role in the two books that ushered in the most recent wave of interest in ex-
perimental syntax (Schütze 1996; Cowart 1997); and it has given rise to a number of
high-level debates in the experimental syntax literature over the past 15 years or so
(see Edelman and Christiansen 2003; Ferreira 2005; Wasow and Arnold 2005; Feather-
ston 2007; Gibson and Fedorenko 2013 for some concerns about informal methods; see
Marantz 2005 and Phillips 2009 for some rebuttals, and Myers 2009 for a proposal that
attempts to split the difference between informally collected judgments and full-scale
formal experiments). The existence of this question is understandable. First, informally
collected judgments form the vast majority of the data points published in the (gener-
ative) syntax literature. Second, the properties of informal collection methods are not
identical to the properties of the formal experimental methods that are often used in
other domains of cognitive science: Informal methods often involve a smaller num-
ber of participants, those participants are often professional linguists instead of naïve
participants, the participants are often presented a smaller number of items, and the re-
sults are often only analyzed descriptively (without inferential statistics). If one believes
that the properties of formal experiments are what they are to ensure the quality of the
data, then it is logically possible that the differences between informal methods and for-
mal experiments could lead to lower-quality data. The consequences of this cannot be
understated. If there are systemic problems with informally collected judgments, then
there are likely to be systemic problems with (generative) syntactic theories.
This question touches upon a number of issues in psychometrics and the broader
philosophy of measurement. The first question is: What do we mean when we say that
data can be “trusted” to form the basis for a theory? Psychometric theories have iden-
tified a number of properties that good measurement methods should have. Here I will
mention two (and only in a coarse-grained way, setting aside subtypes of these proper-
ties): validity and reliability. A measurement method is valid if it measures the property
it is purported to measure. A measurement method is reliable if it yields consistent re-
sults under repeated measurements (with unchanged conditions). The concerns about
informal methods that have figured most prominently in the literature appear to be a
combination of concerns about validity and reliability, such as the concern that small
sample sizes will lead to an undue influence of random variation, the concern that a
small number of experimental items will lead to an undue influence of lexical proper-
ties, and the concern that the participation of professional linguists will lead to theo-
retical bias. In each case, the concern seems to be that informally collected judgments
6 jon sprouse
will not reflect the true acceptability of the sentence (validity), and furthermore that the
judgments themselves will be inconsistent over repeated measurements (reliability).
This leads to a second question: How does one establish validity for the measurement
of a cognitive property like acceptability? The direct method for establishing validity is
to compare the results of the measurement method with a second, previously validated,
measurement method. This is obviously unavailable for most cognitive properties—if
cognitive scientists had a method to directly measure the cognitive property of interest,
we would not bother with the unvalidated measurement method. That leaves only indi-
rect methods of validation. One indirect method is to ask whether the theory that results
from the data has the properties of a good scientific theory. This, of course, interacts
with broader issues in the philosophy of science about what properties a good theory
would have, so I will not attempt to provide an exhaustive list. But two possible criteria
are: (i) making potentially falsifiable predictions, and (ii) explaining multiple phenom-
ena with a relatively small number of theoretical constructs. In the case of acceptability
judgments, I would argue that the resulting theory of syntax does, indeed, have these
properties. Another indirect method is to ask whether other data types provide corrobo-
rating evidence, modulo the linking theories between the data types and the underlying
theory. In the case of acceptability judgments, we can ask whether the resulting syntac-
tic theory can be linked to a sentence-processing theory in a way that makes potentially
falsifiable predictions about other psycholinguistic measures, such as reading times, eye
movements, or EEG, and ultimately whether these measures corroborate the syntactic
theory. I would argue that the current results in the literature connecting syntactic theo-
ries and sentence-processing theories are promising. That said, indirect methods cannot
guarantee validity. It is logically possible that acceptability judgments could give rise to
a theory that has all of the hallmarks of a good theory, but that does not ultimately
explain human syntax (perhaps the resulting theory is actually about probability, or
plausibility, or even prescriptive grammatical rules).
This leads to the final question: How does one establish reliability? In principle, estab-
lishing reliability is relatively straightforward, as it simply entails replicating the mea-
surement. The exact replication can vary based on the type of reliability one is interested
in: Between-participant (or inter-rater) reliability asks whether the same judgments are
obtained with different sets of participants; within-participant (or test–retest) reliability
asks whether one set of participants will give the same judgments at two different times;
between-task reliability asks whether different judgment tasks will yield the same judg-
ments (either between-participant or within-participant). In practice, establishing the
reliability of informal methods is complicated by their informality. By definition, in-
formal methods control the various properties of the judgment collection process less
strictly than formal methods, making a strict replication difficult if not impossible. One
way to circumvent this problem is to compare the results of informal methods, perhaps
as reported in the syntactic literature, with the results of formal experiments. This would
be a type of between-task reliability for informal and formal methods, and to the extent
that the two sets of results converge, it would establish a kind of reliability for informal
methods. Many of the results reported below test precisely this kind of reliability. But
Another random document with
no related content on Scribd:
254
250
Spinal irritation,
251
252
253
Vaginismus,
246
Vomiting,
254
Synonyms,
206
Treatment,
273
282
283
276
285
286
Climatic,
283
282
283
274
281-286
285
275
280
Gold and sodium chloride, use of,
279
286
287
276
Hydrotherapy in,
281
282
in children,
275
Iron and zinc salts, use in,
278
280
Metallo-therapy,
284
277
278
Mitchell's rest-cure,
279
Moral,
276
Musk, valerian, asafœtida, etc.,
278
Nitrite of amyl,
285
of contractures,
286
of paralyses,
286
of paroxysms,
285
278
287
Prophylactic and hygienic,
274
Sea-bathing in,
283
884
of vertigo,
425
1055
Hysterical headache,
402
insanity,
148
YSTERO-EPILEPSY
288
288
Diagnosis,
307
308
from simulation,
310
307
Etiology,
291
291
Emotional,
293
Painful menstruation,
293
291
History,
289
Pathology,
291
Prognosis,
310
Symptoms,
293
Anæsthesia,
298
Contracture,
297
Digestive,
297
Hallucinations,
301
298
301
of delirium,
301
of emotional attitudes,
301
of epileptoid period,
300
of irregular types,
302
of regular types,
293
of paroxysms,
293
304
306
Ovarian hyperæsthesia,
298
299
Permanent,
307
Prodromal,
297
Treatment,
310
Compression of nerve-trunks,
310
311-313
313
Metallo-therapy,
313
312
313
311
Varieties,
290
I.
720
in cerebral meningeal hemorrhage,
715
in neuritis,
1194
in tubercular meningitis,
736
138
116
117
Illusions in nervous diseases,
20
138
791
319
of hysteria,
222-229
Impulsive insanity,
146
47-50
346
NFANTILE
PINAL
P
ARALYSIS
1113
Anatomical lesions,
1131
1133
1138
1139
1138
1139
1139
Microscopic lesions,
1137
1140-1444
1132
1149
Course of,
1148
Definitions,
1113
1127
1128
Dislocations in,
1130
Mechanism of,
1128
1131
1127