You are on page 1of 43

Studies in Hispanic and Lusophone Linguistics 2022; 15(1): 67–109

Steffen Heidinger*
Corpus Data and the Position of Information
Focus in Spanish
https://doi.org/10.1515/shll-2022-2056

Abstract: The syntactic position of information foci is the most vividly discussed
issue in recent literature on focus in Spanish. An interesting aspect of this dis-
cussion is that the diverging views typically correlate with diverging methods:
Authors who rely on their intuitions as native speakers typically assume that
information foci are limited to the final position, while authors using experimental
methods typically argue (based on their experimental data) that information foci
are not limited to the final position. The present paper contributes to this debate by
adding a new data type, namely corpus data. The main empirical finding of our
corpus study is that information foci appear most frequently in final position but
are not limited to the final position. The latter finding is in line with comparable
experimental studies, but the preference for the final position in our corpus data is
not found in all experimental studies. Further, our results challenge the common
view in the introspection-based literature according to which the information focus
needs to be in final position in Spanish. In addition to this empirical contribution
we offer a reflection on the merits of corpus data in this domain of linguistic
research.

Keywords: Spanish, information focus, corpus data, word order

1 Introduction1
This paper contributes to the ongoing debate about the syntactic position of in-
formation focus in Spanish, a language in which sentence form and focus structure

1 I wish to thank two Studies in Hispanic and Lusophone Linguistics referees for their constructive
feedback and helpful comments on an earlier version of this paper. I am also grateful to Vicky
Leonetti and Ramón González Torres who provided feedback on some of the corpus examples. All
remaining errors are my own.

*Corresponding author: Steffen Heidinger, University of Graz, Graz, Austria,


E-mail: steffen.heidinger@uni-graz.at

Open Access. © 2022 Steffen Heidinger, published by De Gruyter. This work is licensed
under the Creative Commons Attribution 4.0 International License.
68 S. Heidinger

have a strong impact on each other:2 Sentence form depends to a certain degree on
the focus structure, and the interpretation of a sentence’s focus structure is often
guided by sentence form. Over the last decades, an impressive body of research on
the relation between sentence form and focus structure in Spanish has been
created which has considerably improved our understanding of how focus shapes
sentence form (cf. recent overviews in Hoot and Leal 2020; Hoot et al. 2020; Uth and
García García 2018). However, one of the most basic questions in this domain is still
heavily debated: What is the syntactic position of information focus in Spanish?
At the heart of this debate is the question whether information focus is
limited to the final position (as in (1B)) or can also appear in non-final positions
(as in (1B′)).3

(1) A: Who bought the car?


B: Lo compró [Juan]F.
it bought Juan
B′: [Juan]F lo compró.
Juan it bought
‘Juan bought it.’

While some authors claim that information focus is limited to the sentence final
position (as in (1B)), others have put forward evidence suggesting that information
focus can also appear in non-final positions (as in 1B′)). Interestingly, methodo-
logical choices are a strong predictor for an author’s view in this matter: Experi-
mental research generally suggests that information focus is not limited to the final
position, while introspection-based studies often assume such a limitation to the
final position. This paper contributes to this debate by adding a new data type,
namely corpus data.
The main empirical claim to be defended in this paper is that the final position
is the preferred position of information focus, but information focus is not limited
to the final position (cf. the presentation of the results in Section 5). The latter
finding agrees with current experimental studies: The results from comparable
experimental studies and from our corpus study accordingly indicate that in both

2 The close relation between sentence form and focus structure is, of course, not a peculiarity of
Spanish (cf. Büring 2009; Drubig 2003; Drubig and Schaffar 2001, and the contributions in Krifka
and Musan 2012b for crosslinguistic evidence). Further, focus structure is not the only level of
information structure that influences sentence form in Spanish (cf. Casielles-Suárez 2004; Feld-
hausen 2016a, 2016b; Fernández Lorences 2010; Hidalgo Downing 2003; Villalba 2011 on top-
icalization and Cruttenden 2006; Frota and Prieto 2015; Hualde and Prieto 2015 on the prosodic
weakening of given constituents).
3 We use non-final as a cover term independently of whether the non-final constituent is in situ or
not.
Corpus Data and the Position of Information Focus 69

data types focused subjects are not limited to the sentence final position. However,
the two data types differ in that the preference for the final position in our corpus
data is not found in all experimental studies. Further, our results challenge the
common view in the introspection-based literature according to which the infor-
mation focus needs to be in final position in Spanish. In the discussion of the
empirical results in Section 6 we will first compare our findings to the existing
literature and then highlight some merits of corpus data in the current debate on
the position of information focus in Spanish.
Before all that we will introduce in Section 2 information focus as the central
information structural category of this paper. In Section 3 we provide an overview
of how information focus shapes sentence form in Spanish; we will pay special
attention to the diverging views that can be found in the literature on the syntactic
position of information focus in Spanish.

2 Functional Characteristics of Information Focus


The focus back-ground partition ( focus structure in this paper) is one of the levels of
information structure. In line with Rooth (1985, 1992), Krifka (2007: 18) defines
focus as follows: “Focus indicates the presence of alternatives that are relevant for
the interpretation of linguistic expressions.” Thus, in the example in (2) a new car is
focus because, for this part of the sentence, alternatives that are relevant for the
interpretation of the sentence exist: the focus a new car specifies that among all the
things that John might have bought, he actually bought a new car.

(2) (Context: What did John buy?)


John bought [a new car]IF.
The idea that focus interpretation is linked to alternatives is wide spread (cf. Rooth
1985, 1992, 2016; further Krifka 2001, 2006, 2007; Krifka and Musan 2012a; Onea
and von Heusinger 2009; von Stechow 1991; Zimmermann and Onea 2011). This
success of an alternative-based concept of focus follows – according to Matić and
Wedgwood (2013: 155) – from the fact that alternatives are relevant to various
phenomena, such as new information in question-answer pairs or overt contrast.
The existence of alternatives is linked to assertion and, for that matter, to a
fundamental communicative function. An assertion only has value for a commu-
nicative act, if there are alternatives to what is asserted. As a consequence, the
existence of alternatives is a fundamental aspect of what determines the
communicative relevance of an assertion (cf. Matić and Wedgwood 2013: 154).
70 S. Heidinger

Several types of focus can be distinguished in terms of various parameters. As


for the size of the focus, one can distinguish between sentence focus versus VP
focus versus narrow focus, as in (3).

(3) a. (Context: What happened?) sentence focus


[John bought a new car]IF.
b. (Context: What did John do yesterday?) VP focus
He [bought a new car]IF
c. (Context: What did John buy?)   narrow (argument) focus
He bought [a new car]IF.

A further distinction are the different types of focus based on the relation that the
focused constituent has to its context. A new car in (4a) clearly has a different
relation to the context than a new car has in (4b). While the focus in (4b) contrasts
with an element of the preceding context, no such relation holds in (4a): the focus
just contributes new information to the discourse (in this case, information that is
explicitly requested in the preceding question). Based on these distinct relations
with the preceding context, information focus as in (4a) and corrective focus as in
(4b) are distinguished.

(4) a. (Context: What did John buy?) information focus


He bought [a new car]IF.
b. (Context: John bought a house, right?) corrective focus
No, he bought [a new car]CF.

Within Rooth’s (1985, 1992) Alternative Semantics, the distinction between infor-
mation and corrective focus can be stated in terms of the size or properties of the
alternative set, i.e., the set of alternatives to the focused constituent. In the case of
the contrastive focus in (4b), the set of alternatives for the focus a new car is a
closed set and the alternatives are explicitly given; the alternative set consists of
one element only, namely a house. In the case of the information focus in (4a), the
set of alternatives is an open set, which may contain a house, a dog, a bike, etc.
Cruschina (2012, 2021), however, argues that this binary distinction between two
types of focus is not sufficient and proposes a four-way distinction with exhaustive
focus and mirative focus as additional types (cf. (5)). Note that these additional
types also rely on an alternative-based definition of focus (cf. also Repp [2010: 1335]
on the relation between different types of focus and the alternative set).
(5) a. information focus: a contextually open set (only pragmatically
delimited) (= (4a))
Corpus Data and the Position of Information Focus 71

b. exhaustive focus: exhaustive identification or the exclusion by


identification with respect to a set of alternatives
c. mirative focus: the proposition asserted is more unlikely or unexpected
with respect to the alternative propositions4
d. corrective focus: correction of explicitly given alternatives (= (4b))
(Cruschina 2021: 5; modified)

The identification of and distinction between the first three types is not trivial, since all
three types can appear in answers to wh-questions. We assume that information focus
is the most neutral focus type in that – besides the specification of a variable – it does
not involve any semantic or pragmatic components. Hence, we analyze foci in an-
swers to wh-questions as information foci as long as there are no contextual cues that
suggest otherwise (cf. Section 4.1 on coding decisions in the corpus study).

3 How Focus Shapes Sentence Form in Spanish


3.1 Information Focus and Main Stress in Spanish

Spanish is one of many languages for which it has been claimed that the main
stress needs to be within the focus (e.g., Zubizarreta 2016: 166). In (6) we see a
general representation of this condition, which we henceforth refer to as the focus
prominence rule (FPR). If the main stress does not fall within the focus, the
respective sentence is ungrammatical or pragmatically infelicitous (as in (6b)).

(6) a. [… XX …]F XX = position of the main stress


b. #[…]F … XX …
(Bosque and Gutiérrez-Rexach 2009: 681)
Focus structure would thus shape sentence form in that one prosodic aspect of sen-
tence form (position of main stress) would be determined (fully or partly) by focus
structure. In the case of narrow focus, the position of the main stress would be fully
determined by focus structure: The main stress must fall on the narrowly focused
constituents. In the case of wide focus, focus structure would still limit the possible
sentence forms since the main stress needs to be on one of the focused constituents.
There are several formulations which capture the idea that in a well-formed
sentence, the main prosodic prominence must be within the focus. An early

4 An example for a mirative focus (adapted from Cruschina 2021: 6) is given in (i):
(i) They had told us that we would only see zebras and lions and instead we saw [a tiger]F
yesterday at the zoo.
72 S. Heidinger

formulation can be found in Jackendoff (1972) who not only states that the main
prominence must be within the focus, but also claims that stress rules determine to
which element of the focus the main prominence will be assigned: “If a phrase P is
chosen as the focus of a sentence S, the highest stress in S will be on the syllable of P
that is assigned highest stress by the regular stress rules” (Jackendoff 1972: 237). A
shorter version of Jackendoff’s formulation can be found in Truckenbrodt (1995: 152),
who leaves aside the question of to which element of the focus the main prominence
will be assigned. The following quotes from Zubizarreta (1998, 2016) show that single
authors may provide slightly different formulations of the focus prominence rule,
highlighting different aspects of the relation between prosodic prominence and focus.
Focus prominence rule.
Given two sister categories Ci (marked [+F]) and Cj (marked [−F]), Ci is more
prominent that Cj. (Zubizarreta 1998: 21)
Focus prosody correspondence principle.
The F-marked constituent of a phrase must contain the rhythmically most
prominent word of that phrase. (Zubizarreta 1998: 38)
The focused constituent must contain the rhythmically most prominent word,
i.e., the word that bears the Nuclear Stress (NS). (Zubizarreta 2016: 166)

Within optimality theory (Prince and Smolensky 1993) the focus prominence rule
has a corresponding constraint which again can be formulated in different ways.
Gutiérrez-Bravo’s (2006) constraint states that focus is most prominent and regu-
lates the relation between a focus feature in the input and the main stress in the
output. Samek-Lodovici’s (2005) constraint captures in addition the idea that the
focus must be most prominent within its focus domain (typically a sentence).
Gabriel’s (2010) formulation states that the focused phrase bears the nuclear stress.
FOCUSPROMINENCE: Focus is most prominent.
(Gutiérrez-Bravo 2006: 111)
STRESS-FOCUS: For any XPf and YP in the focus domain of XPf, XPf is prosodically
more prominent than YP.
(Samek-Lodovici 2005: 696)
STRESSFOC: [XP]F bears nuclear stress.
(Gabriel 2010: 203)

Note that in the above formulations of the FPR prominence is referred to as stress,
prosodically more prominent or rhythmically most prominent word. In the following
the term main stress is used to refer to prominence.
As to Spanish, the language of interest of this paper, most authors assume the
validity of the FPR (diverging statements will be discussed below). The following
Corpus Data and the Position of Information Focus 73

example from Bosque and Gutiérrez-Rexach (2009: 682) illustrates the assumed
validity of the FPR in Spanish. The answer in (7) is judged pragmatically infelici-
tous by the authors, because the main stress is outside the focus.

(7) (Context: What did Pepín do?)


# [Llegó tarde]F PEPÍN.
arrived late Pepín
‘Pepín arrived late.’
(Bosque and Gutiérrez-Rexach 2009: 682)

Examples for the claim that the FPR holds for Spanish, which we have already seen
above, include Zubizarreta’s (1998) formulation of the FPR, which she extends to
Spanish, Gabriel’s (2010) STRESSFOC constraint and Bosque and Gutiérrez-Rexach’s
(2009: 681) representation in (7). Further, Olarrea (2012: 606) states that in Spanish
“[…] the highest syntactic node marked as focus must dominate the constituent
that contains the Nuclear Stress […].” Feldhausen and Vanrell (2014: 124–125) and
Heidinger (2015: 118) propose constraint rankings for Spanish, in which STRESSFOCUS
is dominated by no other conflicting constraint. This suggests that there is indeed a
close relation between focus and prosodic prominence in Spanish.
Nevertheless, there are also findings in the literature which suggest that ex-
ceptions to the FPR may occur in Spanish. Feldhausen and Vanrell (2015) present
data on the prosodic realization of Spanish clefts, where the main stress does not
fall within the focus.5 It must be kept in mind, however, that the cases of a
mismatch between focus and main stress reported by Feldhausen and Vanrell
(2015) concern cleft sentences and not simple (i.e., non-cleft) sentences. Data on
the violation of the FPR in simple declaratives are presented in Calhoun et al.
(2018). The authors have conducted a production experiment on the sentence form
of sentences with intransitive verbs. A striking result of their prosodic analysis of
the participants’ responses is that the main stress quite frequently does not fall
within the narrow focus of the sentence (Calhoun et al. 2018: 18). Such violations
have the sentence form and focus structure as in (8): The focused subject is in
initial position, but the main stress falls on the sentence final verb.

(8) [subject]F-VERB

5 An example from Feldhausen and Vanrell (2015: 15; adapted) for such a mismatch in Spanish
cleft sentences is given in (i) where the main stress falls on vecino.

(i) No, fue [el coche]F lo que María trajo a su VECINO.


no it.was the car that María brought her neighbor
‘No, it was the car what María brought her neighbor’
74 S. Heidinger

The general view in the literature maintains that Spanish is one of the languages
for which the FPR correctly characterizes the relation between main stress and
focus structure.

3.2 The Position of Information Focus in Spanish

Spanish is a language with a relatively free word order, but word order is strongly
influenced by information structure. Specific syntactic constructions such as
fronting, movement to final position or clefting strongly interact with focus
structure. The most vividly discussed issue in recent literature on focus in Spanish
is the syntactic position of information focus, and specifically in simple declarative
sentences. Basically, there are two opposing viewpoints: A first group of authors
holds that in Spanish information focus needs to appear in sentence final position
(cf. Büring 2009; Büring and Gutiérrez-Bravo 2001; Cruschina 2021; Ecandell Vidal
and Leonetti 2019; Fábregas 2016; Feldhausen and Vanrell 2015; Gutiérrez-Bravo
2002, 2008; Helfrich and Pöll 2011; Leonetti 2014; Martín Butragueño 2005; Revert
Sanz 2001; Rodríguez Ramalle 2005; Zubizarreta 1998, 1999). A second group of
authors shows that information focus is not limited to the sentence final position,
but can also appear in non-final positions (cf. Brunetti 2009; Calhoun et al. 2018;
Domínguez and Arche 2014; Gabriel 2007, 2010; Gupton 2017; Hertel 2003; Hei-
dinger 2013, 2014, 2018a, 2018b; Hoot 2012, 2016; Hoot and Leal 2020; Hoot et al.
2020; Jiménez-Fernández 2015a, 2015b; Leal et al. 2018; Muntendam 2013; Olarrea
2012; Ortiz López 2009; RAE and ASALE 2009; Roggia 2018; Uth 2014; Vanrell and
Fernández-Soriano 2013, 2018). To illustrate the difference between the two
groups, let us take a look at the data in (9). While for the first group of authors only
(9B) would be grammatical/felicitous, the second group assumes that non-final
information focus as in (9B′) is also possible.

(9) A: What does María buy at the kiosk?


B: María compra en el kiosco [el diario]F.
María buys at the kiosk the newspaper
B′: María compra [el diario]F en el kiosco.
María buys the newspaper at the kiosk
‘María buys the newspaper at the kiosk.’
(Gabriel 2010: 213)

There are several subtypes of non-final positions that have been attested for in-
formation focus in Spanish: postverbal prefinal (cf. (9B′)), preverbal in situ
(cf. (10B)), and preverbal fronted (cf. (11B)).
Corpus Data and the Position of Information Focus 75

(10) A: Who bought the newspaper?


B: [Juan]F compró el periódico.
Juan bought the newspaper
‘Juan bought the newspaper.’
(Olarrea 2012: 605)

(11) A: What did the sailor give to the old man?


B: [La carta]F le dio el marinero al viejo.
the letter him gave the sailor to.the old.man
‘The sailor gave the letter to the old man.’
(Vanrell and Fernández-Soriano 2013: 261)

The reason for which information focus might be drawn to the final position has
to do with the focus prominence rule that we saw in Section 3.1. Zubizarreta
(1998) distinguishes two types of main stress (nuclear accents in her terms): A
neutral accent that is used in the case of information focus and an emphatic one
which is used in the case of contrastive focus. Crucially, according to Zubi-
zarreta (1998, 1999), the neutral accent has to appear in sentence final position
and the nuclear accent also has to fall within the focus. As a consequence of
these two constraints, which we label STRESSFOCUS and RIGHTMOSTSTRESS, infor-
mation focus needs to be in final position. The limitation to the final position
implies deviations from the unmarked word order whenever a constituent that
has a non-final unmarked position is the information focus (cf. (12B)). Since this
type of word order variation is prosodically motivated, Zubizarreta (1998: 124)
labels it as p-movement (prosodically motivated movement). In the literature
which also considers non-final information focus as an option, the non-final
information foci are commonly analyzed as cases where RIGHTMOSTSTRESS is
outranked by a constrained which punishes movement (e.g., STAY; cf. Grimshaw
[1997: 374]).
(12) A: What does María buy at the kiosk?
B: María compra en el kiosco [el diario]F.
María buys at the kiosk the newspaper
RIGHTMOSTSTRESS >> STAY
B′: María compra [el diario]F en el kiosco.
María buys the newspaper at the kiosk
‘María buys the newspaper at the kiosk.’
 STAY >> RIGHTMOSTSTRESS
An interesting aspect of the debate about the position of information focus in
Spanish is that the diverging views typically correlate with diverging methods (cf.
76 S. Heidinger

Table : Position of information focus in Spanish (methods and viewpoints).

Introspection Experiment

Only final Büring ; Büring and Gutier- Feldhausen and Vanrell 
rez-Bravo ; Cruschina ;
Escandell-Vidal and Leonetti
; Fábregas ; Gutierrez-
Bravo , ; Helfrich and
Pöll ; Leonetti ; Martín
Butragueño ; Revert Sanz
; Rodríguez Ramalle ;
Zubizarreta , 

Final and Olarrea ; RAE and ASALE Brunetti ; Calhoun et al. ; Domí-
non-final  nguez and Arche ; Gabriel , ;
Gupton ; Heidinger , ,
a, b; Hertel ; Hoot ,
; Hoot and Leal , Hoot et al. ;
nez-Fernández a, b; Leal et al.
Jime
; Muntendam ; Ortiz López ;
Roggia ; Uth ; Vanrell and Fer-
nández-Soriano , 

Table 1). Authors who rely on their intuitions as native speakers (or at least do not
mention any other source of data) typically assume that information focus are
limited to the final position. Authors using experimental methods typically argue
based on their experimental data that information focus is not limited to the final
position;6 the noteworthy exception is Feldhausen and Vanrell (2015) who have
not found non-final information foci in simple declaratives (nevertheless their data
shows considerable variation in the expression of information focus − especially
with regard to clefts).
Most of the experimental studies cited in Table 1 have in common that they put
forward evidence for non-final information foci. Besides, it must be noted that
these studies are a rather heterogeneous group. This concerns the variety of
Spanish that is studied (the list covers several varieties of European and American

6 Studies on English have shown that in the case of core grammatical phenomena introspection
and experimental data typically align (Sprouse and Almeida 2012). The discrepancy in Table 1
clearly shows that such an alignment does not generally hold for the pragmatics-syntax interface
in Spanish (cf. also Linzen and Oseki 2018 who report differences between the acceptability
judgements of authors and the acceptability judgments collected in experiments in Hebrew and
Japanese).
Corpus Data and the Position of Information Focus 77

Table : Focused direct objects in the context of a locative adjunct


(Gabriel [: , –]; our own calculation).

Order Frequency (%)

S-V-[dO]F-LOC .
S-V-LOC-[dO]F .
Total .

Table : Focused direct objects in the context of an indirect object


(Gabriel [: –, ]; our own calculation).

Order Frequency (%)

S-V-[dO]F-iO .
S-V-iO-[dO]F .
Total .

Spanish),7 and the methods that are applied in the experiments. We find pro-
duction tasks (both written and spoken), forced choice tasks (with written and
auditory stimuli), judgment tasks (with written and auditory stimuli), and self-
paced reading. However, the experiments also differ with respect to the preferred
position of information focus. While some studies detect a preference for the
sentence final position (e.g., Heidinger 2014), others show a preference for non-
final positions (e.g., Gabriel 2010). Many differences among the results found in
experimental studies can be explained by the slightly varying stimuli. For
example, focused subjects are preferred in preverbal position when the verb is
transitive with a lexical object, or intransitive unergative, but they are preferred in
final position when the verb is intransitive unaccusative or transitive with a clitic
object (cf. the acceptability judgment tasks in Hoot 2012, 2016 and Gupton 2017;
represented in (13)).

(13) Graded acceptability judgments


a. [S]F-V-dO > V-dO-[S]F (Hoot 2012, 2016)
b. Cl.V-[S]F > [S]F-Cl.V (Gupton 2017)

Sometimes even comparable structures yield different preferences. This can be


illustrated with data on the order of postverbal constituents in oral production
experiments. In Gabriel (2010) we find data on the frequency of final or prefinal

7 Note that we find speakers from both American and European varieties also among the authors
who rely on introspection-based data and assume that information focus is limited to the final
position.
78 S. Heidinger

Table : Focused direct objects in the context of a depictive (Hei-


dinger : ).

Order Frequency (%)

S-V-[dO]F -DEP .


S-V-DEP-[dO]F .
Total .

focused direct object in the context of a locative adjunct or an indirect object. As


Tables 2 and 3 show, the focused direct object appears more often in prefinal than
in final position.
Heidinger (2014) also presents data from a production study. Those results
indicate that the information focus appears more often in final than in prefinal
position (cf. Table 4). Even in the cases, where the focused constituents have a
prefinal unmarked position (e.g., [dO]F in the context of DEP), the information
focus is more often in final than in prefinal position.
Although the frequency of non-final positions may vary among the different
experimental studies, they suggest that information focus is not limited to the
sentence final position in Spanish. However, the validity of these experimental
data has been challenged. Uth (2014) raises the question whether participants
indeed assume an open alternative set with no further interpretational features
when answering questions in production tasks. She argues that the experimental
design used in production tasks, as in Gabriel (2007, 2010), Vanrell and Fernández-
Soriano (2013, 2018) or Heidinger (2014), invites an interpretation of the commu-
nicative situation in which the answers go beyond neutral information focus, and
also include pragmatic/semantic components such as exhaustivity or obvious-
ness: The pictures that the participants are asked about already contain the asked
information and therefore participants might answer as if the imagined questioner
already knows the answer (which is an unnatural situation for question-answer
pairs). Uth’s (2014) concerns are shared by Escandell-Vidal and Leonetti (2019)
who call into question the validity of experimental data with non-final information
focus in Spanish. In addition to Uth’s (2014) general critic of the design of picture
tasks in production experiments, Escandell-Vidal and Leonetti (2019: 206) also
argue that information packed questions such as ¿Quién le dio el diario a su her-
mano?‚ Who gave the apples to his/her brother’ suggest that the questioner is
highly informed about the situation in question. Further, they argue that the use of
written stimuli in some experimental studies leaves the position of the main stress
unclear (cf. Escandell-Vidal and Leonetti 2019: 204). Their final concern is that
these experiments generally deal with rather unnatural structures since the most
natural way to answer a wh-question is supposedly a fragment only containing the
focus (cf. (14)).
Corpus Data and the Position of Information Focus 79

(14) A: Who bought the car?


B: [John.]F

According to Escandell-Vidal and Leonetti (2019) participants are therefore con-


fronted with stimuli that they would not produce in a natural communicative
situation (either because participants are urged to use full sentences as answers in
production studies or because the stimuli in acceptability judgment tasks or forced
choice tasks are full sentences); according to Escandell-Vidal and Leonetti (2019: 207)
this again calls into question the validity of the data collected with such methods.
In the controversy outlined above, corpus data have played a very minor role
so far. Crucially, we are not aware of a corpus study which systematically inves-
tigated the expression of information focus in Spanish.8 Nonetheless, we find
some data relevant to the present discussion in corpus studies of specific con-
structions such as focus fronting or focused constituents in preverbal positions.
Silva-Corvalán (1984), who investigated fronting in spoken language corpora,
presents examples of fronted information foci. In (15) and (16) the foci correspond
to a wh-word in the preceding question.

(15) A: ¿Pero qué tratamiento le dan a la presión baja,


What treatment they.give to the pressure low
fuera del café con cognac?
besides of.the coffee with cognac
‘What do they give against low blood pressure besides
coffee with cognac?’
B: [Effortil]F me dieron a mí.
Effortil me they.gave to me
‘They gave me effortil.’
(Silva-Corvalán 1984: 13; modified)

(16) A: ¿Y cuántas inyecciones te pusiste?


and how.many injections you gave
‘And how many injections did you give yourself?’
B: [Dos]F parece que me puse.
two seems that me gave
‘It seems that I gave myself two.’
(Silva-Corvalán 1984: 13; modified)

8 At the last proofing stage of this paper, an unpublished doctoral dissertation has come to our
attention: Cassarà (2021). Although we cannot do justice to this work at this late stage, it shall at
least be mentioned since it does include a systematic analysis of Spanish corpus data with respect
to subject focus.
80 S. Heidinger

Brunetti (2009) also looks at the discourse function of fronted constituents and
found cases where the fronted focus is not contrastive, but answers and implicit (cf.
(17)) or explicit wh-question (cf. (18)).

(17) BEA: No está mal tener actividades de ocio […]


‘It’s not bad to have leisure activities.’
VIT: Sí, como el aerobic, por ejemplo.
‘Yes, like aerobics, for instance.’
BEA: Que se nos acaba. Tendremos que buscarnos otra cosa, no? […]
‘which is about to end. We’ll have to look for something else, don’t
you think?’
Sí que tendremos que buscar agún sitio … a mí sí que apetece
seguir …
‘We definitely should look for some place … I do want to continue
…’
[Ir a nadar]F me=gustaría.
go:INF to swim:INF CL= please:3.SG.PRS.COND
‘I would like to go SWIMMING.’
(C-Oral-Rom, cited after Brunetti 2009: 60)

(18) ALM: Y ahora mismo, cuál es la que menos oposición tiene por parte de
la gente?
‘And right now, which is the one that encounters less opposition
by the people?’
JAV: Yo no sé cuál será, probablemente [la energía solar]F
I not know which will.be probably the energy solar
será la que menos oposición tenga.
will.be the.one that less opposition has
‘I don’t know which one; probably solar energy will encounter less
opposition.’
(C-Oral-Rom, cited after Brunetti 2009: 60)

Finally, Hülsmann (2019) analyzes the discourse function (topic and focus status)
of sentence initial constituents in spoken European Spanish (data from C-Oral
Rom). Besides contrastive and mirative foci in initial position, he also finds cases of
information foci, as in (19). In this example, the focus el año pasado corresponds to
the wh-word cuándo of the preceding question.

(19) A: ¿Y cuándo lo h[izo]?


and when it he/she.did
‘And when did he/she do it?’
Corpus Data and the Position of Information Focus 81

B: ¡Ah!, pues [el año paSAdo]F fue.


ah well the year passed it.was
‘Ah, well, it was the last year.’
(C-Oral-Rom, cited after Hülsmann (2019: 136))

To sum up these findings, several types of data support the assumption that in-
formation focus is not limited to the final position in Spanish:9 introspection of
authors (cf. Olarrea 2012), various types of experimental studies (see list in Table 1),
and sporadic corpus data (cf. Brunetti 2009; Hülsmann 2019; Silva-Corvalán 1984).
Since many authors maintain the view that information focus is limited to the final
position in Spanish, the controversy has yet to be solved.

9 Although the debate on the position of information foci in Spanish is mainly concerned with
non-complex declaratives it should be noted that cleft sentences appear in the context of infor-
mation foci as well. The three subtypes of clefts which are commonly distinguished, i.e., clefts
(sensu stricto), pseudo clefts and inverted pseudo clefts, are also attested in Spanish. As concerns
the expression of information foci, Moreno Cabrera (1999: 4299) states that clefts (sensu stricto) are
not an option and are limited to contrastive contexts. However, experimental data from Gabriel
(2007, 2010), Uth (2014), and Feldhausen and Vanrell (2015) suggest that all three types can be
used with information foci (cf. (i)).

(i) a. Fue [Blancanieves]F quien lo entregó aquí. (Uth 2014: 100)


it.was snow white who it.put here
‘It was snow white who put it here.’
b. Quien habla es [Juan]F. (Moreno Cabrera 1999: 4296)
who speaks is Juan
‘It is Juan who speaks.’
c. [Aruma]F es quien le da el diario al hermano. (Uth 2014: 99)
Aruma is who him gives the newspaper to.the brother
‘It is Aruma who gives the newspaper to the brother.’

In addition to these three types, some American varieties have a construction with focalizing ser
(ser focalizador). In this construction the focus is separated from the background via the inflected
verb ser (cf. Bosque 1999; Camacho 2006; Méndez Vallejo 2009, 2015; Mora-Bustos 2009; Sedano
2003; Zubizarreta 2014). In Bosque (1999) and Mora-Bustos (2009) this construction is not analyzed
as a subtype of clefts, but as a non-complex sentence where ser functions as a focus marker. As
shown in (ii) this construction can also be used for information foci.

(ii) A: What did John study?


B: Juan estudió fue [lingüística]F.
Juan studied was linguistics
‘John studied linguistics.’
(Méndez Vallejo 2009: 10–11; modified)
82 S. Heidinger

4 Empirical Study
4.1 Method and Data Basis

4.1.1 Preliminary Remarks

In this paper we investigate the syntactic position of information focus in European


Spanish based on corpus data. The main motivation for looking at corpus data is
that they constitute an important source of linguistic evidence, but have been
underrepresented so far in the research on the position of information focus in
Spanish and we expect novel insights from considering this data type. The lack of
corpus-based studies in the recent discussion could be explained by the fact that
focus and background are information structural categories which are difficult to
annotate in corpus data (unlike, for example, the information status (given vs.
new) of constituents, which can be derived rather easily from the previous
context).
As shown in Section 3.2, while corpus studies are not totally absent in the
literature on focus in Spanish, they take specific syntactic constructions as a
starting point, such as fronting or clefts (e.g., Brunetti 2009; Hülsmann 2019; Silva-
Corvalán 1984 on fronting). These corpus studies also provide valuable insights
into the information structural properties of these constructions (e.g., can fronted
constituents be information focus or are they always contrastive?). Since they go
from form to function, they cannot provide the full picture on how information
focus is expressed in Spanish and how often the various formal means are used. In
our study, we start from the information structural function information focus and
then search for its formal expression, mainly in terms of the syntactic position of
the focus within the sentence. Hence, before we can determine the syntactic po-
sition of the information focus in a given sentence, we need to know whether the
sentence contains an information focus and, if so, which part of the sentence is the
information focus. Only once this is settled, the syntactic position of the infor-
mation focus can be analyzed.
As mentioned above, the first methodological challenge we face is how to
identify information focus in text corpora (cf. Lüdeling et al. 2016 on information
structure annotations in corpus data). A common heuristic in the study of infor-
mation focus is to look at question-answer pairs, as in (20), where the question
contains a wh-question and the part of the answer which corresponds to the wh-
word is the information focus (cf. The Questionnaire on Information Structure, QUIS
[Götze et al. 2007]; El Zarka and Heidinger 2014; Skopeteas 2012; van der Wal 2014).
Corpus Data and the Position of Information Focus 83

(20) A: Who bought the car?


B: [John]F.
B′: [John]F did.
Bʺ: [John]F bought it.
etc.

This heuristic gives rise to the question whether all foci in answers to wh-questions
are in fact information focus. Taking the open alternative set as the defining
property of information focus, wh-questions, such as (21), which are followed or
preceded by a choice among the alternatives do not elicitate information focus.10
(21) E1: A ver, y la boda quién la pagaba, ¿los padres o …?
to see and the wedding who it paid the parents or
‘Let’s see, and the wedding, who paid for it? The parents or …?’
I1: Pos la paga- | la boda la pagaban los padres.
Well it paid the wedding it paid the parents
‘Well, the wedding, the parents paid for it.’
(COSER: Higueruela, COSER-0211_01)

Another issue is whether information focus is limited to cases without any special
interpretative effect such as exhaustivity or mirativity (cf. Cruschina 2012, 2019 for
the latter, and further Uth and García García [2018: 11–12] and Escandell-Vidal and
Leonetti 2019). In order to have an operationalizable criterion, we take the focus in
answers to wh-questions where the alternative set is not explicitly restricted by the
preceding context to be information focus.
Once the information focus of the sentence is identified, the second step
consists in determining the syntactic position of the information focus. This aspect
will be detailed below when we discuss the coding decisions applied in this study
(Section 4.1.4). Before that we introduce the text corpus (Section 4.1.2) and describe
how relevant data was extracted from this corpus (Section 4.1.3).

4.1.2 The Corpus COSER

As shown above, a suitable text corpus for the study of information focus should
contain dialogical structures with a large amount of wh-questions and the answers
to these questions. A text corpus for European Spanish which has such a large
amount of dialogical structures and wh-questions is the Corpus Oral y Sonoro del
Español Rural (Fernández-Ordóñez 2005–), henceforth referred to as COSER.

10 In our presentation of data from the corpus COSER we kept the abbreviations E (interviewer)
and I (informant/interviewee).
84 S. Heidinger

Besides the large amount of dialogical structures with wh-questions COSER has the
advantage that these dialogues are authentic, because they are not simulated
spoken language as in dramatic plays.
The main characteristics of COSER are the following (cf. http://www.
corpusrural.es/descripcion.php; version March 5 2020): The corpus consists of
spontaneous speech for which members of the COSER project (typically students)
interviewed speakers of European Spanish on various topics of their everyday
lives. At the point of data extraction from the corpus (March 2020) 2.684 informants
(47.6% male and 52.3% female) were interviewed amounting to 1.851 h of recorded
speech (the interviews started in 1990 and are still continuing). The informants for
the recordings came from 1.383 different rural villages covering all territories of
European Spanish (for a detailed map see http://www.corpusrural.es/map.php;
version March 5 2020). So far, 193 interviews have been transcribed by the COSER
project, corresponding to 258 h of recorded speech and a searchable corpus of
5.045.684 words. It is this corpus we have searched in this study. Since the main
goal of COSER is to provide a corpus for dialects of European Spanish, the research
group compiling the corpus chose the informants among criteria of traditional
dialectology: speakers from rural areas, if possible elderly persons (the average
age of all 2.684 informants in COSER is 73.4 years), low institutional education,
natives of the place where they are interviewed. The interviews focused on topics of
traditional country life with the goal of giving the informant some informative
authority over the young urban interviewer (cf. Fernández-Ordóñez 2011: 174). This
latter aspect is relevant for the present study because it creates a natural setting for
dialogical structures with question-answer pairs.

4.1.3 Data Extraction

In order to extract relevant data from COSER, we limit our interest to information
focus on subjects.11 This choice is motivated by the fact that the position of focused
subjects in Spanish is a highly debated issue with diverging results in experimental
and introspection-based studies. With most verbs subjects have a preverbal un-
marked position. Focused induced word order variation could be visible even in
rather reduced answers consisting only of the verb and the subject. Further, the
wh-word quién allows for relatively accurate searches in the corpus.

11 Note that in our corpus study the terms focused subject or subject focus are not limited to foci
which have the syntactic function subject (as John in in [John]F bought the car.). Instead we consider
all constituents that specify the variable of a question in which the interrogative pronoun is the
subject (e.g., Who bought the car?). This allows us to include, for example, focus-only sentences
and verbless sentences where the syntactic function of the focused constituent is not obvious.
Corpus Data and the Position of Information Focus 85

We searched COSER for the form quién in the simple search (consulta básica
and búsqueda exacta). The 1.324 hits of this search contained both lower case quién
and upper case Quién. Unfortunately, not all hits are relevant for the study and
relevant results had to be manually separated from irrelevant ones. First, we dis-
carded cases where quién is not the wh-word in an interrogative sentence with the
grammatical role subject (e.g., con quién, a quién). Second, we excluded cases
where the question is either a yes-no question or an open question that is preceded
or followed by some restriction of the alternative set. In (21), for example, la boda
quién la pagaba ‘the marriage, who paid for it’ is an open question, but the
alternative set is restricted by the follow up los padres ‘the parents’. Such cases
were excluded since the focus in the answer does not meet our definition of in-
formation focus: Information focus is the most neutral focus type in that – besides
the specification of a variable – it does not involve any semantic or pragmatic
components. Hence, we analyze foci in answers to wh-questions as information
foci as long as there are no contextual cues that strongly suggest contrast,
exhaustivity or mirativity. We left out questions that are copula sentences (¿Quién
era alcalde? ‘Who was the mayor?’) and only considered questions that displayed
real dialog structures, i.e., we excluded cases with echo questions that where
caused by mishearing during the interviews, questions during reported speech,
rhetorical questions or questions that were answered by the same person who
posed the question. We only considered questions that were answered by the
informants and we did not consider the rare cases of questions that were answered
by the interviewers. Finally, we ignored cases where the question remains unan-
swered either because the informant states that he/she does not know the answer
or because the informant continues talking about another topic.

4.1.4 Data Annotation

After the manual selection according to the above criteria we obtain a total of 258
relevant quién-questions. The answers to these questions were further annotated
and, in a first step, classified as focus only versus focus + background depending on
whether the answer only consists of the focus or also contains background ma-
terial. Note that discourse markers, such as pues, bueno, hombre, were ignored.
Hence, in (22) the answer is classified as focus only despite the presence of various
discourse markers in addition to the focus el marido ‘the husband’.

(22) E1: ¿Y quién se encargaba de dar de comer a las vacas?


and who REFL took.care of give to eat to the cows
‘And who took care of feeding the cows?’
86 S. Heidinger

I1: Bueno, pues [el marido]F, hombre.


good well the husband man
‘Well, the husband.’
(COSER: Narros del Puerto, COSER-0614_01; modified)

The answers with focus and background fall into several subclasses: Cleft sen-
tences (23), copula sentences with ser (24), existential sentences with haber (25),
sentences with a lexical verb (26), and verbless sentences (27).

(23) E: ¿Quién solía ser el que se encargaba?


who used be the.one that REFL took.care
‘Who was the one who used to take care?’
I: El que más se encargaba era [uno que
the.one that most REFL took.care was one that
 estaba de sacristán]F, pero a ese señor necesitaba
 was of verger but to this man needed
 que lo ayudaran.
 that him helped
 ‘The one who most took care was one who as a verger, but he needed
help.’
 (COSER: Villaconejos de Trabaque, COSER-1636_02; modified)

(24) E4:  Y, y, y ¿quién ha construido las casas que hay


 and and and who has built the houses that exist
 en este pueblo?
 in this village
 ‘And who built the houses in this village?’
I1: Esta era | esto era [la yeguada militá]F …
that was that was the stud.farm military
‘That was the military stud farm.’
(COSER: San José de Malcocinado (Medina-Sidonia),
COSER-1116_01; modified)

(25) E1: ¿Y quién, quién ayudaba entonces?


and who who helped back.then
‘And who helped back then?’
I1: Pues había [una comadrona]F.
well had a midwife
‘Well, there was a midwife.’
(COSER: Huércanos, COSER-2506_01; modified)
Corpus Data and the Position of Information Focus 87

(26) E2: ¿Quién compraba el traje de la novia?


who bought the dress of the fiancée
‘Who bought the fiancée’s dress?’
I2: […] Los trajes de la novia los compraba [el novio]F.
the dresses of the fiancée them bought the fiancé
‘The fiancé bought the fiancée’s dress.’
(COSER: Cifuentes de Rueda (Gradefes), COSER-2606_01; modified)

(27) E2: ¿Quién hizo el vestido?


who made the dress
‘Who made the dress?’
I1:  El vestido …, […] [una modista de aquí]F.
 The dress a dressmaker from here
 ‘The dress, a dressmaker from around here.’
 (COSER: Miranda de Arga, COSER-3222_01; modified)

Although we consider all of the above types in our quantitative analysis, our main
interest is the analysis of answers with a lexical verb. These answers are further
annotated with respect to the syntactic position of the subject, and we distinguish
between a postverbal final position (cf. (26) above) and a preverbal position (cf. (28)).

(28) E1: ¿Quién iba a por la leña?


who went PREP for the firewood
‘Who went for the firewood?’
I2: Pos [mi marido]F la traía.
well my husband it brought
‘Well my husband brought it.’
(COSER: Bacares, COSER-0404_01; modified)

Finally, the informants sometimes responded to a given question in several ways.


In (29), for example, there is a succession of a focus-only answer followed by an
answer with a lexical verb and the focus in final position. In such cases we have
only considered the first answer and would thus count this case as a focus-only
answer.

(29) E:  ¿Y quién les pagaba?


 and who them paid
 ‘And who paid them?’
I:  [El dueño]F. No, el dueño no, no, no los pagaba [nadie]F.
 the owner no the owner not no no them paid nobody
 The owner. No, the owner did pay them. Nobody did.
 (COSER: Navalmoral de la Mata, COSER-1015_01; modified)
88 S. Heidinger

Table : Frequency of focus-only answers and focus + background


answers.

abs. %

Focus only  .


Focus + background  .
Total  .

5 Results
The corpus search in COSER has yielded a total of 258 relevant examples,
i.e., examples in which the information focus corresponds to the subject wh-word
quién of a preceding open question. We describe these 258 examples with respect to
their syntactic and information structural properties in three consecutive steps.
Firstly, we look at how many answers are focus-only answers and how many
answers contain both focus and background. Secondly, we classify further those
answers which have both a focus and a background. Finally, we analyze the 58
answers formed with a lexical verb and look at the syntactic position the focus in
terms of sentence final and non-final positions.
The distinction between focus-only answers and answers which contain both
the focus and background material: Table 5 shows that the majority of wh-
questions about the subject are answered with focus-only answers. Examples of
focus-only answers are given in (30)–(32).12

(30) E1: ¿Y quién encendía el horno?


and who turned.on the oven
‘And who turned on the oven?’

12 Additionally, we also categorized as focus only the following case where the question that is
answered is actually Who paid whose clothes? (given that in the context of a wedding the clothes of
two persons are relevant). The subject focus is thus one of two narrow information foci in the
answer of I1.

(i) E1: […] ¿Quién pagaba el vestido? ¿Quién …?


who paid the dress who
‘Who paid the dress? Who …?’
I1: Pos [cada uno]F [el suyo]F.
well every one the his/hers
‘Well, everyone paid for himself.’
(COSER, Porzuna, COSER-1417_01; modified)
Corpus Data and the Position of Information Focus 89

I1:  [Mi padre]F.


 my father
 ‘My father.’
 (COSER, Menagarai (Ayala), COSER-0109_02; modified)

(31) E1: ¿Quién le hizo el traje?


who him made the suit
‘Who made his suit?’
I:   Pos [una muchacha que cosía aquí mu[y] bien]F.
  well a woman that sewed here very well
  ‘Well, a woman from here who sewed very well.’
  (COSER, Algar, COSER-1102_01; modified)

(32) E: Y, ¿quién la hacía? [la =meal at wedding]


and who it made
‘And who made it?’
I:  Pues [las madres y señoras que sabían guisar
 well the mothers and ladies who knew cook
 muy bien]F.
 very well
 ‘Well, the mothers and the ladies who cooked very well.’
 (COSER, Cubillejo del Sitio (Molina de Aragón), COSER-1907_01;
modified)

Regarding the answers which include both the focus and background material
Table 5 shows such answers are less frequent than the focus-only answers. In order
to further classify the answers with both focus and background, we use the
following categories: declarative sentences with a lexical verb, existential con-
structions with haber, copula constructions with ser, clefts and, finally, verbless
sentences. The absolute and relative frequencies of these categories are given in
Table 6.

Table : Types of answers with focus and background.

abs. %

Declaratives with lexical verb  .


haber existentials  .
ser copula sentences  .
Cleft sentences  .
Verbless sentences  .
Total  .
90 S. Heidinger

The most frequent type of answers are sentences which contain a lexical verb
in addition to the focused subject (and possibly some further backgrounded
constituents). They make up more than half of the cases with both a focus and a
background. Example (33) shows that the focused subject los padres ‘the parents’ is
in sentence final position. The sentence further contains the verb pagar ‘pay’ in its
compound past form, the pronoun esto ‘that’, the adverb siempre ‘always’ and the
discourse marker pues ‘well’. The decisive property for this class is, however, the
presence of a lexical verb; below we further analyze the members of this class with
respect to the syntactic position of the focus.

(33) E1: Claro. ¿Y quién pagaba eso?


of.course and who paid that
‘Of course. And who paid for that?’
I3: Pues esto siempre ha pagao [los padres]F.
well that always have paid the parents
‘Well, the parents have always paid for that.’
(COSER, Luzuriaga (San Millán/Donemiliaga), COSER-0107_01;
modified)

The second type of answers with both focus and background are constructions
with haber ‘have’ (or rather ‘there is/was’). They make up about 10% of the answers
with both focus and background. For the most part, these impersonal uses of haber
show the focus in final position, as in (34). Only in one out of 10 examples with
haber, the focus appears in preverbal position. This example is shown in (35), with
the focus being split up by the verb haber and the locative aquí.

(34) E1: ¿Y quién ayudaba al cura a celebrar la misa?


and who helped the priest PREP celebrate the mass
‘And who helped the priest celebrating the mass?’
I2: Siempre había [un monaguillo]F.
always had a altar.boy
‘There was always an altar boy.’
(COSER, Bacares, COSER-0404_01; modified)

(35) E1: ¿Y quién, y quién ayudaba entonces?


and who and who helped then
‘And who helped back then?’
I2: Pues, [una mujer]F había aquí [que, que lo ha | lo hac-]F.
well a woman had here who who it have it did
‘Well, there was a woman here who did it.’
(COSER, Ledantes (Vega de Liébana), COSER-1212_01; modified)
Corpus Data and the Position of Information Focus 91

Another type of answers with both focus and background is formed with the copula
verb ser ‘be’. An example of this construction is given in (36). As the verb’s
congruence in person and number shows, the focus is also the subject of the copula
verb.

(36) E1: Y, ¿cómo | quién se ocupaba de eso?


And how who REFL cared of that
‘And who took care of that?’
I1:  Ay. Eso más bien eran [las señoras, la mujer]F.
 Well that rather were the women the wife
 ‘Well, that were rather the women, the wife.’
 (COSER, Narros del Puerto, COSER-0614_01; modified)

As in the case of haber, the focus in the ser-construction is typically in postverbal


position (although the absolute numbers are rather low: four postverbal vs. one
preverbal). The only preverbal focus in a construction with ser deviates from the
other examples in that the copula sentence is a subordinate clause (cf. (37)).
(37) E3: ¿Quién decidía cuando se gastaba?
Who decided when REFL spent.money
‘Who decided when money was spent?’
I3: […]pues yo creo que [los dos]F serán, ¿no?
Well I think that the two will.be no
‘Well, I think both of them do.’
(COSER, Luzuriaga (San Millán/Donemiliaga), COSER-0107_01;
modified)
A further interesting point of difference among the ser-constructions is whether the
complement of the copula verb is overtly expressed or not. While in (36) the
pronoun eso, referring back to quién se ocupaba de eso is the complement of the
copula verb, no such overt complement can be found for the copula verbs in (37)
and (38). The anaphoric interpretation of the null complement is nevertheless the
same as in (36), namely referring back to the preceding question.13 With regard
to actual cleft sentences, our corpus study suggests that they are quite infrequent
in the expression of information focus in spoken European Spanish. Among all 258
relevant examples only one is a cleft sentence (cf. (39), repeated from Section 4.1).
It is striking that in this case not only the answer, but also the question is formu-
lated as a cleft sentence.

13 With their anaphoric reference to the preceding question the ser-constructions somewhat
resemble wh-clefts, or at least can be easily transformed into wh-clefts once we substitute the
anaphoric complement of the copula sentence by its antecedent, i.e., the wh-question.
92 S. Heidinger

(38) E: ¿Pero quién viene?


But who comes
‘But who comes?’
I: Son [los de aquí, de los pueblos vecinos]F, […]
are the.ones from here from the villages neighboring
‘Those from the neighboring villages.’
(COSER, Almajano, COSER-3901_01; modified)

(39) E: ¿Quién solía ser el que se encargaba?


Who used be the.one that REFL took.care
‘Who was the one who used to take care?’
I:  El que más se encargaba era [uno que
 the.one that most REFL took.care was one that
 estaba de sacristán]F, pero a ese señor necesitaba
 was of verger but to this man needed
 que lo ayudaran.
 that him helped
 ‘The one who most took care was one who was a verger, but he
needed help.’
 (COSER: Villaconejos de Trabaque, COSER-1636_02; modified)
The last category shown in Table 6 are verbless sentences. In these cases, the
background is typically marked as such by dislocation or detachment. Still, the
members of this class show interesting variations. In some verbless sentences, we
see the same anaphoric reference back to the question as discussed above for the
copula sentences (cf. eso ‘that’ in (40)). Other types of background material are
frame setters such as antes ‘in the former days’ in (41) or frequency adverb(ial)s
such as normalmente ‘normally’ in (42).

(40) E3: ¿Cómo se mantenía una huerta antes?


How REFL maintained a orchard before
¿Quién se encargaba?
Who REFL took.care
‘How was an orchard maintained back then? Who took care?’
I1: Eso, [el marido]F. […]
that the husband
‘The husband.’
(COSER, Miranda de Arga, COSER-3222_01; modified)

(41) E1: Y antes, ¿quién podía cazar?


and before who could hunt
‘And before, who was allowed to hunt?’
Corpus Data and the Position of Information Focus 93

Table : Focus position in declaratives with lexical verb.

abs. %

Preverbal  .
Final  .
Total  .

I: Antes [todos]F.
before all
‘Before, everybody.’
(COSER, Valle de Cerrato, COSER-3426_01; modified)

(42) E1: ¿Quién hacía para sacar las vacas o …?


who made for take.out the cows or
‘Who got the cows out?’
I:  Pues, normalmente, [los hombres]F.
 well normally the men
 ‘Well, usually, the men.’
 (COSER, Azcona (Valle de Yerri), COSER-3203_01; modified)

We now return to the first subclass in Table 6: Declarative sentences which contain
in addition to the focused subject also a lexical verb (and possibly some further
backgrounded constituents). We further analyze this subclass because it is most
relevant for the ongoing debate about the syntactic position of information focus in
Spanish (cf. Section 3.2). The corpus search yielded a total of 58 such examples. We
distinguish two syntactic positions in the linear order of constituents: preverbal
and final. As shown in Table 7 these two positions differ considerably in terms of
frequency. The final position by far more frequent position in our data, and in two
thirds of the cases the information focus is in final position. However, the preverbal
position is shown in a considerable number of cases (one third). The results thus
show that the final position is the preferred position for information focus, but at
the same time information focus is not limited to this position.14
We conclude our presentation of the main results with data exemplifying the
two positions. Starting at the beginning of the sentence, the examples in (43) and
(44) show cases where the information focus is in preverbal position. Examples for
the more frequent sentence final information focus are given in (45) and (46). An

14 Bear in mind that these results are about subjects only (which typically have a non-final in situ
position). In the case of constituents with a final in situ position (e.g., objects in the absence of
sentence final adjuncts) we can expect an even higher share of sentence final foci.
94 S. Heidinger

in-depth comparison of our empirical results with the existing literature and a
reflection on the merits of corpus data in this debate follow in Section 6.

(43) E1: Y, ¿quién iba a por tamujo?


and who went PREP for tamujo
‘And who went to get tamujo?’
I2: Pues [mi marido]F iba y mis hijos.
well my husband went and my sons
‘Well, my husband went and my boys.’
(COSER, Madrigal de las Altas Torres, COSER-0609_01; modified)

(44) E1: ¿Y quién, quién sacaba los | las ovejas?


and who who take.out the the sheep
‘And who took the sheep out?’
I: Ah, [mis hermanos]F solían andar, […]
well my brothers used go
‘Well, my brothers used to go.’
(COSER, Leitza, COSER-3214_01; modified)

(45) E1: ¿Quién te trae el periódico a casa? […]


who you bring the newspaper to home
‘Who delivers the newspaper to your home?’
I:  […] Nos deja [el …, el señor de los periódicos]F,
us leaves the the man of the newspapers
 nos deja en el portal.
 us leaves at the portal
 ‘The man of the newspapers brings them. He leaves it at the portal.’
 (COSER, Ermua, COSER-4503_01; modified)

(46) E1: ¿Y quién mandaba todo eso?


and who ruled all that
‘And who ruled all that?’
I1: En el pueblo mandaban [tres o cuatro]F:
in the village ruled three or four
el cura, el médico, secretario y, y. […]
the priest the doctor the secretary and and
‘In the village there were three or four who gave orders: the priest,
the doctor, the secretary and …’
(COSER, Lizartza, COSER-2005_01; modified)
Corpus Data and the Position of Information Focus 95

6 Discussion
6.1 Comparison of Results with Literature on Other Data Types

The results presented in Section 5 are based on a new data type in the debate on the
position of information focus. An obvious question to address is therefore how our
results fit with the existing literature. We first compare our results with the
introspection-based literature and then move on to a comparison with experi-
mental studies. Note that we will concentrate on declaratives with a lexical verb
and leave aside focus-only answers and other strategies that we have found in our
corpus data.
Recall from Section 3.2 that authors who rely on introspection fall into two
groups as to the syntactic position of information focus in simple declaratives:
Most authors assume that the information focus has to be in final position, while
only a small number of authors mentions that the information focus can also
appear in a non-final position (cf. Table 1). Since we found a considerable number
of examples of the information focus in preverbal, i.e., a non-final, position, our
results align with the minority and not the majority view from the introspection-
based literature. Our results thus challenge the common view in the introspection-
based literature according to which the information focus needs to be in final
position in Spanish. Nonetheless, we did find the final position to be the most
frequent position for information focus in declaratives. Hence, the syntactic po-
sition that is considered the only position by the vast majority of introspection-
based authors is the preferred position in our corpus data.15 This is, however, just a
preference and non-final information foci do occur in our corpus data.
We now turn to the comparison of our results with the experimental literature.
Recall from Section 3.2 that the experimental studies on the position of information
focus have shown that the information focus is not limited to the final position in
Spanish. In this very general sense, our results are in line with the experimental
literature since we also found many instances of non-final information focus. For
more specific comparisons with the experimental literature we must carefully
consider what exactly is compared. For example, it has been shown in several
studies that in declarative sentences with a backgrounded lexical direct object (as
part of the core sentence) the preferred position of the focused subject is the
preverbal position (cf. Gabriel 2010; Hoot 2012, 2016). In our corpus data, however,

15 If we not only consider non-complex declaratives with a lexical verb but also the ser- and
haber-constructions and the only attested cleft, the tendency to put the information focus in final
position would be even stronger; in these three constructions 14 out of 16 cases show the focus in
final position.
96 S. Heidinger

sentences with a backgrounded lexical argument of any kind as part of the core
sentence are very rare. On the one hand because of the frequent use of verbs which
do not have any arguments besides the subject and on the other because the
objects of transitive verbs are typically pronominalized (with the possibility of left
or right dislocation). In (47) the direct object in the answers is pronominalized, in
(48) it is pronominalized and further lexically expressed in a left dislocated
position.

(47) E1: ¿Y quién la lleva?


And who it carries
‘And who carries it?’
I2: Pues hombre, la lleva [quien quiere]F.
Well man it carries who wishes
‘Well, whoever wishes carries it.’
(COSER, Quintana de los Prados (Espinosa), COSER-0939_01;
modified, our italics)

(48) E2: ¿Quién compraba el traje de la novia?


Who bought the dress of the fiancée
‘Who bought the fiancée’s dress?’
I2:  […] Los trajes de la novia los compraba [el novio]F.
 the dresses of the fiancée them bought the fiancé
 ‘The dresses of the fiancée, the fiancé bought them.’
 (COSER, Cifuentes de Rueda (Gradefes), COSER-2606_01; modified,
our italics)
Only in two out of the 58 answers which are declaratives with lexical verbs we find
a lexical argument besides the subject as part of the core sentence. Interestingly, in
both cases the focused subject is in preverbal position (cf. (49) and (50)).

(49) E1: ¿Quién llamaba a ese señor?


Who called that man
‘Who called for that man?’
I1: Hombre, los no- | el novio y la novia, o sea,
Man the the fiancé and the fiancée that is
[los, los familiares]F llamaban a aquel hombre […]
the the family called this man
‘Well, the finance and the fiancée that is the family called for that
man.’
(COSER, Palencia de Negrilla, COSER-3610_01; modified)
Corpus Data and the Position of Information Focus 97

(50) E1: ¿Quién puso la casa?


Who put the house
‘Who provided the house?’
I1: Pue- … la casa, pues [mí suegro]F
Well the house well my father-in-law
nos dio una cas-, una casita.
use gave a house a small.house
‘Well, the house, well, my father-in-law gave us a house, a small
house.’
(COSER, Villalba de Lampreana, COSER-4611_01; modified)
Comparing the more frequent case in our data, namely the answers without lexical
arguments in the background of the core sentence with the existing experimental
literature makes more sense due to the larger amount of data. In cases without
lexical arguments in the background of the core sentence, information focus ap-
pears much more often in final position than in preverbal position in our corpus
data (about 66 vs. 34%).
We start our comparison with evidence from oral production experiments.
Calhoun et al. (2018: 18) report for focused subjects in intransitives in Venezuelan
Spanish that three types of syntactic-prosodic structures were produced: subject-
verb with the main stress on the verb ([S]F-V), subject-verb with the main stress on
the subject ([S]F-V), and verb-subject with the main stress on the subject (V-[S]F).
Table 8 shows the relative frequencies of these structures separately for unergative
and unaccusative verbs. In most cases the participants produce structures with the
focused subject in initial position. It is striking that the focus prominence rule is
frequently violated (56 and 42% of the cases respectively). Sentence initial focus
dominates even if cases where the focus prominence rule is violated are ignored.
Gabriel’s (2010: 202) production experiment with speakers of Argentinean
Spanish shows that in non-complex declaratives with transitive verbs the position
of the focused subject strongly depends on whether or not the object is a clitic or
expressed as a lexical argument. While preverbal subjects dominate whenever the
direct object is lexical, the situation is different when the object is realized as a

Table : Information focus on subject in intransitives (relative frequency; bold face indicates
position of nuclear stress) (Calhoun et al. : ; modified).

[S]F-V [S]F-V V-[S]F

Unergative     (N = )


Unaccusative     (N = )
98 S. Heidinger

clitic: Out of the 31 utterances of this type, 19 show the focused subject in sentence
final position (Cl.V-[S]F) and only 12 in preverbal position ([S]F-Cl.V). Roggia (2018)
tested in his oral production experiment the position of subjects with six types of
intransitive verbs (ranging from “unaccusative core” to “unergative core”); par-
ticipants were native speakers of Mexican Spanish. Lumping all verb types
together, Roggia’s data show a slight preference for putting the focused subject in
final position (56% V-[S]F vs. 44% [S]F-V; calculations based on Roggia [2018: 90,
Table 3]). Hertel (2003) conducted a written production experiment and reports
that native speakers (“from a variety of Spanish speaking countries” Hertel [2003:
285]) produced [S]F-V order more often than V-[S]F order (roughly 2/3 of the time)
for both unaccusative and unergative verbs (cf. Hertel 2003: 291).
Turning to forced choice experiments, Gupton (2017) reports for focused
subjects in sentences with a transitive verb and a clitic object that Spanish-Catalan
bilinguals preferred [S]F-Cl.V over Cl.V-[S]F, while it is the other way around with
Spanish-Galician bilinguals (Cl.V-[S]F > [S]F-Cl.V). Hoot and Leal (2020) conducted
a forced choice experiment with written stimuli and two groups of participants:
Native speakers of Mexican Spanish (from Merida), and Spanish-Catalan bilingual
native speakers of Spanish (from Barcelona, with Spanish as the dominant lan-
guage). Their results show a strong preference for [S]F-Cl.V over Cl.V-[S]F in the
Merida group (62 vs. 38%) and even a stronger preference for the other order, Cl.V-
[S]F, in the Barcelona group (78 vs. 22%) (cf. Hoot and Leal 2020: 17). Domínguez
and Arche (2014) present data from 20 monolingual speakers of Spanish which
“was collected in Spain” (Domínguez and Arche 2014: 252). They tested the posi-
tion of focused subjects with unergative verbs, unaccusative verbs and transitive
verbs with clitic objects. They found that subject final position is strongly preferred
with transitive verbs (and clitic objects) and unaccusative verbs, and that both
subject final and subject initial orders are chosen equally often with unergative
verbs (Domínguez and Arche 2014: 254).
The last experiment type we consider are judgement experiments. Jiménez-
Fernández (2015b) tested the acceptability of focused subjects in final and initial
position in the context of unaccusative verbs and transitive verbs with clitic objects
(written stimuli; participants are native speakers of Standard Spanish and
Southern Peninsular Spanish). In sentences with unaccusative verbs he found that
sentence final focused subjects are judged grammatical more frequently than
sentence initial ones (the difference in acceptability is bigger in Standard Spanish
(94 vs. 18%) than in Southern Peninsular Spanish (96 vs. 63%)) (cf. Jiménez-
Fernández 2015b: 129). In sentences with transitive verbs and clitic objects he
found again that sentence final focused subjects are judged grammatical more
often than sentence initial ones; but in this case the difference in acceptability is
similar in Standard Spanish (82 vs. 52%) and in Southern Peninsular Spanish (78
Corpus Data and the Position of Information Focus 99

vs. 48%)) (cf. Jiménez-Fernández 2015b: 128). Focused subjects in sentences with
transitive verbs and clitic objects are further tested in Gupton (2017). Participants
had to judge Cl.V-[S]F and [S]F-Cl.V on a 4-point Likert scale. Both structures
received scores above 3, but sentence final subjects received higher scores from
both Spanish-Catalan bilinguals and Spanish-Galician bilinguals (cf. Gupton
2017).
In sum, the experimental studies show that in declaratives without lexical
arguments as part of the background focused subjects are produced, chosen, and
judged acceptable both in final and non-final position. Besides this rather general
statement, the experimental evidence does not show a clear picture. While some
production and forced choice studies found the final position to be preferred over
the preverbal one, others show a preference for the preverbal position over the final
one. As for judgment experiments, both studies that we looked at suggest that the
final position is more acceptable than the preverbal one (but none of them is
unacceptable). Table 9 gives a summary ordered by experiment type.16

Table : Position of focused subjects in experimental studies (intransitive verbs and verbs with
clitic objects).

Experiment type Preference

Calhoun et al. () Oral production [S]F-V > V-[S]F Non-final


Gabriel () Oral production Cl.V-[S]F > [S]F-Cl.V Final
Roggia () Oral production V-[S]F > [S]F-V Final
Hertel () Written production [S]F-V > V-[S]F Non-final
Gupton () Forced choice [S]F-Cl.V > Cl.V-[S]F* Non-final
Cl.V-[S]F > [S]F-Cl.V** Final
Hoot and Leal () Forced choice Cl.V-[S]F > [S]F-Cl.V* Final
[S]F-Cl.V > Cl.V-[S]F*** Non-final
Domínguez and Arche () Forced choice Cl.V-[S]F > [S]F-Cl.V Final
Vunacc-[S]F > [S]F-Vunacc Final
[S]F-Vunerg ≈ Vunerg-[S]F
nez-Fernández (b)
Jime Acceptability Cl.V-[S]F > [S]F-Cl.V Final
V-[S]F > [S]F-V Final
Gupton () Acceptability Cl.V-[S]F > [S]F-Cl.V Final
(*Spanish-Catalan bilinguals, **Spanish-Galician bilinguals, ***Mexican Spanish).

16 As noted by a reviewer it would be interesting to link the differences between varieties found in
Table 9 to more general differences with respect to verb-subject versus subject-verb order. For
example, the preference of Merida Spanish for [S]F-Cl.V over Cl.V-[S]F might be linked to the
general preference of this variety for subject-verb order (possibly due to being partial null subject
language/variety, cf. Frascarelli and Jiménez-Fernández [2019]).
100 S. Heidinger

In general, the results from comparable experimental studies and from our
corpus study both indicate that focused subjects are not limited to the sentence
final position. The data types differ in that the preference for the final position in
our corpus data is not found in all experimental studies. Still, there are more
experimental studies that show a preference for the final position than studies with
a preference for a non-final position.

6.2 The Merits of Corpus Data in the Present Debate

The obvious motive to include corpus data in the debate on the position of infor-
mation focus is the general principle of methodological triangulation. The appli-
cation of various methods in the investigation of a linguistic phenomenon is
desirable for at least two reasons: It increases the validity of the empirical obser-
vations and it decreases the overall risk of methodological bias (Mackey and Gass
2015).17 Since systematic corpus studies on the position of information focus in
Spanish were missing, the inclusion of corpus data constitutes a step forward in
this direction. Despite the general goal of methodological triangulation, it is
crucial to ask whether a certain method is appropriate for a given research ques-
tion. In fact, there is a long-standing discussion on the merits and shortcomings of
corpus data and corpus-based linguistics (cf. e.g., Aarts 2000). Of course, not all
linguistic research questions are equally prone to be answered by corpus data (e.g.,
discussions about syntactic island constraints often rely on rather specific exam-
ples that are hardly found in corpus data). However, the syntactic position of
information focus in terms of linear order is a phenomenon that can be observed
in corpus data. Based on such data probing questions about optionality,
constraint ranking (cf. the brief comments on STAY, STRESSFOCUS and RIGHTMOST-
STRESS in Section 3) or the syntax-information structure interface in general could
be tackled. But the fulfillment of methodological triangulation as a general
desideratum is not the only merit of corpus data. In fact, the relevance of corpus
data manifests itself in several other ways.
Despite the similarities with the comparable results from experimental
studies, our results are also relevant for the interpretation of some of the experi-
mental data. Our corpus data clearly show that experimental studies often test
structures that are rather uncommon in terms of frequency in natural discourse
(e.g., sentences with lexical arguments as part of the background and the core
sentence). Linguistic studies definitely should not be limited to the most frequent

17 Cf. Hoot et al. (2020) for an application of methodological triangulation using different types of
experimental studies.
Corpus Data and the Position of Information Focus 101

and common constructions, but it should be considered that participants are asked
to judge, choose between or produce options that are highly marked in a given
discourse situation (e.g., the instruction to answer in full sentences repeating
elements from the question). The frequency and acceptability of preverbal focused
subjects in sentences with lexical objects comes to mind. They are produced and
judged as acceptable in experiments (often even as the preferred option), but are
rejected by the vast majority of authors who rely on their native speaker intuitions
(cf. Section 3.2). The fact that such constructions rarely appear in spoken language
corpus data may be important for the interpretation of this mismatch; recall from
Section 3.2 that the unnaturalness of some experimental stimuli made Escandell-
Vidal and Leonetti (2019) call into question the validity of experimental results.
Another point raised by Escandell-Vidal and Leonetti (2019) in their criticism
of (certain) experimental studies is that the most natural way to answer a wh-
question is a focus-only answer. Thanks to the corpus data compiled for the pre-
sent study we can evaluate this claim. Our corpus data indeed confirms (at least for
focused subjects) that focus-only sentences are the most frequent (and hence
probably the most natural) way to respond to an open wh-question: 60% of the
answers only contain the focus (cf. Table 5, Section 5). While the general preference
for focus-only answers might not be surprising, it is the remaining 40% that is
really interesting in the context of the discussion of the appropriateness of
experimental studies which include stimuli with both focus and background. The
relatively high frequency of 40% suggests that experimental stimuli with both
focus and background are not per se problematic, as they also occur in natural
discourse.
The consideration of corpus data also enables us to investigate the relation
between acceptability and frequency, which is still an understudied issue (cf. Adli
2015: 174). Corpus data provide true frequency data in the sense of frequency of
occurrence in natural discourse (unlike the frequency data used in Heidinger
[2018b: 111] where frequencies from production experiments are compared to
acceptability judgments collected in experiments). As to the relation between
frequency and acceptability our corpus data lead to some expected and some
rather surprising findings. As one might expect, frequency data and data from
acceptability experiments align in that more frequent structures are more
acceptable than less frequent ones. Recall from Section 6.1 that in sentences
without lexical arguments in the background, the sentence final position is
preferred over the preverbal position (cf. (51)). Frequency and acceptability align
since the structures with the focus in final position are more frequent than the ones
with the focus in preverbal position.
102 S. Heidinger

(51) a. Cl.V-[S]F > [S]F-Cl.V (Jiménez-Fernández 2015b; Gupton 2017)


b. V-[S]F > [S]F-V (Jiménez-Fernández 2015b)

Surprisingly, some structures which receive rather high acceptability scores are
missing in our corpus data. A case in point are sentences with lexical arguments as
part of the background which receive high acceptability rates (Hoot 2012, 2016) but
are missing in our corpus data (e.g., [S]FVO-sentences with lexical objects). Adli
(2011: 398) uses the term latent construction for such acceptable but rare or absent
constructions.
Finally, the unbiased nature of the present corpus study may also be
mentioned as a merit. Recall that no restrictions on the answer to the wh-questions
were imposed. Such an approach might reveal constructions which have been so
far overlooked in the experimental and introspection-based literature. The ser- and
haber-constructions presented in Section 5 might be such a case (at least we are not
aware of any discussion as focus marking devices in Spanish). The syntactic and
information structural properties of such newly found constructions might then be
investigated in subsequent studies.

7 Conclusions
The position of information focus in Spanish is a heavily debated topic in the
current literature on the syntax-information structure interface. The frontline in
this debate runs along methodological choices. Introspection-based studies typi-
cally argue that the information focus is limited to the final position while
experimental studies suggest that the information focus can also appear in non-
final positions. In this paper we have contributed to this debate by analyzing a new
data type, namely corpus data. The main empirical results show that information
foci appear most frequently in final position, but they are not limited to the final
position. The latter finding is in line with comparable experimental studies, but the
preference for the final position in our corpus data is not found in all experimental
studies. Our results challenge the common view in the introspection-based liter-
ature according to which the information focus needs to be in final position in
Spanish.
In addition to the empirical contribution, we offer a reflection on the merits of
corps data in this domain of linguistic research. An obvious argument to include
corpus data in the debate is the general principle of methodological triangulation.
More methods increase the validity of the empirical observations and decrease the
overall risk of methodological bias. Further, the results from the corpus study have
proven instructive for the interpretation of some of the experimental data and
Corpus Data and the Position of Information Focus 103

material (e.g., the frequency and naturalness of focus-only answers). Corpus data
are also a prerequisite for analyzing the relation between acceptability and fre-
quency in grammar (still an understudied issue). The unbiased nature of corpus
studies has the potential to reveal constructions which have been so far overlooked
in the experimental and introspection-based literature. In the study of the
expression of information focus, ser- and haber-constructions might be such cases
that appeared in the corpus data and deserve further attention.
A challenge that could be tackled in future research is to apply to the corpus
data more fine-grained distinctions − as the ones proposed in Cruschina (2021)
between information focus, exhaustive focus, mirative focus and contrastive
focus − in order to measure inter-annotator agreement and investigate possible
differences between the focus types with respect to focus realization (syntax and
prosody). Another obvious topic for future research is diatopic variation, and
whether the corpus data align with dialectal differences identified in the experi-
mental literature on European Spanish (cf. Jiménez-Fernández 2015a; Vanrell and
Fernández Soriano 2013, 2018).

References
Aarts, Bas. 2000. Corpus linguistics, Chomsky and fuzzy tree fragments. In Christian Mair &
Marianne Hundt (eds.), Corpus linguistics and linguistic theory: Papers from the twentieth
international conference on English language research on computerized corpora, 5–13.
Amsterdam: Rodopi.
Adli, Aria. 2011. On the relation between acceptability and frequency. In Esther Rinke &
Tanja Kupisch (eds.), The development of grammar: Language acquisition and diachronic
change, 383–404. Amsterdam: John Benjamins.
Adli, Aria. 2015. What you like is not what you do: Acceptability and frequency in syntactic
variation. In Aria Adli, Marco García García & Göz Kaufmann (eds.), Variation in language:
System- and usage-based approaches, 173–200. Berlin: de Gruyter.
Bosque, Ignacio. 1999. On focus versus wh-movement. The case of Caribbean Spanish. Sophia
Linguistica 44–45. 1–32.
Bosque, Ignacio & Javier Gutiérrez-Rexach. 2009. Fundamentos de sintaxis formal. Madrid: Akal.
Brunetti, Lisa. 2009. Discourse functions of fronted foci in Italian and Spanish. In Andreas Dufter &
Daniel Jacob (eds.), Focus and background in Romance languages, 43–81. Amsterdam: John
Benjamins.
Büring, Daniel. 2009. Towards a typology of focus realization. In Malte Zimmermann &
Caroline Féry (eds.), Information structure: Theoretical, typological, and experimental
perspectives, 177–205. Oxford: Oxford University Press.
Büring, Daniel & Rodrigo Gutiérrez-Bravo. 2001. Focus-related word order variation without the
NSR: A prosody-based crosslinguistic analysis. In Séamas Mac Bhloscaidh (ed.), Syntax and
semantics at Santa Cruz, vol. 3, 41–58. Santa Cruz: UC Santa Cruz.
104 S. Heidinger

Calhoun, Sasha, Erwin La Cruz & Ana Olssen. 2018. The interplay of information structure,
semantics, prosody, and word ordering in Spanish intransitives. Laboratory Phonology:
Journal of the Association for Laboratory Phonology 9(1). 1–30.
Camacho, José. 2006. In situ focus in Caribbean Spanish: Towards a unified account of focus. In
Nuria Sagarra & Almeida J. Toribio (eds.), Selected proceedings of the 9th hispanic linguistics
symposium, 13–23. Somerville: Cascadilla Proceedings Project.
Casielles-Suárez, Eugenia. 2004. The syntax-information structure interface: Evidence from
Spanish and English. New York: Routledge.
Cassarà, Alessia Caterina. 2021. Subject focus in French and Spanish. Köln: Universität zu Köln
Dissertation.
C-Oral-Rom = Cresti, Emmanuela & Massimo Moneglia. 2005. C-ORAL-ROM: Integrated reference
corpora for spoken Romance languages. Amsterdam: John Benjamins.
COSER = Fernández-Ordóñez, Inés. 2005. Corpus Oral y Sonoro del Español Rural. http://www.
corpusrural.es/ (accessed March 2020).
Cruschina, Silvio. 2012. Discourse-related features and functional projections. Oxford: Oxford
University Press.
Cruschina, Silvio. 2019. Focus fronting in Spanish: Mirative implicature and information structure.
Probus 31(1). 119–146.
Cruschina, Silvio. 2021. The greater the contrast, the greater the potential: On the effects of focus
in syntax. Glossa: A Journal of General Linguistics 6(1). 1–30.
Cruttenden, Alan. 2006. The de-accenting of given information: A cognitive universal? In
Giuliano Bernini & Marcia L. Schwartz (eds.), Pragmatic organization of discourse in the
languages of Europe, 311–355. Berlin: Mouton de Gruyter.
Domínguez, Laura & María J. Arche. 2014. Subject inversion in non-native Spanish. Lingua 145.
243–265.
Drubig, Hans Bernhard. 2003. Toward a typology of focus and focus constructions. Linguistics
41(1). 1–50.
Drubig, Hans Bernhard & Wolfram Schaffar. 2001. Focus constructions. In Martin Haspelmath,
Ekkehard König, Wulf Oesterreicher & Wolfgang Raible (eds.), Language typology and
language universals: An international handbook (HSK20.2), 1079–1104. Berlin: Walter de
Gruyter.
El Zarka, Dina & Steffen Heidinger. 2014. Introduction [to Special issue: Methodological issues in
the study of information structure]. Grazer Linguistische Studien 81. 5–13.
Escandell-Vidal, María Victoria & Manuel Leonetti. 2019. Una nota sobre el foco informativo en
español. In Ramón González Ruiz, Inés Olza Moreno & Óscar Loureda Lamas (eds.), Lengua,
cultura, discurso. Estudios ofrecidos al profesor Manuel Casado Velarde, 203–227.
Pamplona: EUNSA.
Fábregas, Antonio. 2016. Information structure and its syntactic manifestation in Spanish: Facts
and proposals. Borealis: An International Journal of Hispanic Linguistics 5(2). 1–109.
Feldhausen, Ingo. 2016a. Inter-speaker variation, optimality theory and the prosody of clitic left-
dislocations in Spanish. Probus 28(2). 293–333.
Feldhausen, Ingo. 2016b. The relation between prosody and syntax: The case of different types of
left-dislocations in Spanish. In Meghan E. Armstrong, Nicholas Henriksen &
Maria del Mar Vanrell (eds.), Intonational grammar in Ibero-Romance: Approaches across
linguistic subfields, 153–180. Amsterdam: John Benjamins.
Feldhausen, Ingo & Maria del Mar Vanrell. 2014. Prosody, focus and word order in Catalan and
Spanish. An optimality theoretic approach. In Susanne Fuchs, Martine Grice, Anne Hermes,
Corpus Data and the Position of Information Focus 105

Leonardo Lancia & Doris Mücke (eds.), Proceedings of the 10th international seminar on
speech production (ISSP), Köln (Germany), 122–125. Köln: Universität zu Köln.
Feldhausen, Ingo & Maria del Mar Vanrell. 2015. Oraciones hendidas y marcación del foco
estrecho en español: una aproximación desde la teoría de la optimidad estocástica. Revista
Internacional de Lingüística Iberoamericana 13(2). 39–60.
Fernández Lorences, Teresa. 2010. Gramática de la tematización en español. Oviedo: Universidad
de Oviedo.
Fernández-Ordóñez, Inés. 2005–. Corpus Oral y Sonoro del Español Rural. www.corpusrural.es
(accessed March 2020).
Fernández-Ordóñez, Inés. 2011. Nuevos horizontes en el estudio de la variación gramatical del
español: el Corpus Oral y Sonoro del Español Rural. In Germà Colón Domènech &
Lluís Gimeno Betí (eds.), Noves tendències en la dialectología contempoània, 173–203.
Castelló de la Plana: Universitat Jaume I.
Frascarelli, Mara & Ángel L. Jiménez‐Fernández. 2019. Understanding partiality in pro-drop
languages: An information-structure approach. Syntax 22(2–3). 162–198.
Frota, Sónia & Pilar Prieto. 2015. Intonation in Romance: Systemic similarities and differences. In
Sónia Frota & Pilar Prieto (eds.), Intonation in Romance, 392–418. Oxford: Oxford University
Press.
Gabriel, Christoph. 2007. Fokus im Spannungsfeld von Phonologie und Syntax: Eine Studie zum
Spanischen. Vervuert: Frankfurt/M.
Gabriel, Christoph. 2010. On focus, prosody, and word order in Argentinean Spanish: A minimalist
OT account. Revista Virtual de Estudos da Linguagem – ReVEL Special edition 4. 183–222.
Götze, Michael, Thomas Weskott, Cornelia Endriss, Ines Fiedler, Stefan Hinterwimmer,
Svetlana Petrova, Anna Schwarz, Stavros Skopeteas & Ruben Stoel. 2007. Information
structure. In Stefanie Dipper, Michael Götze & Stavros Skopeteas (ed.), Information structure
in cross-linguistic corpora: Annotation guidelines for phonology, morphology, syntax,
semantics and information structure, 147–187. Potsdam: Universitätsverlag Potsdam.
Grimshaw, Jane. 1997. Projections, heads, and optimality. Linguistic Inquiry 28(3). 373–422.
Gupton, Timothy. 2017. Early minority language acquirers of Spanish exhibit focus-related
interface asymmetries: Word order alternation and optionality in Spanish-Catalan, Spanish-
Galician, and Spanish-English bilinguals. In Lauchlan Fraser & María C. Parafita Couto (eds.),
Bilingualism and minority languages in Europe, 212–239. Newcastle upon Tyne: Cambridge
Scholars Publishing.
Gutiérrez-Bravo, Rodrigo. 2002. Focus, word order and intonation in Spanish and English: An OT
account. In Caroline R. Wiltshire & Joaquim Camps (eds.), Romance phonology and variation:
Selected papers from the 30th linguistic symposium on Romance languages (Gainesville,
Florida, February 2000), 39–53. Amsterdam: John Benjamins.
Gutiérrez-Bravo, Rodrigo. 2006. Structural markedness and syntactic structure. A study of word
order and the left periphery in Mexican Spanish. London: Routledge.
Gutiérrez-Bravo, Rodrigo. 2008. La identificación de los tópicos y de los focos. Nueva Revista de
Filología Hispánica 56(2). 363–401.
Heidinger, Steffen. 2013. Information focus, syntactic weight and postverbal constituent order in
Spanish. Borealis: An International Journal of Hispanic Linguistics 2(2). 159–190.
Heidinger, Steffen. 2014. El foco informativo y la posición sintáctica de los depictivos orientados
al sujeto en español. Verba: Anuario galego de filoloxia 41. 51–74.
Heidinger, Steffen. 2015. Optionality and preferences in Spanish postverbal constituent order: An
OT account without basic constituent order. Lingua 162. 102–127.
106 S. Heidinger

Heidinger, Steffen. 2018a. Acceptability and frequency in Spanish focus marking. In


Marco García García & Melanie Uth (eds.), Focus realization in Romance and beyond, 99–128.
Amsterdam: John Benjamins.
Heidinger, Steffen. 2018b. Sekundäre Prädikation und Informationsstruktur: Fokus und
Informationsstatus bei spanischen Depiktiven. Berlin: Peter Lang.
Helfrich, Uta & Bernhard Pöll. 2011. Wortstellung und Informationsstruktur. In Joachim Born,
Robert Folger, Christopher F. Laferl & Bernhard Pöll (eds.), Handbuch Spanisch: Sprache,
Literatur, Kultur, Geschichte in Spanien und Hispanoamerika. Für Studium, Lehre, Praxis,
340–345. Berlin: Schmidt.
Hertel, Tammy J. 2003. Lexical and discourse factors in the second language acquisition of
Spanish word order. Second Language Research 19(4). 273–304.
Hidalgo Downing, Raquel. 2003. La tematización en el español hablado: Estudio discursivo sobre
el español peninsular (Biblioteca románica hispánica 2, Estudios y ensayos 429). Madrid:
Gredos.
Hoot, Bradley. 2012. Presentational focus in heritage and monolingual Spanish. Chicago:
University of Illinois at Chicago Dissertation.
Hoot, Bradley. 2016. Narrow presentational focus in Mexican Spanish: Experimental evidence.
Probus 28(2). 335–365.
Hoot, Bradley & Tania Leal. 2020. Processing subject focus across two Spanish varieties. Probus
32(1). 93–127.
Hoot, Bradley, Tania Leal & Emilie Destruel. 2020. Object focus marking in Spanish: An
investigation using three tasks. Glossa: A Journal of General Linguistics 5(1). 1–32.
Hualde, José I. & Pilar Prieto. 2015. Intonational variation in Spanish: European and American
varieties. In Sónia Frota & Pilar Prieto (eds.), Intonation in Romance, 350–391. Oxford: Oxford
University Press.
Hülsmann, Christoph. 2019. Tópicos y focos iniciales en el español hablado: funciones
pragmáticas y correlatos formales. In Valeria A. Belloro (ed.), La interfaz sintaxis-pragmática,
121–142. Berlin: de Gruyter.
Jackendoff, Ray. 1972. Semantic interpretation in generative grammar. Cambridge, MA: MIT Press.
Jiménez-Fernández, Ángel L. 2015a. Towards a typology of focus: Subject position and
microvariation at the discourse-syntax interface. Ampersand 2. 49–60.
Jiménez-Fernández, Ángel L. 2015b. When focus goes wild: An empirical study of two syntactic
positions for information focus. Linguistics Beyond and Within 1(1). 119–133.
Krifka, Manfred. 2001. For a structured meaning account of questions and answers. In
Caroline Féry & Wolfgang Sternefeld (eds.), Audiatur vox sapientiae. A Festschrift for Arnim
von Stechow, 287–319. Berlin: Akademie Verlag.
Krifka, Manfred. 2006. Association with focus phrases. In Valéria Molnár & Susanne Winkler
(eds.), The architecture of focus, 105–136. Berlin: Mouton de Gruyter.
Krifka, Manfred. 2007. Basic notions of information structure. In Caroline Féry, Gisbert Fanselow &
Manfred Krifka (eds.), The notions of information structure, 13–55. Potsdam:
Universitätsverlag Potsdam.
Krifka, Manfred & Renate Musan. 2012a. Information structure: Overview and linguistic issues. In
Manfred Krifka & Renate Musan (eds.), The expression of information structure, 1–43. Berlin:
De Gruyter Mouton.
Krifka, Manfred & Renate Musan (eds.). 2012b. The expression of information structure. Berlin: De
Gruyter Mouton.
Corpus Data and the Position of Information Focus 107

Leal, Tania, Emilie Destruel & Hoot Bradley. 2018. The realization of information focus in
monolingual and bilingual native Spanish. Linguistic Approaches to Bilingualism 8(2).
217–251.
Leonetti, Manuel. 2014. Gramática y pragmática en el orden de palabras. Lingüística en la red XII.
1–25.
Linzen, Tal & Yohei Oseki. 2018. The reliability of acceptability judgments across languages.
Glossa: A Journal of General Linguistics 3(1). 1–25.
Lüdeling, Anke, Julia Ritz, Manfred Stede & Amir Zeldes. 2016. Corpus linguistics and information
structure research. In Caroline Féry & Shinichiro Ishihara (eds.), The Oxford handbook of
information structure, 599–617. Oxford: Oxford University Press.
Mackey, Alison & Susan M. Gass. 2015. Second language research: Methodology and design. New
York: Routledge.
Martín Butragueño, Pedro. 2005. La construcción prosódica de la estructura focal en español. In
Gabriele Knauer & Valeriano Bellosta von Colbe (eds.), Variación sintáctica en español: un
reto para las teorías de la sintaxis, 117–144. Niemeyer: Tübingen.
Matić, Dejan & Daniel Wedgwood. 2013. The meanings of focus: The significance of an
interpretation-based category in cross-linguistic analysis. Journal of Linguistics 49(1).
127–163.
Méndez Vallejo, Dunia Catalina. 2009. Focalizing ser (‘to be’) in Colombian Spanish. Bloomington
ID: Indiana University Dissertation.
Méndez Vallejo, Dunia Catalina. 2015. Ser focalizador: variación dialectal y aceptabilidad de uso.
Revista Internacional de Lingüística Iberoamericana 26. 61–79.
Mora-Bustos, Armando. 2009. Marcación explícita de foco estrecho en español. Nueva Revista de
Filología Hispánica 57(2). 489–511.
Moreno Cabrera, Juan C. 1999. Las funciones informativas: Las perífrasis de relativo y otras
construcciones perifrásticas. In Ignacio Bosque & Violeta Demonte (eds.), Gramática
descriptiva de la lengua española. Vol. 3. Entre la oración y el discurso. Morfología,
4245–4302. Madrid: Espasa Calpe.
Muntendam, Antje. 2013. On the nature of cross-linguistic transfer: A case study of Andean
Spanish. Bilingualism: Language and Cognition 16(1). 111–131.
Olarrea, Antxon. 2012. Word order and information structure. In José I. Hualde, Antxon Olarrea &
Erin O’Rourke (eds.), The handbook of Hispanic linguistics, 603–627. Malden, MA: Wiley-
Blackwell.
Onea, Edgar & Klaus von Heusinger. 2009. Grammatical and contextual restrictions on focal
alternatives. In Andreas Dufter & Daniel Jacob (eds.), Focus and background in Romance
languages, 281–308. Amsterdam: John Benjamins.
Ortiz López, Luis A. 2009. El español del Caribe: orden de palabras a la luz de la interfaz léxico-
sintáctica y sintáctico-pragmática. Revista Internacional de Lingüística Iberoamericana 7(2).
75–93.
Prince, Alan & Paul Smolensky. 1993. Optimality theory: Constraint interaction in generative
grammar [ROA Version, 8/2002]. New Brunswick & Boulder: Rutgers University & University
of Colorado.
RAE and ASALE = Real Academia Española & Asociación de Academias de la Lengua Española.
2009. Nueva gramática de la lengua española. Madrid: Espasa Libros.
Repp, Sophie. 2010. Defining ‘contrast’ as an information-structural notion in grammar: Contrast
as an information-structural notion in grammar. Lingua 120(6). 1333–1345.
108 S. Heidinger

Revert Sanz, Vicente. 2001. Entonación y variación geográfica en el español de América. València:
Universitat de València.
Rodríguez Ramalle, Teresa M. 2005. Manual de sintaxis del español. Madrid: Castalia.
Roggia, Aaron B. 2018. An investigation of unaccusativity and word order in Mexican Spanish.
Spanish in Context 15(1). 77–102.
Rooth, Mats. 1985. Association with focus. Amherst, MA: University of Massachusetts
Dissertation.
Rooth, Mats. 1992. A theory of focus interpretation. Natural Language Semantics 1(1). 75–116.
Rooth, Mats. 2016. Alternative semantics. In Caroline Féry & Shinichiro Ishihara (eds.), The Oxford
handbook of information structure, 19–40. Oxford: Oxford University Press.
Samek-Lodovici, Vieri. 2005. Prosody-syntax interaction in the expression of focus. Natural
Language and Linguistic Theory 23. 687–755.
Sedano, Mercedes. 2003. Seudohendidas y oraciones con verbo ser focalizador en dos corpus del
español hablado de Caracas. Revista Internacional de Lingüística Iberoamericana 1(1).
175–204.
Silva-Corvalán, Carmen. 1984. Topicalización y pragmática en español. Revista Española de
Lingüística 14(1). 1–20.
Skopeteas, Stavros. 2012. The empirical investigation of information structure. In Manfred Krifka
& Renate Musan (eds.), The expression of information structure, 217–247. Berlin: De Gruyter
Mouton.
Sprouse, Jon & Diogo Almeida. 2012. Assessing the reliability of textbook data in syntax: Adger’s
Core Syntax. Journal of Linguistics 48(3). 609–652.
Truckenbrodt, Hubert. 1995. Phonological phrases: Their relation to syntax, focus, and
prominence. Cambridge, MA: MIT Dissertation.
Uth, Melanie. 2014. Spanish preverbal subjects in contexts of narrow information focus: Non-
contrastive focalization or epistemic-evidential marking? Grazer Linguistische Studien 81.
87–104.
Uth, Melanie & Marco García García. 2018. Introduction: Core issues of focus realization in
Romance. In Marco García García & Melanie Uth (eds.), Focus realization in Romance and
beyond, 1–30. Amsterdam: John Benjamins.
van der Wal, Jenneke. 2014. Tests for focus. Grazer Linguistische Studien 81. 105–134.
Vanrell, Maria del Mar & Olga Fernández Soriano. 2013. Variation at the interfaces in Ibero-
Romance: Catalan and Spanish prosody and word order. Catalan Journal of Linguistics 12.
253–282.
Vanrell, Maria del Mar & Olga M. Fernández Soriano. 2018. Language variation at the prosody-
syntax interface: Focus in European Spanish. In Marco García García & Melanie Uth (eds.),
Focus realization in Romance and beyond, 33–70. Amsterdam: John Benjamins.
Villalba, Xavier. 2011. A quantitative comparative study of right-dislocation in Catalan and
Spanish. Journal of Pragmatics 43(7). 1946–1961.
von Stechow, Arnim. 1991. Current issues in the theory of focus. In Arnim von Stechow &
Dieter Wunderlich (eds.), Semantik: Ein internationales Handbuch der zeitgenössischen
Forschung (HSK 6), 804–824. Berlin: Walter de Gruyter.
Zimmermann, Malte & Edgar Onea. 2011. Focus marking and focus interpretation. Lingua 121.
1651–1670.
Zubizarreta, Maria L. 1998. Prosody, focus, and word order. Cambridge, MA: MIT Press.
Corpus Data and the Position of Information Focus 109

Zubizarreta, Maria L. 1999. Las funciones informativas: Tema y foco. In Ignacio Bosque &
Violeta Demonte (eds.), Gramática descriptiva de la lengua española. Vol. 3. Entre la oración
y el discurso. Morfología, 4215–4244. Madrid: Espasa Calpe.
Zubizarreta, Maria L. 2014. On the grammaticalization of the assertion structure: A view from
Spanish. In Andreas Dufter & Álvaro S. Octavio de Toledo (eds.), Left sentence peripheries in
Spanish: Diachronic, variationist and comparative perspectives, 253–282. Amsterdam: John
Benjamins.
Zubizarreta, Maria L. 2016. Nuclear stress and information structure. In Caroline Féry &
Shinichiro Ishihara (eds.), The Oxford handbook of information structure, 165–184. Oxford:
Oxford University Press.

You might also like