You are on page 1of 30

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/248114381

Topic Continuity in Discourse: A quantitative cross-language study

Article · January 1983


DOI: 10.1075/tsl.3

CITATIONS READS
705 3,518

1 author:

Thomas / Givón
University of Oregon
147 PUBLICATIONS   10,074 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

"The Story of Zero" (book; Amsterdam: J. Benjamins, 2017) View project

All content following this page was uploaded by Thomas / Givón on 03 October 2017.

The user has requested enhancement of the downloaded file.


0/Zero.1

THE STORY OF ZERO


T. Givón

TABLE OF CONTENTS:

PART I: NATURAL ZERO

1. The communicative logic of zero anaphora

2. The grammar of referential coherence as mental processing instructions

3. The diachrony of pronominal agreement

4. Early-stage pronominal agreement: A case-study in Ute

5. Is zero anaphora a typological exotica?

6. Verbal zero anaphora: Verbless clauses

7. Cataphoric zero: Passive and antipassive voice

PART II: STRUCTURAL ZERO

8. Co-reference in relative clauses

9. Co-reference in verb complements

10. Co-reference in adverbial clauses

11. Zero, pronouns and the diachrony of clause-chaining

12. Promiscuous ill-governed zeros?

13. Zero and the puzzle of stranded adposition


0.1

PREFACE

The genesis of this book harkens back to the early 1980s, when my good friend, the late Ken
Hale, showed up at the Mid-America Linguistics Conference at Lawrence, Kansas with a fancy new
name for flexible-order languages: word-star languages (Hale 1980). Under the onslaught of Jimmy
Huang's (1984) dissertation on Mandarin Chinese, the new category soon mutated into non-
configurational languages (Hale 1982), in the process acquiring three associated features: empty
nodes, verb-indexed subject/object, and pro-drop. A number of things bothered me about the
enterprise from the very start. To begin with, the alleged typological association between flexible
word-order and either zero anaphora or verb-indexing of nominal arguments--a.k.a. pronominal
agreement on the verb--was shot full of counter-examples. What is more, the data on which the
typological generalization was based was the traditional Generative competence data--isolated
clauses dreamed up by the linguist or arm-twisted out of naive native informants. Natural data from
language use--performance, actual communication, discourse--were conspicuously absent. And the
intimate causal dependency between synchronic typology and diachronic change was not part of the
agenda.
At the time, on the Southern Ute rez, I was 5 years into describing Ute, a language with
flexible word-order but a strange 'performance' preference for its old OV order. In recorded Ute
texts, continuous anaphoric nominal referents were 70% zero-marked, with the remaining 30%
coded by optional clitic pronouns. Those pronouns were suffixed to the first word of the clause
('second position clitics'), which 70% of the time turned out to be the verb. And to confound it all,
Ute clitic pronouns could correspond to ('index') either the subject, object or genitive. The grammar
of Ute was, manifestly, in the midst of multiple, complex diachronic changes in word-order and
morphogenesis.
Back on campus at UCLA, my last seminar revolved around a massive cross-language
comparison of the text-distribution of referent-coding grammatical devices. Among those, the most
frequent in text and the most widely distributed cross-linguistically turned out to be zero anaphora
or its functional equivalent, obligatory pronominal agreement. The resulting published volume,
Topic Continuity in Discourse (1983), came out just as the non-configurationality/ pro-drop fad had
swept the linguistic landscape.
The thematic thread of this book runs as follows. Part I deals first with the communicative
naturalness, cross-language ubiquity, and extreme well-governedness of 'discourse' zero anaphora
(ch. 1), the so-called ungoverned zero. It goes on to investigate the cognitive foundations of zero-
marking of information (ch. 2). The next two chapters (3,4) describe the diachronic rise of the
functional equivalent of zero anaphora, obligatory pronominal agreement on the verb, and how it
gradually invades the functional niche of zero anaphora. The term pro-drop, it turns out, is an
upside-down misnomer, given that in diachrony the exact opposite seems to occur--pro-add. The
0.2

next chapter (5) zeroes in (no pun intended) on the alleged typological correlation between flexible
word-order and zero or pronominal agreement. It exposes non-configurationality as a false empiri-
cal entity, based on the mis-analysis of both zero anaphora and pronominal agreement and ill-
supported by the cross-language distribution of its three core features. Next comes a chapter (6)
describing verbal zero-anaphora or verbless clauses, showing how it parallels in both structure and
function nominal zero anaphora. Part I then closes with a chapter (7) on the rarely-discussed but
just as natural and ubiquitous cataphoric zero, most prominently seen in passive and antipassive
constructions. Inevitably, the role of grammatical relation in marking cataphoric referential
coherence looms over the discussion.
Part II deals with zero-marked nominal arguments in complex syntactic environments, the
so-called governed zeros that can be described configurationally. It shows how such syntactic
zeros or their pronominal equivalents in REL-clauses (ch. 8), V-complements (ch. 9) and ADV-
clauses (ch. 10) arise diachronically from either paratactic--discourse, ungoverned--zeros or from
various pronominal precursors. The natural use of zero in clause chaining, and the subsequent
grammaticalization of complex clause-chaining systems, is investigated in ch. 11.
We conclude the story of zero by examining two syntactic phenomena that can be
considered, on the face of it, typological exotica: first the allegedly ungoverned zero in non-finite
verb phrases (ch. 12), then verb-affixed adpositions (ch. 13). I try to show how both phenomena are
natural, predictable consequences of the story of zero--provided one broaden one's myopic
synchronic-structural perspective and examines the two phenomena in terms of their discourse-
functional distribution and diachronic development.
There is a story that goes here, there always seems to be one, and it may prove instructive.
In the fall of 1986, right after I came back from New Guinea, at the height of the configurationality
boom, Ken Hale invited me to a conference on clause chaining at MIT. Both of us had massive
data-bases on clause chaining, Ken's from Chibchan and Misumalpan, mine from the New Guinea
highlands. So we presented our data, using standard discourse-functional arguments to suggest that
those quaint DS-or-SS chain-medial clauses were the only way those languages had of expressing
clausal conjunction in natural discourse. But the MIT dogma at the time, GB, didn't allow for
considering such chain-medial clauses as conjoined. In order to deal with any clause-type that did
not look like a well-constructed English main clause, that is, any clause-type that didn't resemble
a finite clause with full-NP subject and object and fully-expressed tense-aspect-modality, you had
to define such a clause configurationally as subordinate.
As it happens, reduced finiteness--the zero-coding of subjects and T-A-M marking--is not
a function of configurational subordination, but rather of referential and T-A-M continuity and
predictability in connected discourse, a fact both Ken and I knew from perusing natural texts. And
since equi-topic and equi-T-A-M chain-medial clauses are the most frequent ones in natural
discourse, the most common clause-type in natural discourse is therefore a non-finite clause with
zeroed-out subject and reduced T-A-M marking. Now, since subordinate clauses (REL-clauses, V-
complements, ADV-clause) most commonly display the same high referential and T-A-M continuity
in discourse as chain-medial clauses, their less-finite syntactic properties have nothing to do with
their subordinate configurational status, but rather with their referential and T-A-M continuity.
0.3

After Ken and I finished presenting our data, the brand-new house expert on clause chaining,
a bright, articulate and largely innocent kid who had just finished his dissertation on how to 'handle'
clause chaining in the GB formalism, got up and said: "Well, I don't know how we're going to handle
the clause-chaining facts Ken and Tom have just presented. I suppose we could lower such clauses
into the grammar with a helicopter".

White Cloud Ranch


Ignacio, Colorado
April 2016
1/zero.1

THE STORY OF ZERO


T. Givón

CHAPTER 1: THE COMMUNICATIVE ECOLOGY OF ZERO ANAPHORA

1. Introduction

Zero anaphora is one of the most natural, universal, ancient and functionally-coherent
grammatical devices in the tool-kit of natural language. Not only is it an integral part of all mature
grammars, but it is also one of the most robust pre-grammatical communicative devices found in the
language of early childhood, second-language pidgin and Broca's aphasia.[FN 1] The interpretation
of zero anaphora as a typological exotica--pro-drop or empty-node (Hale 1980, 1982; Huang 1984;
Xu 1986)--is a fascinating tale of factual misrepresentation and theoretical confusion, a discussion
that will be deferred to a later chapter (ch. 5).
The systematic zero-coding of clausal constituents as a grammatical device is best viewed
in the context of two universal communicative principles, the first pertaining to the anaphoric
context of informational predictability, the second to the cataphoric context of informational
importance:[FN 2]

(1) The communicative logic of zero:


a. Anaphoric: "Predictable information need not be mentioned".
b. Cataphoric: "Unimportant information need not be mentioned".

The anaphoric use of zero will be discussed extensively in this and several subsequent chapters. The
cataphoric use of zero is seen most conspicuously in two grammatical constructions, zero-agent
passive and zero-patient antipassive (ch. 7). That is, as in:

(2) a. Passive: Two months later, she was fired [by Ø]


b. Antipassive: He eats [Ø] regularly

The communicative logic of zero anaphora is best understood when studied in its natural
discourse (usage) context, where zero contrasts with the whole inventory of referent-coding devices.
The most universal of those are:[FN 3]

(3) Major referent-coding devices:


a. Zero anaphora
b. unstressed/bound anaphoric pronoun
c. stressed independent pronouns
d. definite NPs
e. indefinite NPs
f. modified Nps
2

In turn, the use of these grammatical devices can only be understood in the context of the overall
organization of information processing in discourse.

2. Discourse structure and referential coherence

2.1. Overview

Human discourse is typically multi-propositional. That is, we string together verbal event-
or-state clauses in coherent sequences, ones that maintain a high degree of continuity. The sub-
elements--strands--of discourse coherence tend to persist from one clause to the next across
stretches of discourse or clause-chains. The overall thematic coherence of human discourse is then
the tapestry-like product of the multiple strands, the most concrete and easier-to-track of which are:

(4) Main strands of discourse coherence


a. referents
b. spatiality
c. temporality
d. tense-aspect-modality
e. action routines

Most commonly, these individual strands of discourse coherence maintain their continuity together,
breaking together at the end of coherence units. And those coherence units are organized
hierarchically, with lower units combining into higher ones; schematically:[FN 4]

(5) Hierarchic structure of discourse


lower
==========
clause chain
paragraph
episode
story
=============
higher

The lowest and most basic unit of discourse-coherence above the atomic clause is the clause
chain (a.k.a. sentence), the arena in which the bulk of grammatical devices perform their assigned
communicative functions. The overall structure of clause chains can be given as, schematically:
3

(6) Structure of clause chain


...# RD, CI, CM,CM,CM,CM, (.....),CF#...
RD = reorientation device
CI = chain-initial clause
CM = chain-medial clause
CF = chain-final clause
# = chain boundary

Prosodically, a clause tends to come under a unified intonation contour. Within-clause


(between-words) intonation breaks tend to be ca. 50mscs long. Between-clause--chain-medial--
intonation breaks tend to be up to 100msecs long. And between-chain intonation breaks tend to be
longer than 100msecs.[FN 5] Inter-clausal intonation breaks correspond, roughly, to comma punctu-
ation [,] in written discourse, and inter-chain breaks to period [.] or semi-colon [;].
The major referent-coding devices listed in (3), above, can be ranked in terms of their degree
of referential continuity:[FN 6]

(7) Referent-coding devices and referential continuity


lowest referential continuity
=========================
a. indefinite NPs
b. definite NPs
c. stressed independent pronouns
d. unstressed anaphoric pronouns
e. zero anaphora
===========================
highest referential continuity

Grammatical relations--subject vs. direct object vs. oblique--also play an important role in
the coding of referential coherence, intersecting with and enriching the referent-coding devices in
(7). All other things being equal, a referent marked as subject tends to be more continuous and more
important; one marked as direct object tends to be less continuous and less important; and one
marked as oblique tends to be less continuous and less important yet. Likewise, word-order can also
play an important role in coding referential coherence, most likely along the cataphoric dimension
of referential importance.[FN 7]
In spite of the seeming strong statistical association between referential continuity
('accessibility', 'predictability') and referential importance ('topicality'), these two dimensions of
referential coherence are distinct and can be dissociated. Thus, for example, an indefinite NP (7a)
codes, by definition, an anaphorically discontinuous referent which may nevertheless be highly
topical cataphorically, as in e.g. existential-presentative clauses.
4

2.2. High-continuity devices

Consider first the contrast between zero anaphora and unstressed anaphoric pronouns in
English:

(8) Unstressed anaphoric pronoun vs. zero:


John went to the mirror, [Ø] examined his hair, [Ø] sighed and [Ø] turned.
a. Then he walked out.
b. *Then [Ø] walked out

Both the unstressed anaphoric pronoun in (8a) and anaphoric zero in (8b) signal maximal referential
continuity. Yet (8b) is an inappropriate continuation, because zero anaphora cannot be used in
English across chain boundaries, only in chain-medial junctures.
Consider next the contrast between unstressed ('anaphoric') and stressed ('independent')
pronouns:

(9) Unstressed/anaphoric vs. stressed/independent pronouns:


Mary talked to Marcie for a while.
a. Then she left. (e Mary left)
b. Then SHE left. (e Marcie left)

The unstressed anaphoric pronoun in (9a) signals referential continuity (SS). The stressed
independent pronoun in (9b) signals referential discontinuity or switch reference (DS). This use of
stressed independent pronouns also applies to objects. Thus, consider the complex subject-object
switches in (10) below, all of them in chain-medial contexts:

(10) John slapped Marcie, then SHE slapped HIM, then HE left in a huff and SHE left too.

In Spanish, where subject pronominal agreement is obligatory, the two highest-continuity


devices, anaphoric pronouns (7d) and zero anaphora (7e), have merged into a single device, subject
pronominal agreement, which can be used at both chain-medial and cross-chain boundaries. Thus
compare the continuation in (11a,b) below with (9a,b) above:

(11) Juan volvi-ó a la casa y comi-ó su cena.


J. returned-3s to the house and ate-3s his dinner
'John went back to the house and ate his dinner.

a. Luego sali-ó de nuevo.


then got.out-3s of new
Then he went out again'.
5

b. *Luego él sali-ó de nuevo.


then 3s got.out-3s of new
*Then HE went out again'.

The infelicity of (11b), in both Spanish and English, is due to the fact that it implies switch
reference (and contrast) where none is warranted by the context. Such a contrast, now used
appropriately, is seen in (12b) below, motivated there by the context and fully corresponding to the
English usage in(9b) above:

(12) Maria habl-ó con Mercedes.


Mary talked-3s with Mercedes.
'Mary talked with Mercedes.

a. Luego volvi-ó a la casa.


Then return-3s to the house
'Then she went home' (she = Mary)

b. Luego ella volvi-ó a la casa.


Then she returned-3s to the house
Then SHE went home' (she = Mercedes)

A similar functional distribution, with obligatory grammatical agreement collapsing the function of
zero anaphora and unstressed/anaphoric pronouns of English, can be seen in other languages with
well-marked obligatory subject-agreement paradigms, such as Hebrew or Swahili.
In languages such as Japanese or Chinese, which have no unstressed anaphoric pronouns,
zero anaphora codes both chain-medial and cross-chain referential continuity, thus corresponding
to pronominal agreement in Spanish. Ute (No. Uto-Aztecan) is roughly in this typological ball-park,
since its unstressed clitic pronouns are optional, and roughly 70% of continuous referents are still
zero-coded.[FN 8] As an illustration, consider the following story-initial sequence:[FN 9]

(13) a. yoghovu-chi 'u, [Ø] pagha'ni-na-pu-ga-'ura,


Coyote/S the/S walk.about-HAB-REM-be
'Coyote, he kept wandering about,

b. kach [Ø] 'ini-a-sapa paqha-na-pu--a, [Ø] 'o-o--'ay-kwa-pu-ga,


NEG WH-O-MOD kill-HAB-REM-NEG bone-be-go-REM
he hadn't killed anything (for a long time), he became bone-skinny,
6

c. ka-'ini-aa-sapa [Ø] paqha-na-pu--a,


NEG-WH-O-MOD kill-HAB-REM-NEG
he hadn't killed anything (for a long time),

d. [Ø] tu-gu-y-whqa-vo-ro--na-pu-ga-'ura...
hungry-search-walk-HAB-REM-be
he was walking about searching hungry...

A second participant is now introduced as the subject of a presentative construction,


with a hedge (14e) below, then as an object (14f). And an independent pronoun is used in
(14f) for switch-subject to the new referent, as in English and Spanish. Such switching is
repeated several times in succession (14g,h). Thus, with Coyote still the topical referent:[FN
10]

(14) e. ...'ú-vway-aqh-'ura 'ú-vwaa-tu--'ura 'íni-kway 'ura-pu-ga...


there-at-it-be there-at-DIR-be WH-MOD be-REM
...Then, right there, there was what's-his-name...

f. mukwapi [Ø] maay-pu-ga, 'uwas-kway pacha'ay-kyay-ku.


spider/O see-REM 3s/S-TOP stick-ANT-SUB
he saw a spider, as HE (spider) was stuck (there).

g. 'ú-vway-aqh-'ura 'uwas magu-ni-pu-ga, [Ø] tu-ka-vaa-chi-'u.


there-at-it-be 3s/S pounce-REM eat-IRR-NOM-3s
so right away HE (Coyote) pounced, intending to eat it (spider).

h. 'u-vyay-aqh-'ura 'uwas-'ura 'áy-pu-ga...


there-at-it-be 3s/S-be say-REM
so then HE (Spider) said...'

2.3. Low continuity--discontinuity--devices

We have already seen how stressed independent pronouns function as switch-reference


devices. Such a use of these pronouns is most typically found in chain-medial contexts, where two
participants alternate as the topical referent. By using the pronoun alone in such contexts, the
speaker signals to the hearer: "Go back to the previous occurrence of a different referent" and
reinstate it, as in (14f,g,h) above. As a result, the anaphoric distance between the current and
previous occurrence of the referent in such mid-chain switches tends to be 2-3 clauses.[FN 11]
7

Full NPs, in contrast to stressed independent pronouns, are used either to introduce into the
discourse brand new ('indefinite') referents, or to re-introduce old ('definite') referents after a
considerable gap of absence. When an indefinite NP is slated to be topical/important, and thus
persist in the subsequent discourse, most commonly some presentative device is used in its first
introduction. Such devices most typically code the new topical referent as the subject of a
presentative clause, as in English existential clauses. In Ute, the equivalent of such presentative
devices involves the use of an independent pronoun in combination with the full NP. Thus compare:

(15) a. English:
Once there was a wizard, he lived in Africa, he went to China to get a lamp....
b. Ute:
'uwas-'ura yoghovu-chi 'ura-pu-ga; khura tu-gu-y-naru'a-puga, tu-kua-tu-gu-y-narua-pu-ga...
3s/S-be coyote/S be-REM then hunger-buy-REM meat-hunger-buy-REM
'There was once Coyote; well he got hungry, he got meat-hungry...'

But new referents are commonly introduced into discourse as indefinite objects, and only
later are upgraded into higher topicality--and re-introduced as definite subjects. This is the Ute
strategy in (14f) above, where 'spider' is introduced first as an indefinite object and then immediately
upgraded to subject in the next clause, coded now by a stressed independent subject pronoun:

(14) f. mukwapi [Ø] maay-pu-ga, 'uwas-kway pacha'ay-kyay-ku...


spider/O see-REM 3s/S-TOP stick-ANT-SUB
he saw a spider, as HE (spider) was stuck (there)...'

When old referents are re-introduced into the discourse after a gap of absence greater than
2-3 clauses, they are most commonly re-introduced as definite NPs. When the old referent is
brought back across a chain or paragraph boundary, with a gap of absence--anaphoric distance--of
10-20 clauses, special chain-initial reorientation devices (RD; see (6) above) are used, most often
with a pause (intonation break) that renders the construction paratactic rather than syntactic. Such
a re-orientation device may be an L-dislocation construction, a long conjunction, and ADV-phrase
or an ADV-clause. And these devices can be ranked in terms of the anaphoric distance (AD) to the
previous mention of the referent, or the depth and complexity of the preceding context vis-a-vis
which the re-orientation proceeds. That is:
8

(16) Common chain-initial re-orientation devices:


Shorter-distance re-orientation
a. Subject L-dislocation:
...Now the other guy, he quit, just took off and vanished...
b. Object L-dislocation:
...Now the other guy, we saw him just once, then he took off, just vanished...
c. Conjunction:
...But then the other guy took off and vanished...
d. Adverbial phrase:
...The next minute, the other guy took off, just vanished...
e. ADV-clause:
...After she finished talking, the other guy took off...
Longer-distance re-orientation

L-dislocation (16a,b) is of considerable interest in studying the diachrony of pronominal


agreement. At least prima facie, it displays two features that can overlap with pronominal
agreement--once the paratactic L-dislocation clause is condensed into a simple syntactic clause:
!The L-dislocated NP is co-referent to the following anaphoric pronoun.
!That unstressed anaphoric pronoun is adjacent to the verb and can readily cliticize to it.
This topic will be discussed in some detail in ch. 3, below.[FN 12]

3. Quantitative distribution of major referent-coding devices

3.1. Preliminaries

In the preceding section we identified three clusters of major referent-coding devices in


terms of their anaphoric continuity:

(17) Expected anaphoric distance of referent-coding devices:


continuity devices anaphoric distance
============ ================== =================
highest zero 1 clause
(chain-medial) unstressed pronouns
pronominal agreement
----------------------------------------------------------------------
intermediate stressed pronouns 2-3 clauses
(chain-medial)
----------------------------------------------------------------------
lowest full NPs > 3 clauses
=================================================
9

In this section we will present quantitative evidence, obtained from the study of written or oral
discourse across a number of languages, that would back up these general predictions.

3.2. English

English is a rigid SVO language using four major referent-coding anaphoric devices: zero,
unstressed/anaphoric pronouns, stressed/independent pronouns and full definite NPs. In Table (18)
below a comparison is given of the mean anaphoric distance (AD) values for these four devices
in written English narrative, re-computed from Brown (1983).

(18) Mean AD values of major referent coding devices in written English


category N mean AD value
============ ======= ==============
zero 314 1.00
unstressed PRO 1,162 1.72
stressed PRO 27 2.27
definite NP 1,023 16.66
====================================

The comparable values for spoken English narrative are given in Table (19) below, re-computed
from Givón (1983b).

(19) Mean AP values of major referent coding devices in spoken English


category N mean AD value
============ ======= ==============
zero 117 1.0
unstressed PRO 336 1.0
stressed PRO 75 3.75
definite NP 69 10.15 [FN 13]
====================================

Within bounds, both written and spoken English conform to the expected values in (17). What is
more, the high text-frequency of zero and unstressed pronouns underscores their use as high-
continuity devices.

3.3. Ute

Ute is a flexible-order ex-SOV language with a high text-frequency of anaphoric zeros. It


also employs optional, low-frequency, unstressed anaphoric pronouns, and those can cliticize on any
word-type, often on the first word in the clause (so-called '2nd position clitics'), most commonly on
the verb.[FN 14] Table (20) below, re-computed from Givón (1983c), summarizes the mean AD
values of the major referent-coding devices in spoken Ute narrative.
10

(20) Mean anaphoric distance values of major referent


coding devices in spoken Ute
category N mean AD value
============ ======= ==============
zero 321 1.21
unstressed PRO 42 1.54
stressed PRO SV 75 2.80
VS 61 1.95
OV 12 2.41
VO 1 1.00
definite NP SV 39 10.84
VS 25 1.48
OV 34 9.67
VO 13 4.46
====================================

Within bounds, the AD figures for Ute conform to the predictions made in (17) above, but with one
crucial exception--the low AD value for post-verbal (VS) subject NPs and, to a lesser extent, of post-
verbal (VO) object NPs. This effect of flexible word-order will be discussed further below.

3.4. Biblical Hebrew

Early Biblical Hebrew (EBH) is a VO language with flexible subject position (VS vs. SV)
and a strong statistical tendency to VSO. The two major verbal conjugations, the suffixal perfect
and the prefixal perfective/irrealis, have obligatory subject pronominal agreement. Object
pronominal agreement on the verb is optional, and alternates with unstressed object pronouns written
as separate words (as in English). Since subject pronominal agreement is obligatory in the main
conjugations (perfect, perfective, irrealis), zero anaphora is rare, found mostly in non-verbal
(nominal, participial) clauses. Table (21) below, re-computed from Fox (1983), summarizes the
anaphoric distance values for the major reference-coding devices in Early Biblical Hebrew.
11

(21) Mean anaphoricv distance values of major referent


coding devices in Biblical Hebrew
category N mean AD value
============ ======= ==============
pro-AGR S 295 1.10
pro-AGR O 57 1.10
stressed PRO-S 87 2.87
stressed PRO-O 52 1.17
definite NP SV 142 9.86
VS 357 6.51
OV 12 25.08
VO 267 12.30
====================================

The AD figures for pronominal agreement and stressed subject pronouns conform, in the main, to
the predictions in (17), above. The effect of the pragmatically-controlled word-order on the AD
values of definite NPs will be discussed further below.

3.5. Spoken Spanish

Spanish is a rigid VO language with a flexible subject position (SV vs. VS) and obligatory
subject agreement in all verbal conjugations. It is thus typologically similar to Biblical Hebrew,
above. Unstressed anaphoric object pronouns are cliticized to the verb, pre-verbally (OV) in most
finite conjugations and post-verbally (VO) in the infinitive and imperative conjugations. The mean
anaphoric distance values for the various referent-coding devices in spoken Venezuelan Spanish are
given in Table (22) below, re-computed from Bentivoglio (1983).

(22) Mean anaphoric distance values of major referent


coding devices in spoken Spanish
category N mean AD value
============ ======= ==============
pro-AGR S 328 1.30
O 137 1.65
DAT 112 1.50
stressed PRO--SV 133 1.90
VS 11 1.64
stressed PRO-VO 6 1.50
definite NP SV 34 4.20
VS 10 2.50
VO 20 8.57
====================================
12

Within bounds, these results conform to the predictions given in (17), above. As in Biblical
Hebrew, a word-order effect is also discernible in Spanish, with post-verbal subject (VS) coding
more continuous referents--lower AD values--than pre-verbal subjects (SV).

3.6. Japanese

Japanese is a rigid SOV language with no unstressed anaphoric pronouns or verb pronominal
agreement. The AD values reported below, re-computed from Hinds (1983), cover oral narrative,
female-female conversation, and male-male conversation. Table (23) below, summarizes the results
for spoken Japanese narrative.

(23) Mean AP values of major referent-coding devices


in Japanese spoken narrative
category N mean AD value
============ ======= ==============
zero 50 1.10
stressed PRO / /
definite NP 147 6.87
====================================

Table (24) below summarizes the results for the female-female conversation.

(24) Mean AP values of major referent-coding devices


in Japanese female-female conversation
category N mean AD value
============ ======= ==============
zero 108 1.55
stressed PRO 11 4.35
definite NP 25 13.5
====================================

Table (25) below summarizes the results for the male-male conversation.

(25) Mean AP values of major referent-coding devices in


Japanese male-male conversation
category N mean AD value
============ ======= ==============
zero 114 3.10
stressed PRO 27 5.27
definite NP 65 10.5
====================================
13

The results of the Japanese AD measures for narrative and female-to-female conversation
conform, in the main, to the prediction in (17). The results for the male-male conversation stand out
in two categories--zero anaphora and stressed pronouns. Both seem to be used in contexts of much
lower referential continuity--higher AD values--than expected. Such usage may be due to the higher
informational predictability in face-to-face conversation between intimate interlocutors in this
particular diad. It may also be due to a more careless style of verbal interaction among males.

3.7. Mandarin Chinese

Mandarin Chinese is a rigid SVO language, with an extensive use of zero anaphora and no
unstressed anaphoric pronouns, in this respect rather similar to Japanese. The correlation between
grammatical role--subject vs. direct object--and frequency of zero anaphora, stressed pronouns and
full NPs in Mandarin was studied by Pu (1997). Her results are reproduced in Table (26) below.

(26) Grammatical role and frequency of zero anaphora in Mandarin oral narrative

full NP stressed PRO ZERO TOTAL


============ ============ ============ ============
role N % N % N % N %
===== ====== ====== ====== ====== ====== ====== ====== ======
S 822 40.2 398 19.4 829 40.4 2046 100.0
DO 648 85.3 65 8.5 47 6.2 760 100.0
others 525 97.9 / 0.0 11 2.1 563 100.0
================================================================
887

The bulk of zero anaphors in the Mandarin text--829 out of 887 or 82.9%--code the subject
participant, the most topical and most continuous in discourse. Conversely 40.4% of all subjects are
zero-coded, as compared to only 6.2% of direct object and 2.1% of other roles.
Pu (1997) also studied the cataphoric persistence of the referents occupying the subject vs.
object grammatical role, expressed in terms of 0-2 occurrences in the subsequent 10 clauses vs. >2
occurrences. The pooled results are reproduced in Table (27) below.

(27) Grammatical role and the cataphoric persistence subjects vs. objects
in Mandarin oral narrative
0-2 occur. >2 occur. TOTAL
============= ============ ============
role N % N % N %
==== ====== ====== ====== ====== ====== ======
S 430 21.0 1616 79.0 2046 100.0
DO 659 86.7 101 13.3 760 100.0
==================================================
14

Subject referents in Mandarin, claiming 82.9% of zero-anaphora in the text, exhibits higher
cataphoric persistence--thus higher topicality--in 79.0% of their occurrence in text. In contrast,
direct objects, claiming only 6.2% of zero anaphora in the text, exhibit lower cataphoric persis-
tence--thus lower topicality--in 86.7% of their occurrence in text. The effect of grammatical
relations on the cataphoric continuity of referents will be discussed further in ch. 7, below.

3.8. Word order and referential continuity

As noted earlier, several of the languages considered above deploy some word-order
variation--SV vs. VS or OV vs. VO--as part of the inventory of devices used to code referential
continuity or topicality. In this section we will consider briefly three languages: spoken English
(rigid SVO), spoken Ute (flexible word-order), and Early Biblical Hebrew (rigid VO, flexible VS-
SV).[FN 15]

3.8.1. Word-order and referential continuity in spoken English

In table (28) below we re-capitulate the AD figures listed in Table (18) above for spoken
English narrative (Brown 1983), adding for comparison the values for L-dislocated (fronted) and
R-dislocated (post-posed) definite NPs from another study (Givón 1983b).

(28) Mean anaphoric distance values of major referent-coding devices


in spoken English
category N % mean AD value
============ ======= ======== ==============
zero 117 18.1 1.0
unstressed PRO 336 52.1 1.0
stressed PRO 75 11.6 3.75
definite NP (SVO) 69 10.7 10.15
L-dislocated NP 44 6.8 15.34
R-dislocated NP 4 0.62 1.00
=============================================
TOTAL: 645 100.0

Several things are striking about these recapitulated results. First the combined high-
continuity devices--zero anaphora and unstressed pronouns--constitute 70.2% of the total sample
of nominal referents in the text. This underscores the use of these two devices to code maximally-
continuous referents, as is also suggested by their identical 1.0--one clause back--AD values.
The average AD value for definite NPs in the most common SVO order of English,
comprising 10.7% of the total referents in the text, is 10.15 clauses back. L-dislocated NPs, at 6.8%
of the total sample, displays an even higher AD value--15.34 clauses back. That is, L-dislocation
is used in spoken English to code referents that are brought back into the discourse after a large gap
of absence, easily transcending the length of the current clause-chain or even the current paragraph.
15

Lastly, R-dislocated NPs, at a minuscule 0.62% of the total sample, code referents with the
same high referential continuity--1.0 AD--as zero anaphora and unstressed pronouns. Whatever the
communicative function of R-dislocation may be, it has little to do with referential continuity.

3.8.2. Word order and referential continuity in spoken Ute

Table (29) below recapitulates the AD values of the various referent-coding devices in
spoken Ute narrative in (20), earlier above. The re-capitulation highlights the contrast between pre-
verbal (SV, OV) and post-verbal (VS, VO) referents.

(29) Mean AP values of major referent coding devices in spoken Ute


category N % mean AD value
============ ======= =========== ==============
zero 321 51.5 1.21
unstressed PRO 42 6.7 1.54
stressed PRO SV 75 12.0 2.80
VS 61 9.8 1.95
OV 12 1.9 2.41
VO 1 0.16 1.00
definite NP SV 39 6.2 10.84
VS 25 4.0 1.48
OV 34 5.4 9.67
VO 13 2.1 4.46
=================================================
TOTAL: 623 100.0

As in English, referents that are placed post-verbally (VS, VO) have a much lower AD value
than those places pre-verbally (SV, OV). That is, post-verbal position marks referents with much
higher referential continuity, with AD values--1.95, 1.00, 1.48, 4.46--approximating those of zero
anaphora and unstressed clitic pronouns (1.21-1.54).
Table (30) below lits the distribution of various referent-marking devices in contexts of high
thematic continuity (paragraph-medial) vs. low thematic continuity (paragraph-initial) in spoken
Ute narrative, re-computed here from Givón (1983c).
16

(30) Distribution of the various referent-coding categories in contexts of high


thematic continuity (paragraph-medial) vs. discontinuity (paragraph-initial)
in spoken Ute
paragraph-initial paragraph-medial total
=============== =============== ===============
category N % N % N %
============= ======= ======= ======= ======= ======= =======
zero 1 0.4 320 99.6 321 100.0
clitic PRO / / 42 100.0 42 100.0
indep-PRO SV 26 34.0 49 66.0 83 100.0
VS 6 9.0 55 91.0 61 100.0
DEF-NP SV 15 38.0 24 62.0 39 100.0
VS 3 12.0 22 88.0 25 100.0
===============================================================

First, the overwhelming distribution of the high-continuity referent-coding devices, zero and
unstressed clitic pronouns, in paragraph-medial contexts--99%-100%--demonstrates vividly how
referential and thematic continuity march hand in hand.
Second, both independent subject pronouns and full subject NPs placed post-verbally (VS)
appear much more frequently in the paragraph-medial contexts of high thematic continuity--88%-
91%--than pre-verbal subject NPs (SV; 62%-66%). This underscores the fact that referential and
thematic continuity march in tandem.

3.8.3. Word-order and referential continuity in Early Biblical Hebrew[FN 16]

Early Biblical Hebrew (EBH) is a rigid VO language with the pre-verbal position (SV, OV)
reserved for discontinuous referents. This word-order device interacts with the tense-aspect
system, so that full-NP continuous referents, overwhelmingly post-verbal (VS, VO), tend to appear
in clauses marked with the perfective (prefixal) conjugation. In contrast, discontinuous referents,
most commonly pre-verbal (SV, OV), tend to appear in clauses marked with the perfect or
imperfective conjugations. As an example, consider the opening episode of Genesis. The first 4
clauses (30a,b,c,d) introduce new referents in rapid succession, first in perfect-marked clauses
(31a,b), then the non-verbal (31c), then the imperfective (31d):[FN 17]

(31) a. bv-re'shit bara' elohim 'et-ha-shamayin we-'et-ha-'arets, (ADV-V)


at-beginning create/PERF/3sm God ACC-the-heaven and-ACC-the-earth
'In the beginning God created the heaven(s) and the earth,

b. we-ha-'arets hay-ta tohu va-vohu, (S-V)


and-the-earth be/PERF-3sf chaos and-confusion
and the earth was all chaos and confusion,
17

c. vv-£oshekh ¨al pney ha-tv'om, (S-V)


and-darness on face/of the-precipice
and darkness over the precipice,

d. wv-rua£ 'elohim mvra£f-et ¨al pney ha-mayim; (S-V)


and-spirit/of God hover/IMPFV-sf on face/of the-water
and the spirit of God (was) hovering over the water;

Once the scene has been set, the continuous narrative with a recurring referent switches to
the VS order and the perfective tense-aspect:

(32) e. wa-yo-'mar 'elohim: "yv-hi 'or!", (V-S)


and-3sm-say/PFV God 3sm-be/IRR light
and God said: "Let there be light!",

f. wa-yv-hi 'or; (V-S)


and-3sm-be/PFV light
and there was light';

g. wa-ya-r' 'elohim 'et-ha-'or ki-ţov (V-S)


and-3sm-see/PFV God ACC-light SUB-good
and God saw that the light was good,

h. wa-ya-vdel 'elohim beyn ha-'or u-veyn ha-£oshekh, (V-S)


and-3sm-divide/PFV God between the-light and-between the-dark
and God divided the light from the dark,

i. wa-yi-qra' 'elohim l-a-'or yom, (V-S)


and-3sm-call/PFV God to-the-light day
and God named the light day,

Next, a new object is contrasted with the preceding object, precipitating a switch to the OV
order and the perfect tense-aspect:

(33) j. wv-l-a-£oshekh qara' layla; (O-V)


and-to-the-dar call/PERF/3sm night
and the dark he named night;

After which the episode closes with the continuous mode once again, with VS order and the
perfective tense-aspect, even with the two subjects ('evening', 'morning') being new--though
unimportant:
18

(34) k. wa-yv-hi ¨erev (V-S)


and-3sm-be/PFV evening
and there came the evening,

l. wa-yv-hi boqer yom 'e£ad. (V-S)


and-3sm-be/PFV morning day one
and there came the morning of day one'. (Genesis, 1:1-5)

Table (35) below summarizes the frequency distribution of the main tense-aspect
conjugations in two EBH books (Genesis, Kings-II) . The prefixal conjugation, strongly associated
with the VS word-order, is a merger of the perfective and irrealis tense-aspects, both used to carry
the bulk of in-sequence new information, the foregrounded backbone of the narrative. The
suffixal conjugation, strongly associated with the SV word-order, carries mostly the perfect tense-
aspect function, with some subjunctive use (see further below). The nominal/participial conjugation,
rather infrequent in the text, carries the imperfective tense-aspect function, and is also strongly
associated with the discontinuous SV word-order.[FN 18]

(35) Overall frequency distribution of tense-aspects in EBH


Genesis Kings-II
================ ================
tense-aspect N % N %
============ ======= ======= ======= =======
prefixal 480 69.7% 912 74.8%
suffixal 181 26.2% 209 17.8%
imperfective 28 5.1% 98 7.4%
============ ======== ======== ======== ========
TOTAL: 689 1,219

As is to be expected, the prefixal conjugation, associated with referential and thematic continuity,
comprises 70%-75% of the total sample.
Consider now the numerical association, given in Table (36) below for the Genesis text,
between the tense-aspect conjugations and word-order.[FN 19]
19

(36) Subject position and tense-aspect in Genesis


tense-aspect conjugation
=================================================
prefixal suffixal imperfective
================ ================ ===============
category VS SV VS SV VS SV
================== ======== ======== ======== ======== ======== =======
Main clause:
no fronted non-S 168 / 1 21 5 76
fronted non-S / / 13 / 2 16
PRO-obj 9 / / / / /
PRO-subj / 5 / 4 3 21
negative 4 / 1 / 1 2
irrealis 8 7 3 / / /
================ ======== ======== ======== ======== ======= =======
total main clause 189 12 18 25 11 115
% 94.0% 58.1% 91.1%
Subordinate clause:
OBJ-REL-clause 2 / 12 / / /
ADV/V-COMP 2 1 13 / 12 1
OBJ-WH-question 2 / 1 / / /
================= ======= ======= ======== ======= ======= ======
total subord. clause 6 1 26 / 12 1
% 85.7% 96.2% 92.3%
================= ==================================================
TOTAL: 195 13 44 25 23 116
% 93.7% 63.7% 83.4%

The main facets of the association between tense-aspect and word-order in EBH may be
summarized as follows:
!In main clauses marked by the prefixal (mostly perfective) conjugation, 94.0% of the full-NP
subjects come in the VS word order.
!In contrast, in main clauses marked by the suffixal (mostly perfect) conjugation, 58.1% of the
full-NP subjects come in the SV order. The figure is even higher in the nominal/particpial
(imperfective) conjugation--91.1% SV.
!In subordinate clauses, which constitute a much smaller part of the sample and tend to code
discontinuous backgrounded information, the SV word order predominates in all three conjugations
(85.7%, 96.2%, 92.3%).
20

4. Summary: From typology to diachrony

The facts discussed above suggest the following cross-language functional-typological


clustering of the four highest-continuity referent-coding devices--zero, unstressed/clitic pronouns,
pronominal agreement and stressed/independent pronouns:

(37) Functional-typological clustering of high-continuity referent-coding devices


(a) (b) (c)
Japanese Ute Spanish
REF continuity: Mandarin English Hebrew
================= =========== ================ ===============
highest zero zero ---------------

-------------- unstressed/clitic PRO pronominal AGR

lower stressed PRO stressed PRO stressed RPO


=================================================================

While the functional and distributional arguments for the three typological clusters are well
supported by the facts surveyed above, these three clusters in fact represent three diachronic stages,
involving four diachronic developments:
!The evolution of stressed independent pronouns from demonstrative pronouns;
!The evolution of unstressed/clitic anaphoric pronouns from stressed independent pronouns;
!The evolution of obligatory pronominal agreement from unstressed/clitic anaphoric pronouns;
and
!The decay of pronominal agreement, returning the language to the beginning of this diachronic
cycle--zero.
These diachronic developments are the subject of the two subsequent chapters (3,4).
21

ABBREVIATIONS OF GRAMMATICAL TERMS

ACC accusative 3s 3rd person singular


ANT anterior 3sf 3rd person singular masculine
DIR directional 3sm 3rd person singular feminine
HAB habitual
IMPFV imperfective
IRR irrealis
MOD modal
NEG negative
NOM nominal
O object
PERF perfect
PFV perfective
REM remote
S subject
SUB subordinator
TOP topic
WH question marker
22

Footnotes

1
For a discussion of early childhood pidgin, second language pidgin and Broca's aphasic speech see
Givón (2009, chs 6,9,10,12). Brian MacWhinney (i.p.c.) points out that there is a considerable
variety in the linguistic behavior of Broca's Aphasia patients, so that only some of them show
classical agrammatism. The problem may be that several major language-processing centers are
crowded near the classical Broca's area (B45, B46), most conspicuously B47/12, which is strongly
implicated in lexical processing (Bookheimer 2002; Martin and Chao 2001). As a result, lesions in
the general Broca's area are seldom fully localized functionally.
2
For an earlier discussion, see Givón (1983a; 1988). Ultimately, maximal anaphoric predictability
may be translated into cognitive terms as continuing mental activation (see ch. 2, below). Linguists
have traditionally focused on the much-more-frequent anaphoric zero, ignoring the just-as-natural
if less-frequent cataphoric grounding of zero (see ch. 7).
3
One referential device left out here is pragmatic word-order. While interacting with anaphoric
predictability/continuity in a number of discourse contexts, word-order turns out to be also highly
sensitive to referential importance (Givón 1988). For some discussion of the anaphoric dimension
of word-order, see section 3.8, below.
4
While the hierarchic organization of discourse coherence is most conspicuous in narrative, it is not
fundamentally different in conversation. That is, in spite of the fact that conversation involves
changes of perspective ('turns'), coherent conversation still has a hierarchic structure roughly similar
to that of narrative, albeit more complex. This becomes clear when coherence is studied across
multiple turns. For an extensive discussion of this, see Chafe (1997), Coates (1997), Ervin-Tripp and
Kuntay (1997), and Linell and Korolija (1997).
5
For discussion and text-based measurements, see Givón (1991a; 2015, ch. 23).
6
For discussion and quantified cross-language studies, see Givón (ed. 1983).
7
For extensive discussion and quantified cross-language comparison, see Givón (ed. 1983, ed.
1997b). Pragmatic ('flexible', 'free') word-order is also an important referential device, interacting
with referent accessibility but sensitive primarily to cataphoric referent importance (Givón 1988).
See sec. 3.8, below.
8
See ch. 4.
9
"Hungry Coyote races Skunk for the prairie dogs", as told by Mollie B. Cloud (Givón ed. 2013)
10
Ibid.
23

11
See text counts further below.
12
Another potential paratactic precursor to pronominal agreement is R-dislocation, as in:
a. Subject R-dislocation: ... and he disappeared, John, I mean...
b. Object R-dislocation: ...and they saw him there, John, I mean...
The probability of R-dislocation being the diachronic precursor of subject agreement is lower,
however, since R-dislocation is typically a chain-final device, recapitulating a recurrent referent
that was marked by zero or pronominal agreement in the preceding clause.
13
Indefinite NPs were not counted here since they have no anaphoric antecedent.
14
See ch. 4.
15
For a more extensive discussion of the pragmatics of word-order flexibility see Givón (1988).
16
The description of Early Biblical Hebrew grammar here is taken from Givón (1977), revised in
Givón (2015a, ch. 9).
17
The first clause is a presentative construction, fronting the time adverb 'in the beginning' and
precipitating the post-posing of the subject, i.e. an OVS order (TVX; Venneman 1973).
18
See Givón (1977), revised in Givón (2015, ch. 9). See also Hopper (1979).
19
Ibid.
24

References

Bentivoglio, P. (1983) "Topic continuity and discontinuity in Discourse: A study in spoken Latin-
American Spanish", in T. Givón (ed. 1983)
Bookheimer, S. (2002) "Functional MRI of language: New approaches to understanding the cortical
organization of semantic processing", Ann. Review of Neuroscience, 25
Brown, C. (1983) "Topic continuity in written English", in T. Givón (ed. 1983)
Chafe, W. (1997) "Polyphonic topic development", in T. Givón (ed. 1997a)
Coates, J. (1997) "The construction of collaborative floor in women's friendly talk", in T. Givón (ed.
1997a)
Ervin-Tripp, S. and A. Küntay "The occasioning and structure of conversational stories", in T.
Givón (ed. 1997a)
Fox, A. (1983) "Topic continuity in Biblical Hebrew", in T. Givón (ed. 1983)
Givón, T. (1977) "The drift from VSO to SVO in Biblical Hebrew: The pragmatics of Tense-
Aspect", in C. Li (ed. 1977)
Givón, T. (ed. 1979) Discourse and Syntax, Syntax and Semantics 12, NY: Academic Press
Givón, T. (1983a) "Topic continuity in discourse: Introduction" in T. Givón (ed. 1983)
Givón, T. (1983b) "Topic continuity is spoken English" in T. Givón (ed. 1983)
Givón, T. (1983c) "Topic continuity on Ute", in T. Givón (ed. 1983)
Givón, T. (ed. 1983) Topic Continuity in Discourse: Quantitative Cross-Language Studies, TSL #3,
Amsterdam: J. Benjamins
Givón, T. (1988) "The pragmatics of word-order flexibility: Predictability, importance and
attention", in M. Hammond et al. (eds 1988)
Givón, T. (1991a) "Serial verbs and the mental reality of 'event' ", in E. Traugott and B. Heine (eds
1991, vol. I)
Givón, T. (ed. 1997a) Conversation, TSL #34, Amsterdam: J. Benjamins
Givón, T. (ed. 1997b) Grammatical Relations: A Functionalist Perspective, TSL #35, Amsterdam:
J. Benjamins
Givón, T. (2009) The Genesis of Syntactic Complexity, Amsterdam: J. Benjamins
Givón, T. (ed. 2013) Ute Texts, Amsterdam: J. Benjamins
Givón, T. (2015) The Diachrony of Grammar, 2 vols, Amsterdam: J. Benjamins
Hale, K. (1980) "Word-star languages", Mid-America Linguistics Conference, Lawrence:
University of Kansas (ms.)
Hale, K. (1982) "Preliminary remarks on configurationality", Proceedings of the Northeastern
Linguistics Society, pp.86-96
Hammond, M., E. Moravcsik and J. Wirth (eds 1988) Studies in Syntactic Typology, TSL #17,
Amsterdam: J. Benjamins
Hinds, J. (1983) "Topic Continuity in Japanese", in T. Givón (ed. 1983)
Hopper, P. (1979) "Aspect and foregrounding in discourse", in T. Givón (ed. 1979)
Huang, C. J. (1984) "On the distribution and reference of empty pronouns", Linguistic Inquiry, 15.4
25

Li, C.N. (ed. 1977) Mechanisms of Syntactic Change, Austin: University of Texas Press
Linell, P. and N. Korolija (1997) "Coherence in multi-party conversation", in T. Givón (ed. 1997a)
Martin, A. and L.L. Chao (2001) "Semantic memory and the brain: Structure and processes",
Current Opinion in Neurobiology, 11
Pu, M.-M. (1997) "Zero anaphora and grammatical relations in Mandarin", in T. Givón (ed. 1997b)
Traugott, E. and B. Heine (eds 1991) Approaches to Grammaticalization, TSL #19:1,2,
Amsterdam: J. Benjamins
Venneman, T. (1973) "Topics, subjects and word-order: From SXV to SVX via TVX", paper read
at the First International Congress on Historical Linguistics, Edinburgh, Sept. 1973 (ms.)
Xu, L. (1986) "Free empty category", Linguistic Inquiry, 17.1

View publication stats

You might also like