You are on page 1of 3

And since the two texts are translations, the mea-

Analysing Parallel Texts with ning part can be assumed to be approximately the
ParaConc same in both texts. Thus we are able to see how
two different languages encode equivalent mea-
Michael Barlow nings. The art of translation is undeniably complex
and involves many different kinds of processes,
Department of Linguistics, Rice University, Hou- but we can consider three main aspects of transla-
ston TX 77005, USA tion, namely, language particular encodings of (i)
event structure, (ii) discourse structure, and (iii)
KEYWORDS: parallel corpora concordance lexis. Each of these areas can be profitably analy-
sed using parallel corpora.
AFFILIATION: Rice University, USA
1. Event Structure
E-MAIL: barlow@ruf.rice.edu Event structure simply refers to those actions oc-
FAX NUMBER: 713-523-6543 curring in the world that are of interest to humans,
PHONE NUMBER: 713- 630-8761 such as a transitive event in which one object acts
on another object in some way, which is typically
Analysing Parallel Texts with ParaConc encoded using a transitive clause. Since we can
Much of the current research on parallel corpora assume that the translations are about the same
concerns the problem of automatic alignment of events, we can use parallel corpora to examine
two texts that are translations of each other (Gale how languages code events in general, in other
and Church 1994, Kay and Roscheisen 1994, Jo- words, how aspects of an event are expressed
hansson and Hofland 1993). This paper, however, grammatically or lexically in different languages.
focusses on the analysis of aligned parallel corpo- An objection that could be raised here is that the
ra rather than on the aligning process itself. particular choices made by a translator will intro-
In order to analyse a parallel corpus a suitable text duce distortions into the data. It is true that some
analysis program is needed. ParaConc is a simple apparently random choices occur in translations,
parallel text concordance program available in but the accretion of motivated translation choices
Macintosh and Windows versions, which was allow the general patterns to be perceived using a
created by the author as a tool for linguistic re- concordance program.
search. This program allows the user to search for For example, we can examine the coding of cau-
a word or phrase, in the way typical of concordan- sative events in English and French by searching
ce programs. However, the result of the search is for the lemma make and examining patterns such
displayed in two windows rather than one. The as "X makes Y do Z" and then observing the
topmost window displays numbered lines contai- patterns used in French to refer to these same
ning each instance of the search term in the first events. A concordance search reveals that causa-
language, along with its context. The lower wind- tive make in English covers a wide variety of
ow displays numbered sentences in the second situations, for instance, causing a change in state
language which correspond to the text displayed (make something possible), causing someone or
in the first language in the upper window. The something to perform an action (make a dog go
results of a search can be sorted, printed or saved. away), and causing some kind of transformation
To obtain a list of words from each text that expressed as make followed by two contiguous
correspond, as illustrated below for English line noun phrases (make John the president). Having
and French ligne, the results of a search are first searched for English causative constructions in-
saved as a text-only file and then loaded into a volving make, we can investigate how these diffe-
word-processor for further formatting. rent causative event structures are coded in
The use of parallel corpora presents very inter- French. And, in fact, we find a rather different set
esting research opportunities in a variety of disci- of patterns for French. For the construction of the
plines including linguistics, literary studies, trans- type make John president, the equivalent occurs
lation, and language teaching. While these in French with faire in most cases. However, the
different areas may be touched upon, the focus of corpus data shows that other causative uses are
the present paper is on the use of parallel corpora often not translated by faire in French. A variety
in linguistic analysis. This project is similar in of constructions are used instead, including verbs
spirit to a variety of parallel corpus projects such such as rendre, as shown in (1).
as Intersect, Contragram, ENPC, and TRIPTIC,
among others. (1) a. The American blockage makes life
Taking a language to consist of form-meaning very difficult for us.
links, what we have in parallel corpora are two sets Le blocus rend nos conditions de vie
of form-meaning linkings, one for each language. trés rudes.
On the other hand, uses of make expressing a alors que, tandis que, pendant que, contre, and si,
causative event in which an agent acts on an ani- among others. To provide a complete analysis of
mate causee to bring about an event are more these conjunctions it is necessary to examine the
likely to be translated with faire, as exemplified in results of the search in some detail and also to
(2). examine the translations in English of the different
expressions: tout en, alors que, etc. One result that
(2) a. It is a behaviour which makes you think we can identify is that French si is used to indicate
of France... a contrastive meaning. Thus the equivalent senten-
Un comportement qui fait penser à la ces to (4) are those given in (5), which use si for
France ... while.

(2) b. ... their parents had made them lose (5) a. Si elle ne manque aucune occasion de
their French nationality. verser au débit des socialistes la dété-
... leurs parents, ...., leur ont fait perdre rioration de la situation de l’emploi, la
la nationalité française. droite paraît tout aussi désarmée devant
la montée du chômage.
This example shows how ParaConc can be used to (5) b. Si les vendeuses sont toujours aussi peu
investigate fairly subtle cross-linguistic di- aménes, les vitrines, en revanche, se
stinctions in the expression of causative events. font plus alléchantes.

2. Discourse structure 3. Lexis


Parallel corpora can also be used to highlight the ParaConc has several uses in investigating the
way in which different languages transform the meaning of lexical items and collocations in two
bare bones of event structure into discourse struc- languages. The program can take advantage of the
tures appropriate for each language. There are information-on-demand aspect of concordance se-
many interesting questions related to the structu- arching and provide equivalences that may be
ring of discourse in different languages and the use incompletely captured or not captured at all in
of parallel corpora offers one avenue of research bilingual dictionaries. For example, a parallel cor-
in this area. As an example, we can consider how pus based on computer texts will allow the user to
English and French discourse signals the fact that see how modern computer terms such as informa-
two events occur concurrently (or are alike in tion highway, email, and home shopping are being
some other way). In English the conjunction while translated. Thus a parallel corpus can be used to
is used both to link clauses that refer to events that reveal both the latest usage and also the variation
overlap in time and to indicate that the speaker is in usage that occurs.
contrasting two events. The two types, the tempo-
ral use and the contrastive use, are shown in (3) Rather than explore these lexicographic uses of
and (4). ParaConc, in this final section I will again pursue
a linguistic investigation and indicate how Para-
(3) a. That means that while we’re shooting Conc can reveal metaphorical and other exten-
one film we can start dreaming about sions of a concept occurring in two languages. The
the next. word line in English, for example, has a variety of
(3) b. That’s the way to get the economy uses, some of which are based on extensions of the
going again while at the same time prototypical meaning. Tied in with these exten-
discouraging looters. sions is the existence of certain collocations such
as hard line, firm line, etc. Some of these exten-
(4) a. While it never misses an opportunity to sions are also present in French; others are not. We
blame the Socialists for the worsening find, for instance that the in line with uses do not
job situation, the right appears to be just appear to have a French equivalent based on ligne.
as helpless in the face of rising unem-
ployment. In (6) a small sample of correspondences is given.
(4) b. While saleswomen remain as surly as (Examples of ligne and their equivalents in En-
ever, shop windows have become glish are omitted from this abstract for reasons of
much more attractive. space.)

Using parallel texts, it is possible to search for (6)


English while and investigate how temporal and a line une réplique
contrastive structures are represented in French communication line ligne de communication
discourse. Searching for while produces a variety cultural line ligne culturelle
of equivalent items in French including: tout en, dedicated line ligne spécialisée

26
a democratic line une ligne démocratique Gale, W. and K. Church. 1994. A program for
dividing line ligne de partage Aligning Sentences in Bilingual Corpora. In S.
dividing line ligne de fracture Armstrong (ed) Using Large Corpora. MIT
drain line canalisation d’écoulement Press: Cambridge.
took a firm line apporté un soutien d’une Hopper, P. and E. Closs Traugott. 1993. Gramma-
fermeté ticalization. Cambridge: CUP.
the following lines cette formule Johansson, S. and K. Hofland. 1993. Towards an
the front line au front English-Norwegian parallel corpus. Paper
front line front from the Fourteenth International Conference
hard-liners l’intransigeance des on English Language Research on Computer-
in line with á l’image de ized Corpora, Zürich, May 19-23, 1993. In U.
kept in line on encadre Fries, G. Tottie, and P. Schneider (eds.), Crea-
not in line with pas correspondre au ting and Using English Language Corpora.
In line with Comme l’indiquait Rodopi: Amsterdam.
in line with s’inscrit dans Kay, M. and M. Roscheisen. 1994. Text-Transla-
into line with en accord avec tion Alignment. In S. Armstrong (ed) Using
line positions Large Corpora. MIT Press: Cambridge.
our line of conduct notre conduite Moon, R. 1987. The Analysis of Meaning. In J.M.
our line notre principe Sinclair (ed) Looking Up. Collins: London.
the poverty line le seuil de pauvreté Noel, Jacques. 1992. Collocation and Bilingual
Text. In G. Leitner (ed) New Directions in
Given these sets of data, it is possible to map out English Language Corpora. Mouton de Gruy-
how the semantic domain of line and ligne resem- ter: Berlin.
ble each other and how they differ in terms of
semantic extensions and usages. The undertaking
of this kind of investigation can play a part in
linguistic investigations of grammaticalisation
(Hopper and Traugott 1993) and of the study of
general constraints on form-meaning mappings
(Barlow and Kemmer 1994).

4. Conclusion
These analyses provide an illustration of how the
common content of parallel corpora can be exploi-
ted to gain linguistic insights into the structure and
function of languages. The technique of investiga-
ting pairs of languages is promising for a variety
of research areas. One advantage is that a two-way
analysis of a domain, from language A to language
B, and from language B to language A provides
clues to the different meanings/uses of each
language form.
In sum: in this paper I describe the analysis of
parallel texts using ParaConc, a parallel concor-
dancer, and outline some fruitful areas of corpus-
based research that are opened up by the use of
such a program.

References
Barlow, M. To appear. Parallel Texts for Linguis-
tic Analysis. In M. Barlow and S. Kemmer
(eds) Usage-Based Models of Language.
Barlow, M. 1995. A Guide to ParaConc. Athel-
stan: Houston.
Barlow, M. and S. Kemmer. 1994. A Schema-ba-
sed Approach to Grammatical Description. In
S. Lima, R. Corrigan and G. Iverson (eds) The
Reality of Linguistic Rules. Amsterdam: Ben-
jamins.

27

You might also like