You are on page 1of 7

Translation Divergence in Sanskrit-Hindi Pronouns

Madhav Gopal
Centre for Linguistics, SLL & CS,
J.N.U., New Delhi
mgopalt@gmail.com
Keshav Niranjan
Department of Computer Science,Deshbandhu College,
University of Delhi, New Delhi, India
keshav.niranjan@gmail.com
Abstract
In this paper we present the language divergence in Sanskrit to Hindi Machine Translation, concentrating on the
first, second and third person pronouns. Sanskrit uses many strategies for encoding things; one idea/information
can be expressed by many ways, and this is evident from our pronoun study also. Sanskrit and Hindi, though
closely related, have many typological and lexical differences regarding these usages. We attempt to map these
things in this article to facilitate Machine Translation of Sanskrit pronouns into Hindi. As the scope of our study
is limited to pronouns only, our focus is on the structural divergence which is prevalent in this regard.
Keywords: Language divergence, Pronouns, Machine Translation, pronoun compounding.

1. Introduction
Language divergence (Dorr, 1993; Dave et. al. 2002)
refers to the differences in lexical and syntactic
choices that languages make in expressing ideas. It
occurs when the underlying concept of a sentence
gets manifested differently in different languages.
Sanskrit

order to find out their corresponding constructions in


the target language, i.e. Hindi. The divergence issues
will be discussed in the third section and finally we
will conclude the paper in the fourth section. At the
end of the paper an appendix is added to help
knowing the abbreviations used in the text body.

and Hindi differ in many respects,

presenting a rich source for the study of language

2. Sanskrit Pronouns and Their Hindi

divergence in developing MT systems. Some

Rendering

interesting examples of language divergence are


given ahead in the context of bilingual MT between

Sanskrit is a pronominally very rich language. All the

these two languages.

personal pronouns encode the person and number of

It is widely accepted that occurrence of divergence

a referent. Persons are first (speaker), second

causes major difficulty for any MT systems. In this

(addressee) or third (person other than speaker or

proposed work we will explore translation pattern

addressee). However, bhavn (masculine form) and

between Sanskrit-Hindi of pronoun constructions to

bhavat (feminine form) are addressees but behave as

identify the divergence in this language pair in this

third person pronoun taking third person verb

respect. This will enable us to come up with

conjugations. The first and second person pronouns

strategies to handle the divergent situations and

have no distinction of gender. But the third person

coming up with correct translation. In the second

pronouns have identifiably three genders- masculine,

section of the paper we will discuss Sanskrit

feminine and neuter. All kinds of pronouns qualify

pronouns, as it is necessary to know their nature in

for compounding and make up the first constituent of

the compound. The examples in this paper are taken

In this example the Sanskrit pronoun is

from the text Panchatantram. Our account is by no

grammatically an adjective with masculine attribute

means exhaustive. We focus those points which

where as its Hindi counterpart is a pronoun in its

require special attention while getting translated.

genitive inflection with masculine feature.

2.1. First Person Pronoun

(2)

Sanskrit first person pronouns inflect for all the cases


except vocative. Sanskrit pronominal morphology is
very rich. From the root asmad possessive adjectives
(asmadya and madya) are formed by adding -ya
suffix (and this process is common for other persons
(yu mad and tad) also). Such an adjective has the role
of a genitive in the sentence and could be said the
substitute of genitive forms in all the above said
pronouns. Their Hindi counterparts are completely
absent. The forms asmadya and madya differ with
respect to the number of the referent. When a speaker
wants to refer himself only madya form is used and
when the speaker wants to be inclusive (that is,
referents are more than one) asmadya form is used.
These possessive adjectives agree with the case,
gender and number features of the possessed
noun~pronoun, as the adjectives do in Sanskrit. For

yat
REL.NEU.3SG

asmadya-m

na

hi

tat

1POSS-NEU.SG

not DEF that.NEU.SG

par-em.
other-MAS.3PL.GEN
What is ours cannot be others. <vaikputra- kath,
mtsp, PT >
Herein, asmadya is in neutral gender due to its
agreement with the neutral antecedent yat.
jo hamara hai vo dusare ka nahi ho sakata.
In this example the pronominal adjective of Sanskrit
with neuter gender changes into a cardinal pronoun
with genitive and masculine attributes. The neuter
gender is not realized in Hindi consequently this
divergence takes place.
(3)
madya-

namaskra-

1POSS-M.SG greeting-M.SG

vcya-
communicable-M.SG

bagavata

instance ,

lord/M.SG.GEN

(1)
ayam

asmadya-

DEM.PRX.M.SG.NOM

1POSS-M.SG.NOM

bndhava-.
brother-M.SG.NOM
This is our brother. <mrkhapaita-kath, apkr,
PT>
Herein, the form asmadya is agreeing with the
possessed noun bndhava with its case, gender and

Convey my greeting to Lord.


In this example also, the agreement is maintained
between madya and namaskra.
bhagavana ko hamara namaskara boliyega.
Here adjectival pronoun is changing into a cardinal
pronoun form of Hindi. In all the above examples the
divergence is at part-of-speech level and grammatical
category level.

number features.
yaha hamara bhai hai.

2.1.1. First Person Pronoun Compounding


Pronoun compounding is a unique feature of

The abbreviations used in these examples are given


their full extension at the end of this paper.

Sanskrit. All kinds of pronouns undergo such word

formations. In the first person pronoun compounding

It is our kings ignorance that he eats cucumber

the roots asmat (used for to refer more than one), and

while toileting. <dantilagorambhayo, kath, mtbd,

mat (used for singular) are used and they serve as the

PT>

first components of such compounds. These kinds of

hamare raja ka yaha agyana hai ki vaha shaucha

pronouns are not found in Hindi. In the process of

kriya ke samaya kakadxi khata hai.

compounding the final sound /t/ of these roots is

Likewise here too, Sanskrit compounding could not

assimilated to the initial sound of the following

sustain in its Hindi rendering.

component, thus creating an instance of regressive

(6)

voicing assimilation. Interestingly, these forms

atra ca

convey the meaning of possessive adjective only and

here and 1POSS.SG-give-PSPL.F.ACC

can be replaced by them without disturbing any

vttim

meaning.

stipend.F.ACC consumer-PL.GEN

mad-da-ttm

bhujn-nm

paca-at
(4)

pait-nm
pundit-PL.GEN

tiha-ti.

five-hundred sit-SG.PRS

adya asmat-svm

There are five hundred pundits present who consume

pigalaka-

today 1POSS.PL-master Pingalaka-NOM

stipend given by me. <kathmukham, mtbd, PT >

bhta-

mere dwara di hui vritti ko bhogane wale pancha sau

bhta-parivra-

ca

vartate

scared-NOM scared-family-NOM and is

paNdita hain.

Today our master Pingalaka is scared and the same

Here the Sanskrit compounding is between the root

for his family. <klotpivnara-kath, mtbd, PT>

of pronoun and the past participle of the verb.

In this example asmad and svm are forming a

(7)

compound and due to [s] sound of svm, [d] sound

madya-bhr-ea

ati-rnta-

of asmad has become [t].

1POSS.SG-burden-M.SG.INS

very-tired-M.SG

aja hamara swami pingalaka aura usaka pariwarab

tvam

dara hua laga raha hai.

you.NOM

Here in the compound of Sanskrit construction has

You are very tired with my burden.

been explicitly split in its Hindi counterpart. Hindi

<bakakarkaaka-kath, mtbd, PT >

Pronouns do not participate in compounding. So, the

mere bhara se tuma bahuta thaka gaye ho.

cases of pronoun compounding in Sanskrit will

In this example the compounding is between


pronominal possessive adjective and a noun. The
form madya is just a replacement for mad, without
contributing anything special to the meaning.

unavoidably diverge while manifesting in Hindi.


(5)
aviveka

asmad-bhpate

ignorance 1POSS-king.SG.GEN
utsarga-m

car-an

excretion-ACC conduct-PRPL
cirbha-bhkaam karoti.
cucumber-eating

do-3SG.PRS

yat
that

purastool-

2.2. Second Person Pronoun


Like first person pronouns, second person pronouns
also

inflect

for

all

cases

except

vocative.

Morphologically, they are similar to first person


pronouns. The second person possessive pronouns
are yumadya and tvadya which are derived from

the roots yumad and tvad respectively by suffixing -

Even in night leaving your monastery, I will go to

ya.These possessive adjectives agree with the case,

another one. <hirayakatmraca-

gender and number features of the possessed noun, as

mtsp, PT >

kath,

the first person possessive adjectives do. However,


when yumadya and tvadya are compounded with

(10)

the possessed noun, their number, gender and case

na asti s

features are dropped. Their Hindi counterparts also

not is

not found. The form yumadya has been used for

tul

plural and tvadya for singular in the text PT. The

scale.F.SG.NOM

agreement is shown by an underline.

That scale of yours is no more.

(8)

<lohatulvaikpura-kath, mtbd, PT >

tvady-

3SG.DEF.F 2POSS.SG-F.SG.NOM

param vayam vana-car


but

we

forest-dweller.M.PL

yumadya-m

ca

2POSS.PL-N.SG and

jalnte

2.2.1. Second Person Pronoun Compounding


gha-m.

water-inside home-N.SG

Like first person pronouns, the roots of second person


pronoun also undergo for compounding. The root

But, we stay in forest and your abode is in the

forms yumad and tvad are used in compound

water. <prastvan-kath, ldpr, PT >

constructions for plural and singular respectively.

kintu hama vanachara hain aura tumhara ghara jala

Interestingly, these forms convey the meaning of

men hai.

possessive adjective. Like in the first person pronoun

Here in the pronominal possessive adjective of

compounding, the voicing assimilation takes place

Sanskrit has been translated into a cardinal pronoun

here also. These kinds of formations are not available

in Hindi.

in Hindi.
(11)

In the examples given in the following sections

yumad-darana-mtra-anurakta-y

behavior of Sanskrit pronouns is exhibited further

2POSS.PL-appearance-only-infatuated-F.INS 1SG.INS

and their Hindi translation is not offered as the

tm prada-tta

explanation we provided in the above examples is

self

applicable to them as well, so we avoid redundancy.

I got fascinated by seeing you only and have given

(9)

myself to you. <vaikputra-kath, mtsp, PT >

tat

rtrau

api

ayam.

give-PSPL.M this

tvadya-m

then night\SG.LOC emph 2POSS.SG-M.SG.ACC

(12)

maa-m

tvad-varjam

tyaktv

anya bhart

monastery-M.SG.ACC leave.PSPL

2POSS.SG-barring other

anyatra

api

me na

EMPH

my not be-3SG.FUT QUOT

mahe

may

y-symi.

else-where monastery\SG.LOC go-1SG.PRS

husband

bhav-iyati

manas-i
mind-SG.LOC

iti.

I cant have another husband even in my mind


except you. <vaikputra-kath, mtsp, PT >

of the referent their plural forms are used. They are


also used as demonstratives and definites. They

(13)
kintu tvat-prrtan-siddhy-artyam
but

sarasvati-

2POSS.SG-prayer-completion-for Saraswati-

vinodam

kar-iymi

entertainment

do-1SG.FUT

occur, sometimes, with determiners and pre-nominal


adjectives as nouns do.

(16)

But, for the completion of your prayer, I will be

sa

entertained with goddess Saraswati. <kathmukham,

3SG.DEF.M.NOM 2SG.GEN maternal_uncle-M.NOM

mtbd, PT >

api

na

EMPH

not came

(14)

te

mtula-

yata ?

Even your uncle didnt come? <bakakarkaaka-

aho gaja

ayam yumat-kula-atru

oh! elephant this

kat, mtbd>

2POSS.PL-family-enemy

Oh! this elephant is your family enemy.

2.3.1. Third Person Pronoun Compounding

<sihaglaputrayo kath, ldpr, PT >

The compounding of third person possessive pronoun


with nominals is available in the text of PT. The root

The second person possessive pronouns yumadya

of the third person pronouns (tad and bhavat) are

and tvadya also make compounds in the language,

used in the compound constructions. The base forms

and in the process of compounding they lose their

of the possessive adjectives tadya and bavadya

grammatical features of case, number and gender, as

have been used in compound constructions as was the

was the case with first person possessive pronouns.

case with first and second person possessive

(15)

adjectives.

yadyapi tvadya-vacanam

na

karoti,

(17)

though 2POSS.SG-word-N.SG.ACC not does,

bhavadya-shas-ena

tathpi svami sva-doa-n-ya

3POSS.SG-courage-SG.INS 1SG.NOM satisfied

still

I am satisfied with your courage.

master self-fault-destuction-SG.DAT

vcya.

aham

tua.

<mandabhgyasomilaka-kath, mtsp, PT>

speakable
Though the master does not pay any heed to your
words, still for the destruction of your own faults you
should speak to him. <dantilagorambhayo kath,
mtbd, PT>

3. Conclusion
In this paper we have presented Sanskrit pronouns
and the divergence that take place during their
translation into Hindi. The Hindi pronouns, not that
sophisticated

morphologically,

have

complex

2.3. Third Person Pronouns

inflections and the use of appropriate postpositions

Third person pronouns have distinct forms for the

will require a couple of things to consider while

categories of number, gender, and proximity, and

training the machine. In Sanskrit there are many

they inflect for case. To express the honorific status

special rules that assign a different case where

otherwise (semantically) would have been a different

Lust, Barbara C., KashiWali, James W. Gair, K.V.

case. In making rules for handling the divergence

Subbarao, eds (2000), Lexical anaphors and

A dhyy

ca,

pronouns in selected south Asian languages: a

adhish sthsm

principled typology, New York: Mouton de

stras

kldhvanoratyanta

(like

akathitam

sa yoge,

Gruyter.

karma, rucyrthnm pryam a , sp herpsita and


so on so forth) can be very helpful in getting
generalizations

in

this

language

pair.

Speijer, J.S. (1886, repr. 2006), Sanskrit Syntax, New


Delhi: MotilalBanarsidass Pvt. Ltd.

Also,

typological differences should also be kept in mind

Gopal, M.:

Anaphor Resolution in the Sanskrit Text

while designing an algorithm for such purpose. We

Panchatantra: A Rule Based Approach to Resolve

do not claim our account on pronoun divergence

Lexical Anaphors in Sanskrit, LAP LAMBERT


Academic Publishing, Verlag (2012).

presented here to be exhaustive. More literature could


be explored to attest different kinds of divergent

Gopal, M.: Resolving Anaphors in Sanskrit, (co-author


Girish Nath Jha), pp. 172-176. In: 5th Language

constructions in order to develop a robust translation

and Technology Conference: Human Language

system.

Technologies as a challenge for Computer


Science and Linguistics, November 6-8, 2011,

To handle the Sanskrit pronoun compounding we

Poznan, Poland: Fundacja Uniwersytetu im. A.

would need a compound processor for their automatic

Mickiewicza, ul. Rubiez.

segmentation and type identification of a compound

Goyal, P., Sinha, R.M.K. (2008), A Study towards

to establish the relationship between the constituents

English to Sanskrit Machine Translation

of compound. Once this relationship is established, it

System. SISSCL.

would be easy to transfer them into Hindi or any

Habash,

N.

and Dorr, B.

Translation

other language for that matter.

(2002),

Divergences:

Handling
Combining

Statistical and Symbolic Techniques in


Generation-Heavy

References
Structures

of

Indian

Languages,

Translation

Technical Report, LAMP 88.

Abbi, Anvita (2000), A Manual of Field Linguistics


and

Machine

Dave, S. and Parikh, J. and Bhattacharya, P. (2002),


Interlingua Based English-Hindi Machine

Munich: Lincom-Europa.

Translation

Chandrashekar, R. (2007), POS tagging for Sanskrit,

and

Language

Divergence,

Journal of Machine Translation (JMT), 17.

unpub. Ph.D. Thesis, New Delhi: Jawaharlal


Nehru University.
Dorr,

B.

(1994),

Classification

of

Machine

Translation Divergences and a Proposed

Appendix 1
Abbreviations used:

Solution, Computational Linguistics 20(4),


597633.
Kale, M.R. (1961), A Higher Sanskrit Grammar, New
Delhi: MLBD Publishers.

ACC:

Accusative Case, DAT: Dative Case, PRS:

Present, 1: First Person, 2: Second Person, 3: Third


Person, ABL: Ablative, AUX: Auxiliary, F: Feminine,
FUT:

Future, GEN: Genitive, GRN: Gerund, IMP:

Imperative,

IMPF:

Imperfect,

INDEF:

Indefinite

marker, INF: Infinitive, INS: Instrumental, EMPH:


Emphatic, LOC: Locative, M: Masculine, NEG:
Negative, N: Neuter, NOM: Nominative, PST: Past,
PERF:

Perfect, PCPL: Participle, PSPL: Past participle,

PRPL:

Present participle, PL: Plural, POSS: Possessive,

PRX:

Proximate/Proximal, QUOT: Quotative, REL:

Relative, SG: Singular, PT: Panchatantram, mtbd:


mitrabhedam, mtsp: mitra-samprptikam, kkly:
kkolkyam,

ldpr:

aparkitakrakam

labdhapraam,

apkr:

You might also like