You are on page 1of 8

Challenges in Machine

Translation
Pablo Lizcano Sánchez-Cruzado
A word is not the basic unit of meaning
• What is a word? Word boundaries are not always well defined
• Spaces?
• Spanish: a penas vs apenas
• German: trennbare verben (separable verbs)
• No spaces. Chinese: 你叫什么名字, 你叫 | 什 |么|名|字 or 你叫|什|么名|字?
• Lexical mismatch. A word may have to be replaced by a phrase so as
not to lose the original meaning.
• Spanish derived words: Limonero = lemon tree, Melocotonero = peach tree,
Caballito = small horse
• English compound words: Airline = línea aérea, Headache = dolor de cabeza,
Sunglasses = gafas de sol
• Different levels of morphological richness. Italian verb inflection
reduces the number of words. Aiuterò (IT) = I will help (EN)
Word ambiguity
• Polysemy: a word has different meanings.
• interest (bank) vs interest («topic of interest»)
Financial To sit down, park Edge of a river
institution
German Bank Yes Yes No
Spanish Banco Yes Yes Yes
Italian Banca Yes No No
English bank Yes No Yes
• Homonimy: words unrelated in origin are written/pronounced the
same.
• left (past of verb leave) vs left (side)
Word order
• Subject / Verb / Object order
• Io mangio carne solo la domenica. (SVO)
• Mi piacciono gli gnocchi. (OVS)
• Modification of Word order to change emphasis
• L’unica cosa che dovevi fare te la sei dimenticata. (OV)
• Adjective / Noun order
• È un uomo grande. = He is an old man.
• Fu un grand’uomo. = He was a great man.
Co-reference
• Anaphora: pronouns (he, she, it)
• Example: If the baby does not thrive on raw milk, boil it.
• Translation to Italian: Se il bambino non prospera con il latte crudo,
fatelo bollire.
• Translation to Spanish:
• Si el bebé no prospera con la leche cruda, hiérvela.
• Si el bebé no prospera con la leche cruda, hiérvelo.
Language is developing
• New words:
• Brexit
• To adult: to behave responsably
• Contactless
• Words from other languages:
• English: apply (job offer)-> Spanish: Aplicar (job offer)
• Social media: acronyms
• WFH: Working from home
• PPE: personal protective equipment
Other challenges
• Categorical divergence.
• I am hungry. (adj) = Ho fame. (noun)

• Difficulty to communicate some nuances contained in the original


text./Extra information can be required by the target language to choose
the right translation.
• Subjunctive vs indicative:
• Cerco un libro che spieghi la guerra civile spagnola dal punto di vista di un italiano. (I don’t
know if the book exists)
• Cerco un libro che spiega la guerra civile spagnola dal punto di vista di un italiano. ( I know
that the book exists)
• Gender of the speaker:
• (SP, not needed) Me he comprado una casa
• (IT, needed) Mi sono comprata una casa OR Mi sono comprato una casa.
• Modal particles in German
Summary
• Word by word translation would be low-quality.
• Context is essential to choose the right translation.
• The natural word order must be taken into account.
• Co-reference must be kept coherent.
• Vocabulary evolves over time.
• Some nuances are almost impossible to preserve in the translation.
• Or we may need extra information to have a good translation.

You might also like