You are on page 1of 20

NMT vs SMT

Dragos Munteanu
MT approaches

• Rule-based: explicitly model all aspects of language

• SMT: learn from, and copy, past experience

• NMT: learn “concepts”, generalize


Machine Learning

data

input output
Machine
Training + Decoding

model
SMT: Training
parallel
……… ……… monolingual ………
………
……… ………
french english english
……… ……… ………

Statistical Statistical
Analysis Analysis
P(s/t) P(t)
Translation Model Language Model

la the 80% the death of 54%

la a 12% the capital of 34%


la 8% a capital of 11%
capitale capital 70% capital of france 41%
capitale death 30%
capital from france 9%
de of 53%
of france is 45%
de from 47%
of the france 2%
france france 100%
france is paris 23%
Is est 75%
Is was 25% france was paris 22%

paris paris 100%


SMT: Decoding
Translation Model Language Model

la the 80% the death of 54%


la a 12% the capital of 34%
la 8%
a capital of 11%
capitale capital 70%
capital of france 41%
capitale death 30%
de of 53% capital from france 9%

de from 47% of france is 45%


france france 100% of the france 2%
Is est 75%
france is paris 23%
Is was 25%
france was paris 22%
paris paris 100%

Translation Score

the capital of france is paris 94%

Input capital of france is paris 71%

la capitale de la france est paris Statistical a capital of france is paris 65%


Search ... …

a death from france was paris 3%


SMT Models

Translation Part of Lexicalized


Distortion
Model Speech Reordering

Language
Alignment Reordering
Model

Syntax
Morphology Smoothing Preordering
Model

Capitalization Transliteration
Word
Deletion
NMT Decoding
-0.2
-0.1
0.1
0.4
-0.3
1.1

ENCODER DECODER
4.3
-0.2
0.5
0.9
1.3
3.4
-5.3

Input
-6.2

Output
4.8
9.3
3.4

Text …
2.6
4.9 Text
0.1
2.6
8.3
-7.3
5.1
1.5
0.6
9.3
-6.2
2.9
1.4
-1.3
A Neural Network

PARAMETERS
A Deep Neural Network
A Deep Recurrent Neural Network
Word representations

Sparse
All words are equally different

Dense
Similar words have similar vectors
Word Embeddings

XKing – XMan + XWoman = XQueen


Training: backpropagation

Predict

Update
Training: Dropout
Long Short Term Memory Units
Attention
NMT Advantages

• Improved quality

• Easy to adapt or re-train

• Opens new opportunities


Investment Effects on Quality

18
NMT Opportunities: Multilingual translation
NMT Opportunities: Low resource translation

Alignment Alignment

TRAINING TRANSLATING

You might also like