Professional Documents
Culture Documents
cs224n Lecture2 WordAlign
cs224n Lecture2 WordAlign
Christopher Manning
“l” to “a”
transition:
25 ms
...
Result:
10ms Acoustic Feature Vectors
(after transformation,
a1 a2 a3 numbers in roughly R14)
Spectral Analysis
• Frequency gives pitch; amplitude gives volume
– sampling at ~8 kHz phone, ~16 kHz mic (kHz=1000 cycles/sec)
s p ee ch l a b
amplitude
source channel
w a
P(w) P(a|w)
best observed
decoder
w a
P(a | w)P(w)
ŵ = argmax w P(w | a) = argmax w
P(a)
= argmax w P(a | w)P(w)
MT System Components
Language Model Translation Model
source channel
e f
P(e) P(f|e)
best observed
decoder
e f
P(f | e)P(e)
ê = argmax e P(e | f ) = argmax e
P(f )
= argmax e P(f | e)P(e)
Other Noisy-Channel Processes
• Handwriting recognition
P(text | strokes ) ∝ P(text ) P( strokes | text )
• OCR
P(text | pixels ) ∝ P(text ) P( pixels | text )
• Spelling Correction
P(text | typos ) ∝ P(text ) P(typos | text )
Statistical MT
Suppose f is de rien
P(you’re welcome | de rien) = 0.45
P(nothing | de rien) = 0.13
P(piddling | de rien) = 0.01
P(underpants | de rien) = 0.000000001
A Division of Labor
Spanish/English English
Bilingual Text Text
Decoding algorithm
Que hambre tengo yo argmax P(f|e) * P(e) I am so hungry
e
What hunger have I,
Hungry I am so,
I am so hungry,
Have I that hunger …
Lecture
Plan
1. A
bit
more
course
overview
[5
mins]
2. Briefly
learn
the
history
of
Machine
Transla?on
[10
mins]
3. Learn
about
transla?on
with
one
example
[10
mins]
4. Speech
recogni?on
&
Applying
it
to
MT:
The
noisy
channel
model
[10
mins]
[Stretch!
Emergency
?me
reserves:
5
mins]
5. Parallel-‐text
word
alignments:
the
IBM
models
[30
mins]
Alignments
We can factor the translation model P(f | e )
by identifying alignments (correspondences)
between words in f and words in e
nouveaux
séismes
“spurious”
secoué
Japon
word
deux
par
Le
Le
Japan Japon Japan
shaken secoué shaken
by par
two deux by
new nouveaux two
quakes séismes new
quakes
Alignments: harder
“zero fertility” word
programme
not translated
application
mis
été
Le
en
And Le
a
the programme And
program a the
has été
program
been mis
implemented en has
application been
implemented
one-to-many
alignment
Alignments: harder
autochtones
appartenait
The Le
reste
balance reste
aux
Le
was
the appartenait The
territory balance
of aux was
the
the
aboriginal autochtones
people territory
of
the
many-to-one
alignments aboriginal
people
Alignments: hardest
démunis
pauvres
sont
Les
The Les The
poor pauvres
poor
don’t sont
have démunis don’t
any have
money any
money
many-to-many
alignment phrase
alignment
Alignment as a vector
application
mis Choose length m for French sentence
été
Le
en
For each i in 1 to m:
a
And
the – Choose ai uniformly from 0, 1, … l
program
has – Choose fi by translating eai
been
implemented
aj 2 3 4 5 6 6 6 We want to learn
how to do this
IBM Model 1 parameters
programme
application
mis
été
Le
en
a
And
the
program
has
been
implemented
ai 2 3 4 5 6 6 6
Applying Model 1*
As translation model
As alignment model
(pigeonhole principle)
Unsupervised Word Alignment
[
– For each French position i
• Calculate posterior over English positions P(ai | e, f)