Professional Documents
Culture Documents
Preprocessing Video Images For Neural Learning of Lipreading
Preprocessing Video Images For Neural Learning of Lipreading
Abstract
W
2 Ricoh California Research Center Technical Report # 93{26
and vice versa. Thus, for example =mi= $ =ni= are highly confusable acoustically but are easily distinguished based
on the visual information of lip closure. Conversely, =bi= $ =pi= are highly confusable visually (\visemes"), but are
easily distinguished acoustically by the voice-onset time (the delay between the burst sound and the onset of vocal fold
vibration). Th
4 Ricoh California Research Center Technical Report # 93z26
A{B
Preprocessing Video for Lipreading 5
Gray Level
6 Ricoh California Research Center Technical Report # 93{26
i=1 ...
v-1
8 Ricoh California Research Center Technical Report # 93{26
vertical mouth_gap (pixels)
25
20
15
10
5
1 10 33 50
time (frame number) --->
Preprocessing Video for Lipreading 9
ts
uni
p -
ts
uni
x -