NETtalk

History
Old model: DECtalk: text to speech with rules and look-up table for exceptions; 1. make libraries accessible to blind 2. 70 different phenomes in English; 26 letters (each letter about three ways ambiguous). “c” could be hard as in “card” or soft as in “circuit” or else “cherry”; context-sensitive exceptions “a” in “brave” and “gave” unlike in “have”; “survey” has the same sound but no “a”; think of George Bernard Shaw and fish written as “ghoti” (“f” from “enough”, “i” from “women” and “sh” from “nation”) 3. Use a “context window” of seven spaces 4. Rules required many man-years of programming 5. First, the size of these tables grows exponentially with the size of the window, with about 500,000 entries needed for a text of 1000 words and a window of 7 letters. 6. Second, some weighting scheme is required to combine evidence from different partial matches. Rumelhart and McClelland (1986) used single layer network to learn past-tense of English verbs with perceptron learning algorithm. NETtalk: automated learning on parallel network of deterministic processing units 1986 Terry Sejnowski and Charles Rosenberg

How it worked
Mapping letters to phonemes; grapheme to phoneme conversion ● 203 input units to encode a string of 7 letters (7 groups of 29); 26 units represent 26 letters plus three more for punctuation and word breaks. (=29) ● 80 hidden units ● 26 output units (23 phonemes + three stress characters); The phonemes, in contrast, are represented in terms of 23 articulatory features, such as point of articulation, voicing, vowel height, and so on.... Three additional units encode stress and syllable boundaries. ● 18,629 adjustable weights Phonemic codes sent through commercial speech synthesizer Supervised learning using back-propogation algorithm (error sent back through the system to adjust weights) ● Trained on 1,024 words, tested on 429 words from same speaker with 78% accuracy

"-". Typically 15/80 hidden units highly activated Network’s representation of letter-to-sound is neither local nor distributed: each unit participates in some (not local) but not all (not distributed) correspondences Recovers quickly after damage .012 word corpus from a dictionary. continuous speech of a child [Kit the Kid] and a 20. A subset of 1000 words was chosen from this dictionary taken from the Brown corpus of the most common words in English. The corresponding letters and phonemes were aligned and a special symbol for continuation. best guesses right 95% Two texts were used to train the network: Phonetic transcriptions from informal.● ● ● ● ● After 10k trials. as inserted whenever a letter was silent or part of a graphemic letter combination. as in the conversion from "phone" to the phonemes If-on-/ . best guesses right 80% After 50k trials.

Sign up to vote on this title
UsefulNot useful