Professional Documents
Culture Documents
LING 285
Spring 2021
Mary Byram Washburn
Rule Based
Synthesis
Physical Properties:
- Harmonics
-
- f0
- Decrease in
amplitude
-
- Formants
-
- F1 and F2 the
loudest
• Rule Based Speech Synthesis:
Rule-Based •
• Harmonics
•
• f0
• Decrease in amplitude
• Formants
•
• F1 and F2
Pure Tone =
Simple Signal
• Lots of Pure Tones
Pure Tone •
Synthesis •
•
One for each:
• Harmonic
•
• f0
• Decrease in amplitude
• Formant
•
2500
Hz
300 Hz
Rule-Based
Synthesis
Rule-Based
Synthesis
Synthesized /æ/
F1: high 800Hz
F2: high
1800Hz
f0: female 200Hz
Rule-Based
natural /æ/
Synthesis
synthesized /æ/
Stop- involves 2
Consonant
gestures-
Acoustics
closure and
release
Kick Pip
/kɪk/ /pɪp/
Fricatives-
just 1
gesture
Fish Sis
/fɪʃ/ /sɪs/
Mesh
5000 Hz
Consonant
Acoustics:
Fricatives
mesh, mess?
Mess
5000 Hz
Synthesizing Consonants
Rule-Based •
Synthesis •
•
Synthesize the acoustics of speech
• Stops:
•
pop
nap
3412, 3512, 3612, 3712
Consonant
Acoustics: [z]
Voicing
[s]
Synthesizing Consonants
Rule-Based •
Synthesis •
•
Synthesize the acoustics of speech
Voicing
Voiced: harmonics
• Frequencies at intervals of the f0
Unvoiced: noise
• Frequencies at random intervals
• do not have harmonics
•
hiss his
/hɪs/ /hɪz/
Consonant
Acoustics:
Voicing
/eib/ /eip/
Synthesizing Consonants
Rule-Based •
Synthesis •
•
Synthesize the acoustics of speech
Voicing
Voiced: harmonics
• Frequencies at intervals of the f0
• Long preceding vowel, Short consonant
Unvoiced: noise
• Frequencies at random intervals
• Short preceding vowel, Long consonant
•
Consonant
Acoustics:
Liege or liege (voiced)
Voicing
Leash?
/liʒ/ /liʃ/
leash (unvoiced)
Rule-Based [s]
Synthesis in 5000 Hz
5200 Hz
Praat 6000 HZ
[ʃ]
3000 Hz
5000 Hz
5200 Hz
• Rule Based Speech Synthesis:
Rule-Based •
• Harmonics
•
• f0
• Decrease in amplitude
• Formants
•
• F1 and F2
•
•
Stops:
Rule-Based
•
•
Silence during closure gesture
A lot of high energy when there’s aspiration on the release gesture
Synthesis:
•
•
Fricatives
Consonants •
•
Energy is high and long
•
Sibilants are louder than other fricatives
•
•
/s/: starts >4000 Hz
•
/ʃ/: starts ~3000 Hz
•
Nasals
•
Voicing
•
Voiced: harmonics
•
•
Frequencies at random intervals
•