You are on page 1of 42

Rule-Based Synthesis

LING 285
Spring 2021
Mary Byram Washburn
Rule Based
Synthesis
Physical Properties:

- Harmonics
-

- f0
- Decrease in
amplitude
-

- Formants
-

- F1 and F2 the
loudest
• Rule Based Speech Synthesis:
Rule-Based •

• Imitation of the physical properties of speech, using something other


Synthesis •
than the human voice

• Synthesize the acoustics of speech


• Harmonics

• f0
• Decrease in amplitude
• Formants

• F1 and F2
Pure Tone =
Simple Signal
• Lots of Pure Tones
Pure Tone •

Synthesis •


One for each:
• Harmonic

• f0
• Decrease in amplitude
• Formant

• F1 and F2 the loudest



5000
4000
3000
2000
1000
0 Hz
Vowel
For Vowels:
Acoustics

• 2 tones (F1 and F2)

2500
Hz
300 Hz
Rule-Based
Synthesis
Rule-Based
Synthesis

Synthesized /æ/
F1: high 800Hz
F2: high
1800Hz
f0: female 200Hz
Rule-Based
natural /æ/
Synthesis

synthesized /æ/
Stop- involves 2
Consonant
gestures-
Acoustics
closure and
release

Kick Pip
/kɪk/ /pɪp/

Fricatives-
just 1
gesture

Fish Sis
/fɪʃ/ /sɪs/
Mesh

5000 Hz
Consonant
Acoustics:
Fricatives
mesh, mess?

Mess

5000 Hz
Synthesizing Consonants
Rule-Based •

Synthesis •


Synthesize the acoustics of speech
• Stops:

• Silence during closure gesture


• A lot of high energy when there’s aspiration on the release gesture
• Fricatives

• Energy is high and long


• Sibilants are louder than other fricatives

• /s/: starts >4000 Hz


• /ʃ/: starts ~3000 Hz
• Nasals

• Nasal Murmur: loud at ~250 Hz



Consonant pam
Acoustics
nap, pam, pop?

pop

nap
3412, 3512, 3612, 3712

Consonant
Acoustics: [z]
Voicing

[s]
Synthesizing Consonants
Rule-Based •

Synthesis •


Synthesize the acoustics of speech

Voicing
Voiced: harmonics
• Frequencies at intervals of the f0
Unvoiced: noise
• Frequencies at random intervals
• do not have harmonics

hiss his
/hɪs/ /hɪz/
Consonant
Acoustics:
Voicing

/eib/ /eip/
Synthesizing Consonants
Rule-Based •

Synthesis •


Synthesize the acoustics of speech

Voicing
Voiced: harmonics
• Frequencies at intervals of the f0
• Long preceding vowel, Short consonant
Unvoiced: noise
• Frequencies at random intervals
• Short preceding vowel, Long consonant

Consonant
Acoustics:
Liege or liege (voiced)

Voicing
Leash?
/liʒ/ /liʃ/

leash (unvoiced)
Rule-Based [s]
Synthesis in 5000 Hz
5200 Hz
Praat 6000 HZ

[ʃ]
3000 Hz
5000 Hz
5200 Hz
• Rule Based Speech Synthesis:
Rule-Based •

• Imitation of the physical properties of speech, using something other


Synthesis •
than the human voice
Formant Based Synthesis

• Synthesize the acoustics of speech


• Harmonics

• f0
• Decrease in amplitude
• Formants

• F1 and F2


Stops:

Rule-Based


Silence during closure gesture
A lot of high energy when there’s aspiration on the release gesture
Synthesis:


Fricatives
Consonants •


Energy is high and long

Sibilants are louder than other fricatives


/s/: starts >4000 Hz

/ʃ/: starts ~3000 Hz

Nasals

• Nasal Murmur: loud at ~250 Hz

Voicing

Voiced: harmonics

• Frequencies at intervals of the f0



Long preceding vowel, Short consonant

Unvoiced: noise


Frequencies at random intervals

You might also like