Professional Documents
Culture Documents
SOMETHING
THAT TALKS?
Modeling the Human Vocal Tract
pitch, timing,
and formant
pitch, timing, and formant control signals control signals
lips, teeth,
and tongue
formant
cavity 2
formant
cavity 1
pulse resonant resonant noise,
vocal
source filter filter dynamics
folds
Sound Path Functional Blocks
vowels:
resonant
filter
pulse envelope
source (attack,
decay)
resonant
filter
consonants:
envelope
Combined outputs creates
noise resonant a functional approximation
(attack,
source filter
decay) of the human vocal tract!
Sound Path Functional Blocks
vowels:
resonant
filter
=
pulse envelope
source (attack,
decay)
resonant
filter
consonants:
envelope
Combined outputs creates
noise resonant a functional approximation
(attack,
source filter
decay) of the human vocal tract!
Mixing the output of two filters approximates double-peak speech resonances.
dB
Freq.
“ee” “ah”
“ih” “oo”
Let's try to simplify. Our system will allow us to use 1- or 2-phoneme syllables,
as long as they start with the consonant. And let's use simple consonants too.
Things like “koo”, “bah”, or “toe”.
“Koo” has 2 phonemes: /k/ and /oo/. Here's a timing diagram of it:
/k/ /oo/
So we need to build a machine that first produces the consonant, then after
that produces the vowel. This is something we can handle. So let's discuss how
to build the control circuits of the synthesizer.
Spectral Control
Your brain controls the resonant frequencies of formants by
changing the shape of the formant cavities in your throat and mouth.
The littleBits Synth Kit controls the resonant frequencies of
voltage controlled filters by modulating the control voltage to the filter.
The microsequencer has 4 independent steps, each having
its own knob to set the control voltage for that step.
4
2 5V
3
1
0V
1 2 3 4
dB
Freq.
Timing Control
This circuit uses a sawtooth (ramp) wave from
oscillator and a few logic modules to
generate a normal positive-edge-triggered
clock, and a delayed clock. The length
of the delay varies depending on the
frequency of the oscillator, so you will need
to experiment with different frequencies
until the delay sounds right.
Positive
edge triggered
clock
Delayed
clock
Control Path Functional Blocks
timing: spectra:
5V
to consonant formant
0V
Positive Variable control voltages
edge triggered
clock
ramp
source 5V
to vowel
formant 2
0V
Variable control voltages
Functional approximation
of the brain and
nerve connections delayed 5V
to the vocal tract! clock to vowel
formant 1
0V
Variable control voltages
Control Path Functional Blocks
timing: spectra:
5V
to consonant formant
0V
Variable control voltages
Positive
edge triggered
clock
ramp
source 5V
to vowel
formant 2
0V
==
Variable control voltages
Functional approximation
of the brain and
delayed
nerve connections 5V
clock
to the vocal tract! to vowel
formant 1
0V
Variable control voltages
Building the Synth
Tune All Three Filters:
Indicator line
5. Now listen to each filter in succession, making small adjustments with the “cutoff” knob
until all three produce the same pitch.
6. Now don't change the “cutoff” of the filters ever again, unless you want to re-tune them.
7. Turn the “peak” control down (counterclockwise) until the filter just stops oscillating (until it
stops producing a tone).
Building the Synth
Build and test the Clock Generator:
wire
NOR
5. Turn on the power and observe the LEDs
6. Adjust the dimmer and oscillator knobs to
set the LEDs to slow flashing. LED
7. The delayed clock LED should stay mostly on, o1 led
filter
random
speaker
3. Set set number mode to “values”
4. Set oscillator mode to “saw”
5. Set oscillator “pitch” to 30%
6. Oscillator “tune” knob does not matter
envelope
7. Set speaker volume to approximately 50%
8. Set dimmer to 0% (off)
9. Set envelope “attack” to 0%, and “decay” to max
10. Build the circuit pictured
11. Turn the dimmer on slowly to advance the sequencers
12. Check that at each step you can move the knobs
mix
on the sequencers to change the values on the number
number modules and that the sound changes
filter
13. Remove the power, dimmer, and
speaker modules when done. micro sequencer split
split
Dimmer HiFilt
Step 4
Step 3
Step 2
Step 1
LoFilt
Word 1: Cookie Vowel-
Vowel envelope: consonant
Attack: ~20% mix
Decay: max
Cons
Consonant envelope: onant
Attack: ~5% envel
Decay: ~10% ope
Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "koo" 05 20 15 envel
2 "kee" 10 85 15 ope
3
4
Dimmer
HiFilt
Step 4
Step 3
Step 2
Step 1
LoFilt
Word 2: Barbecue Vowel-
Vowel envelope: consonant
Attack: ~20% mix
Decay: max
Cons
Consonant envelope: onant
Attack: ~5% envel
Decay: ~10% ope
Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "bah" 30 60 5 envel
2 "bee" 10 85 5 ope
3 "koo" 05 20 15
4
Dimmer
HiFilt
Step 4
Step 3
Step 2
Step 1
LoFilt
A Few Enhancements
Adding delays in front of the filters can add
motion to the vowels, which can be more
life-like. Start with “delay” at min, and turn
up the “feedback”.
Step 4
Step 3
Step 2
Step 1
Word 3: Autobahn Vowel-
consonant
Vowel envelope: mix
Attack: ~20%
Decay: max
Cons
onant
Consonant envelope: envel
Attack: ~5% ope
Decay: ~10%
Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
----------------------------------- Vowel
1 "ah" 30 60 0 envel
2 "toh" 15 40 25 ope
3 "bah" 30 60 5
4 "n" 0 90 0
Dimmer
Step 4
Step 3
Step 2
HiFilt Delay
Step 1
LoFilt Delay
Vocal
Dimmer
Word 4: Robot
Vowel envelope:
Attack: ~20%
Decay: max
Consonant envelope:
Attack: ~5%
Decay: ~10%
Sequencer Settings:
Step phoneme LoFilt HiFilt Cons.
-----------------------------------
1 "roh" 15 40 0
2 "bah" 30 60 5
3 "t" 0 0 25
4
Step 4
Step 3
Step 2
Step 1
Control Reference
All filters tuned to ~800Hz Consonant Formant Frequencies:
(about 45%) peak set above 50% Cons. Seq.
cons. f1 Knob Setting
Vowel envelope: -----------------------------
Attack: ~20% w 290 12
Decay: max y 260 07
r 310 15
Consonant envelope: l 310 15
Attack: ~0%, varies f 340 17
Decay: ~10%, varies v 220 05
s 320 15
Vowel Formant Frequencies: Z 240 10
phoneme f1 f2 filt#1 filt#2 ch 350 19
--------------------------------- jh 260 08
"oh" 450 1000 15 40 p 400 24
"ah" 700 1300 30 60 b 200 00
"ee" 400 2500 10 85 t 400 24
"oo" 350 700 05 20 d 200 00
"ih" 350 2500 05 85 k 300 14
"eh" 750 2300 35 80 g 200 00
"uh" 420 1200 15 45 m 270 10
"er" 450 1400 20 55 n 270 10
"ll" 300 3000 00 90
Adapted from Dennis H. Klatt p987
Further Research Reference
Vocal Synthesis History:
Voder
Vocoder
Speak-N-Spell
Speech Science:
Homer Dudley (Voder, Vocoder)
Dennis H. Klatt (Rules based synthesis)
http://www.cs.indiana.edu/rhythmsp/ASA/Contents.html
http://dspace.mit.edu/handle/1721.1/29185 *good bibliography
mix
i37
filters, just like the vowel block, but using noise
i33 envelope
instead of an oscillator for the source. w19
split
mix
i37
o24
i37 mix i35 delay synth
i37 mix
speaker
i32 filter
w19 o21 number i35 delay
split
split
split
w19
w19
mix
w1
i37
i34 random
wire
mix
i37
i32 filter
split
i33 envelope
i36 microsequencer o21 number i35 delay
w19
split
w10 w1
w7
inverter wire
fork
split
w19
mix
i37
i37 mix
p1 power i6 dimmer i31 oscillator w1
wire
i32 filter
w19 o21 number i35 delay
split
NOR
w15
split
split
w19
mix
w19
i37
i6 dimmer i31 oscillator
w17
XOR
i32 filter
split
split
w19
Next Steps 2:
mix
i37
Control and timing is difficult. Using a programmable controller would
i33 envelope
improve intelligibility. If we could program each syllable or word w19
split
individually, all the detailed timings and filter movements would make
speech more realistic. Our new Arduino module, for example, would be a
perfect fit for this job.
mix
i37
o24
i37 mix i35 delay synth
i37 mix
speaker
i32 filter
w19 o21 number i35 delay
split
split
split
w19
mix
w1
i37
wire
mix
i37
i32 filter
i33 envelope
i36 microsequencer o21 number i35 delay
w19
split
w10 w1
w7
inverter wire
fork
split
w19
mix
i37
i34 random w19
i37 mix
p1 power i6 dimmer i31 oscillator w1
wire
i32 filter
split
littleBits
NOR
w15
Arduino
fork
Module
split
split
w19
mix
i37
i6 dimmer
w17
XOR
Cock
i32 filter
i36 microsequencer o21 number i35 delay
wire
CV1
w1
CV2
split
w19
Power i31 oscillator w19
split
Next Steps 2:
mix
i37
Here you see how you could use two littleBits Arduino modules to replace
i33 envelope
about 17 regular modules and get improved functionality. w19
split
littleBits
mix
i37
o24
i37 mix i35 delay
Arduino
synth
i37 mix
speaker
i32 filter
Module i35 delay
Clock
CV1
split
w19
mix
i37
mix
i37
CV2
i32 filter
i33 envelope
i35 delay
w19
split
Power
split
w19
mix
i37
i34 random w19
i37 mix
i32 filter
split
i35 delay
littleBits
Arduino
Module
split
w19
mix
i37
Clock
i32 filter
i35 delay
CV1
CV2
split
w19
Power i31 oscillator w19
split