Ilovepdf Merged

To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93
or Lucida Sans
Unicode. If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode,
drop me a line.
Solution for October 2002
"Virgin sharks lay eggs that hatch."
In case you're wondering, yes, this happened. Parthenogenesis has been reported in a white spotted bamboo shark at
Belle Isle Aquarium in Detroit, MI. Check out this report from National Geographic. Belle Isle Aquarium appears to be
affiliated with the Detroit Zoological Institute but they don't seem to have their own website.
Lower-case V, IPA 129

SIL [V], Unicode [ʋ]
Very sonorant, there being resonances (formants) about, well, in the low-to-mid F1 range and teh very low F2 range.
These resonances are weak, compared to the following vowel. So we're clearly looking at a sonorant consonant, and
probably not nasal, since there's no zero in the usual range. (The apparent zeroes above 2000 Hz are attributable to the
general lack of energy/amplitude, since the energy in voicing drops off something like 6dB/octave under normal
conditions.) For those of you who are noticing the extremely low F3, you may want this to be an /r/. But I've rarely seen
an utterance-initial /r/ with a) so much energy, b) a distinct steady state that is that long, and c) any energy above F3.
And check out the F4, which seems to be rising. This one is tough, but the transitions in F4 and F5 (if you can convince
yourself to see them) may indicate labiality. There's also just a trace of noise in F3 and F4, and certainly in the transition
between this thing and the following vowel, which might indicate an underlying fricative. Why on earth an initial
fricative should vocalize quite like /v/ always seems to do is quite beyond me.
Turned R + Syllabicity Mark, IPA 151 + 431

SIL [灼], Unicode [ɹ]̩
Well, haivng already mentioned the F3, I'll just say there it is. F3s that low can only be North-American approximant
/r/. The rising F2/F3 are probably transitions into the following consonant.
Lower-case D, IPA 104

SIL [d], Unicode [d]
I'm still trying to decide of the noise I get in otherwise well-behaved stops is just me, my recording environment (I
usually just sit in my office for these things), my microphone (which isn't bad in the abstract), or some combination.
But there you go. This is (weird echoey noise at 1500 Hz aside) a gap, indicating a good stop closure. And note the clear
voicing striations at the bottom. Voiced stop. Definitely not labial, due to the rising F2 into and falling F2 out. Not really
velar-pinchy looking, but that might be a good guess, but I don't want to open up the post-alveolar-as-palatal-as-
fronted-velar debate at this moment. Note the fricative release, treated separately.
Yogh, IPA 135

SIL [Z], Unicode [ʒ]
Probably still voiced, although this bit if high-amplitude frication is a bit short to be certain Note the clear burst
running the top to bottom at about 320 msec, and the very high amplitude noise in the F3-F4-and-higher range. High
amplitude, high frequency frication is sibilance. The fact that the amplitude peak seams to be in the F4-F5 range (i.e.
visible to us on this frequency scale) and not above, and also that there's really no noise below 1500Hz suggests post-
alveolar rather than alveolar sibilance. Voiced, probably, as I say.
Barred I, IPA 317

SIL [醹, Unicode [ɨ]
Short vowel. Given that this is so short it's almost definitely unstressed, mark it with something schwa-like and go on. I
chose barred-i following Keating et al. (1994), given the F2 appears to be closer to F3 than F1.
Lower-case N, IPA 116

SIL [n], Unicode [n]
This is a nice long sonorant consonant, for the same reason the first one was. It's got resonances, but they're weak
relative to things that we know we are going to want to call vowels. THis one does have a good zero, about 1200 Hz
again, with a pole at about 1500. If you know my voice, you know this is probably /n/. Note the length of this thing. This
is what is often considered a syllabic nasal, but as you can see, there's definitely a short stretch of vowel before the
nasal begins. So empirically/phonetically, not syllabic at all. Phonologically? Who knows? As a phonetician, I don't care
(much).
Esh, IPA 134

SIL [S], Unicode [ʃ]
High amplitude frication, concentrated in the F3-F4 range, and cutting out below 1500 Hz or so. Sibilance, probably post
alveolar, and in this case absolutely voiceless. The 'downward' moving centre to the noise is probably due to
coarticulation with the following rhoticized vowel and the /r/ following that. It sort of follows the path of the F3.
Script A + Rhoticity Sign, IPA 305 + 419

SIL [A惈, Unicode [ɑ˞]
I don't want to have a fight about this. Here's what always seems to go on with vaguely backish and non-high vowels
followed by /r/. There's something like a steady state, or at least an moment when reaches its maximum/minimum,
while the F3 is moving above it, and then there's a momemt when the F3 reaches its minimum, with the F2 moving
under it. The F3 steady state is harder to view here, but I've located it approximately in my segmentation. So the first
part of this has the high F1, low F2, the two straddling 1000 Hz, structure of a low back vowel (script a). It has a low F3 of
a rhoticized vowel. So that's how I've transcribed it.
Turned R, IPA 151

SIL [沘, Unicode [ɹ]
Continuing in that vein, I regard the /r/ here as separate and segmentalbe. Notice there's a sharp discontinuity in pitch
at about 720 msec, which is more or less the moment where the F1 and F2 start to move again. Note also that the F3
(which is kind of indistinct in the preceding segmetn) becomes more distinct, and is pretty flat. So even though the F2
is moving, I regard this as a separate segment. It's not my fault that the 'specifications' for /r/ include the low F3, but
do not include anything specific for F2. This is my story and I'm sticking to it.
Lower-case K, IPA 109

SIL [k], Unicode [k]
If you are clever, you spotted the velar pinch with the F3 of the /r/ dropping for no apparent reason just a bit at the
end, meeting the upward moving F2 on the way to the closure. You may also have noticed the double burst at about 900
msec, which commonly accompanies velar releases.
Lower-case S, IPA 132
SIL [s], Unicode [s]
Definitely fricative, almost definitely voiceless, and with its amplitude getting higher as you go up the frequency scale.
Definitely [s].
Lower-case L + Mid Tilde, IPA 155 + 428

SIL [l瀧, Unicode [ɫ] (hey, lookee, this is the composed Unicode symbol)
Note the low F2, starting around 1000 Hz and rising sharply. Note also the raised F3, both commonly associated with
dark (velarized) /l/ in North American English.
Lower-case E + Raising Sign, IPA 302 + 429

SIL [e3], Unicode [e]̝ (NB: I've just noticed that the Lucida Sans Unicode for this diacritic is wrong)
Owing to the F2 transition from the dark /l/, I had a hard time assigning a symbol, or for that matter a segmentation, to
this vowel. So I compromized. It's mid (F1 around 500 Hz), at least once you get past the /l/ around 1090msec. It's very
front, at least at the end, where it looks vaguely steady (at least compared to the previous 100 msec or so) for three
pulses or so around 1150 msec. So is it a diphthong? Dunno, but I have trouble calling something that has a steady state
a glide/semivowel when the supposed nucleus of the thing doesn't even seem to have a 'blip' of a steady state
anywhere. But maybe that's just my problem.
Glottal Stop, IPA 113

SIL [?], Unicode [ʔ]
See the creakiness? The temporal dispruption of the pulse patterns? That's creakiness, and it should probably be
transcribed that way. But this is the phonetic reflex of a glottal stop separating otherise adjacent vowels in different
syllables.
Epsilon, IPA 303

SIL [E], Unicode [ɛ]
Mid, though that extra harmonic or whatever it is suggests that the centre of this formant is just a hair higher than the
preceding one, indicating that this vowel is just a hair lower than the preceding one. But still very (very) front. So
lower-mid and front.
Lower-case G, IPA 110

SIL [g], Unicode [g]
Nice little gap, for a change, and nice little voicing bar. Transitions out are not helpful, but there's definitely velar pinch
heading into it.
Lower-case Z, IPA 133

SIL [z], Unicode [z]
This is the best [z] I've ever had, I think. It's basically the same as the [s] earlier, but there's evidence of voicing at the
bottom. Well, some.
Eth, IPA 131

SIL [D3], Unicode [ð]̝
Something happened here. I'm not sure what. Well, I know what it was supposed to be. It was supposed to be an eth.
And for some reason, I thought once upon a time, it was stopped. But looking at the spectrogram again, I'm not so sure
anymore. Okay, there's a change in the quality of the frication, in the 25 msec or so before 1550 msec. There's a
lessening of the noise around 1000 Hz, and there's some extra stuff in the 2500-3000 Hz range. That's the only evidence
I can see that anything happened here. Not good.
Schwa, IPA 332

SIL [侷, Unicode [ə]
Short, and not particularly distinct. Looked like a schwa at the time, although as I write this I'm leaning towards barred
i. But whatever.
Fish-hook R + Under-Ring, IPA 124 + 402

SIL [R8], Unicode [ɾ]̥
Okay, there's a gap here. And I thought it was voiceless, when I did the segmentation. Teeniest indication of a gap. Looks
like a flap to me.
Lower-case H, IPA 146

SIL [h], Unicode [h]
Fricative, actually possible voiced, which is supposedly typical of intervocalic /h/. Note the disorganized, fricated
energy, but organized in formants. The glottal/epiglottal noise is exciting all the resonances of the vocal tract, just like
voicing in a vowel. Voilá, [h].
Ash, IPA 325

SIL [Q], Unicode [æ]
Very low vowel (very high F1). Not too back, given the vaguely neutral F2 position.
Lower-case T, IPA 103

SIL [t], Unicode [t]
Gap. Slightly rising F2 (from neutral to just above neutral). Probably slightly rising F3. Classic alveolar transitions. Take
it and run.
Esh, IPA 134

SIL [S], Unicode [ʃ]
Again. Note the similarities. The flattish quality of this one is probably because there's nothing after it to add rounding
or rhoticity.
Support Free Speech
Robert Hagiwara, Ph.D.
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
To properly view the phonetic symbols on this page, you must have one of the following fonts installed on your system.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. Depending on which font(s) you have installed, one or the other symbol in paragraph
headings may not display correctly.
Solution for June 2003
"Prairie folk are hardy folk."
Lower-case P + Right Superscript H, IPA 101 + 404,

[pH], [pʰ]
For what it's worth, I thought starting with a nice little plosive for a change. It's always hard to spot an initial voiceless
plosive because, well, it's a gap. Nothing. Silence. So there we go. On the other hand, this one has a nice little burst, a
transient at the moment of release, followed by some nice aspiration. The burst is unfiltered, it's not really stronger in
any frequency range than any other, certainly not the higher frequenciies like we might expect for a /t/ or the middle
frequencies (which for me in this context is about 1500 to 2500 or so) like we might expect for a /k/). So good bet this is
a /p/. Well, good bet it's bilabial. It's clearly voiceless and aspirated, so it's a good bet it's a /p/.
Turned R, IPA 151,

[¨], [ɹ]
F1 is in the mid-range, about 500 or so. F2 starts out just low of mid, but not so low as to be that interesting, but check
out that F3. Starts right about 1750. Which is 70% of 2500, which is (about) the neutral F3. Which is right where I said F3
of /r/ is in my dissertation.
Epsilon + Rhoticity Sign, IPA 303 + 419,

[EÕ], [ɛ˞]
On the other hand, while the F3 stays nice and low, it isn't flat. It rises from the onset of voicing (about 225 msec) to a
peak about 50 msec later. At that point, the F1 is still resolutely mid, the F2 is resolutely central, and the F3 is still low.
So you might be tempted to label this an r-colo(u)red schwa (mid-central), but then you'd have to explain what the
difference between an r-colo(u)red schwa and a syllabic /r/ is. So we need another mid vowel, and I nominate front--
the F2 here isn't strictly front, but it's being shoved down by the F3--see how close the F2 and F3 are? They don't have
to be that close, they just are in this vowel. So I'm thinking this vowel is as front (in the sense of having a relatively high
F2) as it can be, considering the F2 has to be lower than the F3. Those of you who believe F3 of /r/ is the second
resonance of the vocal tract and the third resonance is wiped out by the side cavity (note the apparent zero in the 2000-
3000 range, which is pretty typical, although I think it's not so much a zero as a general depression of frequencies above
F3) need to explain to me where this F2 comes from.
Turned R, IPA 151,

[¨], [ɹ]
Well, here's another one. Get used to them. I really didn't intend this spectrogram to be so /r/ loaded, but this
utterance actually came up in conversation when I needed something for a spectrogram. Anyway, once again, here's
that low F3. Notice the also the amplitude here is a little lower, which to me says this is high-constriction, although the
acoustic modelling camp believes it's just a zero.
Lower-case I, IPA 301,

[i], [i]
Okay, I think this is an [i]. But acoustically, the F1 just isn't low enough to be high. At least it's lower than it was for the
preceding mid vowel, so maybe this should be small-cap I. But I don't think I get small cap I in this environment. But I
could be wrong. Anyway, The F2 is too high to be anything but in the high front area, so either the F1 is weird in this
environment or I'm just wrong about the F2.
Lower-case F, IPA 128,

[f], [f]
Well, this is pretty definitely voiceless--nothing at all going down where the voicing bar should be. Even though I high-
pass filtered out the noise at the bottom, if you compare the clearly voiced things, this cleary isn't. It's pretty clearly
fricative. Now to back up just a second, there are only five voiceless fricatives in English--the glottal, the two sibilants,
and the two dentals (labio- and inter-). This fricative is pretty strong, but for its duration, which is over 100 msec, it's
not as strong as a sibilant. If this were sibilant, it could only be the post-alveolar--the energy doesn't get stronger in the
high frequencies. But it also goes on a little too far into the low frequencies. Esh usually has a zero below the mid-F2 or
so, so there's rarely much energy below 2000 and almost none below 1500. And the energy here goes straight down
(with a bit of an interruption at 1000 Hz) down to 500. If this were glottal (/h/), it would show resonances excited by
noise. This one seems to, I can talk myself into that energy at 500 being the F1 continuing through, and the energy
above that seems to connect the F2 on either side. But the F3 range just doesn't look continuous at all, and in generally
the energy is just too diffuse, or broad band, to be excited by resonances of the vocal tract. So this must be anterior,
either [f] or theta. Exactly which I'm not sure I could tell if I didn't know. So call this an anterior nonsibilant fricatve
and move on.
Lower-case O, IPA 307,

[o], [o]
Okay, so look at that F1, right around 500, i.e. mid. Look at that F2, well below the midrange for F2. So this must be fairly
back and/or round. The F3 is right about where my neutral F3 is, maybe just a little low, which suggests a bit of
rounding, if anything. So what vowel is mid, and back and round? Pretty much only /o/ for me. I note with interest that
if you really want to believe this is a diphthong, you'd have to do some hand waving. Looks pretty flat to me.
Lower-case K, IPA 109,

[k], [k]
Well, there's not a lot of transition in the preceding vowel, and nothing after, so all we arely ahve to identify what is
going on here is the burst at 800 msec. So let's look at that burst. It's definitely got some shape to it, in the sense that it
looks filtered, and so not labial. If you had to call it a fricative, you wouldn't call it /s/, too narrow band and too
concentrated in the middle frequencies. I'd be hapier if this were clearly doubled, but it just doesn't look like a clear
double burst. But what energy there is in this burst is in the F2 range (and F4/F5), and the low F2 range at that. So that
makes it seem velar, if we've eliminated coronal. That's My Story And I'm Sticking To It.
Glottal Stop, IPA 113,

[?], [ʔ]
On the other hand, there's just too much gap between the release of the /k/ and the onset of voicing, so there must be
something else in here, from 825 to 950 msec or so. There's something od about the first coule glottal pulses, but
they're not so odd that I can convince myself they represent a burst. So what kind of plosive doesn't have a burst? Well,
glottal stop springs immediately to mind. Typically the first few pulses of voicing after a glottal stop are irregular (i.e.
creaky), but not always, particularly combined with relatively high pitch, which this is (look at the spacing of the
pulses. The highest pitch in this utterance seems to be the syllable/vowel between 1150 and 1300 msec, the lowest that
last vowel from 1500 to almost 1700 msec.)
Script A + Rhoticity Sign, IPA 305 + 419,

[AÕ], [ɑ˞]
The F1, which would be hard to discern without the formant track, is quite high, indicating a fairly low vowel. The F2 is
very low, even lower, if anything, than the /o/. So this is back and quite low. The F3 is dropping over the course of this
vowel, which is the hallmark of r-colo(u)ring, at least if you're not surrounded by /r/s.
Turned R, IPA 151,

[¨], [ɹ]
So if the previous vowel is r-colo(u)red, there must be an /r/ here somewhere...
Lower-case H, IPA 101,

[h], [h]
Another fricative. This one looks voiced, which will throw you if you don't know that intervocalic /h/s are almost
always voiced. The formant track goes insane through this thing, but I think you can see wht I mean by 'well-shaped'
resonances here, at least compared with the other fricatives. The noise here is clearly being supported by cavigu
resonances, more or less exactly continuous (that's my story) with the surrounding vowels. Which is /h/ all over the
place.
Script A + Rhoticity Sign, IPA 305 + 419,

[AÕ], [ɑ˞]
This should look familiar, but backwards. I want to point out that the F2 moves here suggesting two targets. Even
though the F3 starts out in the extremely low, probably /r/, range, the fact that the F2 has one target at the beginning
of this vowel and another at the end tell sus that there's two things here. The second one must be the /r/, and the first
one must have the low F2. And open-o isn't really an option in my dialect.
Turned R, IPA 151,

[¨], [ɹ]
Once again, there must be one here somewhere.
Fish-Hook R, IPA 124,

[R], [ɾ]
This looks voiced, and although there's some noise slushing around in it, I think we can agree this looks like a gap. But
really really really short. Which is how we describe the English flap/tap thing. So that's what this is.

[i], [i]
Another vowel. This one with a mid-to-high vowel's F1, and an extremely front looking F2.

[f], [f]
Again this should look familiar. I amn concerned about the obvious F2-shaping of the noise, but at least it's consistent.
Whatever the first one was, this is the same thing.
Lower-case O, IPA 307,

[o], [o]
At this point, the patterns should be looking very familiar. And except for the pitch, this one looks pretty much the
same as the previous /o/.

[k], [k]
And once again, you couldn't ask for a better match.
This page last modified: 11/08/2009 22:57:16
Winnipeg, Manitoba
CANADA R3T 5V5
Depending on which font(s) you have installed, one or the other symbol in paragraph headings may not display
correctly. To properly view the phonetic symbols on this page, you must have one of the following fonts installed on
your system.
All are available freeware. I'm committed to keeping my recommendations to a) freeware fonts with b) decent looking
IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts with good looking IPA
support, I'll test them out and add them to the style sheet.
Solution for September 2003
"They like iced tea with lemon."
Eth, IPA 131,

[D], [ð]
Starting just before 100 msec, there's a very clear voicing bar. Too clear, actually, but whatever. It's weak, but there's a
trace of resonant information above. Not really enough to indicate full, open, resonating chambers, so this is some kind
of close approximant or fricative. The choice of fricative comes from the 25 msec of noise or so just before 200 msec. So
this is sort of affricated, but not really. Just bear in mind this pattern as the logical output of initial fortitian of an initial
Eth. It's voiced, it's vaguely fricative (perhaps mostly stopped in its first phase, but clearly fricative in its release). It
looks coronal, judging from the F2 and F3 transitions (F2 'locus'ed between 1700 and 1800 Hz, F3 high), but not at all
sibilant. The noise we see is broad band and not particularly high in amplitude (compared with other fricatives in the
spectrogram). So coronal, but not sibilant, fricative, voiced.
Lower-case E + Small Capital I, IPA 302 + 319,

[eI], [eɪ]
This is pretty typical of my /e/s, a vowel that is quite high (and notice the flat F1 contour), starting quite front and just
getting fronter. So if you really want this to be a mid vowel, sorry. It ain't. See Hagiwara (1997) for some interesting
pictures...
Lower-case L + Mid Tilde, IPA 155 + 428,

[lō], [ɫ]
This isn't the darkest /l/ I've ever produced, but it definititely ain't light, clear, or plain, (whichever you pick it still ain't
it). So I've marked it as velarized. The relatively low (or at the very least not at all raised) F2 is the factor in darkness,
and, well, I've made my choice. The amplitude discontinuity (at about 300 msec going in and about 375 going out)
suggests a degree of closure, and there's just too much energy in the lower frequencies to make a good zero (which we'd
expect with a nasal). The raised F3 (if you wish really really hard, have lived an honest life, or are pure of heart) and the
vaguely raised F4 (if you can make it out) is/are good cues to the /l/ over the other approximants.
Lower-case A + Small Capital I IPA 304 + 319,

[aI], [aɪ]
The F1 starts out quite high, 800 Hz or so, indicating a very low vowel. F2 seems to start out rather low (although on
further consideration, it's pretty neutral. So we're looking for something that starts out as a low vowel of, well, non-
back quality. I think that describes IPA [a] (Cardinal 4) pretty well, especially in English, where the other choice for a
non-back low vowel is Ash, but hopefully would look more front. And not as low. But whatever. After about 400 msec or
so the formants transition pretty clearly, F1 dropping in the last third of the vowel duration (indicating raising) and F2
raising into the very high frequencies (for F2) indicating extreme fronting. So this is a diphthong that moves from [a] to
high and front.
Lower-case K + Right Superscript H, IPA 109 + 404,

[kH], [kʰ]
The other thing worth noting about the preceding vowel si that the F3 starts to drop towards the end, as the F2 raises.
In fact the F2 is heading up to a frequency that is well into the [i] range, as far as frontness goes, but it's heading up to
meet the F3, which is still slightly raised presumably from the preceding /l/ or even the off-glide before that. Anyway,
F2 and F3 coming together like that is pretty classic 'velar pinch', which is mirrored (in reverse) in the shaping of the
aspiration noise on this segments release (550-600 msec or so). So this must be a velar. And pretty clealry voiceless and
strongly released, which I transcribe as aspiration, even though this isn't classic VOT-style aspiration.

[?], [ʔ]
There's about three pulsey looking things in the trailing end of the preceding plosive release. So this might be a teeny
bit of voicing and I should have marked a tiny vowel here. But I didn't, because these pulsey thingies are just my vocal
folds flapping around as I close them. I'm not sure why I wanted a glottal stop here, since this isn't really a very strong
prosodic break, but there it is. There's also some irregularity in the voicing o the following vowel, which I interpret as
further evidence of glottality here.
Lower-case A + Small Capital I IPA 304 + 319,

[aI], [aɪ]
Another one of these dipthongs, this one with less steady state and a much clearer F1 transition, but otherwise pretty
similar.
Lower-case S, IPA 132,

[s], [s]
Very broad-band noise. This segment is a little ambiguous, since it's not obvious where this noise is centered, i.e. it
doesn't obvioiusly get stronger in the higher frequencies, nor does it obviously have a lot of energy in the F2-F3 region.
These two are the most positive cues to either /s/ness or Eshness (I don't really feel like switching fonts in the middle
of sentence, so work with me here) respectively. It doesn't help that the energy drops off suddnely below 1500 Hz,
which is usually a good cue for Eshness (Esh seems to almost always have a zero in the low frequencies). So there you
go. If you guessed Esh here, I can't fault you. ALthough It's worth pointing out that the energy drop off is similar in
profile (if not in absolute distribution) to the following plosive release/aspiration, whichis pretty clearly /s/-like. Then
again, it's not so far off from a fricative coming up later that will confuse everyone.
Lower-case T + Right Superscript H, IPA 103,

[tH], [tʰ]
The closure portion of this thing lasts about 60 msec, which is pretty long for one of my stops. This may be evidence of
gemination, but I'm not convinced I say 'iced tea' and not 'ice tea'. I definitely tried to say 'iced tea', but I'm not sure I
did. Anyway, the F2 transitions, if there are any, are consistent with an alveolar articulation, and not obviously
indicative of velar or bilabial articulations. That's my story and I'm sticking to it. The real give away is the high-
frequency center of this noise, more obviouisly [s]-like than anything else.
[i], [i]
Well, you couldn't ask for a better [i], unless it had a narrower-band F1. It's really broad-band, but I think the center of
the F1 is about 300 or 350 Hz or so, which is pretty low, I guess. The F2 is freakishly high, up into the neutral range for
F3. So if you don't regard this as hyperarticulated (and probably stressed), you've missed something. Not to mention
how long this thing is. Egad. Anyway low F1 (high vowel), high F2 (front vowel).
Lower-case W, IPA 170,

[w], [w]
Woof. I wouldn't want to be riding this F2 through this spectrogram. Check out that drop from about 1300 to 1350 msec.
So the F1 stays dead flat, indicating high or close articulation that doesn't vary a whole lot. The F2 goes from absurdly
high to absurdly low (relatively speaking) indicating extreme backness and rounding (probably both). The F3 doesn't
reall move much out of neutral (once the F2 gets out of the way), so this can only be [w].
Schwa, IPA 322,

[Ŧ], [ə]
Well, this vowel is quite short and it's in a syllable next to something very clearly stressed. So chances are good that this
vowel is unstressed and therefore reduced, and therefore doesn't carry a lot of information. So I call it schwa and move
on.
Theta, IPA 130,

[T], [θ]
Fricative. That part is clear. Broad-band and top-heavy so this looks sort of /s/ like, though it doesn't have nearly the
amplitude that the other [s]s in this spectrogram have. So if you think this is an [s], I guess I won't argue with you.
There is, however, this little matter of the F2 transition into it, which just doesn't look alveolar (although dependng on
what you think is going on on the other side, this oculd be transitional to that). And then theres the teeny littly gap at
about 1550 msec, followed by a fairly clear release burst that doesn't really look alveolar either. So we've got something
that doesn't really look alveolar, but it doesn't really look like anything else. Sorry, that's the best I can do, but I could
go on for a while about how an interdental could look like an alveolar that isn't an alveolar. But I won't.
Lower-case L + Mid Tilde, IPA 155 + 428,

[lō], [ɫ]
Eseentially, this looks the same as the last one, so there's not much point in repeating myself. This one looks more like a
nasal, with its abrupt edges and such, but the zero (if that's what that is between F2 and the very raised F3) is in the
wrong place. Nasals almost always have a nasal between the F1/voicing bar and the F2/nasal pole.
Epsilon, IPA 303,

[E], [ɛ]
Well, this is about as low an Epsilon a I have ever seen. It's even lower (higher F1) than the [a] in the first diphthong.
The F2 is neutral, and the F3 tells us nothing. So I supposed this looks like an Ash, but it's sort of short and flat to be an
Ash. Which leaves Epsilon as the closest plausible vowel. Maybe I'm starting to do Canadian Shift/Lax Vowel Lowering.
Lower-case M, IPA 114,

[m], [m]
See, now *this* is a nasal. Voicing bar or F1, about 300 Hz, Zero, Pole/F2 thing about 1200 Hz or so. The pole is a little
high for my [m], but not high enough for my [n]. But the F2 and F3 (and F4) of the surrounding vowels all point
downward, which is pretty good evidence of bilabial articulation.
Schwa, IPA 322,

[Ŧ], [ə]
Schwa. Not sufficiently different from the previous vowel to be worth arguing about, so I I won't.
Lower-case N, IPA 116,

[n], [n]
Another nasal, thi sone is complicated by the fact that my amplitude goes down at the end of the utterance anyway. But
notice the voicing bar, the zero(es) and the evidence (if you wish hard enough) of a resonance at about 1500 Hz, which
is about right for my [n]. Note that wherever you think this resonance is, it's different from the preceding nasal, so if
there's a choice to be made, they're different.
This page last modified: 11/08/2009 22:57:18 Support Free Speech

Winnipeg, Manitoba
CANADA R3T 5V5
To properly view the phonetic symbols on this page, you must have one of the following fonts installed on your system.
All are available freeware. Depending on which font(s) you have installed, one or the other symbol in paragraph
headings may not display correctly. I'm committed to keeping my recommendations to a) freeware fonts with b) decent
looking IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts will good looking IPA
Solution for July 2003
"It left a greasy stain."

[?], [ʔ]
It's always tough to do these initial things. There's definitely soem kind of transient thing at just about 100 msec, and to
account for that I need some amazing change of state to have occured right there. So I vote for a sudden glottal
closurein preparation for the upcoming voicing. For those of you who care, this is not a 'hard glottal attack'. It's a
normal speech-ready gesture. There's nothing harsh or creaky about the onset of voicing here, except for the transient.
Small Capital I, IPA 319,

[I], [ɪ]
I've decided that the strong movement here is a combination of just stuff that happens to lax vowels, the low
information load (and therefore trend toward reduction) of this morpheme, and the feact that it's just a very high F2
interacting with a following consonant. Even if you split the difference in the F2 movement,y ou've still got something
well above 1800 Hz, and I think there's reason to suppose that almost-2000 Hz bit at just about 150-175 msec or so is the
F2 extremum here. So there's the the very front bit. The F1 is a little low (although it doesn't move much at all through
this spectrogram), so this is something that's basically high. Front, high, and slightly reduced. There you go.
Lower-case T, IPA 103,

[t], [t]
Well, I guess I goofed and trascribed this as a /t/, although it's pretty obviously voiced. I think this was me trying very
hard not to glottalize this stop. Sorry. As it is, I think I did, in that there's a closure transient at about 225msec, where
I've marked the segment boundary. This could be glottal (it looks a lot like the initial transient I took to be glottal), or it
could be alveolar. I don't know. The clue that this is alveolar tho is the F2 and F3 transitions. Even though the F2 is
falling quite sharply, it's not really falling below 1600 or 1700 Hz, and my alveolar transitions usually end up at about
1750. (It may also be that it's being drawn down in coarticulatory preparation for something further on.) But the F3
isn't coming down at all--if anything, it's rising, just a little. So even though the F2 is falling (from a very high F2), this is
still consistent with an alveolar plosive.
Lower-case L + Mid Tilde, IPA 155 + 428 (209 composed),

[lō], [l ̴] ([ɫ] composed)
So starting from this nice transient at 275 msec or so, there's a stretch of about 50 msec of clear voicing (and even
resonance) of lesser amplitude than the following vowel. So something is definitely going on here. The low F1 suggests
some kind of close articulation, although don't quote me on that. The F2 is a little low, and the F3, well, this is scary. I
think the F3 looks like t starts at about 2000 Hz (in the transient, rises to about 2300 Hz in the first bit of this segment,
and then up to 2600 or 2700 (continuous with the F3 in the following vowel). Given where the F2 is, this just isn't low
enough for /r/, particularly fo rmy voice. So I regard this F3 as spurious. If instead we look at that last bit, the part
continuous with the F3 in the following vowel, the F3 actually looks a little high here--it's high in the first vowel (and
not obviously 'pushed out of the way' of the F2), and stays high here. It takes a dive later but settles in the last third or
so of the spectrogram into a frequency a little lower than this. So taking the F3 average over the course of the
spectrogram, this is a high F3. And maybe the F5 starts up there near 4000 Hz and falls. The raised F3 is characteristics
of /l/ in English, and the low F2 of backing (velarization, or darkness). That's my story and I'm sticking to it.
Epsilon, IPA 303,

[E], [ɛ]
Finally, something that's sensible. Sort of. The F1 here is pretty high, certainly higher than every other F1 in this
spectrogram, so this is fairly low. The F2 is just above 1500 Hz--while that seems low (for front vowels) it's not low
enough to be at all 'back'. So we're talking central at best, if not a little front. So for my dialect, that suggests Ash,
Epsilon, or some variant of IPA /a/ (Cardinal 4), such as you might get with a very fronted Turned-V, as I showed in my
dissertation. The high F3 here bugs me a lot, but this is something moderately low, front-to-central. That's all I really
know. It's not really long enough to be a good Ash, but stranger things have happened. This is one of those cases where
you have to leave a few choices up, and let the top-down information of trying to fit words and phrases into a sentence
take over. So come back to this later to decide on Epsilon.

[f], [f]
Well, this is voiceless (well, pretty voiceless, compared to obviously voiced things), and definitely fricative. Comparing
the frication noise here to other obvious fricatives later on in the spectrogram i'd say this was pretty weak, i.e. not
really strong enough to be a sibilant. Which is lucky, because it looks sort of like an Esh, especially with that apparent
zero or whatever it is around 1000 Hz. This isn't really a place where you'd expect an /h/ in English (between a vowel
and a plosive), and there's no evidence of vocal tract filtering in the noise. So this is pretty front, either labiodental or
(inter)dental. But there's not a lot of information here to distinguish the two, unless you really want to believe the F3 is
lowering and based on that you really, really, want to believe the F1 and F2 are too, which might suggest labial over
dental. But I wouldn't place any bets.

[t], [t]
Plosive! Thank heavens, something certain in an uncertain world. Voiceless too. And almost definitely alveolar. First the
positive evidence--the release and VOT phase of this plosive has a high-frequency tilt to it--the amplitudes go up as you
go up in frequency. Like an /s/, only as the release of a plosive. Make a leap of reasoning.
Barred I IPA 317,

[ö], [ɨ]
Lower-case G, IPA 110,
[g], [g]
Last time I checked, the IPA had allowed the type-face "g" as well as the print "g", and as far as I can tell Unicode has
taken them up on their word. Anyway, the transition into this plosive is pretty classically velar--F2 and F3 coming
together. They're aiming at a very high frequency, which is a clue that this is a front velar, which is a clue that there's a
front vowel coming up shortly, but there you go. The transitions out are obscured by something else, so they're not
helpful. Apparently voiced, although the voicing dies off and once the stop is released there's some VOT. Do not be
confused by the leaky noise between 1500 and 2000 Hz, which is just how my velars are.
Turned R, IPA 151,

[Ļ], [ɹ]
There's this short VOT we mentioned above, during which F1 appears to be doing nothing in particular, F2 is sort of low
(I take that stuff between 1000-1500 Hz to be the voiceless portion of the sharply rising F2 in the following vowel). Now
look at that F3 in the vowel. It's pretty straight, so if we follow it back down, it crosses the release burst of the /g/ at
about 1900 Hz. (Actually, if we follow the curving F3 path in the preceding velar pinch, we also get a release-crossing at
about 1900 Hz, which I'll take as confirmation, but if it didn't I'd just ignore it....) Anyway, if you're one of those who
need F3 to be below 2000 Hz to count as an /r/, this one does. But it's a pretty near thing--it might not really be that
low. So the lesson is, low is low. The F3 here cleary starts lower than can easily be accounted for by anything else (lip
rounding, and so forth), so posit the /r/ here and get on with things.

[i], [i]
Ah, transitions. The F3 must be ignored here--on the left it's pulled way down by the /r/, and on the right it's hitting a
neutral frequency, so this is just transition from extremum back to neutral. F2 again is pulled (or pushed) way down by
the /r/, but it's rising way, way beyond neutral (about 1500 Hz or so), up to the absurdly high frequency of 2200 or 2300
Hz. F2 just doens't get that high except for /i/, or possibly /e/. F1 is consistent with either, actually, so we'll need more
information to decide. Luckily, there's stuff coming up later that will narrow things down.

[s], [s]
Ah, sibilants! The amplitude! The frequency! /s/ is easy to sp ot. See the noise? See the broad band (basically, the noise
here is full-spectrum). See how the strongest frequencies in the noise are probably up off the top of the visual range
here, well above 4000 Hz? Classic /s/. In fact, for those of you who are new to this, we often suggest trying to spot the
/s/s first, just because they're so easy. So if you haven't yet, can you spot another /s/ later in the spectrogram?

[i], [i]
F1 hasn't changed much from last time. Neither has F2 or F3 for that matter. So whatever the last one was, this one
probably is (one) too.

[s], [s]
Look familiar? By the way, it's not amazingly often you see formant structure in /s/s, because the noise is usually so
great it just swamps anything else going on. But here the amplitude is relatively low (for an /s/), and we can see what
looks like an F2. Well, this is an F2, in the sense of being the front cavity resonance. I think.

[t], [t]
Ah, gaps. Okay, I'll stop. But this is a nice looking gap. Not a lot of useful transitional information here, i.e. nothing that
obviously says velar or bilabial, so we might choose alveolar by default. Once again, it's worth looking at the release
information, which is /s/ like, which further suggests an alveolar release.
Lower-case E, IPA 302, and Small Capital I + Tilde, IPA 319 + 424
[eI)], [eɪ]̃
I love final lengthening--you can see stuff that you don't ordinarily see. It helps that this is partially nasalized, but
that's getting ahead of things. Note the F1 here is just a hair higher than it was in the preceding two or three vowels,
but nowhere near as high as for the lowish front thing in the second syllable. So this is definitely our mid vowel. For me
/e/ is rarely mid, but whatever. The F2 is quite high, indicating frontness, so we're going for something that starts in the
classic American-style /e/ range. So if this is definitely mid, the others must be high. But then this vowel moves. At
about 1350, the F1 drops sharply--part of the sharpness is probably just the wide-band FFT interacting with the
changing harmonics (which you can kind of see between 1250 to 1350 msec just above F1), and probably there's a
change of state going on here. Actually two things are going on here--the change in tongue-body position, and the
introduction of nasality. You can see the nasality in the 'fuzzing' of the fomrant edges (i.e. the peaks in the filter are
flattening out a little), and the apparent zero creeping in about 1500 Hz. That might just be the distance between the F1
and F2 (I can convince myself this same zero is creeping into the previous two vowels, but this one seems like it's higher
in frequency, in spite of the slightly lower F2. So I think something else is going on here. So anyway, the F2 here is high,
but not as high as the things earlier we have now decided are /i/. This thing has a low F1 and a high F2, and is
connected to an /e/. So I transcribed it as a nasalized small-cap I.

[n], [n]
So just about 1500 msec, the amplitude in the vowel drops abruptly, accompanied by a sharp transition in the F2. So this
is a closure of some kind. But look at that voicing. In spite of the end-of-utterance low pitch/low airflow/low
everything, the voicing goes on and on. So this is must be a sonorant, i.e. open enough to prevent any pressure from
building up above the glottis. The zero(es) here and in the preceding suggest nasality (consistent with nasalizing vowels
before coda nasals and all that). What you can see of the F2 (toward the beginning of the nasal, is at about 1500 Hz,
which for my voice is almost definitely alveolar--there are no velar transitions to suggest anything in that direction and
the F2 would be lower (around 1100 or 1200 Hz) for my /m/.

Winnipeg, Manitoba
CANADA R3T 5V5
Solution for April 2004
"They need to fix that pothole."
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
Starting from about 100 msec for about 75 msec there's some fairly serious voicing. And it doesn't look like stop voicing,
but there's no real evidence of anything happening higher up until at least haflway through, then then only up around
F4 should be (up above 3500 Hz or so). But there's a little bit of noise up there, so this may be a voiced fricative.
Variations on Eth are often a good guess in this kind of case, just because it's English. (Work out the reasoning for
yourself).
Lower-case E + Small Capital I

[eɪ], IPA 302 + 319
Diphthong! Ends with a very front vowel, so we're looking at something fronting. The F2 is way too high to be [a] or
anything like a standard diphthong nucleus, so we're looking at something diphthongized. The F1 is low of neutral, so
this is a fairly high vowel to begin with (even if it does seem to get higher after 250 msec or so), and the F2 is quite high,
so we're looking at something front. So sorta high, quite front, and diphthongized (to the front). Knowing this is
English, this is probably /e/. Since 'they' is a pronoun, and subject to a lot of reduction (due to frequency or whatever),
I'm agnostic about whether there's an underlying glide at the end of this word. I'm inclined to believe there is, because
frankly my /e/ F2 isn't usually that low. But we'd need to check this.
Lower-case N
[n], IPA 116
Well, that sharpling falling F2 transition might suggest labial, but a) it starts out so high anyway, b) F3 doesn't seem to
be doing anything, and c) it doesn't really fall that far. Taken individually, a) where is it supposed to go from that high,
b) if it's labiality that is effecting the F2, it ought to be effecting F3 as well, and c) the transitions stay well above 1500 Hz
ro so, and the 'locus' of alveolar transitions, if you believe in such a thing, is about 1700-1800 Hz more or less depending
on the voice and who you read. So the transitions here are really suggesting an alveolar. The resonances (and zeroes)
suggest a nasal, and the alveolarness (for my voice) is further confirmed by that F2 thing just a hair below 1500 Hz.
Lower-case I
[i], IPA 301
Transitions aside (notice that the F2 looks alveolar on either side, although the F3 transition out of this vowel is a little
ambiguous, this has the same low F1 and extremely high F2 of (in fact higher than) the previous offglide. So there you
go. Must be [i].
Lower-case D
[d], IPA 104
Well, this gap, starting about 475 msec and probably continuing to about 575 msec is kind of long, so it might be two
things. So looking just at the left half (or so), it has alveolar transitions in (although the F3 loos like it's coming down,
which might lead you to suppose this is bilabial--but the lowering in F3 doesn't look 'transitional', so much as it looks
like it's 'modifying' from a higher general position). The first 50 msec or so of the gap is clearly voiced, so this is
probably [d].
Lower-case T
[t], IPA 103
On the other hand, the second half of this gap is probably voiceless. It's got a very strong, sibiliant ([s]-shaped) release,
which again suggests alveolar (which is consistent with the transitions out). In spite of the apparent quick onset of
voicing, is incredibly aspiratd. There seems to be some noise going on for almost 200 msec, which is just impossible. So I
think this is actually pretty heavily aspirated. But since we usually define aspiration as VOT, rather than noise, I didn't
transcribe it that way. I might change my mind about this next time.
Schwa
[ə], IPA 322
Well, there's almost 100 msec of voicing in here, although the F2 is still fairly front and moves to neutral (or just a little
lower), so following Keating et al (1994) as I usually do I probably should have transcribed this as a barred-i. But it
doesn't make a lot of difference. Note the undifferentiated noise above 2000 Hz that just goes on and on.
Lower-case F
[f], IPA 128
More noise, but this time it's voiceless. My first guess at this would be /h/, since there seems to be some resonance--F3
goes straight through, and you can see F2 moving from its low at about 700 msec up to where it is when the voicing
kicks on at about 775 msecs. But there's no F1. Which is possible, but atypical of /h/. Hmm. The noise is
undifferentiated, once you get past the frequencies below 1000 Hz, so there's very little in the way of high-frequency
filtering going on, in spite of the apparent resonance. Hmm. Then there's that F2. Why is the F2 falling to that point
around 700 msec, and then suddenly rising again after. There must be *something* there that is a target causing that. It
can't be alveolar, because an alveolar fricative of any kind would be more [s] shaped. I suppose it could be postalveolar,
given the absence of low-frequency energy, but it doesn't look [ʃ]-like, really. It just isn't loud enough, for one thing. So,
getting back to the F2, we're looking for something with a low F2 target. And lower than an alveolar target. Frankly,
that's as far as I can get. I'd ask you to *consider* [f], and move on.
Small Capital I
[ɪ], IPA 319
Well, at least a typical vowel. Okay. F1 is low, but not incredibly low, so we're looking at a higher than mid vowel, but
probably not just plain high. The F2 is moving, but it starts high and travels slightly higher. It never gets anywhere near
the range of the [i] vowel or the front offlglide we've already seen. But it's definitely front. So this oculd be [e] or [ɪ], and
it just ain't long enough to be [e]. Now look at those F2 and F3 transitions and move on to the next sound.
Lower-case K
[k], IPA 109
Ya gotta love velar pinch. And even though it doesn't really look like a stop (I think the reason my velars never look
stoppy may be my overlarge uvula--don't get me started), it's got to be velar, which doesn't leave many choices. Pretty
clearly voceless. Burst is a little low in frequency, if that's what that is just shy of 900 msec, but whatever.
Lower-case S
[s], IPA 132
Okay, what was that I was saying about 'not being loud enough'? Obviously I was mistaken. Here we've got broad band,
relatively (I guess) high amplitude noise, so this is probably a fricative. And voiceless, of course. I'd guess [ʃ] due to the
apparent low-frequency zero, but the energy doesn't seem concentrated in the visible/present low frequencies the way
I expect [ʃ] to look. So if it's sibilant, it must be [s]. Which is consistent with the distribution of energy, but I'd be
happier if the higher frequencies (above 4000 Hz) were clearly higher in amplitude than the lower frequencies. I mean, I
think they are, but it's arguable.
Eth
[ð], IPA 131
Well, there's something here. It looks slightly gappy, but there's some noise in the higher frequencies, and there's
some...something at 1500 Hz and something weirdly burst-transient like just after 1000 msec. I don't know what to
make of this, except if it's fricative, it ain't sibiliant, and it doesn't really look like anything else. I mean, a post-fricative
plosive ought to have more 'plosion' to it, just because of the airflow. I guess. So bear in mind this is English, and play
the odds.
Ash
[æ], IPA 325
Well, a nice high F1, indicating a rather low vowel (though I think the vowel coming up is lower), but with a mostly
neutral-looking F2. So it's not amazingly front, but then lowish front vowels aren't.
Superscript Glottal Stop + Lower-case T

[ʔt], IPA 113 (superscripted) + 103
Well, technically, the last bit of the preceding vowel is creaky voiced, but I think that's really a manifestation of final-
stop glottalization. I suppose it could be a low boundary tone of somekind, but then I'd expect it to be less creaky and
jitter-y. Anyway, if this is glottalization of the stop rather than creakiness of the vowel, I've decided to transcribe this as
pre-glottalization. For which there technically is no IPA character, so I improvised. The F1 transitions in the previous
vowel indicate approaching closure, and the F2 looks like it's falling. Which would suggest bilabial, except that the F3 is
just sitting there. I don't know why that is. It might be that labialization really does only effect F2, or that it's stronger
on F2 than F3, if there's something else effecting F3. Or maybe it's just a fluke. I think I hear a release, but I can't see on
the spectrogram now. Maybe the alveolar closure is a figment of my imagination, or wishful thinking, cuz there ain't
much evidence for it here.
Lower-case P + Right Superscript H

[pʰ], IPA 101 + 404
On the other hand, there's definitely a gap somewhere in here, and it's got a very broad-band release. Note that it's
pretty even in amplitude across the frequency range, and has none of the [s]-shaped frication that the release around
600 msec had. So no double burst, apparently down-pointing out-transition in at least F3 and possibly F2, depending on
how you read that aspiration, so I'd guess bilabial. There's close to 50 msecs before the voicing kicks on (from a hair
before 1275 to a hair after 1300 msecs at the voicing bar, and a bit later for the energy in F2, which is usually what I use
to mark the beginning of the vowel). So this is aspirated [p].
Script An
[ɑ], IPA 305
Ah, high F1, and an F2 as low as it can get. My kind of [ɑ].
Fish-hook R
[ɾ], IPA 124
Well, there's something there. It's very short, and kind of noisy, but there's something that has a vaguely [s]-shaped
release phase as it approaches 1500 msec. Anything that short can only be some kind of flap, so there you go.
Lower-case H
[h], IPA 144
Well, it's noisy, but it's organized into bands like a vowel. Like a voiceless vowel. Like an [h].
Lower-case O
[o], IPA 307
Well, the F1 is not high. It's a little high of neutral, but it's pretty mid-looking. The F2 is nice and low. So this is middish
and roundish, and I have a western US voice so I really only have vowel back there this could be. Not a heckuva lot of
diphthongization either, although that might be the backing environment of the following dark [l].
Lower-case L + Mid Tilde

[l ̴] , IPA 155 + 428 (209 composed, Unicode character precomposed as [ɫ])
Well, there's change in the amplitude somewhere around 1700 msecs. THe F1 is, well, wherever it is. The F2 is low,
indicating either backness or rounding (or both, as in the preceding [o]), and that F3 is, well, sort of high. So since this
isn't likely to be [w] (the F1 and F2 would both presumably lower as the offglide in a word like 'hoe'), I'll take the
slightly high F3 as an indicator of the lateral. Okay, technically, the IPA regards this as an [l], with velarization
(darkness), which is marked by the Mid Tilde diacritic. It even uses the dark-l symbol as the example for using the Mid
Tilde diacritic. But for some bizarre reason, the IPA assigned the composed dark-l symbol a number (209), instead of
leaving it as a combination of two independent symbols. And the Unicode standards people assigned a Unicode number
to every numbered IPA symbol. So there you go. In my systems, both on-screen and printed, the precomposed symbol
always looks better than the composed one, but it doesn't seem like it should.
Whew. Another hard one.


Winnipeg, Manitoba
CANADA R3T 5V5
Solution for November 2004
Well, nobody mentioned that I didn't change the header last month to read "Solution for October 2004", but I assume
people have been paying attention (Sara, Grant... ;-). Thanks for the feedback. Now on to the spectrogram.
"Several awards were given out."
Lower-Case S
[s], IPA 132
Well, this is sort of weak, but you can see the fricative, starting about about 100 msec and going on to almost 250 msec.
It's broad band (rather than organized into narrower formant-like organization), and concentrated in the very high
frequencies. Toward the beginning, where the overall amplitude is much less, the frequencies we can see are very high.
So this is a pretty typical sibilant, almost definitely [s].
Epsilon
[ɛ], IPA 303
So the vowel starts just before 250 msec and goes on for about 100 msec. The F1 is a little low of mid, suggeting a
slightly higher mid vowel, which is weird for me if this is [ɛ]. But anyway, this vowel looks quite central, and so this
looks very schwa-like. But judging from the amplitude it must be stressed, and if it's stressed, this can't be my [ʌ],
which is typically low. So this is probably mid or high, and otherwise non-descript. Oh well. The falling formants are
clearly transitional, since they mostly all do it, so they don't help. Not long and not tense.
Lower-Case V
[&#x;], IPA 129
Well, it's very weak, but there's frication throughout this gap up to 400 msec. There's also voicing, so this is either a
very weak voiced stop or a weak voiced fricative. The transitions in and out all suggest bilabial (although the F3 doesn't
help--more later), so, since bilabial fricatives are not an option, as this is my English, labiodental is not a stretch.
Turned R
[ɹ], IPA 151
So there's that F3, clearly transitioning way down in the previous vowel, and there it is here, down at about 1600 Hz or
so. So this must be an /r/ of some kind. Nuff said, I guess.
Schwa
[ə], IPA 322
Transcribing this as a vowel is merely a convenience. There seems to be 'something' between the /r/ and the following
segment, but exactly what is open to interpretation.
Tilde L (Dark L)
[ɫ], IPA 209
So from about 475 to 550 msec or so, there's a dip in amplitude, accompanied by an apparent zero in F2, and a relatively
high F3. Very high considering the previous /r/. I'm not quite sure what's going on in F2, but the raised F3 is usually a
good indicator of the lateral. And the F2, if it's anywyere, is down there below 1000 Hz, so it must be dark.
Barred I
[ɨ], IPA 317
Again, this is a bit of vowel. I made a mistake transcribing it as barred-i--I think I must have misread the F3 as an F2, but
that's idiotic, since even the highest F2 can't get up that high. But it's a transitional vowel more than anything else. So
there.
Lower Case W
[w], IPA 170
So here's another attenuated, presumably consonantal articulation, but fully and clearly voiced. So this is almost
undoubtedly a sonorant, but very, very close. The low F2 is consistent only with something very round and very back,
and the following F2 transition is typical of [w], so there you go.
Open O + Rhoticity Sign

[ɔ˞], IPA 306 + 419
Again, a transcriptional convenience more than anything else. I needed something mid and fairly round. This might be
better as an [o], but whatever. The /r/ colo(u)ring is fairly clear, with that low F3 again, but the vowel is again mostly
transitional. To the degree that it F1 indicates something mid and the F2 indicates something mostly back, take your
pick.
Turned R
[ɹ], IPA 151
Well, here's another /r/. Low F3, though not as low as previously. I've been noticing that the bandwidths of initial /r/s
being very narrow, but that may just be me doing stuff to do that. Anyway, thsi looks like a typical /r/ in coda position,
with the higher (closer to F3) F2 than in other positions.
Lower-Case D
[c], IPA 104
Gap. Probalby a plosive of some kind. Very voiced, which is interesting. There seems to be a folling F4 (or something),
But the F3, if anything, has a rising transition into this gap. But then it would be, since it' starts so low. The F2 is
ambiguous to say the least. SO on the balance, I think the alveolar guess is just a default thing. The release is even weak,
so it's not clear if that fricative coming up is just release (which would tell us a lot about the place of this plosive) or if
it's a fricative.
Lower-Case Z
[z], IPA 133
Well, if there's a fricative, it must be a sibilant. Look at that frequency. And it must be alveolar. Same reason. And
voiced.
Lower Case W + Rhoticity Sign

[w˞], IPA 170 + 419
I went nuts with the rhoticity this time, I guess. There's some labial shaping to the tail of the fricative, which is the only
decent indication of anything other than just the /r/. If you missed it, I don't blame you. There's some weird overlap of
the fricative and the following /r/, but the only thing I latched onto was the attenuation of the voicing before the F3
minimum coming up. The rhoticity sign is is just because the F3 is so low.
Turned R + Syllabicity Mark

[ɹ]̩ , IPA 151 + 431
Okay, well, there's clearly something we want to call a vowel here, and it's got the typical low F3, hence the syllabic /r/
transcription.
Lower-Case G
[g], IPA 110
Well, there's another gap here, with voicing. Nice loud, but noisy release. Transitions in and out have F2 and F3 close
together, so velar is probably the best guess. Transitioning from back to front velar is apparent from the frequency of
the 'pinch' on either side.
Small Capital I
[ɪ], IPA 319
So if it ends up as afrotn velar, this vowel must be front vowel. ANd it is. Quite front, at least at the beginning. And quite
high, judging from the low F1. The F2 transitions down in a way that I'd expect an [i] to have more of a steady state or
trend upward, at least until it starts to transition into a following consonat. So this is probably lax/short/whatever you
want to call it.
Script V
[ʋ], IPA 150
Well, this looks short, and vaguely flap-like, being a short, fully voiced 'gap' looking thing. But while the amplitude
attenuation is appropriate for a flap, the sonorousness is not. The formants may dip away, but the don't 'stop' the way
the would/might ina proper flap. Then there are thte transitions. All falling. So this looks (bi)labial again. Not really a
good fricative like the previous one, but an approximant-y looking fricative. And again probably labiodental over
bilabial, just because this is English.
Schwa
[ə], IPA 322
Okay, now this looks like a schwa. Formants at 500, 1500, 2500 and--well, short of 3500, but what the heck.
Lower-Case N
[n], IPA 116
Another segment that has roughly the duration of a flap, although it might be just a tad long. And considering the
length, it's fully sonorant (with resonances), so probably not a tap. The attenuation therefore is probably close
articulation, and the discontinuity in the frequency/bandwidth of the formants, not to mention the apparent zero
below "F2" suggest a nasal. No evidence of velar pinch or a single F2/F3 range pole, and the pole is up around 1500 Hz,
too high to be my bilabial. One one choice left.
Lower-Case A + Upsilon
[aʊ], IPA 304 + 321
Well, abstracting away from the first 50 msec or so, the F1 here is fairly high, indicating a fairly low vowel. The F2 starts
in the central range, and goes down, indicating increasing rounding and/or backing. So this is probably a diphthong
[aʊ]. I wish I could see hte F1 dropping a little, to suggest going from low-to-high, vowel height-wise, but whatever. I
don't regard /aO/ a likely diphthong in this case.
Lower-Case T + Right Superscript H

[tʰ], IPA 103 + 404
Well, I probalby should have marked some preglottalization on the previous vowel, but I guess I just read it as low pitch
when I was doing the transcribing. But looking at it now, it looks creaky, sort of. Anyway, without the release, it would
be hard to tell anything was going on here. But the release at about 1800 msec, is clearly alveolar-looking. Sharp and
abrupt, broad band, followed by something that looks very sibilant. Typical of alveolar plosion noise. Especially if you're
used to seeing my voice in these things.

Winnipeg, Manitoba
CANADA R3T 5V5
PLEASE NOTE: I've become disillusioned with Lucida Sans Unicode--I've done some checking, and some of the diacritics
are just wrong, and the linespacing is weird--some styles are lower than others. So enough of that. I'm now touting the
virtues of Gentium, a freeware font by Victor Gaultney that (despite some minor kerning problems) is quite beautiful,
legible, readable, etc. Failing that, I'm also testing the Beta of SIL's SILDoulosUnicodeIPA(all one word), the Unicode-
compliant layout of their popular Doulos. The advantage is the Unicode organization. The disadvantage is that you
*must* use Keyman or Insert Symbol or something to use it (or for that matter any Unicode special characters) into a
word-processing document. And at least in Word (2000, ver 9.0.somethingsomething, SR something something), while
the non-spacing characters look fine (even beautiful) on screen, for some reason when you print them, they come out
spaced way too far to the left. It seems that Unicode 'smart font' features are getting outsmarted by Word.
On the other hand, these things look GORGEOUS on screen, and print fine for me off of the browser when coded in
HTML. So, since this is coded in HTML, I'm switching to them. This month I'm leaving up the regular SILDoulos IPA93 as
before, but have switched the defaults for the Unicode to Gentium (first) and SILDoulosUnicodeIPA. At some point I'll
probably stop using the regular Doulos altogether. So please please please download either Gentium or
SILDoulosUnicodeIPA from the above links, so you can see the symbols as they were meant to be seen.
If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode, drop me a
line.
Solution for May 2003
"The damage reduces the value."
Eth, IPA 131,

[D], [ð]
Okay, there's not much going on here, and now that I'm looking at the print of the spectrogram and not the original,
there's not a lot of evidence of friction for a lot of it. Starting just sy of 100 msec there's 25-50 msec of voicing (three or
four pulses), there's also a couple of very noisy pulses (look at the high frequencies) suggesting either frication, or the
fricated release of a stop. Okay, so maybe I should have marked this as 'raised'. But at least it's fully voiced. No useful
transition information, so go for coronal by default. Voiced, coronal, probably plosive or fricative. Moving on.
Schwa, IPA 322,

[«], [ə]
Well, from 125 msec to 175 msec or so there's a vowel. The F1 is just a little low of mid, the F2 just a little high of central,
and the F3 just plain high. Must have been particularly short that day, if you follow. But look at the distribution of the
four visible formants. Dead even, all the way up. Must be schwa, although the F1 is a little low.
Lower-case D, IPA 104,

[d], [d]
Well, once again, not a lot of useful transition information, so if you're guessing, guess coronal. Definitely voiced for
most of its duration, it's got a nice sharp release burst with some very high frequency frication. This is pretty classic for
coronal plosives.
Ash, IPA 325,

[Q], [æ]
It's worth noticing that this vowel is pretty long, and things this long tend to be low vowels. So it's probably safe to
ignore the F1 transition from the preceding consonant. on the other hand, the F2 is definitely diving down into the
following consonant, so we have to factor that part of the F2 transition out. So we have a vowel whose F1 (excluding the
first third or so as transition) has a *very, very* high F1, and whose F2 (ignoring the last third or so as transition) is also
very, very high. F3 is not doing anything useful, so we'll ignore it. So the F1 says this vowel is low, and the F2 says this
vowel is front. How many low, front vowels can you find in English? Well, maybe three, but ASH is the most obvious
guess.

[m], [m]
So there's 50 msec (or so) between 400 and 500 msec where there's definitely strong voicing (note how it doesn't die out
like in the earlier [d]), but of lower overall amplitude than either flanking vowel. So we definitely have a sonorant of
some kind. Given the abruptness of its edges, not to mention the discontinuity in the first formant, I'm inclined to say
'nasal' over oral approximant, although it's not really perfect for a nasal. The F2 and F3 (and F4 if you're paying
attention, and even the F1 if you look close) are all coming down into this segment, and rising out of it. When your
formants are pointing down, that's usually labial, so this is probably [m]. Unfortunately, there's not a lot of zero/pole
information in the middle to add to this guess.
Schwa, IPA 322,

[«], [ə]
Once again, you have to ignore the transitions, but this is pretty classic schwa otherwise.

[d], [d]
Sometimes I wonder about my plosives. Okay, there's a gap (ignoring the junk in the F2 range, which just seems to be
me and phrase-internal plosives), with clear, if dying, voicing at the bottom. The transitions here (F2 and F3 at least)
seem to point up, if they point anywhere, which is typical of coronals. Of course, it helps that the F2 of the flanking
vowels is relatively low to the 1700-1800Hz 'locus' of alveolar transitions.
Yogh, IPA 135,

[Z], [ʒ]
Okay, this is the release of the preceding plosive, so phonologically we have an affricate here. Which narrows down the
possibilities for English. But in case it wasn't English, or you weren't sure, this looks like a Yogh. It's fricative, it's quite
high amplitude, it's very broad band, and it dies in the low frequencies. THe big difference between 's' and 'sh' is a) the
frequency of the peak amplitude of the frication (high for 's' and in the F2-F3 or occasionally F3-F4 range for 'sh'), and
the shape of the noise in the lower frequencies (sharp cut off below 1500 Hz or so for 'sh', where it's more gradual for
's'). I thought there was evidence of voicing during this, but now I don't see it. But knowing this is an affricate and
believing the preceding plosive to have been voiced ....
Turned R, IPA 151,

[¨], [ɹ]
There's that low F3. There must be an /r/ in here somewhere.
Barred I, IPA 317,

[ö], [ɨ]
I went with Barred I, because I couldn't decide what else this would be. It's not really high (the F1 is low of mid, but not
really Low). It's vaguely front (the F2 is high of central, but not really High--it doesn't help that the F3 is pushing down).
The F3 is useless, since we know there's an /r/ in there somewhere. And listening to it (come back when this gets
archived), it just sounds reduced (probably due to the /r/ colo(u)ring). Moving on.

[d], [d]
Again. Nothing new to add, really.
Barred U, IPA 318,

[¬], [ʉ]
I'm not sure I've ever used this symbol in a spectrogram before. This vowel was definitely round, contrary to all
expectations (it's my western US speech, it follows a coronal). The vowel is moderately high (as high as vowels get in
this stupid spectrogram). THe F2 is not high enough for this to be front, even if it is round. So call it central, and round,
and move on.

[s], [s]
Classic, if slightly weak, [s]. The highest amplitude of the frication is in the very high frequencies, it's sort of broad
band, and it dies (relatively) slowly as you go down in frequency. Okay, so it zeroes out abruptly at 1300 Hz or so, like the
previous Yogh. But there's just no way to read the center of the energy in the F2/F3/F4 range in this one.
Schwa IPA 325,

[«], [ə]
Another one of these too, albeit with more classical-looking formants.
Lower-case Z, IPA 133,

[z], [z]
Well, if the preceding consonant was [s]. this is just the same, but with voicing.
Eth + Raising SIgn, IPA 131 + 429,

[D3], [ð]̝
Well, this looks slightly voiced, but that might just be the background noise. But it's got a short VOT, whatever it is.
Okay, so if you think it's a stop, fine. That's why I used the raising sign. It's not bilabial looking or velar looking, so it's
probably coronal. It could be a [d], I guess, but it certainly isn't a /d/.
Schwa, IPA 322,

[«], [ə]
I'm not sure I ever had this many reduced vowels in a spectrogram, let alone actual schwas. And this is a little long. But
there you go.
Lower-case V, IPA 129,

[v], [v]
Well, it's voiced. There's stuff going on above, which looks like pulsing, but if you look really, really closely (and wish
really, really hard), you can see that the apparent pulses in the middle of this don't line up at all with the pulses in the
voicing bar. So either this is mid-frequency mushy stuff in the middle of one of my plosives, evidence of noise, or
evidence of 'double clunking'--the kind of thing you get with velar releases (and occasionally 'distributed' coronal
releases. Okay, so there's no release here (it's in the middle, after all), this may be some structure, like my fat lower lip,
flapping against some other structure, like my upper lip or my upper incisors. You pick, based on what kinds of words
you can get out of the strings.
Ash, IPA 325,

[Q], [æ]
I love it when the F1 and F2 steady states don't overlap at all. As in the previous [ae]. Well, here's another one, although
the F2 isn't really front at all, and the F1 is tough to reckon given the apparent coarticulation with the following thing,
whatever that is. SO it could be another schwa, but it's just too long and loud. And high pitched (if you can see the
pulses). So something that ends up a low vowel, not at all back. Again, depending on what the voice is like you have
your choice of three vowels. Again, you pick.
Lower-case L + Mid Tilde, IPA 155+428 (composed as 209),
[lò], [ɫ]
Well, it would be easy to miss this, but there's this pretty abrupt 'thing' in the F2, around 1400 or 1425 msec, a
discontinuity. This moment is when the F1 stops moving, the F2 lowers abruptly, and the amplitude of F3 suddenly
drops. And something weird happens in F4, but I don't know what that's about. So something is happening here. F2 is
low, so whatever it is, it's back. So it's either [w] or [l].
Lower-case J, IPA 153,

[j], [j]
Okay once and for all, see how the F2 of /j/ here is much much much higher than the F2 during the thing that turns out
to be an [l]. So explain to me how this[l] can be anything but dark (velarized) when it's F2 is nowhere anywhere near
front. Sorry, folks, while onset /l/ may not be as dark as coda /l/, US (and Canadian) English /l/s are never anything but
velarized. In my experience. And certainly in my voice. Further more, the need for it to be very, very back, is so at odds
with the need to produce a front glide into the following vowel, that the front glide gets displaced way way way later
rather than risk fronting the [l]. At least that's my story.
Okay, so anyway, The F1 is pretty low (again, as low as it ever gets), the F2 clearly has a very high target, but no steady
state. So this is probably a glide of some kind, and it must be front.
Barred U, IPA 318,

[¬], [ʉ]
Well, setting aside the end-of-utterance low amplitude problem, the 'targets' for this thing appear to be moderately
high (lowish F1), and not front at all, but either sort of back but not round, or sort of round and not really far back. So I
used the barred U again.

Winnipeg, Manitoba
CANADA R3T 5V5
Welcome to the Monthly Mystery Spectrogram webzone. These pages are Rob Hagiwara's professional web-space. For
personal musings, please see Rob's blog.
This is the How To page of the mystery spectrogram webzone. Contents for this page:
How do I read a spectrogram?

Sources and filters
Formants and vowels
Manner and place of articulation (plosives)
Fricatives
Nasals
Approximants
Etc.
May 2009: fixed broken rollovers, miscellaneous text cleaning
September 2007: updated menus and navigation
May 2006: Commentary about stuff that I plan to change, or would like some in put on, is interspersed throughout this version of
the page, in this goofy text. Depending on your browser, I think this is rendered in colo(u)r. General stuff changing throughout:
Fix Unicode IPA symbol calls to uniform style (hex?)

Re-do all figures with better original recordings and cleaner spectrograms
Rollover formant markings for all figures
Incorporate other rollover information (nasal poles/zeroes, duration) into text
Clean up text
Separate sections onto separate pages for faster loading(? maybe vowels, obstruents and sonorant consonants? or vowels,
consonants, and allophonic/prosodic stuff?)
Beef up allophony section to include flapping, glottal stops and glottalization, nasality (nasal vowels and nasal taps?), and
...?
How do I read a spectrogram?

The same way you get to Carnegie Hall: practice, practice, practice!
First, read the chapter on acoustic analysis in Ladefoged's A Course in Phonetics, or better yet take a course based on
Ladefoged's Elements of Acoustic Phonetics or Johnson's Acoustic and Auditory Phonetics. Or you can just read this summary,
but bear in mind there's going to be a lot left out, especially in the 'why' realm. Then (as usual) learn by doing!
The goal of this page is to provide just enough basic information for the novice to begin, perhaps with some guidance, the
process of decoding the monthly mystery spectrogram. This page is not intended to be the last word in spectrographic analysis in
general, nor even the last word on spectrogram reading. However, reasoning your way through a mystery spectrogram is very
instructive, especially in relating acoustic events with (presumed) articulatory ones. That is, in relating physical sounds with
speech production.
If you're reading this, I assume you are familiar with basic articulatory phonetics, phonetic transcription, the International Phonetic
Alphabet, and the surface phonology of 'general' North American English (i.e. phonemes and basic contrasts, and major
allophonic variation such as vowel nasalization, nasal place assimilation, and so forth). I try to keep in mind that I have an
international audience, but there are some details I just take to be 'given' for English. Someday if we do spectrograms of other
languages, we'll have to adjust.
I really recommend that beginners find someone to discuss spectrographic issues with. If you're doing spectrograms as part of a
class, form a study group. If you're a 'civilian', form a club. Or something. I'm toying with the idea of starting a Yahoo group or
something for us to do some discussions as 'community'. Strong opinions anyone? Unfortunately, I don't have time to answer in
detail every e-mail I receive about specific spectrograms or sounds or features, but if you have a general question or suggestions,
please feel free to contact me.
Please note: My style sheet calls for this page to be rendered in either Victor Gaultney'sGentium font, or in SIL's
SILDoulosIPAUnicode. These fonts are (in my opinion) the best available freeware fonts for IPA-ing in Unicode for the web.
Please see my list of currently supported fonts for justification and links to download these fonts.
So what is a spectrogram anyway?
A sound spectrogram (or sonogram) is a visual representation of an acoustic signal. To oversimplify things a fair amount, a Fast
Fourier transform is applied to an electronically recorded sound. This analysis essentially separates the frequencies and
amplitudes of its component simplex waves. The result can then be displayed visually, with degrees of amplitude (represented
light-to-dark, as in white=no energy, black=lots of energy), at various frequencies (usually on the vertical axis) by time
(horizontal).
Depending on the size of the Fourier analysis window, different levels of frequency/time resolution are achieved. A long window
resolves frequency at the expense of time—the result is a narrow band spectrogram, which reveals individual harmonics
(component frequencies), but smears together adjacent 'moments'. If a short analysis window is used, adjacent harmonics are
smeared together, but with better time resolution. The result is a wide band spectrogram in which individual pitch periods appear
as vertical lines (or striations), with formant structure. Generally, wide band spectrograms are used in spectrogram reading
because they give us more information about what's going on in the vocal tract, for reasons which should become clear as we go.
Sources and filters

We often talk about speech in terms of source-filter theory. Put simply, we can view the vocal tract like a musical instrument.
There's a part that actually makes sound (e.g. the string, the reed, or the vocal folds), and the part that 'shapes' the sound (e.g.
the body of the violin, the horn of the clarinet, or the supralaryngeal articulators). In speech, this source of sound is provided
primarily by the vibration of the vocal folds. From a mathematical standpoint, vocal fold vibration is complex, consisting of both a
fundamental frequency and harmonics. Because the harmonics always occur as integral multiples of the fundamental (x1, x2,
x3, etc.—which phenomenon was mathematically proven by Fourier, hence "Fourier's Theorem" and "Fourier Transform"), it turns
out that the sensation of pitch of voice is correlated to both the fundamental frequency, and the distance between harmonics.
The point is that vocal source isn't just one frequency, but many frequencies ranging from the fundamental all the way up to
infinity, in principle, in integral multiples. Just as white light is many frequencies of light all mixed up together, so is the vocal
source a spectrum of acoustic energy, going from low frequencies (the fundamental) to high frequencies. In principle, there's
some energy at all frequencies (although unless you're talking about an integral multiple of the fundamental, the amount will be
zero).
The energy provided by the source is then filtered or shaped by the body of the instrument. In essence, the filter sifts the energy
of some harmonics out (or at least down) while boosting others. The analogy to light again is apt. If you pass a white light through
a red filter, you end up removing (or lessening) the energy at the blue end of the spectrum, while leaving the red end of the
spectrum untouched. Depending on the filter, you might pass a band of energy in the red end and a band of energy in the green
band, and something else. The 'color' of light that results will be different depending on which frequencies exactly get passed, and
which ones get filtered.
In speech, these different tonal qualities change depending on vocal tract configuration. What makes an [i] sound like an [i] is not
something to do with the source, but the shape of the filter, boosting some frequencies and damping others, depending on the
shape of the vocal tract. So the 'quality' of the vowel depends on the frequencies being passed through the acoustic filter (the
vocal tract), just as the 'color' of light depends on the frequencies being passed through the light filter.
So, we can manipulate source characteristics (the relative frequency and amplitude of the fundamental—and some properties of
some of the harmonics) at the larynx independently of filter characteristics (vocal tract shape). < a href="#figfilter">Figure 1, is a
spectrogram of me saying [ i ɑ i ɑ ] (i.e. "ee ah ee ah") continuously on a steady pitch. On the left, a wide band spectrogram
shows the formants (darker bands running horizontally across the spectrogram) changing rapidly as my vocal tract moves
between vowel configurations. (Take a moment to notice that the wide band spectrogram is striated, and the horizontal formants
are 'overlaid' over the basic pattern of vertical striationsn.) On the right, a narrow band spectrogram reveals that the harmonics
—the complex frequencies provided by the source—are steady, i.e. the pitch throughout is flat. Because some harmonics are
stronger than others at any given moment, you can make out the formant structure even in the narrow band spectrogram. The
filter function (the formant structure) is superimposed over the source structure.
If you're still not sure what I mean by 'band' or 'formant', pass your mouse cursor over the figure. I've marked the center
frequency, more or less, of each visible formant in the figure. Look for the in captions of spectrograms for extra information like
this. Depending on your hardware/software configuration, you should also be able to play the audio clip, by pressing the 'play'
button in the figure caption.
Figure 1. wide band (left) and narrow band (right) spectrograms, illustrating changing vowel quality with level pitch.
The other side of the source-filter coin is that you can vary the pitch (source) while keeping the the same filter. Figure 2 shows
wide and narrow band spectrograms of me going [aː], but wildly moving my voice up and down. The formants stay steady in the
wide band spectrogram, but the spacing between the harmonics changes as the pitch does. (Harmonics are always evenly
spaced, so the higher the fundamental frequency —the pitch of my voice—the further apart the harmonics will be.)
Figure 2. wide band (left) and (narrow band) spectrgrams of me saying [aː], but with wild pitch changes.
A word on sources
I like to divide the kinds of sources in speech into three categories: periodic voicing (or vibration of the vocal folds), non-voicing
(which most people don't consider, but I like to distinguish it from my third category), and aperiodic noise (which results from
turbulent airflow).
Voicing is represented on a wide band spectrogram by vertical striations, especially in the lowest frequencies. Each vertical 'line'
represents a single pulse of the vocal folds, a single puff of air moving through the glottis. We sometimes refer to a 'voicing bar',
i.e. a row of striated energy in the very low frequencies, corresponding to the energy in the first and second harmonics (typically
the strongest harmonics in speech). For men, this is about 100-150 Hz, for women it can be anywhere between 150-250 Hz, and
of course there's lots of variation both within and between individuals. In a narrow band spectrogram, voicing results in harmonics,
with again the lowest one or two being the strongest.
Non-voicing is basically silence, and doesn't show up as anything in a spectrogram. So while there isn't a lot going on during
silence that we can see in a spectrogram, we can still tell the difference between voiced sounds (with a striated voicing bar) and
voiceless sounds (without). And usually there's still air moving through the vocal tract, which can provide an alternative source of
acoustic energy, via turbulence or 'noise'.
On the other hand, it's worth distinguishing several glottal states that lead to non-voicing. Typically, active devoicing, results from
vocal fold abduction. The vocal folds are held wide apart and thus movement of air through the glottis doesn't cause the folds to
vibrate. If the vocal folds are tightly adducted (brought together in the midline) and stiffened, the result is no air movement through
the glottis, due to glottal closure. Ideally, this is how a 'glottal stop' is produced. Finally, the vocal folds may be in 'voicing
position', loosely adducted and relatively slack. But if there is insufficient pressure below the glottis (or too much above the glottis)
the air movement through the glottis won't be enough to drive vibration, and passive devoicing occurs.
Noise is random (rather than striated or harmonically organized) energy, and usually results from friction. In speech this friction is
of two types. There's the turbulence generated by the air as it moves past the walls of the vocal tract, usually called 'channel
frication'. This is just 'drag', resistance to the free flow of air. If the air is blown against (instead of across) an object, you get even
more turbulence, which we sometimes call 'obstacle frication'. For instance, when we make an [s], a jet of air is blown against the
front teeth—the sudden displacement results in a lot of turbulence, and therefore noise. In spectrograms, noise is 'snowy'. The
energy is placed in frequency and amplitude more randomly rather than being organized neatly into striations or clear bands. (Not
to say they're aren't or can't be bands. They're just usually don't have 'edges' to the degree that formants do. Or may.)
We'll return to voicing and voicelessness below, after we deal with vowels.
So what's the deal with formants?

A formant is a dark band on a wide band spectrogram, which corresponds to a vocal tract resonance. Technically, it represents a
set of adjacent harmonics which are boosted by a resonance in some part of the vocal tract. Thus, different vocal tract shapes will
produce different formant patterns, regardless of what the source is doing. Consider the spectrograms in Figure 3, which
represents the simplex vowels of American English (at least in my voice). In the top row are "beed, bid, bade, bed, bad" (i.e. [bid]
[bɪd] [beɪd] [bɛd] [bæd]). Notice that as the vowels get lower in the 'vowel space', the first formant (formants are numbered from
the bottom up) goes up. In the bottom row, the vowels raise from "bod" to "booed"—the F1 starts relatively high, and goes down
indicating that the vowels start low and move toward high. The first formant correlates (inversely) to height (or directly to
openness) of the vocal tract.
Now look at the next formant, F2. Notice that the back, round vowels have a very low F2. Notice that the vowel with the highest
F2 is [i], which is the frontmost of the front vowels. F2 corresponds to backness and/or rounding, with fronter/unround vowels
having higher F2s than backer/rounder vowels. It's actually much more complicated than that, but that will do for the beginner. If
you're picky about facts or the math, take a class in acoustic phonetics.
Figure 3. Wide band spectrograms of the vowels of American English in a /b__d/ context.
Top row, left to right: [i, ɪ, eɪ, ɛ, æ]. Bottom row, left to right: [ɑ, ɔ, o, ʊ, u].
There are a variety of studies showing various acoustic correlates of vowel quality, among them formant frequency, formant
movement, and vowel duration. Formant frequency (and movement) are probably the most important. So we can plot vowels in
an F1xF2 vowel space, where F1 corresponds (inversely) to height, and F2 corresponds (inversely) to backness and we'll end up
with something like the standard 'articulatory' vowel space.
Note that some of the vowels in Figure 3 ([eɪ] and [ʊ] especially) show more movement during the vowel (beyond just the
transitions). Whether that makes them diphthongs (or should be represented like diphthongs) I'll leave for somebody else to
argue. But before we get too far, what would you imagine an [aɪ] or [aɪ] diphthong would look like?
It's worth pointing out now that all the formants show consonant transitions at the edges. Remember that the frequency of any
given formant has to do with the size and shape of the vocal tract—as the vocal tract changes shape, so do the formants change
frequency. So the way the formants move into and out of consonant closures and vowel 'targets', is an important source of
information about how the articulators are moving.
Manner (and place) of articulation

Plosives (oral stops) involve a total occlusion of the vocal tract, and thus a 'complete' filter, i.e. no resonances being contributed
by the vocal tract. The result a period of silence in the spectrogram, known as a 'gap'. A voiced plosive may have a low-frequency
voicing bar of striations, usually thought of as the sound of voicing being transmitted through the flesh of the vocal tract.
However, due to passive devoicing, it may not. And due to perseverative voicing even a 'voiceless' plosive may show some
vibration as the pressures equalize and before the vocal folds fully separate. But let's not get lost in too many details.
Generally we can think about the English plosives as occurring at three places of articulation—at the lips, behind the incisors, and
at the velum (with some room to play around each). The bilabial plosives, [p] and [b] are articulated with the lower lip pressed
against the upper lip. The coronal plosives [t,d] are made with the tongue blade pressing against the alveolar ridge (or
thereabouts). [k] and [g] are described as 'dorsal' (meaning 'articulated with the tongue body') and 'velar' (meaning 'articulated
against or toward the velum'), depending on your point of view. (I tend to use the 'dorsal' and 'velar' interchangeably, which is
very bad. I use 'coronal' because it's more accurate than 'alveolar', in the sense that everybody uses their tongue blade (if not the
apex) for [t,d], but not everybody uses only their alveolar ridge.)
That controversy aside, the thing to remember is that during a closure, there's no useful sound coming at you—there's basically
silence. So while the gap tells you it's a plosive, the transitions into and out of the closure (i.e. in the surrounding vowels) are
going to be the best source of information about place of articulation. Figure 4 contains spectrograms of me saying 'bab' 'dad' and
'gag'.
Figure 4. Spectrograms of "bab" "dad" and "gag". .

There's no voicing during the initial closure of any of these plosives, confirming what your teachers have always told you: "voiced"
plosives in English aren't always fully voiced during closure. Then suddenly, there's a burst of energy and the voicing begins,
goes for a couple hundred milliseconds or so, followed by an abrupt loss of energy in the upper frequencies (above 400 Hz or so),
followed by another burst of energy, and some noise. The first burst of energy is the release of the initial plosive. Notice the
formants move or change following the burst, hold more or less steady during the middle of the vowel, and then move again into
the following consonant. We know there's a closure because of the cessation of energy at most frequencies. The little blob of
energy at the bottom is voicing, only transmitted through flesh rather than resonating in the vocal tract. Look closely, and you'll
see that it's striated, but very weak. The final burst is the release of the final plosive, and the last bit of noise is basically just
residual stuff echoing around the vocal tract.
Take a look at those formant transitions out of and into each plosive. Notice how the transitions in the F2 of 'bab' point down (i.e.
the formant rises out of the plosive and falls into it again), where the F2 of 'gag' points up? Notice how in 'gag' the F2 and F3 start
out and end close together? Notice how the F3 of 'dad' points slightly up at the plosives? Notice how the F1 always starts low,
rises into the vowel, and then falls again.
Okay, these aren't necessarily the best examples, but basically, labials have downward pointing transitions (usually all visible
formants, but especially F2 and F3), dorsals tend to have F2 and F3 transitions that 'pinch' together (hence 'velar pinch'), and the
the F3 of coronals tends to point upward. The direction any transition points obviously is going to depend on the position of the
formant for the vowel, so F2 of [t,d] might go up or down. A lot of people say coronal transitions point to about 1700 or 1800 Hz,
but that's going to depend a lot on speaker-individual factors. Generally, I think of coronal F2 transitions as pointing upward
unless the F2 of the vowel is particularly high.
Another thing to notice is the burst energy. Notice that the bursts for "dad" are darker (stronger) than the others. Notice also that
they get darker in the higher frequencies than the lower. The energy of the bursts in "gag" are concentrated in the F2/F3 region,
and less in the higher frequencies. The burst of [b] is sort of broad—across all frequencies, but concentrated in the lower
frequencies, if anywhere. So bursts and transitions also give you information about place.
Figure 4 also illustrates that in initial position, phonemic /b, d, g/ tend to surface with no voicing during the closure, but a short
voice onset time, i.e. as unaspirated [p, t, k]. In final position, they tend to surface as voiced, although there's room for variation
here too.
Fricatives
Frankly, fricatives are not my favorite. They're acoustically and aerodynamically complex, not to mention phonologically and
phonetically volatile. There's not a lot you can say about them without getting way too complicated, but I'll try.
Fricatives, by definition, involve an occlusion or obstruction in the vocal tract great enough to produce noise (frication). Frication
noise is generated in two ways, either by blowing air against an object (obstacle frication) or moving air through a narrow channel
into a relatively more open space (channel frication). In both cases, turbulence is created, but in the second case, it's turbulence
caused by sudden 'freedom' to move sideways (Keith Johnson uses the terrific analogy of a road suddenly widening from two to
four lanes, with a lot of sideways movement into the extra space), as opposed to air crashing around itself having bounced off an
obstacle (Keith's freeway analogy of a road narrowing from four lanes to two works here, but I don't really want to think about
serious sibilance in this respect....)
Sibilant fricatives involve a jet of air directed against the teeth. While there is some (channel) turbulence, the greater proportion of
actual noise is created by bouncing the jet of air against the upper teeth. The result is very high amplitude noise. Non-sibilant
fricatives are more likely 'pure' channel fricatives, particularly bilabial and labiodental fricatives, where there's not a lot of stuff in
front to bounce the air off of.
In Figure 5, there are spectrograms of the fricatives, extracted from a nonce word ("uffah", "ussah", etc.).
Figure 5: Top row, left to right: f, theta, s, esh. Bottom row, left to right: v, eth, z, yogh.
Let's start with the sibilants "s" and "sh", in the upper right of Figure 5. They are by far the loudest fricatives. The darkest part of
[s] noise is off the top of the spectrograms, even though these spectrograms have a greater frequency range than the others on
this page. [s] is centered (darkest) above 8000 Hz. The postalveolar "sh", on the other hand, while almost as dark, has most of its
energy concentrated in the F3-F4 range. Often, [s]s will have noise at all frequencies, where, as here, the noise for [ʃ] seems to
drop off drastically below the peak (i.e. there's sometimes no noise below 1500 or 2000 Hz.) [z] and [ʒ] are distinguished from
their voiceless counterparts by a) lesser amplitude of frication, b) shorter duration of frication and c) a voicing bar across the
bottom. (Remember, however, that a lot of underlyingly voiced fricatives in English have voiceless allophones. What other cues
are there to underlying voicing? Discuss.) Take a good look at the voicing bar through the fricatives in the bottom row. You may
never see a fully voiced fricative from me again.
It's worth noting that F2 transitions are greater and higher with [ʃ] than with [s], and I seem to depress F4 slightly in [ʃ], but I don't
know how consistent these markers are.
Labiodental and (inter)dental (nonsibilant) fricatives are notoriously difficult to distinguish, since they're made at about the same
place in the vocal tract (i.e. the upper teeth), but with different active articulators. Having established (in a mystery spectrogram)
that a fricative isn't loud enough to be a sibilant, you can sometimes tell from transitions whether it is labiodental or interdental—
labiodental will have labial-looking transitions, interdentals might have slightly more coronal looking transitions. But that's poor
consolation—often underlying labiodental and interdental fricatives don't have a lot of noise in the spectrogram at all, looking
more like approximants. Sometimes, the lenite into approximants, or fortisize to stoppy-looking things. I hate fricatives.
Before moving on, we need to talk about [h]. [h] is always described as a glottal fricative, but since we know about channels and
such, it's not clear where the noise actually comes from. Aspiration noise, which is also [h]-like, is produced by moving a whole lot
of air through a very open glottis. I heard a paper once where they described the spectrum of [h]-noise as 'epiglottal', implying that
the air is being directed at the epiglottis as an obstacle. Generally speaking, we don't think of the vocal cords moving together to
form a 'channel' in [h], although breathy-voicing and voiced [h]s in English (as many intervocalic [h]s are produced) maybe be
produced this way. So I don't know. What I do know about [h]s is that the noise is produced far enough back in the vocal tract that
it excites all the forward cavities, so it's a lot like voicing in that respect. It's common to see 'formants' excited by noise rather than
harmonics in spectrograms of [h]. Certainly, the noise will be concentrated in the formant regions. Compare the spectrograms in
Figure 6.
Figure 6. Spectrograms of "hee" "ha" and "who".
Notice how different the frication looks in each spectrogram. In "hee", the noise is concentrated in F2, F3 and higher, with every
little in the 1000 Hz range. In "ha", in which F1 and F2 straddle 1000 Hz, the [h] noise is right down there. In "who", there is a lot
less amplitude to the noise between 2000 and 3000 Hz, but there around F2 (around 1000 Hz) and lower, there's a great deal.
You can even see F2 really clearly in the [h] of "who". So that's [h]. Don't ask me. It's not very common in my spectrograms....
Nasal stops
Nasals have some formant stucture, but are better identified by the relative 'zeroes' or areas of little or no spectral energy. In
Figure 7, the final nasals have identifiable formants that are lesser in amplitude than in the vowel, and the regions between them
are blank. Nasality on vowels can result in broadening of the formant bandwidths (fuzzying the edges), and the introduction of
zeroes in the vowel filter function. Nasals can be tough, and I hope to get someone who knows more about them than I do to say
something else useful about them. You can sometimes tell from the frequency of the nasal formant and zero what place of
articulation was, but it's usually easier to watch the formant transitions. (This is particularly true of initial nasals; final nasals I
usually don't worry about--if you can figure out the rest of the word, there's only three possible nasals it could end with.) (Actually,
being loose with the amount of information you actually have before you start trying to fit words to the spectrogram is one of the
tricks to the whole operation.)
Figure 7. Spectrograms of "dinner", "dimmer", "dinger".
The real trick to recognizing nasals stops is a) formant structure, but b) relatively lower-than-vowel amplitude. Place of articulation
can be determined by looking at the formant transitions (they are stops, after all), and sometimes, if you know the voice well, the
formant/zero structure itself. Comparing the spectrograms above, we can see that 'dinger' (far right) has an F2/F3 'pinch'—the
high F2 of [ɪ] moves up and seems to merge with the F3. In the nasal itself, the pole (nasal formant) is up in the neutral F3 region.
'Dinner' (middle) has a pole about 1500 Hz and a zero (a region of low amplitude) below it until you get down to about 500 Hz
again. The pole for [m] in 'dimmer' is lower, closer to 1000 Hz, but there's still a zero between it and what we might call F1. Note
also that the transitions moving into the [m] of dinner are all sharply down-pointing, even in the higher formants, a very strong clue
to labiality, if you're lucky enough to see it.
Approximants
In case you're not familiar with the term (generally attibuted to Ladefoged's Phonetic Study of West African Languages or as
modified in Catford's Fundamental Problems in Phonetics), the approximants are non-vowel oral sonorants. In English, this
amounts to /l, r, w, j/. They are characterized by formant structure (like vowels), but constrictions of about the degree of high
vowels or slightly closer. Generally there's no friction associated with them, but the underlying approximants can have fricative
allophones, just as fricative phonemes can occasionally have frictionless (i.e. approximant) allophones.
Canonically, the English approximants are those consonants which have obvious vowel allophones. The classic examples are the
[j-i] pair and the [w-u] pair. I have argued that [ɹ] is basically vowel-like in structure, i.e. that syllabic /r/ is the most basic
allophone, but there are those who disagree. Syllabic [l]s are all at least plausibly derived from underlying consonants, but I'm
guessing that'll change in the next hundred years.
Figure 8. Spectrograms of 'ball', 'bar', 'bough', 'buy'.
In Figure 8, the approximants are presented in coda/final position, where the formant transitions are easiest to discern. Note that
in all four words, the F1 is mid-to-high, indicating a more open constriction than with a typical high vowel. For /l/, the F2 is quite
low, indicating a back tongue position—velarization of 'dark l' in English. The F3, on the other hand, is very high, higher than one
ever sees unless the F2 is pushing it up out of the way. In "bar", the F3 comes way down, which is characteristics of [ɹ] in English.
Compare the position of the F3 in "bar" with that in "bough" and "buy", where the F3 is relatively unaffected by the constriction.
In "bough", the F2 is very low, as the tongue position is relatively back and the lips are relatively rounded. Note that the this has
no effect on F3, so let it be known that lip rounding has minimum effect on F3. Really. The next reviewer who brings up lip
rounding without having some data to back it up is going to get it between the eyes. It's worth noting that the nuclear part of the
diphthong is relatively front (as indicated by the F2 frequency in the first half of the diphthong) with the [aʊ] than in [aɪ]. In 'buy',
the offglide has a clearly fronting (rising) F2.
Common allophonic variation

One of the absolutely characteristic features of American English is "flapping". This is when an underlying /t/ (and sometimes /d/),
is repaced by something which sounds a lot like a tapped /r/ in languages with tapped /r/s. I refer the reader to Susan Banner-
Inouye's M.A. and Ph.D. theses on the phonological and phonetic interpretations of flappy/tappy things in general. But the easiest
thing to do is compare them. The spectrograms in Figure 9 are of me reading "a toe", "a doe" and "otto", with an aspirated /t/,
voiced /d/ and a flap respectively.
Figure 9. Spectrograms of "a toe", "a doe" and "otto".
Note that for both proper plosives, there's a longish period of relative silence (with a voicing bar in the case of /d/), on the order a
100 ms. The actual length varies a lot, but notice how short the 'closure' of the flapped case is in comparison. It's just a slight
'interruption' of the normal flow, a momentary thing, not something that looks very forceful or controlled. It doesn't even really
have any transitions of its own. The interruption is something on the order of three pulses long, between 10 and 30 ms. That's
basically the biggest thing. Sometimes they're longer, sometimes they're voiceless (occasionally even aspirated), but basically a
flap will always be significantly shorter than a corresponding plosive.
Okay, so let's turn back to the proper plosives. Notice the aspiration following the /t/, and the short VOT following the /d/. Note the
dying-off voicing during the /d/ closure, presumably due to a build up of supralaryngeal pressure. (Frankly, we're lucky to get any
real voicing during the closure at all.)
(Other big allophonic categories I want to cover are nasalized vowels and rhoticized vowels, but I'm wondering how important
they are at this level. Remember that this is a primer, not the be-all and end-all work on spectrogram reading. Also worth doing is
some prosodic stuff, pitch and duration, amplitude and that kind of thing, as it relates to finding word and phrase boundaries in
spectrogram reading. Comments?)
Is that it?
Well, obviously not. But it should be enough to get you started reading the monthly mystery spectrogram. We could go on and on
about various things, but that's not the point right now. Remember, identify the features you can, try to guess some words,
hypothesize, and then see if you can use your hypotheses to fill in some of the features you're unsure about. Do some lexical
access, try some phrases, and see how well you do. Reading spectrograms, like transcription, and so many other things can be
taught in a short time, but takes a long time and experience to learn. But then that's why we're here, right?
Last modified: 11/19/2009 03:33:28

Dept. of Linguistics Current Mystery - Solution - Past Mysteries
University of Manitoba How To - Research - Courses
Winnipeg, Manitoba To the Lab - To the Department - To the University
CANADA R3T 5V5
Solution for March 2003
"Some find shiny things here."
This month's high-pedagogy mode spectrogram is heavy with fricatives (hence the extended frequency scale) and nasals. So pay
attention. Things aren't going to stay this "easy" for long...
[s], IPA 132

Lower-case S
So from about 75 msec to 200 msec, there's a voiceless fricative. Note the absence of any voicing striations at the bottom, and
the 'snowy' 'random' noise. The noise seems to be composed of a single wide, wide band, with no (or at least very little) formant-
like 'shaping'. The band seems to be centered up above the top of this spectrogram, which goes up to about 8000 Hz. If we see
this as a single huge band, centered up there somewhere, then what we're seing is the bottom half of the bell curve--the greatest
energy near the putative center, and falling away fairly quickly, but with a long tail, all the way down to the very low frequencies.
Noise that loud (at its center) is charcteristic of sibilants. So compare the energy here (and also in the segment from 750-900
msec, and even 1500-1600 msec) with the noise in the 400-500 msec segment, and the one around 1200 msec. That's sibilance
folks. Very high frequency, and very high amplitude, noise. Centered up above 8000 like this is charactgeristic of alveolars, so this
is [s].
[ɐ], IPA 324

Turned A
From 200 to not quite 300 msecs, wew've got voicing (note the 'voicing bar' between 100-200 Hz or so, consistent with my
probable F0/fundamental/first harmonic). The F1 is up around 800 Hz, the F2 just avoe it around 1200 Hz, and the F3, hmm, I
guess around 3700 Hz. But it's kind of fuzzy. Luckily for vowels we don't usually consider F3.... So an F1 that high must be a
relatively low vowel. An F2 that low should be back (or round), and I recall this was backer than I usually produce this vowel.
Probably should have transcribed it as back, but I was in a hurry. This one is closer to my (now) prefered phonemic transcription
for this vowel, which you'll need to find this word before you'll understand. So it's down there and not at all front. That's the main
thing.
[m], IPA 114

Lower-case M
Then at 300 msec the amplitude suddenly dies off. Where F1 was, suddenly there's nothing, the thng that looks like it used to be
the F2 is now weaker, and all the way up there's just less energy. Typical of nasals (if you've been through any acoustic
phonetics, you know that side cavities suck energy out of spectra-- as antiresonances--rather than adding energy to it. So this
almost has to be a nasal. Note thte transitions in the previous vowel. F4 falls sharply; F3 isn't really doing much, or if it is, it's
interrupted by the zero. F2 starts out low and if anything falls. F1 always falls into closure, so that's not really indicative of
anything. So we have mostly falling formants, usually correlated with bilabiality. And if you know my voice, you know my nasal
pole (the formant above the lowest zero) is usually around 1000 Hz, where it's closer to 1300-1500 Hz for an alveolar. So
everything points to bilabial, or at least away from anything else.
[f], IPA 128

Lower-case F
Another fricative. This one is much, much weaker overall that the sibilant (it's hard to tell that from the 'below the dotted line'
frequencies, unless you have a lot of experience with this sort of thing, but then that's why I've provided 'above-the-dotted-line'
frequencies in this spectrogram. If you look, it's not obvious it's stronger anywhere up above 8000 Hz. While the noise in the [s]
was distributed in sort of a curve, this noise is sort of flat acros smost freuqencies. There's some shaping into formants in F4, I
guess, but for the most part, this looks like a non-sibilant fricative that doesn't have any formant 'shaping' to it. That suggest it's
produced in front of any useful resonating cavities, so if this is English, it's probably labiodental or (inter)dental. The F4 seems to
be falling, which is sort of unaccountable. F3 might be flat. But notice that the F2 starts, if anything, below 1000 Hz and rises. Now
it rises sharply throughout the following vowel, but there's nothing like anything but a labial transition into the following vowel. So
this might be a labial. I suppose it could be something else, but labial is probably the best guess for now.
[ɑɪ], IPA 305 + 319

Script A + Small Capital I
So, abstarcting away from the first 25-30 msec following the 500 msec mark (which I take to be mostly transitional), we've got an
F1 that reaches it's steady state (if you believe in steady states, or its maximum if you don't) at about 800-900 Hz. So it start very
low, but starts to transition towards the higher space (lower F1 frequency--try to keep 'vowel height' and 'formant frequency'
straight in your heads at all times in theses discussions) in the last half of the vowel (from before 600 msec to the vowel end at
about 650 msec). When the F1 is 'steady' at its maximujm, F2 is transitioning, but is still quite low, so at the beginning of this
vowel, it's pretty low and quite back. The F2 rises up to almost 1900 Hz and then suddenly transitions sharply down (to about
1750 Hz) in the last 25-30 msec of the vowel, which again I take to be transitional. So where the F2 reaches its maximum, near
650 msec, the F1 is around 500 Hz, so sort of mid. Note the F1 is getting fuzzy in the second half of this vowel. This will be
important later. SO this is a diphthong. The nucleus is low and back, and the offglide is toward the high front (as opposed to the
higher-back) space. The transcription reflects the 'reality' of the nucleus, but only the 'direction' of the offglide, which is sort of
combininb transcription conventions. So I'm explaining it here.
[nd], IPA 116 + 104

Lower-case N + Right Superscript D
Again, making up symbols on the fly will require explanation. But work with me. WE've got another nasal here. See the sudden
drop of amplitude? See the zero? See how there's no energy right at 1000 Hz, but some up at 1500 Hz or so? Must be alveolar.
This is consistent with the transitional information. Even though the F2 is heading down, it's pointed at 1750 Hz or so, which is
generally the 'locus' (if you believe in loci) of alveolar transitions. Somewher ein there. If this were a velar, there'd be more
evidence of 'velar pinch' in the approaching transitions, and a bilabial would have a sharper fall (one presumes) in F2, and
something like falling transitions in the upper formants. So everything points to an alveolar nasal. The fuzziness in F1 in the
previous vowel I noted before is a sign of nasalization on the vowel. In spectrograms, I rarely mark contextual nasalization of
vowels, unless a) it's really, really obvious--with creeping zeroes and whatnot) and/or b) the following nasal stop isn't obvious.
This is, so English phonology being what it is, the vowel must nasalize. Compare my decision to mark rhoticity later. The 'right
superscript d' diacritic is my ad hoc way of marking oral plosion. There's no real 'oral stop' phase to this, unless you count the last
10-15 msec or something right before the onset of the noise. If it ain't long enough to segment out, I'm not wasting a lot of time
trying to. So I've just marked an oral release rather than a separate segment. Take that for whatever it's worth, which I don't
expect is much. Anyway, this is how homorganic nasal-stop coda clusters seem to look in my voice.
[ʃ], IPA 134

Esh
Now this is obviously another fricative. But while the initial [s] in this spectrogram was 'tilted' toward the high frequencies, this one
is much flatter. Still very broad band, and very high amplitude, so we're still talking sibilant. Which pretty much just leaves [ʃ], but
let's suppose we didn't have an [s] to compare to. We could still identify this, a) because it isn't tilted toward the high frequencies,
b) it's much stronger in the F2/F3 region than a typical [s], and c) just below the F2 region, the amplitude suddenly drops off really
sharply. All of which point to [ʃ]. The noise at the bottom is pretty noisy. It's not striated into (fairly) clean vertical pulses. So this is
voiceless.
[ɑɪ], IPA 305 + 319

Well, it's back to formants for this stretch between 900-1000 msec. F1 is kind of fuzzy, but it seems to be centered betwen 750
and 800 Hz. F2, is also kind of fuzzy, but it seems to start around 1300 Hz. At about 950 msec, the fuzzinees starts to leave off,
slightly, and the F1 may be dropping slightly. F2 is clearly rising. So we've got a vowel that moves from fairly low and more back
than central to something slightly higher and definitely front. I really only have two fronting diphthongs (under normal
circumstances) and only one starts anything like low.
[n], IPA 116

Lower-case N
So here's our next nasal. It's really short, just about 50 msec, but oh well. See the zero? See what would have been the F1 die?
Now, where's the pole? Well, it's not at 1000 Hz. It's not at 1500 Hz. There's definitely something at 2500 Hz or so, but that's not
what we're looking for. So our pole is weak, and we need other cues. So let's look at the transitions. The F2 transition in particular
seems to point down into the closure, but to somewhere around 1700-1800 Hz. Alveolar locus. So that's our best cue. And the
shortness is sort of consistent with that--it's verging on flapping. TMSAISTI.
[i], IPA 301

Lower-case I
F1 is fairly low, certainly lower than we've seen anywhere else, so we're dealing with a fairly high vowel. F2 is way the heck up
avoe 2200 Hz, which is about as high as I've ever seen my F2. Which tells us this is massively front. So we've got a high,
outrageously front vowel.
[θ], IPA 130
Theta
Another fricative, probably voiceless, from just before 1200 msec and lasting about 100 msec. The noise in the lower (normally
visible) frequencies is very light. The noise that we can see is tilted to the high frequencies, but even up there it's not loud enough
to be sibilant. So this ain't sibilant. It isn't shaped like a vowel (i.e. with noise running through the vocal tract and filtered by
resonances) so it ain't [h]. So that leaves the labiodental and interdental again. So now it's time to compare the transitions with
the previous non-sibilant. The F2 in the preceding vowel is too high to do anything but fall but the F2 coming out of the fricative is
just flat. And it's even near the alveolar locus. Now look at that low F2 coming out (but rising) of the [f]. It definitely starts lower
than it 'needs' to. Where this one has room to start lower if it wanted to. So it doesn't. Probably interdental.
[ɪ], IPA 319

Small Capital I
Now here's a vowel. F1 is hard to read, but it's definitely not very high. And it's not as low as the previous vowel. So we're talking
high, but not highest. F2 is quite high, but under 2000 Hz, so it's very front, but not nearly as front as the previous vowel. So if that
was [i], we need to find something not quite as high and not quite as front. Could be [e], but, well, it's not.
[ŋ], IPA 119

Eng
So we've got another nasal. This one is quite long, and definitely doesn't have a pole at 1000 or 1500 Hz. IT also ahs a pole
above 2000 Hz, like the last one, but it's looking like they all do. So all we have going for us is that pinchy transition into it. Look at
that F2 and F3 come togther! Doesn't get any pinchier than that.
[z], IPA 133

Lower-case Z
But then there's a discontinuity. An alveolar looking pole kind of comes in just before 1500 msec. The voicing also kinds of dies
away for a bit, but doesn't go away completely. It's atypical, but the nasal resonance changes around this moment too, which
suggests something odd is happening with the coordination of my soft palate raising and the alveolar closure. Which isn't
technically a closure since we're dealing with a fricative. I've tried slowing down my [ŋz] sequencies and I do get some nasality
over the frication. As improbable as that sounds. Anyway, it's good that I increased the visible frequencies for this spectrogram,
so you can see what's really going on with the noise. The noise is quite high amplitude, but mostly at the highest frequencies.
That is, it's obviously sibilant, tilted to the high frequencies, but less loud than the initial [s], so the noise dies off in the 'normally
visible' frequencies much faster than for the voiceless [s].
[h], IPA 146

Lower-case H
The voicing never quite dies off, but since there's no evidence of striated formant stuff, I decided not to bother transcribing this as
voiced. This in spite of the apparent resonances that creep in really early. It's nice though, because you can see the transitions
into the high, front tongue position, but excited by noise rather than voicing.
[i ˞], IPA 301 + 419

Lower-case I + Rhoticity Sign
I don't usually mark rhoticity on vowels, since Keating et al. (1994) suggested it was entirely redundant. But I needed something
to indicate/account for the movement in the fully voiced section of the vowel. So actually the clear high, front position is hit really
early, during the noise, and by the time full voicing and resonance kicks on (where I've marked the 'beginning' of the vowel) the
F3 is already transitioning to its final position (more below) dragging the F2 along with it. So even though it's moving, it was still
clearly an [i] target. In the center of the vowel the F2 is closer to [ɪ]. But this is a rhotacized [i], not an allophonic [ɪ] selected before
an [ɹ]. TMSAISTI.
[ɹ], IPA 151

Turned R
So following the transitions, that blob around 1750 Hz is both F2 and F3. An F3 that low can only be a [ɹ]. Nuff said?
So remember how these fricatives really look, so that when you can only see them up to 4000 Hz you can hypothesize how they
are supposed to look.
Last modified: 11/08/2009 22:57:48

Dept. of Linguistics Dept. of Linguistics - University of Manitoba
University of Manitoba Robzone Home - Research - Courses - Personal
Winnipeg, Manitoba Current Mystery - Solution - How To - Past Mysteries
CANADA R3T 5V5
"They found us a side table."
Eth + Raising Sign

[ð̝], IPA 131 + 429
Well, there are those pulses at 100msec, then not much, then some fairly significant voicing and a little noise in the resonances
approaching 200 msec. So there you go. Sort of typical of my initial Eths, but whatever. If you didn't know that, you'd have to fake
it somehow. Well, there's something voiced going on. And it's a little noisy, and doesn't look at all plosive. So a tightened fricative
that's voiced is probably a good bet. Can't be sibilant (wrong energy pattern, and wrong pattern of strengthening) so that leaves
relatively few choices. /h/ doesn't go gappy like this, at least not usually. And the transitions don't look labial.
Lower-Case E + Small Capital I

[eɪ], IPA 302 + 128
Well, we've got some vowel going on here, at least from about 75 msec starting at about 200 msec. The voicing goes on for a
while after that, but it's definitely fricative above the F1, so I discount that as vowel. The F1 is fairly mid, possibly folling. The F2 is
high and moving higher. The F3 is just sitting there. So we have a mid-ish, front vowel, possibly moving fronter and higher.
Lower-Case F
[f], IPA 128
So here's a fricative. It has very little in the way of resonance-y-looking organization. It *may* be sibilant--it has vaguely the same
profile as the following fricatives (cent(e)red around 700 and 900 msec). But you may notice that it isn't quite as strong in the high
frequencies as the ones that follow. So if one of them isn't a sibilant, it would be this one. And if it isn't, this one looks more labial
than the others. After all, all three formants rise (to different degrees) starting with the voicing onset around 350 or 375 msecs. So
this seems to have labial transitions out. It's strong, I have to say, for a labiodental fricative. But there you go.
[aʊ], IPA 304 + 321
So, abstracing away from the transitions (so starting around 400 msec or so), we've got a quite high F1, a middling F2 and an F3
we don't expect to tell us much. There is a funny discontinuity in the amplitude of the striations above about 2000 Hz, which might
tell us something, but since the over all amplitude seems fairly consnat until 525 msec or so, it looks like this is a classic, vocalic-
throughout, diphthong. So after that amplitude change, the F1 seems to be falling. The F2 is decidedly low. So this vowel starts
rather low and vaguely central, and moves (sort of) up and definitely backer and/or rounder.
Lower-Case N
[n], IPA 116
Starting from 525 msec or so there's something voiced, but of reduced overall intensity compared to the preceding vowel. It's
clearly got periodic energy in the resonances, mostly seprated by zeroes. So this is almost definitely a nasal. Now here's where
things fall apart. The first pole above the voicing bar seems to be about 1100 Hz. This is not where I expect any of my nasal
resonances to be. If it were at 1000 Hz, it would look labial. If it were about 1500 Hz, it would look coronal. There's no hint of velar
pinch so don't even go there. So this looks closest to being labial. But as it turns out, it's not. On the other hand, the immediately
preceding vowel is round (or at least the offglide of the diphthong is probably round) which may be pulling down the locus of the
transition. WHy it sould have any effect on the nasal pole I have no idea, and I doubt that it does. So exactly why this looks like
this is beyond me. If you thing it's a [m], try to find a word that fits. See?
Lower-Case D
[c], IPA 104
I'd like to talk to someone about these nasal-plosive sequences. They always look like this, and it seems to me we have to start
transcribing such sequences as orally-released nasals or something. (I think Keating et al, 1994, used the distinction between
plosive closure-durations and plosive releases to do this sort of thing.) Anyway, there's this distinct oral release at just about 600
msec. It's quite sharp, with limited transitional information in the following vowel. So it's probably the same place as the preceding
nasal, just on general principle, and the absence of useful transitions is weakly suggestive of something coronal.
Schwa
[ə], IPA 322
Basically, we've got a mid-looking F1, possibly just low of mid, so this is a mid or higher-mid vowel. The F2 is pretty much central.
The F3 is as neutral as it gets. So this is pretty classically schwa-looking.
Lower-Case S
[s], IPA 132
Ignoring whatever we thought of the earlier fricative consonant (the one that turns out to be [f]), this one is fairly clearly an [s]. It's
very broad band, having energy at all visible frequencies. The greatest energy is in the highest frequencies, at least up around
4000 Hz if not higher. Classic [s]-shaped sibilant noise.
Schwa
[ə], IPA 322
This vowel looks just like the previous vowel, except it's a little shorter and possibly a little weaker. So this one is definitely a
schwa.
Lower-Case S
[s], IPA 132
And another [s]. Thsi one is longer though. So it's either the end of a word or phrase and undergoing final lengthening (but then
that schwa might be a little longer too--in English, final lengthening applies to the entire final syllable or at least rime) or this is the
beginning of a phrase and lengthened due to strengthening. Or it could be a geminate. Hmm.
Lower-Case A + Small Capital I

[aɪ], IPA 304 + 319
Well, again abstracting away from the transitions, the steady part o fthis vowel has a very high F1, a lower-than-central F2 (so this
vowel may be slightly back), and nothing much going on in F3. So it starts low and backish. After 1100 msec, the F1 transitions
down, so the vowel moves from low to higher, and the F2 transitions sharply up, so the vowel trends sharply frontward. So we've
got another diphthong here, this one going the other way. (I tried to work an [oi] into this one, but just couldn't. Maybe another
time.)
Lower-Case D
[c], IPA 104
Longish gap. The F1 transitions down, but that's consistent both with the raising of the offlgide of the diphthong and the
approaching closure. The F2 however turns sharply in the last couple of glottal pulses, and starts to head downward. The F3
doesn't to much. Bilabial or velar transitions, in the best case, would pull F3 down a little, so the transitions are most consistent
with an coronal closure. The gap is at least 150 msec long, which is fairly long, and the voicing lasts close to 100 msec into the
closure. That's too long (and too strong) to just be perseverative voicing, so this stop, at least at the beginning, has to be
underlyingly voiced.
Lower-Case T + Superscript Lower-Case H

[tʰ], IPA 103 + 404
On the other hand, the release is almost definitely voiceless. The release is sharp and noisy, and [s]-shaped, if you see what I
mean, which suggests that this side of the gap must be coronal. The noise persists for close to 75-100 msec--note that from
release to the obvious periodicity in F2 is a loooong time. So even though the voicing technically kicks in fairly early, this is pretty
definitely aspirated. So this side has to be underlyingly voiceless (and syllable-initial), which tells us we're definitely dealing with a
second plosive here.
Lower-Case E
[e], IPA 302
So the vowel has a nice flat F1, in the mid range. The F2 is slightly rising, but very high throughout, so this is front and getting
fronter. The F3 is trending down throughought this vowel, but not sharply enough to really mean anything. So we're looking at a
mid-ish or slightly higher vowel, with a very front (and possibly getting fronter) tongue position. So again this look slike an [e].
Whether this one is 'as diphthongal' as the first vowel in this sentence is something I'm looking at finding an answer to. But it looks
less diphthongal than the other one to me, so I transcribed it as a monophthong.
Lower-Case B
[b], IPA 102
Gap. Voiced throughout. The F1 transitions don't tell usmuch. The F2 transition in the preceding vowel is definitely falling to below
1500 Hz, which is characteristic of labial transitions. The falling trend in F3 might be due to the labialization as well. Someone will
have to look into that.
Lower-Case L + Syllabicity Mark

[l̩], IPA 155 + 431
Well, this is interesting. It looks longish and the lower frequencies are quite strong, but above about 1200 Hz the energy dies off
almost completely. So we're dealing with something with less overall energy than your typical vowel. Although at the end of an
utterance my energy and pitch fall away sort of sharply. So what do we know about this thing. It's either a vowel or an
approximant--nasals would have a better-formed zero below 1000 Hz, I guess, and any real obstruent wouldn't look so sonorant.
The F1, if it's anywhere, is in the mid range, just a hair higher than the previous vowel, and pretty much dead-on at 500 Hz. The
F2, is very low, starting at around 1000 Hz and trending downward. So if it's a vowel, we're dealing with something very back and
or round. So this could be [o]. Which frankly would have been my guess if I'd just been reading this with absolutely no inside
information. Fortunately, I do have other information, which leads me to reconsider the frequency of the F3. It's petty neutral. The
F4, if that's wha tthat blip at about 3600 Hz is) is a little high. So if it's not an [o], it's an approximant. It can't be [r] or [j]. Which
leaves [w] or [l]. Dark [ɫ] as it turns out. And [bw] is not what I would call a conspicuous possibility as an utterance final sequence.
But [bu]/[bo] is still possible. This is where sounding out the possibilities would be useful. But there's a discussion point to be
made here, about the relationship of final syllabic dark [l] and dialects and accents that vocalize dark [l], especially finally. So
discuss.

Winnipeg, Manitoba
CANADA R3T 5V5
"I tried to fix the window."

[ɑɪ], IPA 305 + 319
This utterance might start properly with a glottal stop. This wo uld explain the funny pulsing in the beginning. For
some reason, I decided that this wasn't a glottal stop at the beginning. As I recall, I was really trying to do a soft attack
without it being breathy, and obviously I failed. Abstracting away from the raggedness of the pulsing, we can see that F2
starts very high (800-900 Hz) and transitions downward (so this vowel starts very low and transitions upward), and that
the F2 starts very low (down around 1100-1200 Hz) and transitions upward. So we've got something that starts lowish
and backish and moves frontish and highish. In other words, a diphthong with a front off-glide, presumably therefore
either /aj/ or /oj/. I'll have to make an /oj/ sometime to contrast the two, but since this is an English declarative, I
think it's likely to start with "I".

[tʰ], IPA 103 + 404
Well, this is as gappy as my gaps ever get. So this is undoubtedly (or indubitably) a plosive. And look at that release.
Very sharp and instantaneous (I'm looking right at 300 msec), followed by some very high energy noise concetranted in
the high frequencies (i.e. sibiliant /s/-shaped). The VOT continues for a while, in fact close to 100 msec after the
release, which clearly indicates aspiration. The shape of the first 20-30 msec of [s]-shaped noise suggests coronal.
Voiceless, of course.
Turned R + Under-Ring
[ɹ]̥ , IPA 151 + 402
Well, the F3 is sort of obscured, as are, frankly, the F1 and F2, by the aspiraiton n oise. But juding from the transitions,
the F2 starts high, the F2 starts low, and the F3 looks like it starts very low and rapidly transisiotns upward to almost
neutral. Which for me is about 2400-2500 Hz. I'm loking at that curved bit of energy just surrounding the first few
glottal pulses around 400 msec, which start about 1750 Hz and rise sharply from there. Such a low F3 can only be an /r/
in English. Marked as voiceless, due to the aspiration.

[ɑɪ], IPA 305 + 319
So, ignoring the F3, we've got basically the same formant patterns as before,e xcept the F2 transition seems to be
displaced slightly to the right, i.e. the F2 'steady state' is actually a steady state, probably lasting about 25 msec into the
voicing, where in the previous vowel, the F2 seems to start slightly higher and to be already moving when voicing clicks
in. This may be a function of lengthening preceding a voiced coda (wink wink), but I'm working on a hypothesis about
'transition points', i.e moments where you go from steady-state to transition, that I'm working on... Which I mention
only because it's interesting that the result of the longer duration of this vowel, is not that the transitions are slower,
but that the steady states are longer. Nuff sed.
Lower-case D
[d], IPA 102
Well, here's another gap, this one mor obviously voiced. But it's very long to be a voiced stop, so, it may be two stops.
Considering the first (voiced) bit, the voicing clearly persists into the closure, further suggesting undelrying voicing (as
opposed to simply perseverative voicing). As for place, well, the transitions from the preceding vowel don't really look
velar, nor do they look bilabial. So coronal is probably a good bet. That and 'tried' is a better word after "I" than either
'trige' or 'tribe'. Sometimes your top-down look-ahead really is your best advantage.

[tʰ], IPA 103 + 404
Well, here we ar again. The release has many of the same characteristics as the previous release like this, and the
transitions out (when not obscured by the aspiration) are more likely to be coronal than anything else.
Barred I
[ɨ], IPA 317
Somewhere around 650 msec there's some voicing starting that lasts, well, about 75 msec. WHich is quite short for a
vowel. And it doesn't really get very resonant. There's some F2 harmonic energy, but above that it' smostly noise. Noise
that's mostly constant from the release of the /t/. So this is a really short vowel, mostly being hidden by the
voicelessness and/or frication around it. So pick a reduced vowel symbol (as always I follow Keating et al (1994) in
choosing barred-i if the F2 is closer to the F3 than the F1) and get on with things
Lower-Case F
[f], IPA 128
Well, if I didnt' know better, I'd swear this is some kind of weak Esh. It has that obvious zero below 1500 Hz, and the
energy above that is contiguous with F2/F3 supported energy. With broad band. But if it were an Esh, and this long< I'd
*really* want to see more amplitude. I mean come on. So probably not Esh. Obviously not [s], so we're running out of
voicelss fricatives. /h/ doesn't usually go that voicelss or that long between vowels. WHich leaves [f] and Theta. And
there's not a lot to tell us which. (I think this was in response to someone's question about the 'dental' fricatives in
English, an dhow to tell them apart. The asnwer was, well, if there isn't transitional information, you pretty much can't.
At least not consistently. At least in a way generalizable within and across speakers. As far as I know. Hmm. If I had my
dissertation to do over again, maybe I'd concentrate on fricative noise. Hmm.)
Small Capital I
[ɪ], IPA 319
Well, this is another short vowle, but since it at least looks like a real vowel, I decided to treat is as such. Lowish F1 (the
harmonics around 500 Hz seem stronger than immediately below, but that little thing can't be aformant--too narrow
band, so I take it to be the top edge of the F1), very high (but not outlandishly high) F2. So high and front. And short.
Pretty much leaves [I].
Lower-Case K
[k], IPA 109
ANd the preceding vowel very nicely provides us with some very velar-pinch-y looking transitions. This coupled with
that nice low double-burst on the other side can only lead to 'velar' asa conclusion. And voicelss, of course.
Lower-Case S
[s], IPA 132
Well, if the other one couldn't be an Esh, I *raelly* don't see how this could be an [s], but there it is. The noise *does*
have that weird zero (although it's not really obvious that it has anything to do with the F2 being above it), and the
noise, such as it is is *really* broad band. The only giveaway that this *might* be an [s] is that the noise gets a *little*
stronger and a *little* more organized in the very high frequencies. So maybe it's adevoiced /z/, except that it's way
too long. So frankly I'd guess whatever I guessed for the other one. And I'd be wrong. Hey, I never said I knew any more
about this stuff than you do.
Eth + Raising Sign
[ð]̝ , IPA 131 + 429
Well, okay. Ome day I'll produce a real, honest-to-goodness *fricative* dental. This one clearly isn't. Even though it's
fully voiced, there's no way to get that, well, *release* thing at 1100 msec without building up some fairly serions
pressure behind the closure. Which fricatives do, but not like that. So the answer to how you tell the difference between
Eth and [v], which no one asked, but here you go, is that the Eth is more likely to present like a stop, an dth e[v] is more
likley to present as an approximant. So there. Anyway, the very strong and even voicing (and no VOT) tell us this is
voicd, and there's not really any transitional information to tell us anything, and the burst is sort of misleading but
looks more coronal than anything else. So I might guess /d/ again. ANd having made it that far, I could top-down to Eth
once I'd worked out what the rest of the sentence looked like.
The spectrograms are getting *hard*.
Schwa
[ə], IPA 322
Well, here's ahortish vowel that's mostly transition. Judging from its starint frequencies, it looks highish and central-to-
ever-so-slightly-back (but that might just be the influence of the transition). F3 is slighlty raised, which is alarming. F4
is not helping, since it's very obviously transitioning into the follwoign deal. So I don' tknow. I called it a schwa, and
even if it isn't, it's not going to be overwhelmingly informative.
Lower Case W
[w], IPA 170
Well, look at that F1. Indicates a fairly high articulation. The dip in amplitude just after 1200 msec tells us that there's
very minute change toward closure (thoug of cousre it never reaches closure), so even without the transitions, there's a
consonant-like moment here between what would otherwise be just a sequence of vowels. The F2, a that moment, is low
low low. So, oddly enough is the F4. So at least for this utterance, the Maeda model that coupled the F4 to the F2 looks
like it was right after all. The F3, on the other hand, is damn flat throughout. So, this has got to be [w].
Small Capital I
[ɪ], IPA 319
AH, and here's naother shortish vowel that I'd be inclined to ignore, except that I know what the answer is. Abstracting
away from the [w] transition, it looks fronter than the previuus schwa thing, for what that's worth. But high. But short,
in spite of being high. So on the balance, I called it [I[, but it doesn't really ever get nearly front enough to warrant it.
Lower-Case N
[n], IPA 116
Voicing, longish, but not fully resonant like a vowel. But clearly sonorant. With a nize little zero around 1000 Hz. Must
be a nasal, and must be an [n], in that with those transtiions into and out of it it can't possilby be an Eng, and my [m]s
have a *resonance* aroudn 1000 Hz, with the zero lower (or higher, depending on how you think of what the zero is
doing).
Lower-Case D
[c], IPA 104
But then there's that momentary (like one glottal pulse) worth of dipped energy, and the followign thing that looks like
an (oral) release transient. Which is how voiced stops following homorganic nasals tend to look in my voice.
Lower-Case O + Upsilon
[oʊ], IPA 307 + 321
Well, this is a vowel. With transitions, so it's probably some kind of diphthong. In another life I'd have analyzed this as
two separate segments, but now I"m not so sure. Ennyhoo, starting that he beginning, we've got a mid-looking F1 which
is transitioning upward, meanign this vowel is going from mid to low. Or something like that. The F2 starts, well,
central, or front of central, and transitions rapidly downward, indicating increased backness or rounding. F3 is just
sittingthere, F4 is basically mirroring F2. Good strong voicingfor a looong time, when it sort of peters out of the higher
frequenices until , by the time you get about 300 msecs in, you've just got a voicing bar. But the amplitude changes is
very even. The clunk in the F3/F4 range at about 1575 msecs doesn't lik eup with anyting else. The moment the F1/F2
finally give up to the noise doesn't really line up well with anything else. So this is very smooth. No specific consonant
moment here.
So we've got somethig mid going to low, and something central going to back. WHich just can't happen in my dialect.
[əɑ] is just *not* a vowel where I come from. Especially not in an open syllable. But I don't know what else to say. Maybe
it's an artifact of the analsyis. Some weird interaction between the bandwidth windowing and the fundamental
frequency. I don't know. SO once again, go for the top down.
I tried tuhshuhksduhwunduhn
I tried to shucks dawinda
I tried to shucks the winda
Shucks the window? Fix the window. I tried to f*x the window.
Yeah, that'll work.


Winnipeg, Manitoba
CANADA R3T 5V5
Solution for October/November 2008
"Let's try to find the ghost instead."
Inspired by the Scooby-Doo toy I got out of a gumball machine.
[ɫ], IPA 209

Tilde L (Dark L)
Well, you might have thought this was a nasal, but the noise in the upper formants probably should be a clue against
that. Otherwise, it does look nasaly---zeroes between weakish formants, relatively flat structure. So if it wer a nasal, it
would be [m], probably, becasue the 'pole' looks like it's at about 1100 Hz at best. But it turns out not to be a nasal, as
mentioned earlier. F1 is very, very low, disappearing into the voicing bar, typical of very high vowels and approximants.
F2, that pole thing, is on the low end (though not low enough to be really back and round, as in a real back [u]. But back.
Now take a look at the F3. It's actually raised, to about 2800 Hz (or thereabouts), from a more reasonable 2500 or just
below (check out the longish vowels, especially that last one). That raised F3 is usually a good cue for laterality. I'll let
Carol Espy-Wilson and Frank Guenther try to explain why. So what we have here is a nice back (velarized, i.e. dark) /l/.
[ɛ], IPA 303

Epsilon
So we have a vowel here, about 100 msec long. It's moving throughout, which is usually an indicator of schwa-iness, but
this is too strong to be unstressed. So if we just kind of average out the movement (and especially the lower starting
point of F2 and the raised starting point of F3) we have a mid-to-lowish kind of vowel (as indicated by the 'average' F1
of, um, let's say around 700 Hz, which is higher than 'neutral', but not 'really high'. F2 is around 1500-1600 Hz or so,
which is, in the scheme of things, 'not low'. So this is a centralish-to-frontish sort of midish-to-lowish vowel. Work with
me.
[t], IPA 103

Lower-case T
So, starting at about 225 msec or so, to just aboute 300 msec, we've got a gap. A silence. Which is pretty clearly telling us
that we've got a stop/plosive. (I'm intrigued by comparing the terms 'stop' and 'plosive', one referring primarily to the
closure, and the other to the quality of the release, but both referring to the same underlying 'object'. But anyway....) As
to place, the F1 transition into it seems to be rising, which is anomalous. F2 may also be rising slightly, which is a good
indicator of not-bilabialness. The F3 is coming down, which taken with the F2 might be velar pinch, but it's not really
'pinchy', and anyway the movement may still be transitional from the frequencies forced by the initial /l/. So go fig.
Alveolar is probably a better guess in general.
[s], IPA 132

Lower-case S
Frictaive. This probably isn't just 'release' from the previous plosive, since it's not noticeably stronger at the 'release'.
Looks like a release into a fricative rather than a 'fricated release', if you follow. But anyway, no resonance shaping so
it's one broad band, apparently strongest off the top of the spectrogram (by which I only mean 'above 4000 Hz or so').
Classic [s].
[tʰ], IPA 103 + 404

Lower-case T + Right Superscript H
So there's anoter plosive here, and note the difference in the release--sharper transient and stronger/st noise in the
release rather than later. So this is probably aspirated (and strongly fricated)--being [s] shaped, the most likely
candidate is [t].
[ɹ]̥ , IPA 151 + 402

So even if you're not convinced of the segmentation, I invite you to look at the starting formants in the following vowel
(and as an exercise, the noise in the aspiration). The F1 seems to start very high, F2 very low. And now look at the F3.
Very low. Very, very low. Must be an [ɹ].
[ɑɪ], IPA 304 + 319

So as we were saying, the F1 starts high and drops. The F2 starts low and raises. So we've got a a nice fronting
diphthong here.
[ɾ], IPA 124

Fish-Hook R
You may think this looks like a small, short, little stop. And it is. Classic definition of a flap. See Inouye (1989). Who, by
the way, was the first person to explain what a formant was to me. Thanks, Sue!
[ɨ], IPA 317

Barred I
Short little vowel, relatively weak and almost definitely unstressed. F2 is closer to F3 than F1, so following Keating et al.
(1994), it's a barred-i.
[f], IPA 128

Lower-case F
So here's another fricative. It's tilted to the high frequencies, but is really very weak compared to the obvious [s] earlier.
Note also the higher frequency noise isn't really well organized. There's also a lot of stuff down at the very low
frequencies that doensn't connect up (literally or logically) with that upper noise if it's a sibilant. So it probably ain't.
It's not [h]y looking and (it's obviously voiceless so) that leaves only two real possibilities--the labiodental and the
(inter)dental. Exactly how one tells the difference is not clear. The lwo F2 starting frequency in the following vowel
could be labial, reinforced by the rising f4, but the F3 is just sort of sitting there. but labiodental isn't quite 'bilabial' so
it isn't clear what to expect here. So probably [f], but it could go either way.
[ɑɪ], IPA 304 + 319

F1 starting at around 900 Hz and falling sharply. F2 starting at about 1200 Hz and rising. Falling diphthong (in
traditional terms 'falling from the syllable peak (i.e. peak sonority)'). What I would call the low, fronting diphthong.
Transcribed here with a starting script-a since it's clear the F1 and F2 starting frequencies are in that range, rather than
central.
[nd], IPA 116 + 104

Well, this is what I mean by something that looks like a nasal. Fully voiced, zeroes, flat resonances, and no noise. Pole up
around 1500Hz is usually an [n] in my voice. The 'oral release' is marked by the burst transient, but there isn't a lot in
the way of separate oral 'closure duration'. If you follow me. So I've been transcribing this as an orally released nasal,
althohg it's phonemically a sequence.
[ð]̝ , IPA 131 + 429

Eth + Raising Sign
This looks like a stop, and perhaps it is. I tried to convince myself tjere was meaningful noise in the lower frequencies,
but I don't think it's really meaningful at all. It's a little short to be an onset ploosive, but then the following vowel is
definitely stressless, so that's a wash. I'd suggest the noisy release meant something but I'd be talking out of my *ss.
(Sorry, I don't usually work blue, but as I'm typing this it's very late and I'm very tired.)
[ɨ], IPA 317

Barred I
Short, obviously stressless, almost transitional, vowel. Movingo n.
[k], IPA 109

Lower-case K
Well, there's actually a fair amount of voicing in the closure, so I probably should have transcribed this as voiced. Sorry.
The release is nicely doubled (double bursts are usually described as involving the longer (sagitally) closure of velars
and Bernoulli's force (which explains their occurrence in some dental/laminal releases) but I'm of the opinion that
velar releases are ubiquitously doubled because the release happens separately on both sides of the uvula. The rising F3
and falling F2 may be evidence of velar pinch, but the double release centered in the F2/F3 frequencies is the dead
giveaway of velarity here.
[oʊ], IPA 307 + 321

Lower-case O + Upsilon
Well, this is another diphthing. F1 isn't moving much at all but is comfortably mid-looking throughout. F3 starts sort of,
well, almost front, but moves sharply downward, i.e. back and round-ward. So this is what I woudl call a mid, backing
diphthong, a realization of /o/.
[s], IPA 132

Lower-case S
Another sibilant. Longer than the previous one, proably due to not being trapped in between plosives. Note the spectral
similarities.
[t], IPA 103

Lower-case T
On l'autre main, There's a funny gap in the noise here. It even has a release burst and a bit of a delay before the noise
starts up again. So this is tough to ignore, and we won't. Could be anything, but the release is too [s] shaped to be
anything except /t/.
[ɨ], IPA 317

Barred I
Another short little tiny vowel. Moving on.
[n], IPA 116

Lower-case N
Another nasal. Nice and flat. Weak little pole at 1500 again.
[s], IPA 132

Lower-case S
Do i have to keep saying it?
[t], IPA 103

Lower-case T
Yes, yes.
[ɛ], IPA 303

Epsilon
Ah, finally. F1 around 600 Hz, so a tiny bit lower than my /o/, but not 'low'. F2 well abover 1500 Hz, but not absurdly
high. So this is mostly front and middish or lower-middish.
[d], IPA 104
Lower-case D
And lastly, a quite strongly voiced (even I couldn't ignore that voicing) sound, with absolutely nothing above the
voicing bar--must be a voiced plosive (or something else so close as to be functionally closed), with a sharpish release,
tilted (sort of) to the high frequencies. So sort of /t/ looking, but voiced.

CANADA R3T 5V5
Solution for February 2005
"We ate toast with jam."
I was thinking about intonation when I did this one, and in my memory I kept hearing Mary Beckman's voice intoning ToBI
sentences. Since I can't use proper names in these things (by rule set down long ago by Peter Ladefoged), I couldn't use my
favo(u)rite "Marianna" sentences, and anyway, since I wasn't doing intonation with this one it wasn't that important. But there are
a lot of 'jam' sentences as I recall. But we will be doing some intonation stuff later on this year.
Lower Case W
[w], IPA 170
Well, starting at about 75 msec or so, there's some voicing going on. You'll notice that there's not much in the way of energy
above 1000 Hz. ANd the F1, such as it is, is weaker than the F1 of the following vowel (or whatever that is). So we're looking at
something sonorant (voiced and open enough have some serious periodicity to it), but not open enough to really be even a high
vowel. So this must be some kind of approximant, by traditional definition. The F1 is hard to see, but it's lowish, whatever it is,
which is consistent with a tighter-than-open constriction. The F2 is tough to make out, if in fact you can see it at all, but it's clear
the F2 transition into the following vowel starts around 900-1000 Hz or so. So it must be quite back and/or round. Probably and.
So how many back, round approximants can you think of?
Lower-Case I
[i], IPA 301
So abstracting away from the transition, this vowel thing has a low F1, lower than mid-range anyway. and an absurdly high F2 up
raround 2300-2400 Hz. That's just freaking high. So we're dealing with something high and exceptionally front. Again, how many
such vowels can you think of? Good.
So at this point, knowing we've got an english sentence, we could probably make a good guess at the first word. Or at least the
first syllable. And further, if we feel like we have a word/syllable that could plausibly be the subject of a sentence (someday I'm
going to put a weird adverbial or something at the front of a sentence and derail this whole line of reasoning), we might guess that
the next bit has to be some kind of verb. Or we could be wrong about [wi] being a subject, or even about it being [wi]. But it's a
working hypothesis.
Lower-Case E
[e], IPA 302
I'm trying to be consistent about marking movement in vowels, but I'm not sure what I was thinking here. But we have something
here that is separate from the preceding vowel--there's a sharp change in frequency, as well as a sudden change in F1 frequency.
The F1 frequency is higher than the previous vowel, approaching the mid-range, but not quite. So this vowel is still quite high, but
not as high as [i]. The F2 is similarly not-quite-as-high as the preceding vowel, so the this is not quite as front but still quite
radically front. So not as high or as front as [i], but still high or high-of-mid, and very front. Possibly moving back towards [i], at
least as we approach 400 msec or so. So possibly diphthongal. Sound familiar? If you're wondering about the height, check out
the height of /e/ in my 1997 JASA paper.
Glottal Stop
[ʔ], IPA 113
Well, as we approach 400 msec and a bit beyond, the periodicity, or the regularity of the voicing striations starts to fall apart. So
either there's a very abrupt and very extreme drop in F0, or there's some creak going on here. Creak in the sense of
glottalization. Glottalization as might result from a glottal stop. Hint hint.
Lower-Case T + Superscript Lower-Case H

[tʰ], IPA 103 + 404
Well, glottal stop aside, there's longish gap here. 75 msec or so. Well, not quite, but long enough to probably be a plosive.
There's some indication, in all that glottality of falling transitions in the lower three formants, so we might be thinking bilabial. But
look at the release on the other side. Sharp release concentrated in the high frequencies. Strong noise, again concentrated in the
higher frequencies. And a longish VOT, 50-75 msecs again. But most of that is clearly aspiration with formants running through it
and everything. So let's look at that release. Almost nothing in th elow frequencies. And no indication of bilabiality in the
transitions. And that noise in the high frequencies like it was a really short [s] or something. Hmm. Something with an [s]-shaped
release. Maybe an alveolar stop? Voiceless and aspirated, as it turns out.
[oʊ], IPA 307 + 321
Now this is a diphthong. The F1 starts a little high of the mid range and moves downward. So this starts a a little lower than mid
and moves toward a high vowel. The F2 starts well, the F2, when the voicing kicks in at about 500 msec, is low of the mid range,
so this is sort of back and/or round, but the F2 again drops in frequency reaching a min at about 700 msec. So it's getting backer
and/or rounder. So middish to highish, and backish/roundish to backer/rounder.
Lower-Case S
[s], IPA 132
So this next bit is definitely voiceless (no periodicity, no striations, no low-frequency "voicing bar" energy). It's not very little in the
way of formant structure (except possibly some in F2, almost definitely the front cavity. And fairly high amplitude noise in the very
high frequencies (very high at least in the sense of being at the top of the frequency range in this spectrogram, which goes up to
about 4400 Hz. So this is probably an [s].
Lower-Case T
[t], IPA 103
Well, here's another gap. This one is shorter than the previous one. It's voiceless, but it's hard to tell if it's aspirated. There's some
periodic looking things in the low frequencies that could be voicing. But during the closure it's voiceless. The release is sort of
sharp, but doesn't have a strong transient to it, suggesting that the closure was sort of weak without a lot of pressure building up
behind it. The release noise is concnetrated in the F3 range and higher. The F2 is a little lower. So there's no obvious velar pinch
in the release. The noise is consistent with an alveolar, but not great. BUt as it turns out there's a reason for the noise to be a little
lower than in the previous [s] or [t]...
Lower Case W + Under-Ring

[w̥], IPA 170 + 402
So, you might have noticed that the vowel starts out lower, in the aspiration, or whatever that is, than in the voiced poriotn. For
something so weakly released, that apparent voicelessness/apsiration/absence of periodicity in the F2 and F3 range goes on for
an awfully long time. Maybe there's something else here to time the voicing (or lack thereof) with. SOmething that would
otherwise have a very low F2. Hmm.
Schwa
[ə], IPA 322
Well, there's teeny tiny bit of real voicing in here, with formants and everything Certainly a local sonority peak, worthy of being
called a vowel, but otherwise not worth worrying about. Schwa. Done.
Theta + Raising Sign

[θ], IPA 130 + 429
So there may be a little short gap before the fricative thing, but as it turns out that will be a red hearing. So paiyng attention to the
noise, it looks noisy. If you were misled by blip of energy at the very low frequencies which otherwise might be consistent with
voicing, you were misled. With that much energy down there, we should see definite striations, and given that there is formant-like
energy above, I'd expect it to look more periodic up there too. So this is just noisy and voiceless. There's some formanty stuff, and
it's not loud enough of broad-band enough to be a sibilant. It could be an [h], but then there should be more in the F1. With gaps
on both sides, it's not like there's a lot of transitional information, but f we look at the transitions, they don't look particularly velar
or bilabial. So it's some kind of front, and maybe coronal fricative.
Lower-Case D + Under-Ring
[d̥], IPA 104 + 402
Well, this is a voiceless gap, and probably coronal for the same reasons as the preceding. And I mean that literally. The release
transition isn't followed sharply by the high amplitude noise, like I'd expect with a simple /t/ release, but what do I know?
Yogh + Over-Ring
[ʒ̊ ], IPA 135 + 402
So if you notice the earlier [s] and [t] bursts, this doesn't really look quite the same. This is a period of high amplitude noise. It's
broad-band, but centered a little lower than the [s]s earlier, and it's pretty dead below 1500 Hz. Typical of [ʃ]. But if this were a
syllable-initial [tʃ]. I'd expect it to look, well, more aspirated. So I transcribed its a devoiced[dʒ], but whatever.
Ash + Length Mark

[æː], IPA 325 + 503
Well, I marked these last two segments as long, but I'll probably stop doing that since phrase-final lengthening is so totally
predictable in these things. I was in a mood, I guess. So, we've got an F1 that starts middish (although that may be transitional)
and moves upward, so this is a mid-to-low sort of vowel. The F2 starts very high (so this is quite front) and seems to transition
down to at least the mid-neutral range (the last bit, after 1400 msec or so I'd ignore since there's an amplitude change there and
things definitely start to transition at that point). SO this is very front and moves centrally or backish. Which is not what I would call
stereotypical English vowel behavio(u)r. So let's thing a second. The preceding sound is a close fricative in the post-alveolar
region, so very front high transitions in the vowel are okay. The next sound is obviously a sonorant consonant, probably a nasal.
So there's probalby some nasalization covering the transitions. So I'll concentrate on the middle portion of this, rather than treat it
as a diphthong. And it's mostly a lowish vowel, and vaguely front. This narrows the choices down a bit.
Lower-Case M + Length Mark

[mː], IPA 114 + 503
So this is probably a nasal. It's got weak resonances, but the main one is either at 1000 Hz or so, or at 1500. Which is not
helping, since depending on which it would be a different nasal. So there's two things about the transitions in the preceding to
consider. The first is that the F2 transition neither pinches up with the F3, nor points toward 1700-1800 Hz. In fact it falls much
further than that, which is consistent mostly with bilabial. The second thing is that the F2 transition is clearly contiguous with the
1000 Hz resonance. So that's probably the one to pay attention two. And in my voice, the 1000 Hz pole is usually for the bilabial
[m]. and the 1500 Hz pole is usually for [n]. Voilá.

Winnipeg, Manitoba
CANADA R3T 5V5
"On the right night there's a white light."

Three guesses what I was thinking about when I did this one.
Script A
[ɑ], IPA 305
This vowel is short enough that it might be worth ignoring. On the other hand, it's initial. So we'd better learn from it
what we can. F1 is hard to see. It might be really really low, lost in the voicing bar, or it might be that little blip at about
750 Hz. F2, probably, is that thing that starts around 1000 Hz and rises slightly. T'm not sure about the F3 or F4, unless
it's that mess around 2700-3500 Hz. But we'll ignore that for the moment. So we've got something lowish (or extremely
high) and very back and/or round.
Lower-Case N
[n], IPA 116
So we're looking at this thing between 150 and 250 msec (or so). There's a nice little zero around 700-800 Hz, and
another above approaching 2000 Hz. So we've got something very clearly fully voiced and sonorant, with overall weaker
amplitude compared with the vowels around it, clear zeroes, and flat(tish) formant structure. The pole (think F2) is
ambiguous for my voice, being above 1000 Hz (clearly [m] territory) and 1400-1500 Hz (clearly [n] territory). So if you
had to choose, I'm not sure which you'd choose. Luckily there's another interpretation. Something in between labial
and alveolar. Hmm. Dental? And how might we get a dental nasal? Maybe by place assimilation with a dental
consonant? Hmm?
Okay, I don't expect you to have noticed that. Frankly I've only noticed it now that I'm trying to figure out why I
thought there was a fricative in here, besides knowing the text. But if you spotted it, you could have a future in this
business.
Eth
[ð], IPA 131
So there is this funny discontinuity in the upper formants, and this 'noisy' release, if that's what you want to call it. So
there's somethign at the end of that nasal we have to account for. And since the following vowel is mostly transition, I'd
say this is a function word. And whether we want to read this as a separate thing, it's reasonable to take an eth ([ð]) as a
source for the dentality of the nasal.
Schwa
[ə], IPA 322
Well, the F1 and F2 are bouth moving--the F1 from middish to highish and the F2 from backish/roundish to
backer/rounder. The F3 is really cruising, as is the F4. SO there's no 'moment' here where the F1 and F2 are just doing
what they want to do, suggesting 'targetless' vowel, i.e. one whose targets have been removed or expanded beyond
'targetness', i.e. one that is reduced, i.e. schwa.
Turned R
[ɹ], IPA 151
So while we're ignoring the preceding vowel, we should be noticing that diving F3. For about 50 msecs from just after
300 msec to not quite 375 msec, there's another dip in amplitude. And while there's attenuation in the higher
frequencies, there's no obvious zero in the lower frequencies. So we're looking at an approximant, with an absurdly low
F3.

[aɪ], IPA 304 + 319
Well, this spectrogram is all about the /ai/ diphthong, all in the voiceless environment, and all in my American accent.
Okay, I don't know what I was really thinking. I think I was just desperate. Anyway, There's a fair amount of variation
among the /ai/ diphthongs in this spectrogram, but we'll concentrate on the similarities. The F2, once you get past the
transitions (around 400 msec) we've got a fairly high F1, around 750Hz or higher. The F2 starts sort of low, so we've got
something lowish and backish. The F1 transitions downward, indicating a movement toward a higher vowel; the F2
transitions sharply upward indicating something moving far forward in the space. Pretty classic.
Lower-Case T
[t], IPA 103
Well, there's a short gap approaching the 600 msec mark. The trannsitions in the preceding vowel look very front-velar,
but that would be wrong. There's an advantage to knowing the text before you start this sort of ting. Okay, there's some
falling apart of the periodicity into this gap, which might indicate glottalization. Unfortunately, this is probably a coda
plosive of some kind, and I can glottalize any final plosive. But most likely I'd glottalize a coronal. But this is sort of
ambiguous.
Lower-Case N
[n], IPA 116
Well, there's evidence of voicing, lots of it, but nothing much above it. So we're looking for something weak, but
sonorant. It scould just be a weak approximant, but the total absence of energy above the voicing bar suggests a zero, as
well as general attenuation of upper frequencies. So I'd say this was a nasal, on the balance. The pole sort of sneeaks in
at about 1300-1400 Hz as the upper frequency energy starts to creep in, which is the best guess at place information.

[aɪ], IPA 304 + 319
Here's another one. The F1 is less obvious in this, but once you make it out, it starts out even higher than the previous
one. Once you get the phrase put together, it's clear this is a nominal head and the previous one is a modifier, so this
one is less prone to reduction than the previous one. So even though it has less amplitude, it might be more
'prominent', whatever that means. So there's this interaction between acoustic amplitude, peripherality/reduction,
and prominance. Discuss.
Lower-Case T
[t], IPA 103
Another one of these gaps with no information in it. Leave it and get on with your life.
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
On the other hand, there's this noisy release. Which Could either be a very short VOT of the preceding gap, or it could
actually be a thing on its own. Which limits the possibilities.
Epsilon + Rhoticity Sign

[ɛ˞], IPA 303 + 419
This was trouble for me. The F3, if you can tell, is not that rising thing in the F3 zone, but that little shadow of
something just above the F2. Which just goes to show you that you have to know what you're looking at before you look
at it. The F1 is not amazingly helpful. It's sort of mid, or higher mid. The F2 is frontish but moving backer, probably
under the influence of the F3. If you believe me, then the F3 is falling, i.e. this is rhoticized. Or rhotacised. Or something
like that. If you don't, then the F3 is a little low but rising. I think that's technically the F4. But anyway, this is a frontish,
rhoticized vowel.
Lower-Case Z
[z], IPA 133
Well, oaky, we've got some serous attenuation of amplitude, but full voicing from 975 msec to about 1050 msec. In the
higher frequencies, we've got a whole buncha noise. So we've got something voiced and noisy. The high-frequency of
the noise suggests sibilance more than anything else, so we've got some kind of voiced sibilant. There are only two in
English. 50-50.
Schwa
[ə], IPA 322
Here again a vowel that's mostly transition. If you must know, it's mid-to-high, and very back or round. But the
roudning is probably coarticulatory with....
Lower Case W
[w], IPA 170
Again, sharply attenuated in the upper frequencies (by which I mean above 1000 Hz in this case), but the F1, while not
'sharp', isn't particularly weak (compared for instance with with the voicing in the [z]). The F1 is apparently so low as
to disappear into the voicing bar, the F2 is as low as it ever gets. The F3 is a little low of neutral, what you can see of it,
so this is something quite high and close, quite back and round, and almost definitely not lateral.

[aɪ], IPA 304 + 319
And here's another one of these. Also slightly 'reduced'. Does anybody hear these as raised?
Lower-Case T
[t], IPA 103
And here's another one of these. This one has more obvious glottalization to it, although I swear I still hear it. Oh well.
Even if you didn't know, by now it should be clear that all of these words are supposed to rhyme.
Tilde L (Dark L)
[ɫ], IPA 209
So this is fully voiced, and apparently sonorant, withno obvious low-frequency zero. So the attenuation of the higher
frequecies is probably just the tightness of the constriction. F1 is quite low. F2 is fairly low. F3 is downright high. So the
difference between [w] and dark l?

[aɪ], IPA 304 + 319
And here we are again.
Lower-Case T
[t], IPA 103
Finally, a [t] worth talking about. Nice long gap (except for whatever that stuff is at 1750 msec). Nice sharp release with
s-shaped release noise. Classic stuff.

Winnipeg, Manitoba
CANADA R3T 5V5
Solution for August 2008 (originally from December 2001)
"My feet get cold at night."
Minimally revised from the original. Wow, my style has changed. And not just in 'official' ways--like treating
aspiration (properly) as a diacritic and not a separate thing.
[m], IPA 114

Lower-case M
It would have been easy to miss this consonant, but there are three or four good glottal pulses starting at about 75 msec
telling you that there's *something* there. It's voiced, it's resonant, and it has a zero (actually several), suggesting a
Nasal, and the pole near 1000 Hz suggests /m/ for my voice. /n/ has a higher pole, closer to 1500 Hz. But that may just
be me.
[aɪ], IPA 304 + 319

Lower-case A + Small Capital I
More than anything else, I wanted my students (this spectrogram was offered as extra credit on their take-home final)
to notice this pattern. F1 starts high and dropps (this is tough to see, but once you know it's there it's pretty obvious).
F2 starts incredibly low, indicating backness, and goes very high. the combination is classic for the /ai/ diphthong,
although the appearance of steady states is a matter of some dispute. There really aren't any here.
[f], IPA 128

Lower-case F
I've got to do something about my fricatives. This is centered too low to be a /s/, it doesn't have enough structure to be
/h/. It sort of has the structure of a post-alveolar [ʃ], but it isn't really strong enough to be sibilant. That leaves the
labidodental and the (inter)dental (for English voiceless fricatives in my normal speech), and your guess is as good as
mine.
[i], IPA 301

Lower-case I
If I've said it once, I've said it a thousand times. Super-high F2, mostly low F1. So we're looking for the frontest vowel we
can find. And it's relatively high.
[t], IPA 103

Lower-case T
I worked hard to make sure there was a release here to tell the relatively inexperienced reader there had to be two
things in this gap. How exactly you tell that this is alveolar as oppose to anything else is beyond me. The burst looks
double (although I think that's really the burst of the /t/ and the closure of the /k/, and the transitions are look labial
(but who can tell when the F2 and F3 are that high in the vowel). All I can say is notice the energy in the (first) release.
It has the 'slope' of an /s/. So there.
[k], IPA 109

Lower-case K
In fact, this is a front /k/, owing to the rontness of the following vowel. Note the F2 F3 pinch in the release, and the
noise concentrated in the F2 F3 conjoined region. Fairly classic for velars. This is voiceless but unaspirated (in the
chinchilla-VOT sense), so this is probably phonemically a /g/.
[ɛ], IPA 303

Epsilon
Vowel. I take it that isn't controversial. Check the F1. It's higherthan the /i/ precedign, and a lot lower than the /a/ at
the begining of the /ai/ diphthong. So it's mid. Note teh F2. Though falling, it's still high, indicating front. Mid and
front. If anything, it's centralizing rather than raising or fronting. So probably 'lax'. Review your vowel features and
you're there.
[t], IPA 103

Lower-case T
Okay, this looks more like a flap, but this is so not a flapping environment. So either the flapping environment is wrong,
or it's not a flap. I don't know the answer. The duration and amplitude of the 'release' noise suggests that a bit of
pressure built up. So let's call it a stop. Note also the shape of the noise in the release, i.e. that it's stronger in the higher
frequencies than otherwise. I.e. looks like /s/. So what stop-like thing is most likely to look like /s/?
[k], IPA 109

Lower-case K
Long gap, followed by a low-frequency double burst. Note also the transitions in the following aspiration. Not very
pinchy, but in the pinchy directions.
[ʰ], IPA 404

Right Superscript H
Aspiration. Loong. (Review the aerodynamics of aspiration and place of articulation.) The double burst and the loooong
aspiration suggest velar for the previous stop.
[o], IPA 307

Lower-case O
Note the F1, in the same range as the mid vowel previous. note the incredibly low F2. Not [u], due to the mid-ish F1. But
very back and round. Must be [o].
[l], IPA 155

Lower-case L
It would have been easy to miss this , but post hoc I can convince myself that there's a change in bandwith of all the
formants, but particulalry F3 and F4, about 2/3 through the 'vowel' here. The F3 and F4 take a teeny little jump here as
well, but otherwise this looks quite back (and round). So we're looking for some kind of approximant which is back.
This could be /w/ or /l/, if you catch it at all. Turns out to be /l/ (velarized), and the little jump in F3 and F4 (okay,
mostly F4, but it's supposed to be in F3...) might help us decide it was /l/ and not /w/.
[d], IPA 104

Lower-case D
Another goofy stop. Short, and this one is arguably in a flapping environment. Okay, so maybe this one is really a flap.
Sue me.
[æ], IPA 325

Ash
I swear I tried to make this a full vowel. But it's looks awfully schwa-like. It looks mid-ish, but it looks pretty non-
descript F2 and F3 wise. Okay, call it a schwa and it won't matter in the long run. Turns out to be a function word
anyway.
[t], IPA 103
Lower-case T
Stop, good release, not really useful in the transitions into it. Looks kind of labial. But it's a /t/. Sue me.
[n], IPA 116

Lower-case N
Fully voiced, resonant, nice little zero, F2 near the neutral range, which for me always means alveolar.
[aɪ], IPA 304 + 319

Another one. Same reasons, but this one has a better steady state in the 'a' portion.
[t], IPA 103

Lower-case T
Well, there's a buncha cues for this, but since it's a stop, they're mostly in the transitions into it and the aspiration out
of it. Utterance final, so there's probably a low boundary tone floating around here, in addition to the general tendency
for me to glottalize final /t/.
[ʰ], IPA 404

Right Superscript H
Okay aspiration. Note the shape of the release noise. Looks like an [s]. But the rest of the noise is organized into
formants like a good /h/.

CANADA R3T 5V5
Solution for (mid-)May 2005
"I"m leaving tomorrow night."
There are different styles of reading--this left-to-right business is just how I do it for convenience. As time goes on, I'll be
introducing other styles. One of the things that I always forget about, at least when sitting down to do these is the 'big' picture
stuff. For instance, how many syllables (or at least vowels) are there in this? What evidence of segmentation do you see? Where?
Can you see anything suggesting pitch peaks/lows, correlates of stress like amplitude, length, or pitch excursions? Once you've
done that sort of thing,usually you go through and mark all the things that are obvious--the sibilants, the nasals, if you can see
them, things that are obviously [i] or [a], that sort of thing. Then, once you've got the big picture, then you start in on specific cues.

[aɪ], IPA 304 + 319
Well, from 100 msec to between 225 and 250 msec, there's a longish vowel, but with moving formants. The F1, the lowest one,
starts around 800 or 900 Hz, and moving downward to near 500 Hz by the time you get to the end. So this vowel moves from very
low to at least mid. The F2 starts very low as F2s go, especially relative to the high F1. So about 1250 Hz or so at the beginning,
rising all the way to almost 2000 Hz by the end. So from relatively back, or at least back of central, moving very far forward. So at
this point, you should have a pretty good idea of what this diphthong is. It's sort of interesting that the F1 has a failry steady state
at the beginning, where the F2 doesn't, and the F2 has a fairly steady state at the end.
Lower-Case M
[m], IPA 114
From the end of the previous diphthong to somehwere past 300 msec, there's another segment Fully voiced throughout,
suggesting [+voice]. Fully sonorant, i.e. with formants all the way up, suggesting something [+sonorant]. Sharp discontinuities on
both sides, which is classically nasal (as the side (oral) cavity closes, leaving only the nasal cavity as the open channel, you
suddenly get a sharp change in the acoustics. If you're lucky.) So this is probalby nasal. It's got what is probalby a zero around
1750 Hz or so (I'm not sure about the apparent narrow zero around 700 Hz, just because it continues for quite a ways--it looks
like an artifact. But it seems to separate a pole around 1000 Hz, which if you know my voice is about right for a bilabial pole.
There is an incling of something higher up, about 1300 Hz or something like that, which might be an indicator of a partial
occlusion at the alveolar ridge or so, but I'll defer that to someone who actually knows something about the acoustics of these
sorts of things.
Tilde L (Dark L)
[ɫ], IPA 209
I'll say it again, all North American /l/s are dark (velarized). True, some of them are darker than others, especially domain finally
(syllable, word, phrase, etc.), but there ain't nothing light about this lateral. WHich we know is a lateral because of the F3, but I'm
getting ahead of myself. 300-someting msec to about 400 msecs, where the F2 sharply changes, the amplitudes all weird out,
and the F3 really kicks in. That's the extent of this thing we're looking at. Okay, F1, if that's what you want to call it, fairly low,
indicating a fairly close constriction, so while not nasal, we're talking about something that is arguably a high vowel or closer. The
F2 is moderately low, lower than for the beginning of the diphthong, so this is very back. As iin velar. So we've got two
possibilities, something in the area of [w], and something in the area of [ɫ]. The difference, according to some sources, is the F3
or sometimes the F4. Typically, rounding will lower formants, so if we believe this is round, we'd want to see some lowering or at
least not raising, of the upper formants. But the F3 of this is just high. At the edge of the nasal it's quite high, just about 3800-3900
Hz, and although it falls, it's still well above 3600 Hz before the amplitude starts to kick in for the next vowel. So this ain't round.
Lower-Case I
[i], IPA 301
Vowel. Starting from the release of the lateral, if that's what you want to call that moment up to 500 msec. But the last 50 msec or
so are clearly transitional, so let's just worry about the apparent F2 extremum area. The F1 is pretty flat and fairly low throughout.
The F2, as I've said has a maximum around 425 msec, at around 2250 Hz. That's freaking high for an F2, so this is about as front
as this can get. So relatively high and outrageously front.
Lower-Case V
[v], IPA 129
Those diving transitions in F2, F3 and F4 (mirrored by rising trasnitions on the other side)! This is clearly labial. But not bilabial.
English only has three bilabials, one is voiceless, and one is a nasal, so this could only be [b]. First of all, even though the overall
amplitude is reduced here, the voicing is very strong and even throughout. And although a lot of my plosive closures are noisy,
there not fricated the way this thing is. This is a fricative. But labial. Which for my variety of English only really leaves [v].
Small Capital I + Tilde

[ɪ̃], IPA 319 + 424
Well, the F1 seems almost mid-looking here, but at least this is not a low vowel. The F2 is really high again, but not so high as for
the [i]. It's actually quite in the same range as the ofglide of that initial diphthong. So.
Eng
[ŋ], IPA 119
[tʰ], IPA 103 + 404
Well, some zero-ness starts to creep in really early, and then at about 700 msec the upper formants, or at least F4, just sjut off
completely. And the other formants flatten out. So the zero is evidence enough of nasality, I suppose. The zero is right where
you'd hope to see a pole, so the pole is either that shadow thing at about 800 Hz, which I'm suspicious of, just because there
seems to be a lot of stuff just about that frequency, especially something that might be a harmonic in the [i]. The other candidate,
which is much stronger, is up arround 2400-2500 Hz. Which is pretty high. But notice that the F2 and F3 transitions in the
preceding vowel point right at it. So that's probably our puppy. The joint F2/F3 thing looks like velar pinch, albeit very front velar
pinch, so this is probably velar.
Lower-Case T
[tʰ], IPA 103 + 404
Well, there's 25-50 msec of gap, which is plenty to count, after a nasal. Now if you're expecting this to be velar based on a blind
faith in nasal place assimilation you're going to be disappointed. Because the release of this thing is not in the least velar-looking.
It's broad band, quite sharp, and strongest in the highest frequencies (which in this context means between 3000 and something
higher). The spectrum is strongest in the higher frequencies, well above the 'formant zone'. So this is basically an [s]-shaped
release. Which makes this an alveolar. And voiceless, and probably aspirated, due to a VOT of about 50 msec. At least. Which is
not outrageously long for a VOT, but long enough to count as aspirated. TMSAISTI.
Schwa
[ə], IPA 322
Then there's a pulse or two of vowel. <snooze>
Lower-Case M
[m], IPA 114
Hey, another one of these. This one starts at about 850 msec, and lasts to about 925 msec. But otherwise it's pretty much the
same as the previous [m], though a little weaker.
You'll notice there's full voicing and sonorance for 300 msecs starting around 925 msec. WHich is just too long to be one
segment. And the formants are too mobile to be reflecting a single target. So before going further, think about how many
segments there are here, and if you can't find the edges of each, can you find moments you want to call the 'center/re' of each?
Go ahead. I'll wait.
Okay, so there's the F1 peak near the beginning; the moment where the F3 is lowest and the F2 is highest, around 1050 msec;
and there's the funny dip in F2 at about 1125 msec. Those are the 'moments' I'm going to consider as evidence of at least three
things in this stretch. So, on to the first.
Turned Script A
[ɒ], IPA 313
I never get to use this vowel. But thre it is. It's just possible this is round, and not just transitioning from-and-to roundness. So I
transcribed it as round. Sue me. The F1 is high, the F2 is very, very low.
Turned R
[ɹ], IPA 151
Okay, so the F3 is dipping below 2000 Hz. Barely, but there you go, if that's good enough for you.
Lower-Case O
[n], IPA 307
Okay, so the F1 has been falling fairly steadily since that moment in the [ɒ]. So this ain't low. But it doesn't seem to be heading
down to well below 500 Hz as the hig vowels earlier on did. So this ain't particularly high. So probably mid. At least round here. F2
is low, as indicated at that dip. F3 doesn't tell us much. So mid and back. Only a couple of possibilities, and even on a good day
one is probably better.
Lower-Case N
[n], IPA 116
So we've got something fully voiced and sonorant, but with zeroes, as before. So this is probably another nasal. But notice the
polse. The bilabial poll was around 1000 Hz or just lower. This is definitely higher. And that harmonic/shadow thing just above it
maybe means it's even a little higher--maybe there's just a harominc space there that makes it look like the edge of the pole is
lower. But I don't know. Spectrally, this just ain't the same animal as any of the preceding nasals, which were bilabial, bilabial, and
velar, respectively. So this must be something else. Few options, at least for English.

[aɪ], IPA 304 + 319
There's another of these. What's odd is that although spectrally this is very similar to the initial diphthong, dynamically it's
completely different. There's a nice clear staedy state in both F1 and F2, and the movement just doesn't make it as far, at least
not until after the voicing starts to go. Maybe it does. Maybe it don't. But it's still something near [a], or [a], followed by something
near [ɪ].
Lower-Case T
[t], IPA 103
The cruddy voicing towards the end of the preceding vowel is probably a combination of low pitch and glottalization. The
glottalization probably tells us that this sound is probably a voiceless plosive underlyingly, and the release (and maybe the
transitions) suggest a nice alveolar again.

Winnipeg, Manitoba
CANADA R3T 5V5
Solution to the mystery
"Happy Birthday, Peter Ladefoged!"
Back to unlabeled spectrogram.
Back to the contents page.
Lower-case H
[h], IPA 146
For about 100 msec, starting about 50 msec, there's some noise up in the region of F3 and F4 of the following vowel. No voicing
to accompany it, this looks like a voiceless fricative. It's not quite strong enough to be sibilant, and it doesn't have the "unfiltered"
quality of an [f] or [θ]. Also the transitions into the following vowel are all wrong for that. In fact, there don't seem to be any.
Coupled with the F3/F4 range the noise appears in, I'd say [h] is the most likely candidate.
Ash
[æ], IPA 325
This vowel has a very, very high F1, indicating a very low vowel. But the F2 is basically neutral, or a trifle higher than neutral. This
can't be a back vowel. Which leaves precious few options, at least in my dialect.
Lower-case P
[p], IPA 101
Beginning around 225 msec and going on until the release at about 300 msec, there's a nice little gap--no resonances, no noise
to speak of. Also no voicing (there's no striated organization to that noise, whatever it is, at the bottom). Check the transitions in
the surrounding vowels. All point down toward the closure, classically bilabial.
Lower-case I
[i], IPA 301
This vowel is a trifle short, but it has a very distinctive spectrum. F1 is quite low, F2 is about as high as one is ever likely to see,
and for those of you who believe in formant distnaces, F2 is 'tight' to F3. So we've got the F1 of a high vowel, and the F2 of a very
front vowel. The F2 being well above 2000 Hz, and the total trajectory being toward the front, this looks like an [i] more than
anything else.
Lower-case B
[b], IPA 102
There's another gap here but this one is clearly voiced through most of the closure duration. It's clearer on the left, transitioning
from the preceding vowel, that the F2 and F3 transitions point downward, i.e. in the bilabial direction again. It's harder to tell on
the right side, what with the noise and the short segment of voicelessness, but if you wish really, really hard, you might convince
yourself. There's not a lot indicating anything else, at least.

[ɹ ̩], IPA 151 + 431
Perhaps my favo(u)rite American English vowel. The critical thing to notice is not the F1 frequency, but the fact that there are two
clear bands of energy in the F2 range. Those are F2 and F3, and an F3 that low can really only be an American-style approximant
[ɹ ̩]. Well, not approximant, at least not in the usual sense of a consonant type, since this is a sonorant between two fairly obvious
obstruents, and therefore a sonority peak (if you believe in those) and presumably therefore a vowel, at least functionally.
Theta
[θ], IPA 130
Well, this is a little confusing, but bear with me. The transitions into this fricative are consistent with a coronal. The F3 is returning
to its neutral value, the F2 seems to head towards 1750 Hz or thereabouts. And the noise, such as it is, is present mostly in the
high frequencies. So this is basically [s]-shaped. But there's no way there's enough energy for it to be an[s]. There's even less
amplitude to this noise than there is in the [h] at the beginning. It's even less than the release noise of the stop that follows. So if
this isn't sibilant, what is it? Frankly, this is the best-looking [θ] I've produced in a long time. Except that it's [s]-shaped rather than
more clearly unfiltered.
Lower-case T
[t], IPA 103
Okay, first things first. Ignore the 'clunk' (that's a technical term) at 750 msec. I have an explanation for it, but it's a long shot. So
ignoring the clunk, there's a gap here. There's no useful transitional information in the preceding fricative, so we're stuck with
those into the following vowel. These don't tell us much, except that they don't look obviously bilabial or velar, and they are
consistent with alveolar. Or coronal. So that's a working hypothesis, helped along by the clearly [s]-shaped (skewed to the high
frequencies) noise of the release burst. (That's what I meant about the preceding fricative just not being [s]-like). So this is a
voiceless plosive, almost definitely alveolar. Now that clunk. I think it has something to do with the transition between the theta
(which for me is truly interdental) and the release of the release of the plosive (which for me is apical). There's probably a point
where just a little bit of air escapes in between the laminal closure (since my tongue tip is busy between my teeth) and the apical
release. TMSAISTI.

[eɪ], IPA 302 + 319
Well, if anyone asks if western US [e]s are really higher than mid, this is a great example. While not quite as low as for the
previous [i], the F1 is still a little low. But it does move quite distinctly lowerward. The F2 starts out quite high, though not as high
the previous [i], and moves distinctly higher. So this looks like a very front not quite high diphthong.

[pʰ], IPA 101 + 404
Another voiceless gap. A good look at those transitions into it reveals those bilabial looking transitions. It's easier to see the
transitions into the following vowel on this one, and they are also consistent with a bilabial. I've marked this one as aspirated,
based on the fairly long VOT.
Lower-case I
[i], IPA 301
The F1 has gone low again, and the F2 is way up above 2000 Hz (at its peak) again. This pattern should be familiar.
Lower-case T
[t], IPA 103
This probably isn't phonetically a plosive so much as a voiceless and ever so slightly aspirated flap. Or tap, or whatever it is in
English. The voiceless portion is close to 50 msec long, but the entire duration is a bit noisy in the higher frequencies. I'm
suspicious of the change in the noise about half-way through the voiceless duration. I think we've got a very short, incomplete
contact (hence the noise) followed by a portion of 'regular' voicelessness, i.e. VOT. But given that this is North American English,
a voiceless flappy thing can't really be anything except a phonemic /t/.

[ɹ ̩], IPA 151 + 431
There's that low F3 again. Not quite as low as last time, but plenty low enough. With both the duration and the local amplitude to
count as a vowel.
Tilde L (Dark L)
[ɫ], IPA 209
This is interesting. If I didn't know better, I'd have thought this was a nasal. It's fully voiced (compare the striated, low-frequency
energy here with the noisy low-frequency band in the flappy thing and the first [p]. So this is voiced, and while there isn't much
energy above, what there is is organized in formant-like bands. Overall amplitude is low, and there are largish bands with no
energy in them. ANd there don't seem to be any other nasals in this spectrogram to compare them with. But there are clues that
this isn't a nasal. First, there's no hint of nasality on the preceding vowel, which by itself is not strong evidence. But the decrease
in energy is so great, it's odd that the 'edge' of the supposed nasal isn't 'sharper'. Usually nasals either look very 'abrupt' or look
more coarticulated with at least the preceding vowel. The 'pole' is just above 1000 Hz, but the 'zero' below it isn't very well
defined. Which again is not strong evidence, but something to be explained. So entertaining a hypothesis that this is something
other than a nasal, what would it be. F1, if that's what you want to call it, is low. The F2 is that thing just above 1000 Hz (note the
continuity with the F2 in the surrounding vowels). And the F3. The F3, in spite of being radically low for the preceding [r], rise
sharply in the transition into this segment. On the other side, the F3 seems to fall from something higher than the 2500 Hz or so it
ends up at in the following vowel. And this raised-from-neutral F3 is consistent with the noisy energy we see in this segment. So
we have an oddly and otherwise inexplicably high F3 to contend with. And high F3s (and to a lesser extent F4s) are typically
associated with laterals in English. So this is a dark /l/. By 'dark', I mean velarized, as all my English laterals seem to be, without
intending anything about syllabic position.
Ash
[�E6], IPA 325
So here we have a relatively high F1, so we have a relatively low vowel. The F2 isn't doing much of anything. If it were lower, this
vowel would look back. But it's not, so the other choice is front. Or frontish. As front as a quite low vowel can ever really get?
Lower-case D
[c], IPA 104
Again, this probably is a more of a flap than a proper plosive [d], but I tried. At least this one is a more standard looking flap, and
actually voiced. It looks basically like a very, very short plosive, consistent with alveolar.
Barred I
[ɨ], IPA 317
I've transcribed this as a reduced vowel, given its relative duration, amplitude, and pitch to the previous vowel, which is definitely
longer (though not a lot, given how low it is), stronger (darker) and higher in pitch (closer striations). If I had to give it a standard
vowel symbol, I'd say [ɪ], that is something in between cardinal 1 and cardinal 2. The F1 idnicates something which is higher than
mid, but not exactly as high as it could get. The F2 something rather front, but nowhere near the frontness of [i]. But as I said, this
vowel is probably reduced, so spending too much energy trying to do something with it is probably unwarranted.
Lower-case F
[f], IPA 128
Well, starting around 1700 msec and going on until almost 1800 msec, we've got a fricative, surely, and mostly voiceless. It's got
some formant structure to it, suggesting that it's resonating through the vocal tract. It's definitely not sibilant. So the most likely
candidate is [h]. But I'd expect an [h] to ahve at least some energy in F1, and that's the one place where this fricative has no
energy. So while [h] is probably the most obvious guess, it does need some explaining. So entertaining the possibility that it is
something else, we're only really left with a couple of choices--labiodental and (inter)dental. And I can just convince muself that
the transitions are more labial looking (pointing down toward the closure) than coronal looking. Although interdentals don't look
always look particularly alveolar. Givne the more downward pointing transitions, I suppose [f] is the runner up. Keep this in mind
when you try to make some kind of word--or name--out of this part of the spectrogram.
[oʊ], IPA 307 + 321
Well, the F1 is pretty close to neutral (around 500 Hz) or just low of that, which tells us that we've got a fairly mid or just high of
mid vowel. It's pretty flat, height-wise. over it's almost 150 msec duration. The F2 is less flat. It starts at about 1250-1300 Hz,
where it sits for about half the vowel or so, and then starts to lower just a little. Now that doesn't seem to coincide with the
transition in the F1, so I think that's a separate thing, and not just transition. So we've got something that is of constant height, mid
or just above, that starts vaguely back and moves slightly further back. Or round. There's only a couple of choices, and the other
one is more likely to move forward (i.e. towards central).
Lower-case G
[g], IPA 110
Another (weakly) voiced gap. At first glance it looks like it shas a fiarly clean burst, but if we look closer the noise seems to
precede the burst ever so slightly. This is different from the preceding bursts, which are pretty sharp on their left sides. Also notice
the low frequencies, which are weaker. In preceding bilabial bursts all had some energy down there. And the transitions into the
next vowel are all wrong for bilabial. In fact it looks like F2 and F3 are awfully close together (I hesitated to say 'pinched' together)
and move apart into the following vowel. So the absence of low-frequency release burst energy and the 'velar pinch' in the
transitions suggest a velar.
Barred I
[ɨ], IPA 317
Well, this is another weak vowel. Given that it's the last vowel, and presumably lengthened by final lengthening, it's not amazingly
long. So whatever you thought of the vowel before the last one, you might think here.
Lower-case D
[d], IPA 104
So we come to the end at last. A good gap as far as resonances go. Some decent voicing, considering the low frequency and
amplitude of the striations (considering it's utterance-final quite good amplitude and duration of voicing during the closure. And a
noisy release burst with energy biased to the very high frequencies. Definitely looks like an alveolar burst.
With thanks and admiration. Happy Birthday, Peter! -- RH

"I hang those keys on a hook."

[aɪ], IPA 304 + 319
Well, that's how I transcribed it. I've been having fights with various bits of literature having to do with the nucleus of this (and
similar) diphthong(s). But I've noticed two things looking at this again. One, there's clearly a glottal 'attack' to this vowel/utterance,
as evidenced by the first two pulses being out of sync with the others, so I probably should have put a glottal stop at the
beginning. B (I just saw a rerun of Mad About You and Paul does this), this vowel clearly starts out more like a classic 'script a'
The F1 couldn't be higher, so this vowel is as low as it gets--the F2 couldn't be lower, so thsi vowel is as back as it gets. The two
formants, straddling 1000 Hz like that are classic for 'script a', rather than [a] as traditionally transcribed in this diphthong. Okay,
so anyway, this is a diphthong, the F1 drops (the vowel rises) and the F2 rises (the vowel fronts). So this is [aɪ].
Hook-top H
[ɦ], IPA 147
Well, if I'm not trying to be literal about transcribing my diphthongs, I am trying to be literal about voicing, and the continuous
striations here indicate that this is voiced. On the other hand, it's very definitely fricative, and its spectrum is very cleary shaped by
the surrounding resonances. Which is pretty classic for [h], except this is voiced. In spite of the usual description, it's not unusually
to a voiced [h], especially between vowels. For trivia value, I seem to be one of those speakers who allows [h] word-medially only
if it is initial to the stressed-syllable (as in pro[h]ibit, but pro[*h]ibition). Mostly. So this either has to be word initial, pre-stress, or
both. As it turns out ...
Ash
[ɒ], IPA 325
Well, once the frication subsides, we've once again got something quite low (the F1, though weak, is just a little lower than it was
in the preceding vowel nucleus), but very definitely front. Frankly, fronter (higher F2) than I think I usually produce this vowel.
Maybe I'm reacting against the general trend centralizing this vowel, which I found (as others have reported) in California, and has
been demonstrated for some Canadian speakers.
Eng
[ɲ], IPA 119
So here we have a segment that's fully voiced, and apparently sonorant, but of greatly reduced amplitude compared to a real
vowel. It's got a nice little resonance just below 1500 Hz, which would make you think of [n], if you were use to my voice, except
that doesn't jive with the serous velar pinch going on in the preceding transition. So this most likely is velar.
Eth
[ð], IPA 131
There's a sudden change at about 625 msec, suggesting the end of the preceding eng and the onset of something else. It's still
fully voiced, but is again of reduced amplitude compared to the nasal. So this might be some kind of voiced obstruent. And given
the slushiness of my stop closures, you might be tempted to suggest this is a plosive. But there's that funny F2 thing going on,
and the noise at the top of the spectrogram. So maybe this isn't plosive, but fricative. So it's a voiced fricative. There's not a lot of
useful transition information, I don't think, to tell us which. Let's rule out the sibilants due to weakness. [h] is possible, since there
seems to be some formant-like organization to the noise, but this looks nothing like the preceding [h]. Which leaves us with [v]
and Eth. Probably that's the best we can do at this point.
[oʊ], IPA 307 + 321
Well, it looks like there's a serious discontinuity around 775 msec, but since there's vowel on either side, I don't know what it could
possibly be, unless it has something to do with the sudden pitch change in this vowel. But the good thing is that the discontinuity,
whatever it is, clearly tells us that the vowel goes from basically mid, 500 Hz, to just a little higher (lower F1). The F2 starts sort of
neutral-to-lower than neutral, and gets lower. So this long vowel goes from mid and sort of back to higher than mid and definitely
back.
Lower-case S
[s], IPA 132
Now this is what a sibilant should look like, except it's a little short. Spectrally, this is a classic [s], with very broad band energy,
quite high in amplitude, and concentrated well above the 4000 Hz range, with very little formant-like organization.
Lower-case K + Right Superscript H

[kʰ], IPA 109 + 404
Okay, I know there was an aspiration mark in my transcription. I must have dropped it somewhere. Oops. Well, here's what I
meant by mushy. I don't think that moving noise around 1000 msec is really air moving anywhere. Maybe it's reverberation. But if
you ignore it, you can explain the double burst (approaching 1050 msec) as the relase of a velar. A pretty front velar, juding from
the concentration of ernergy, not to mention the formants of the followign vowel. So if this is a frontish velar plosive. it is defintely
voiceless and probably aspirated, having a VOT of something like, oh, 75 msec (max). This isn't the longest VOT one might see
for a velar apsirate in English, but it'll do. Don't ask me what happened the the apiration mark.
Lower-case I
[i], IPA 302
Well, F1 as low as it ever seems to get for me, maybe 300 Hz or so, with an F2 well above 2000 Hz. That's about as high as my
F2 can go. So this is the highest, frontest vowel I can produce.
Lower-case Z
[z], IPA 133
Another sibilant, spectrally very similar to the previous [s], except if anything the noise is slightly broader band, extending all the
way down to the low frequencies. But this one is voiced. Barely. But there you go.
Script A
[ɑ], IPA 305
Well, look at that F1 and F2, as I said before, straddling about 1000 Hz.
Lower-case N
[n], IPA 116
This is the duration of a flap, and I'm not 100% sure why I decided it wasn't a nasal flap rather than the straight nasal I
transcribed. But it's definitely not a straight flap--it's got very clear resonant energy (formants) all the way up. But it's weak
compared to the flanking vowels, so it must be a nasal. Its shortness may tell us it's flappy, which makes it mostly likely
underlyingly alveolar.
Schwa
[ə], IPA 322
Short vowel. Actually, it looks just like the previous [ɑ] but it's short, and it's F1 is more ambiguous. TMSAISTI. I'm a little
concerned that the F3 seems to split into the F3 and F4 of the following vowel (the intervening noise notwithstanding), but, as I'm
so fond of saying, oh well.
Hook-top H
[ɦ], IPA 147
Well, here's naother one of these. Definitely noisy, though apparently voiced, and with energy cleary organized in the formant
pattern of the surrounding vowels.
Upsilon + Right Superscript Glottal Stop

[ʊ + ???], IPA 321 + ???
Okay, there's no such diacritic, and if there were, it should probably go on the following plosive. But here goes. We've got a vowel
somewhere in the mid-to-high range, with a very, very low F2, suggesting both back and round. It also gets creaky towards the
end, probably due to glottalization of the following stop--it's a little rough to just be the F0 drop.
Lower-case K
[k], IPA 109
Well, there's not a lot of velar pinch, although the F3 is definitely falling, and the F2 is definitely rising. But the rising F2 rules out
bilabial, and the falling F3 probably rules out alveolar. Which only leaves one option. And that would explain the apparenty double
burst at about 1825 msec as well.

Winnipeg, Manitoba
CANADA R3T 5V5
"I envy hibernation."
I'm not sure how grammatical this is in my idio/dialect. I did check and there are plenty of examples of "envy [abstract nominal]"
but the more usual thing is to "envy [sentient]" where [sentient] can do or experience something enviable. Discuss. (I'd prefer
'covet' over 'envy', but I'm not sure I can "covet [abstract nominal]", although I can clearly "covet [concrete nominal]". Hmm. Or
more obviously "envy hibernating species/animals/individuals" in the sense of "envy [things (that hibernate)]".
[ʔ], IPA 113

Glottal Stop
I don't often mark initial glottal stops, but the first couple of pulses here are just so different from the more modal voicing that
happens later that I thought I had to do something. The irregularity in amplitude (shimmer) and (if it were more obviously
present) irregularity in timing (jitter) are usually correlates of creak, or glottalization, and can be attributed to a glottal attach to a
vowel-initial form. Since English doesn't have 'underlying' glottal stops.
[ɑɪ], IPA 305 + 319

I'm having a fight with one of my students about raising and shortening of diphthongs. Well, a fight, in that there's my view, and
there's the way his data are coming out. So I'm going to be careful with my diphthongs. So at the beginning here, we've got an
F1/F2 fighting for the same frequency but, due to the magic of coupling, are separated. So the F1 here is as about as high as
possible, indicating something very low/open. F2 is about as low as possible (given the F1) indicating something very, very back
(and/or round). Since I so rarely have rounded low back vowels, I presume back. Then the vowel moves--the F1 drops sharply,
the F2 rises sharply into the front space. So we have what I call a low fronting diphthong, for lack of anything better to call it. (If
you think I'm going to get into falling and rising diphthongs at the same time as fronting, backing and rounding, you've never had
to explain diphthongs to somebody else.)
[ʔ], IPA 113

Glottal Stop
So it's worth noticing from 300 to about 350/375 msec there's a sudden change in timing (although still vaguely regular, at the
very least there's a sudden drop in frequency of pulses), and a chang e amplitude. If it were just the amplitude, I might suggest a
nasal, but the low frequency again suggests glottalization. So we're looking at another syllable (and probably word) that starts
with a vowel.
[ɛ], IPA 303

Epsilon
This vowel is hard to read, but the F1 is either around 700 Hz and lowers to about 500, or it starts even higher than that and
maybe doesn't move much (i.e. what I'm interpreting as some movement is just some weirdness happening within the broadish
bandwidth of this F1. But whatever, this is mid-to-low either way. The F2 is high at moving downward, that could be transitional,
or it could be inward (centralizing) movement typical of my lax vowels. So this is some kind of lowish frontish vowel. It's also
pretty short, considering the very high pitch excursion (so presumably lengthened under stress), so I'm inclined to treat this as an
epsilon rather than an ash.
[n], IPA 116

Lower-case N
Now thi s a nice nasal, From 450 msec to almost 600 msec. Nice full voicing bar, but not quite as strong as with an obvious
vowel. Zeroes above the voicing bar (and between any visible formants), lower amplitude in the formants/poles than in the
obvious vowels), and flat formant structure. And a sharp change in amplitude on both sides. Can't ask for more nasal cues than
that. Okay, so ignoring the energy below 1000 Hz, the first real pole is around 1500 Hz, which is very typical of my alveolar
nasals. (My labial nasals typically have a more obvious pole closer to 1000 Hz, and my velar nasals usually show nasal pinch in
the surrounding transitions.)
[v], IPA 129

Lower-case V
The voicing never really leaves off, but there's a change in the amplitude again, just around 600 msec, suggesting a more
obstruent-y thing. The noise on both sides suggests a fricative (which ought to be noisy all the way through, but noise on both
edges suggests I'm at least trying to produce noice, as opposed to a nice clean closure of some kind. No sibilance, so probably
either labiodental or (inter)dental. The F2 transition into the following vowel is ambigious, but the upper formants are all rising into
the following vowel--slightly more evidence of labial rather than dental. On the other hand, the clearly alveolar nasal might be
leading you in the other direction. But that would end up being wrong...
[i], IPA 301

Lower-case I
The F1 here is about as low as it ever gets, strongest well below 500 Hz, so we're talking bout something quite high. Ignoring the
transitions, the F2 tops out at about 2100-2200 Hz or so, which typically can only be [i].
[ɦ], IPA 147

Hooktop H
There's some nice voicing, but all the frequencies above are noisy--even in the formants, the energy is snowy rather than nicely
striated. So we've got here something with clear formants, and noisy, but voiced. So all those descriptions of [h] as 'voiceless
vowel' sort of leave us flummoxed to describe this, a voiced, noisy vowel. But there it is. Call it a voiced [h] or breathy voice, or
whatever. But as commonly happens bewteen vowels, [h] gets some voicing. hence [ɦ].
[aɪ], IPA 304 + 319

Very, very short vowel, and moving. But this isn't really a schwa. Well, it could be, but it ain't. So if it ain't, what is it? F1 moves
from quite high to lower (so we've got something moving from low to high) and F2, such as it is, is moving from roughly neutral to
higher (so we've got something going central-to-front). So that's how I transcribed it. Notice the F2 starting frequencies, even in
the /h/, are nowhere near as low as there were in the first vowel of this utterance. So a central, low, fronting diphthong, rather than
a back, low, fronthing dipthong. If you need that degree of specificity.
[b], IPA 102

Lower-case B
Voiced plosive (clear voicing, lowered amplitude, and nothing above the voicing bar). All formant transitions point down into it (i.e.
down into the plosive and rising out of it), so this must be a labial.
[ɹ ̩], IPA 151 + 431

On the other hand, the F3 here is well below 2000 Hz, and seems to ahve a minimum extremum at about 1700 Hz. Can only be
/r/. And given its position between a plosive and a nasal, really can only be syllabic.
[n], IPA 116

Lower-case N
Anothyer nasal. This one is a little harder to read than the previous one, since the pole is fainter, but it seems to start sort of high
and lower to about 1500 Hz into the following vowel. So this is likely another alveolar--nothing really to suggest anything else.
[eɪ], IPA 302 + 319

The F1 is sort of low again, so this is kind of high. The F2 goes from below 2000Hz to just above, so I think this is an [eɪ].
[ʃ], IPA 134

Esh
Nice, loud noise, very broad band. Some shaping into formants, but really too loud to be anything but a sibilant. The noise isn't
loudest in the highest frequencies, so unlikely to be [s], and dies sharply below the F2 range, typical of [ʃ].
[ə], IPA 322

Schwa
Nice short little vowel. Very short, considering it presumably is undergoing final lengthening. Also low energy and indistinct
formant-wise. So probably reduced, and therefore unstressed, and therefore not worth wasting a lot of time on figuring out.
[n], IPA 116

Lower-case N
Final nasal. Too long and strong to be an obstruent of any kind. The no transitions in the preceding vowel may suggest alveolar,
but then again it may not. But almost probably a nasal rather than some kind of approximant, given the abrupt change in the
upper frequencies. But given the phonotactics and prosody of this last couple of syllables, really there's no choice to be made
here.
Last modified: 11/08/2009 22:57:55

CANADA R3T 5V5
"He plays oboe and clarinet."
First the segmental, then the prosodic.
[h], IPA 146

Lower-case H
So for about 50 msec 'around' 100 msec, if you follow, there's some noise at the bottom, then nothing, then some noise at about
2200 Hz, more or less, then some more at about 2600 Hz, more or less, then a little more at 3400 and so on. The noise at the
bottom isn't striated, as in periodic voicing, so this is actual noise, i.e. voicelessness, i.e. airflow through the glottis uninterupted by
glottal pulsing. But the noise up above indicates an open vocal tract. So this is a glottal fricative. Note the energy up about 2000
Hz is more or less in line with the formants in the following vowel, as if (actually, there's no 'as if' about it) the resonances of the
vocal tract are being excited, not by periodic energy but by the noise. This is why [h] is often equated with 'voiceless vowel'.
Vowel, in the sense of an open vocal tract with resonances, but voiceless. And in English, in onsets.
[i], IPA 301

Lower-case I
So on to the vowel. F1 is low, sort of low enought to get los tin the voicing bar, so we're dealing with a high vowel. F2 is very high,
up around 2200 Hz, and really only [i] (and analogous glides) eery have an F2 that high.
[pʰ], IPA 101 + 404

So the gap starts around 200 msec, or slightly after, with a few pulses of perseverative voicing into the closure. Note right at 200
msec that F2 is pulling way, way down, all the way to about 1500 Hz, and F3 (and F4) are also pointed down into the closure.
When all the formants are pointing down like that, you're likely heading into a bilabial closure. The release seems to occur at
about 300 msec, and is followed by at least 50 msec of aspiration. So we've got an aspirated bilabial plosive going on here. That's
plenty of aspiration to count as aspirated, but it doesn't seem to be that long, all things considered. But as it turns out, it's not just
long enough to be aspirated, but lengthened.
[ɫ̥], IPA 209 + 402

Tilde L (Dark L) + Under-Ring
See how the F3 is raised coming out of the aspiration? How to account for that, you might ask? Why bother? Cuz that aspiration
is longer than we might expect, which suggests that there's something going on--either the approximant degree of closure
increases the duration of the noise, or the voicelessness spreads into the duration of the appoximant segment, depending on
what kind of explanation you're looking for. So we can account for both the raised F3 and the relatively long aspiration by positing
an approximant. With a raised F3 like a lateral. Consistent also with the low F2 (dark/velarized English /l/, even in an onset).
[eɪ], IPA 302 + 319

So, just so we can all agree, the only really steady state portion of this is the F2 at its highest point, right? Okay, the low F2
transition is part coming from the low F2 position of the approximant, but the fact that it doesn't immediately skip up, but sort of
moves in a straight line up to the steady-state position are both suggestive of an intermediate 'target', i.e. a frontish but not as-
front-as-the-glide Thing at the front of this vowel. The F1 is clearly at about 500 Hz, middish, and then suddnely it moves a little
lower. So we have something that is middish and front that becomes higher and fronter, i.e. a diphthong [eɪ].
[z], IPA 133

Lower-case Z
So the noise starts at about 425 msec, and there's 25msec or so of voicing accompanying the noise. So this is probably a voiced
fricative. This also might explain the relative length of the preceding vowel, especially the offglide, but I don't know much about
that. The noise forms a signel broad band that seems to get stronger as it goes upin frequqnecy. Even though it seems to die off
belwo the F2, it doesn't die away comletely below the F2, and there's no accompanying 'pole' or amplitude band in the F2/F3
range. Had they been there, it would suggest a post-alveolar, but since they're not, we're talking alveolar.
[ʔ], IPA 113

Glottal Stop
There's a looong duration between the offest of the noise, at about 600 msec and the onset of regular voicing, at almost 700
msec. Too long to just have nothing, and anyway, it's not nothing. It's a buncha creaky voice. Creaky voice, especially at the
beginning of a vowel like that, is usually a correlate of glottalization, and hence glottal stop. Technically it's not a stop, and
eventually I'll get weird about transcribing creaky voice instead of glottal stop the way I've gotten weird about [ɐ] rather than [ʌ].
But anyway, it's the beginning of a phrase of some kind.
[o], IPA 307

Lower-case O
Diphthongized? Dunno. But it doesn't have a clear 'moment' when it changes, as opposed to just a slow and steady slide into the
closure. So I didn't transcribe it taht way. I could be wrong. Judg(e)ment call. Anyway, look at the F1. Mid mid mid. F2, low. Mid
and back/round.
[b], IPA 102

Lower-case B
Nice longish gap with a nice clear voicing bar at the bottom. Voiced plosive. Transitions into it are F1-falling (consistent with
closure), F2-falling (consistent with a bilabial), and F3 and F4 just sitting there. On the opposite side, they're the opposite. So we
have a fairly good cue for bilabial and nothing pointing specifically anywhere else. Bilabial.
[nʊ], IPA 307 + 321

Longish stretch of vowel, with a funny dip in the F2 between 1050 and 1100 msec. So I think from 950 msec to 1150 msec, we've
got three things to considere. F1 is pretty mid throughout, flat and not particularly interesting, excep tmaybe as the amplitude
changes a little. F2 starts low as before, so the first bit of this is clearly [o] again. Then there's a dip in the F2, almost as it the lips
started to close or tighten or something. This is what I'm counting as the off-glide. It may not be an offglide so much as a
transitional thing between the [o] and the moving bit that follows, but whatever. For just a moment, it gets more round. Or maybe
back. The lost of the F3 I think points to round--smaller apertures just transmit less, so I think rounding accounts for the total drop
in amplitude. And the weirdness in F4, which I just don't want to think about.
[ə], IPA 322

Schwa
So the past bit, from 1100 to 1150 msec or so. It's really short, and the F3 is moving throughout. So let's jsut call it a schwa and
move on.
[nd], IPA 116 + 104

50 msec or so of strong, sonorant voicing (compare the relatively strong amplitud of voicing here than during the previous [b]. Not
much happening in terms of resonance until we get up to 2500 Hz, but there it is, a resonance. So we've definitely got something
sonorant. Probably nasal, given the zeroes (absence of energy). The F3 transition into it is hanging there. F4 I'm going to continue
to ignore. F2 is rising through the schwa. So we're not getting any bilabial cues, and not much in the way of velar cues. So this is
probably alveolar. It woudl be nice if there were a resonance at F2, but oh well. The mushy release burst is an oral release, so, in
my increasing IPA weirdness, I'm now transcribing these as orally-released nasals, but this is just standard for my homorganic
nasal-stop sequences. So this is an underlying /nd/.
[kʰ], IPA 109 + 404

There's a funny, broad, high amplitude clunk here, which I'd be tempted to ignore or declare a closure transient, since it's another
25 msec or so before we get any decent release noise going. But then there's a double-bursty looking thing in F2 at the same
moment. Double bursts are usually associated with velar releases but then I oculdnt explain the delay in the noise. So I'm going
to suggest that this is a rare velar closure transient, consistent with a short delay between closure(s) on each side of the velum
(with the uvula in the middle), which is my (admittedly minority) explanation of the prevalence of double bursts with velars. So this
is velar. Although there's nothing really useful in terms of transition on either side. Aspirated, of couurse. Just look at that VOT.
[ɫ̥], IPA 209 + 402

BUt once again, there's something weird happening in the upper formants, and an ohterwise unexplained low F2 during all that
VOT noise to be accounted for. So again I'm jamming an approximant /l/ and moving on.
Egad this is a long spectrogram. Pushing on: There's a funny discontinuity at about 1425 msec, which I take to be the 'moment' if
there is one, when the tongue approaches the minimum coming up. So ths sretch from 1375 to 1575 msec (or so) I'm dividing
into three bits again. The first bit, before the discontiniuty, the bit around the F3 minimum, and the bit after.
[ɛ], IPA 303

Epsilon
Okay, vowel. F1 is mid or a little bit high of mid, so we're dealing with a middish or lower-middish kind of vowel. F2 is neutral, but
not nearly as low as with the [o]s previous, so I'm going to declare this central/frontish. F3 I'm goign to ignore even though it's flat
because it's heading somewhere. So we've got something not at all back or round (by virtue of having been declared
central/frontish--Peter Ladefoged always said you had to know what you were looking at before you could look at it, and this kind
of circular reasong is what he meant. And mid or mid-low. Admittedly, there's a couple of English vowels down in that part of the
vowel space but since there's some serious coarticulatory (or allophonic) issues with the upcoming segment, it hardly matters
which you pick....
[ɹ], IPA 151

Turned R
So the critical thing here is to notice the F3 is way low. Not as low as it sometimes gets, but way low. Really, the only thing that
pulls F3 down like that is /r/.
[ɨ], IPA 317

Barred I
Another short, moving vowel. Moving on.
[n], IPA 116

Lower-case N
Okay, this is a more canonical looking nasal. Not quite 100 msec around 1600 msec. Look at those edges. Look at that nice
voicing bar. Those fabulous zeroes. And best of all, there's just a hint of resonance at 1500 Hz or so, telling us that this is an
alveolar nasal. Also the transitions are consistent with that, but that 1500 Hz resonance is just beautiful.
I've been looking at these things too long.
[ɛ], IPA 303

Epsilon
F1 is relatively high, so we're dealing with something lowish. F2 is a little above neutral, so this is front. So we're dealing with a
lowish frontish vowel. This one is the stressed one, so it can be a little lower and fronter compared with the previous [ɛ], and it
doesn't have the same contextual difficulty. So again, we've got two vowels down in the lower fronter part of the space, and here
it matters which one it is. Although how you'd tell the difference I'm not sure at this point. Hmm. Anyway, there's some
glottalization at the end of this, which is partly phrase-final low pitch ...
[t], IPA 103

Lower-case T
... and partly allophonic glottalizaiton of this consonant. So what we can see of the transitions during the glottalization is consistent
with alveolar closure, which is lucky because alveolars glottalize more than velars and bilabials. The release noise is 'tilted' toward
the high frequencies, like a sibilant (well, like [s]) which is consistent with an alveolar release. I'd expect a velar release to have
more noise in the F2/F3 range, and a bilabial release to not have that very high frequency noise component to it. Usually. So on
the balance, almost undoubtedly alveolar.
So let's talk prosody. This utterance didn't work out quite the way I expected.
First the break indices. Zeroes mark the ends of reduced words, think 'clitic' although that word is loaded. Ones mark the ends of
prosodic words. Anything 1 or above should have a lexical (*) accent. Twos are for something in between word and phrase, either
that has a boundary tone but doesn't otherwise exhibit the features of a boundary, or as in this case something that seems to
have the timing of a boundary but doesn't seem to have a tonal mark. Threes, if there were any, would mark phrase boundaries,
and the four the utterance boundary. Or at least that's what they're doing here.
Tonewise, there are two clear highs. Whether they're H*s or H*+L I'm not 100% sure, so I just marked them as H*. Probably the
first one should be an H*+L, since otherwise there's no real reason for the low pitch on the following syllable. Unless you think
that's really a 3], which is possible. The last H* is placed on the stressed syllable, which happens also to be final in the utterance,
so it's getting kind of squished in with the low boundary tone.
That 2] was a compromise in my head. It felt (and sounded) like there was some lengthening there, but there didn't seem to be a
phrasal tone. It looks like the H* there is deaccented, and I don't know how to deal with this in ToBI. Suggestions appreciated. But
following (my) ToBI rules, the lexical word there is supposed to get some kind of mark. Again, I could be wrong, as far as real
ToBI goes.
I obviously need to brush up on my ToBI. So this'll be it for the intonation for a while.
Last modified: 11/08/2009 22:57:49

CANADA R3T 5V5
"Some find shiny things here."
This month's high-pedagogy mode spectrogram is heavy with fricatives (hence the extended frequency scale) and nasals. So pay
attention. Things aren't going to stay this "easy" for long...
[s], IPA 132

Lower-case S
So from about 75 msec to 200 msec, there's a voiceless fricative. Note the absence of any voicing striations at the bottom, and
the 'snowy' 'random' noise. The noise seems to be composed of a single wide, wide band, with no (or at least very little) formant-
like 'shaping'. The band seems to be centered up above the top of this spectrogram, which goes up to about 8000 Hz. If we see
this as a single huge band, centered up there somewhere, then what we're seing is the bottom half of the bell curve--the greatest
energy near the putative center, and falling away fairly quickly, but with a long tail, all the way down to the very low frequencies.
Noise that loud (at its center) is charcteristic of sibilants. So compare the energy here (and also in the segment from 750-900
msec, and even 1500-1600 msec) with the noise in the 400-500 msec segment, and the one around 1200 msec. That's sibilance
folks. Very high frequency, and very high amplitude, noise. Centered up above 8000 like this is charactgeristic of alveolars, so this
is [s].
[ɐ], IPA 324

Turned A
From 200 to not quite 300 msecs, wew've got voicing (note the 'voicing bar' between 100-200 Hz or so, consistent with my
probable F0/fundamental/first harmonic). The F1 is up around 800 Hz, the F2 just avoe it around 1200 Hz, and the F3, hmm, I
guess around 3700 Hz. But it's kind of fuzzy. Luckily for vowels we don't usually consider F3.... So an F1 that high must be a
relatively low vowel. An F2 that low should be back (or round), and I recall this was backer than I usually produce this vowel.
Probably should have transcribed it as back, but I was in a hurry. This one is closer to my (now) prefered phonemic transcription
for this vowel, which you'll need to find this word before you'll understand. So it's down there and not at all front. That's the main
thing.
[m], IPA 114

Lower-case M
Then at 300 msec the amplitude suddenly dies off. Where F1 was, suddenly there's nothing, the thng that looks like it used to be
the F2 is now weaker, and all the way up there's just less energy. Typical of nasals (if you've been through any acoustic
phonetics, you know that side cavities suck energy out of spectra-- as antiresonances--rather than adding energy to it. So this
almost has to be a nasal. Note thte transitions in the previous vowel. F4 falls sharply; F3 isn't really doing much, or if it is, it's
interrupted by the zero. F2 starts out low and if anything falls. F1 always falls into closure, so that's not really indicative of
anything. So we have mostly falling formants, usually correlated with bilabiality. And if you know my voice, you know my nasal
pole (the formant above the lowest zero) is usually around 1000 Hz, where it's closer to 1300-1500 Hz for an alveolar. So
everything points to bilabial, or at least away from anything else.
[f], IPA 128

Lower-case F
Another fricative. This one is much, much weaker overall that the sibilant (it's hard to tell that from the 'below the dotted line'
frequencies, unless you have a lot of experience with this sort of thing, but then that's why I've provided 'above-the-dotted-line'
frequencies in this spectrogram. If you look, it's not obvious it's stronger anywhere up above 8000 Hz. While the noise in the [s]
was distributed in sort of a curve, this noise is sort of flat acros smost freuqencies. There's some shaping into formants in F4, I
guess, but for the most part, this looks like a non-sibilant fricative that doesn't have any formant 'shaping' to it. That suggest it's
produced in front of any useful resonating cavities, so if this is English, it's probably labiodental or (inter)dental. The F4 seems to
be falling, which is sort of unaccountable. F3 might be flat. But notice that the F2 starts, if anything, below 1000 Hz and rises. Now
it rises sharply throughout the following vowel, but there's nothing like anything but a labial transition into the following vowel. So
this might be a labial. I suppose it could be something else, but labial is probably the best guess for now.
[ɑɪ], IPA 305 + 319

So, abstarcting away from the first 25-30 msec following the 500 msec mark (which I take to be mostly transitional), we've got an
F1 that reaches it's steady state (if you believe in steady states, or its maximum if you don't) at about 800-900 Hz. So it start very
low, but starts to transition towards the higher space (lower F1 frequency--try to keep 'vowel height' and 'formant frequency'
straight in your heads at all times in theses discussions) in the last half of the vowel (from before 600 msec to the vowel end at
about 650 msec). When the F1 is 'steady' at its maximujm, F2 is transitioning, but is still quite low, so at the beginning of this
vowel, it's pretty low and quite back. The F2 rises up to almost 1900 Hz and then suddenly transitions sharply down (to about
1750 Hz) in the last 25-30 msec of the vowel, which again I take to be transitional. So where the F2 reaches its maximum, near
650 msec, the F1 is around 500 Hz, so sort of mid. Note the F1 is getting fuzzy in the second half of this vowel. This will be
important later. SO this is a diphthong. The nucleus is low and back, and the offglide is toward the high front (as opposed to the
higher-back) space. The transcription reflects the 'reality' of the nucleus, but only the 'direction' of the offglide, which is sort of
combininb transcription conventions. So I'm explaining it here.
[nd], IPA 116 + 104

Again, making up symbols on the fly will require explanation. But work with me. WE've got another nasal here. See the sudden
drop of amplitude? See the zero? See how there's no energy right at 1000 Hz, but some up at 1500 Hz or so? Must be alveolar.
This is consistent with the transitional information. Even though the F2 is heading down, it's pointed at 1750 Hz or so, which is
generally the 'locus' (if you believe in loci) of alveolar transitions. Somewher ein there. If this were a velar, there'd be more
evidence of 'velar pinch' in the approaching transitions, and a bilabial would have a sharper fall (one presumes) in F2, and
something like falling transitions in the upper formants. So everything points to an alveolar nasal. The fuzziness in F1 in the
previous vowel I noted before is a sign of nasalization on the vowel. In spectrograms, I rarely mark contextual nasalization of
vowels, unless a) it's really, really obvious--with creeping zeroes and whatnot) and/or b) the following nasal stop isn't obvious.
This is, so English phonology being what it is, the vowel must nasalize. Compare my decision to mark rhoticity later. The 'right
superscript d' diacritic is my ad hoc way of marking oral plosion. There's no real 'oral stop' phase to this, unless you count the last
10-15 msec or something right before the onset of the noise. If it ain't long enough to segment out, I'm not wasting a lot of time
trying to. So I've just marked an oral release rather than a separate segment. Take that for whatever it's worth, which I don't
expect is much. Anyway, this is how homorganic nasal-stop coda clusters seem to look in my voice.
[ʃ], IPA 134

Esh
Now this is obviously another fricative. But while the initial [s] in this spectrogram was 'tilted' toward the high frequencies, this one
is much flatter. Still very broad band, and very high amplitude, so we're still talking sibilant. Which pretty much just leaves [ʃ], but
let's suppose we didn't have an [s] to compare to. We could still identify this, a) because it isn't tilted toward the high frequencies,
b) it's much stronger in the F2/F3 region than a typical [s], and c) just below the F2 region, the amplitude suddenly drops off really
sharply. All of which point to [ʃ]. The noise at the bottom is pretty noisy. It's not striated into (fairly) clean vertical pulses. So this is
voiceless.
[ɑɪ], IPA 305 + 319

Well, it's back to formants for this stretch between 900-1000 msec. F1 is kind of fuzzy, but it seems to be centered betwen 750
and 800 Hz. F2, is also kind of fuzzy, but it seems to start around 1300 Hz. At about 950 msec, the fuzzinees starts to leave off,
slightly, and the F1 may be dropping slightly. F2 is clearly rising. So we've got a vowel that moves from fairly low and more back
than central to something slightly higher and definitely front. I really only have two fronting diphthongs (under normal
circumstances) and only one starts anything like low.
[n], IPA 116

Lower-case N
So here's our next nasal. It's really short, just about 50 msec, but oh well. See the zero? See what would have been the F1 die?
Now, where's the pole? Well, it's not at 1000 Hz. It's not at 1500 Hz. There's definitely something at 2500 Hz or so, but that's not
what we're looking for. So our pole is weak, and we need other cues. So let's look at the transitions. The F2 transition in particular
seems to point down into the closure, but to somewhere around 1700-1800 Hz. Alveolar locus. So that's our best cue. And the
shortness is sort of consistent with that--it's verging on flapping. TMSAISTI.
[i], IPA 301

Lower-case I
F1 is fairly low, certainly lower than we've seen anywhere else, so we're dealing with a fairly high vowel. F2 is way the heck up
avoe 2200 Hz, which is about as high as I've ever seen my F2. Which tells us this is massively front. So we've got a high,
outrageously front vowel.
[θ], IPA 130
Theta
Another fricative, probably voiceless, from just before 1200 msec and lasting about 100 msec. The noise in the lower (normally
visible) frequencies is very light. The noise that we can see is tilted to the high frequencies, but even up there it's not loud enough
to be sibilant. So this ain't sibilant. It isn't shaped like a vowel (i.e. with noise running through the vocal tract and filtered by
resonances) so it ain't [h]. So that leaves the labiodental and interdental again. So now it's time to compare the transitions with
the previous non-sibilant. The F2 in the preceding vowel is too high to do anything but fall but the F2 coming out of the fricative is
just flat. And it's even near the alveolar locus. Now look at that low F2 coming out (but rising) of the [f]. It definitely starts lower
than it 'needs' to. Where this one has room to start lower if it wanted to. So it doesn't. Probably interdental.
[ɪ], IPA 319

Small Capital I
Now here's a vowel. F1 is hard to read, but it's definitely not very high. And it's not as low as the previous vowel. So we're talking
high, but not highest. F2 is quite high, but under 2000 Hz, so it's very front, but not nearly as front as the previous vowel. So if that
was [i], we need to find something not quite as high and not quite as front. Could be [e], but, well, it's not.
[ŋ], IPA 119

Eng
So we've got another nasal. This one is quite long, and definitely doesn't have a pole at 1000 or 1500 Hz. IT also ahs a pole
above 2000 Hz, like the last one, but it's looking like they all do. So all we have going for us is that pinchy transition into it. Look at
that F2 and F3 come togther! Doesn't get any pinchier than that.
[z], IPA 133

Lower-case Z
But then there's a discontinuity. An alveolar looking pole kind of comes in just before 1500 msec. The voicing also kinds of dies
away for a bit, but doesn't go away completely. It's atypical, but the nasal resonance changes around this moment too, which
suggests something odd is happening with the coordination of my soft palate raising and the alveolar closure. Which isn't
technically a closure since we're dealing with a fricative. I've tried slowing down my [ŋz] sequencies and I do get some nasality
over the frication. As improbable as that sounds. Anyway, it's good that I increased the visible frequencies for this spectrogram,
so you can see what's really going on with the noise. The noise is quite high amplitude, but mostly at the highest frequencies.
That is, it's obviously sibilant, tilted to the high frequencies, but less loud than the initial [s], so the noise dies off in the 'normally
visible' frequencies much faster than for the voiceless [s].
[h], IPA 146

Lower-case H
The voicing never quite dies off, but since there's no evidence of striated formant stuff, I decided not to bother transcribing this as
voiced. This in spite of the apparent resonances that creep in really early. It's nice though, because you can see the transitions
into the high, front tongue position, but excited by noise rather than voicing.
[i ˞], IPA 301 + 419

I don't usually mark rhoticity on vowels, since Keating et al. (1994) suggested it was entirely redundant. But I needed something
to indicate/account for the movement in the fully voiced section of the vowel. So actually the clear high, front position is hit really
early, during the noise, and by the time full voicing and resonance kicks on (where I've marked the 'beginning' of the vowel) the
F3 is already transitioning to its final position (more below) dragging the F2 along with it. So even though it's moving, it was still
clearly an [i] target. In the center of the vowel the F2 is closer to [ɪ]. But this is a rhotacized [i], not an allophonic [ɪ] selected before
an [ɹ]. TMSAISTI.
[ɹ], IPA 151

Turned R
So following the transitions, that blob around 1750 Hz is both F2 and F3. An F3 that low can only be a [ɹ]. Nuff said?
So remember how these fricatives really look, so that when you can only see them up to 4000 Hz you can hypothesize how they
are supposed to look.
Last modified: 11/08/2009 22:57:48

CANADA R3T 5V5
To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93 or Lucida Sans Unicode.
If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode, drop me a line.
Solution for December 2002
Not
"Silk flowers have no smell."
Segmental cues

We've got a fair bit of noise. It's not remarkably loud, so I guess I won't go on too long about how much louder sibilant noise is
than channel noise. But starting about 100 msec and continuing to just past 250 msec, there's frication. Notice the 'profile'.
There's some noise at every visible frequency (more as the fricative goes on), but the noise is definitely concentrated off the top
of the spectrogram. This is absolutely characteristics of /s/, i.e. alveolar (coronal) fricative, and it's due in part to the jet of air being
blown against the upper teeth. Sibilant fricatives are all about jet management. Remember that, next time you're haranguing a six-
year-old. ;-)
Small Capital I, IPA 132

SIL [I], Unicode [ɪ]
This vowel is tough, and I probably shouldn't have transcribed it so directly. There's a heckuva lot of movement here, due in part
to the following consonant. The F1 is just a little low of neutral (neutral for F1 is somewhere around 500 Hz for my voice). So this
vowel is a little high of mid. I decided this because there seem to be something like four strong harmonics, at roughy 200, 400,
600 and 8oo Hz. Notice that the bottom two are stronger and smear together more. So I'm assuming the peak of the filter function
(the first resonance) is somewhere between the two. The peak of the filter function gives the most boost to the underlying
harmonics, and does the most smearing. Which is why I think the peak is low of mid rather than somewhere else. The F2 starts
just a tad high of the neutral F2 range (1500-1600 Hz) or so, but falls rapidly. Since this doesn't look like a backing diphthong (i.e.
[aw]), I'd say this was transitional. Transitional to what we'll discuss later. At that left edge of the vowel, the F1 indicates a higher-
than-mid vowel, and a slightly front vowel, with a basically neutral F3. So what vowel is slightly high and slightly front. Well,
something like small-cap I.

SIL [lò], Unicode [ɫ]
Segmenting off this bit depends on noticng that the harmonics above 1500 Hz or so die off, but the voicing and lower harmonics
(supported by F1 and F2) stick around. So something is happening here which isn't happening in the 'vowel' segment earlier. That
might be just voice-quality or pitch change zapping the upper frequencies, but it's better to assume that there may be something
here and be wrong than to assume there isn't. Just cuz disproving a positive is easier than proving a negative. Okay, so if there's
something going on here, what is it? The F1 is low. The F2 is low, but that F3 is outrageously high. /l/ in my dialect is pretty much
always dark (low F2), and is almost always accompanied by a high F3 (or occasionally an F3 that looks neutral and a high F4).
Low F2, high F3, must be dark /l/.

There's almost no evidence of a velar here. I'm looking at a gap from about 400 msec or so to about 480 msec or so. But look at
that. There's a burst amid some light noise, and then right after 500 msec there's another clunk in the same range. Could just be
a clunk, but that timing, 30 msec or so between clunks, just looks like a double-burst, characteristic of velars (and occasionally
laminal coronals). Which is pretty much the only evidence that this could be velar. There's no transitional information available in
the preceding voiced bit, and there fricative noise following is really unhelpful. But that double burst. And even that's not really
great looking, since it seems to be in the F1 range, and we really like the burst of a velar to be in the F2 and F3 range. But, here's
where you have to consider context. So we'll go on to the following context, and then come back to this.
Lower-case F, IPA 128

SIL [f], Unicode [f]
Noise, so a fricative. Pretty clearly voiceless, so voiceless. Very strong. And apparently concentrated in the uppre frequencies, but
much less obviously than the initial /s/ in this utterance. Note the absolute lack of any frequency-continuity with th formants.
Compare that with the /s/, where there's not a lot, but at least you could argue that the noise is sort of close to where the F2 and
F3 transitions are. But there's just no continuity here. So I'd say this fricative is unfiltered, i.e. the noise has to be coming from the
front (lip-end) of a mostly closed vocal tract. Otherwise, some cavity somewhere should be offering a resonance. And it's not.
There being no bilabial fricatives (normally) in my speech, the next best choice is something dental. Either labio- or inter-.
Okay, so if we say interdental, we might say the double burst is due to laminality. But I have mis-spoken. It's not laminality per se,
but the length of the closure along the hard palate. And the upper teeth don't provide that kind of length. So let's say labiodental.
Then we could have a velar stop which would explain the double burst, and we could have the low-frequency burst, since it has to
resonate through the approach to the labial 'closure'. Voilá, everybody's happy.

Remember what I said before about low F2, and high F3 OR low F2, neutral F3 and a high F4? Well, here's the high F4 part of
that. F3 is just dead flat through this vowel, but the F4 starts up somewhere around 3999 Hz! That's just freaky. It's also
uncontestably high. Dark /l/.
Lower-case A, IPA 304

SIL [a], Unicode [a]
Had some trouble deciding how to transcribe this one. I've got intuitions about what the structure of this word is, and none of them
are borne out by this spectrogram. So, in the end, I did the following. I abstracted away from the /l/ transitions, so I started looking
at about 625 msec or so. Then I notice there's sort of a 'hump. in F1 and F2 between 625 and about 725. After 725 there's
something else going on. There's very little evidence of diphthongization in this stretch, depending on what you think is going on
after 725 msec. Okay, so we've got something that's quite low. It's strikingly back, though not as back (or round) as either the /l/ or
the following bit. So I chose /a/. Which should technically be used for a front vowel. But this vowel just doesn't feel back. So I
cheated.
Lower-case W, IPA 170

SIL [w], Unicode [w]
The F1 is ambiguous here, but the F2 definitely has a dip about 750 or so. So something not at all low (but not particularly high)
and backish/roundish. Fully voiced and sonorant, by the way. Non-nasal (no zeroes). And back. Could be another /l/, although it
would be nice if the F3 weren't taking a nosedive and F4 were visible. Then I notice the following transition (from about 775 msec
to the F2 offset at 850msec or so). It's straight. Transitions are usually curved, i.e. the velocity of the movement slows as it
approaches the target. The only thing that I know of that has consistently straight, slow F2 transitions is /w/ across a following
vowel. Good guess. Who knows if it's right?

SIL [¨`], Unicode [ɹ ̩]
Low F3, reaches a minimum just about 800-825 msec, around 1800 Hz or so. Nuff sed?
Lower-case Z + Under Ring, IPA 133 + 402

SIL [z8], Unicode [z̥]
This doesn't look voiced. Except for the teeny hint of energy around 300 Hz, which just might be a hint of a harmonic. But there's
no striations in the voicing bar, I don't think. So who knows. This looks like a fricative, and judging from the high frequency
energy, it looks alveolar and sibilant. But weak. I've transcribed it as a devoiced [z], which is not quite the same thing as a
voiceless [s], since voiceless implies vocal fold abduction, where devoiced just implies lack of vibration. So the relative weakness
of the energy here (which implies low airflow) suggests something that is 'underlyingly' voiced. That's my story and I'm sticking to
it.

This is brilliant. It's incredibly loud, so we might think sibilant (I'm assuming I don't have to point out that this is obviously frication).
But look at those formant-like bands. I'm still trying to work out whether something aperiodic can have formants, or should have
resonances, or whatever. But look. Very obviously F2 moving through this stretch of frication, and F3 pretty flat. Some
organization in the F4 range, and if you work at it, you can see some organization in the F1 range, which is high (there's a
harmonic in the following vowel at about 600 and another between 900 and 1000 Hz (which is consistent with the pitch track,
which indicates the pitch here is just a hair above 150 Hz). Anyway, those two strong harmonics look like F1 to me, so if you
follow those frequencies back into the fricative we're discussing, voilá organization.
Ash, IPA 325

High F1 (low vowel), neutral F2 (not amazingly front, but not at all back). At least this one has a higher F2 than the previous one.
So whatever you called that one, call this one slightly fronter.

SIL [v], Unicode [v]
Well, it's fricative. Weak but fricative. There's more of that energy where the voicing bar should be, and notice the interruption of
the background noise, suggesting really, really weak voicing, even if it doesn't look amazingly striated. So weakly voiced, weak
fricative. So how do we identify the fricative. It's not sibilant. It doesn't have the filtered-organizational quality of /h/. Which leaves
the front fricatives (labiodental and (inter)dental). Not a lot of transitional information, but if you use your imagination--er, I mean
insight, the F3 transition into this fricative may be dropping a little. A little. And if you think that's true, then you can convince
yourself that F2 is doing it as well. And then you have labial-looking transitions. So there you go.

Ah, something obviously voiced, but weak compared to the following vowel, and with a nice obvious zero in it. The sharp
transition between this and the following vowel, along with the zero (about 1000 Hz, where all the energy gets zapped out of the
spectrum), suggest nasal more than anything else. Oral approximants can be similarly weak, but the transitions (both in terms of
resonance and just the amplitudes) are usually just a little bit smoother. So Nasal. If you know my voice, you know that 1500 Hz
for a resonance is usually a pretty good indicated of alveolar closure. (My bilabial nasal has a resonance closer to 1000 Hz.) The
other voice is velar, but the transitions in the following vowel don't look velar at all.
Lower-case O + Upsilon, IPA 307 + 321

SIL [oU], Unicode [oʊ]
Middish vowel, perhaps rising at the end (F1 starts at about 600 Hz, and possibly falls a little starting around 1400 msec). Backish
vowel (F2 starting just below 1500 Hz) and getting backer (or, more likely, rounder) over the course of the vowel. So middish and
sort of back, getting higher and backer.

'nother one.
Lower-case M, IPA 114

SIL [m], Unicode [m]
There's 75 msec or so of gap in the upper frequencies, but if you follow it down to the low frequencies, you see it's clearly voiced
(striations from about 1600 on). So low amplitude and voiced. Must be a sonorant. Following /s/ (unless you believe the preceding
word is [noUs]) so it must be a nasal (except eng) or an approximant (except /r/). Doesn't look like an /l/ or a /w/ (not enough
lowness to the onset of the F2), and the lack of energy even below 1000 Hz is suspiciously nasal-zero looking. Transitions could
be bilabial (F2 wouldn't start that low if it were alveolar or velar--at least not usually) so /m/ is the best guess.
Epsilon, IPA 303

Okay, here's where it all goes out the window. This is a lowish vowel. Not as low as low gets, but the F1 is definitely high of the
mid-range. It gets lower, which is a little weird, but oh well. The F2 looks a whole lot like the preceding vowel, but I promise there's
nothing round (at least) about this vowel. Which just goes to show you how 'tonality' works. The F3 looks a little low, but I'm willing
to bet that's just bilabial-transition low rather than rhoticity low. So we're looking for a lowish vowel. We'll come back to this one.
after we do the final sound.

Okay, there's this moment, around 1850 msec where the harmonics and formants lose their energy. It's not quite where I placed
the segment break, since I was looking more at what I thought was going on in F2, where it looks like where it starts to fall apart.
But whatever. After that moment, we've got a very high F1 (don't ask me why), and a very, very low F2. So this is as back as it
gets. The F3 doesn't tell us a lot, except it's just a little bit high.
Okay, so if the vowel is /o/ and this is an off-glide (like a /w/), this sentence doesn't make any sense. If this is a dark /l/, then the
word could either be 'small' 'smole' or 'smell'. Which is the most likely to go with the 'silk flowers'?
So (working backwards from the hypothesis that this word is 'smell'), what this is is a very dark /l/, so dark it's velarizing/backing
(slightly) the (underlyingly not-particularly front, low-of-mid vowel /E/ which precedes it. (There are people who definitely have a
highish /e/ vowel in this word. I'm not one of those people.) Monosyllabic words with final /l/ are notorious for goofing around with
the vowel quality in North American English.
The prosody and intonation
Once again, I've played with the current E_ToBI transcription conventions. Rather than a separate orthographic tier, I've aligned
the Break Indices to the segmental transcription. I've combined the Tonal Tier and the Break Index Tier as a single line. I align
word-level (*) tones with the middle of the marked vowel, but phrase accent (-) tones and boundary (%) tones to the left of their
appropriate Break Index.
"silk"
Break Index: 1
This word is receiving some kind of focus. silk flowers versus real flowers, I guess. As I understand the current conventions,
lexical words receive (at least) a 1BI and a *-tone, unless deaccented in some way. Receiving contrastive focus, this word
receives an H*. If this were a normal declarative. It might receive an L*.
"flowers"
Break Index: 2
Okay, the way I read the conventions, 2BIs are for a) something which 'feels' like a phrasal break, but doesn't receive any
particular tonal or prosodic mark (i.e. no phrase-final lengthening). This is different from a ?BI, which indicates an apparent mark,
but of uncertain or ambiguous BI level. So I gave this a 2, just because it seems a little bigger than a 1, but there doesn't seem to
be any mark. Lexical word, so I gave it an L* since it is near the apparent baseline. In a normal sentence, this would be H*, but
due to the focus of the preceding word, this is L*.
"have"
Break Index: 1
Another word, so it gets a tone. I've used H* because it feels like a high, although it is obviously lower than the first one. I don't
think there's a strong or obvious contrast between this intonational contour and one in which the HiF0 is here rather than on the
previous H*. Give it a try. If this were a normal rather than focus sentence, this would be a L*.
"no"
Break Index: 1
Since 'no' isn't lexical in the usual sense, I'd be tempted to give this a 0BI, but since it seems to have a distinct L*, I gave it a 1.
There's a certain amount of interpolation between the preceding H* and the pitch minimum on this vowel. On the other hand, I
think the pitch track indicates a drop to baseline here, rather than something neutral between the preceding H* and the Ls to
come. In a normal sentence, this could be either H* or L*, or unaccented.
"smell"
Break Index: 4
Low throughout, so I gave it a L*. The slight rise in pitch at the left edge of the [m] voicing I think is just a microperturbation due in
part to the voicelessness of the preceding fricative. (Voiceless sounds often are accompanied by a slight increase in pitch.
Supposedly this has to do with tightening of the folds that accompanies or anticipates abduction.) The utterance-level 4BI
requires (as I understand the current conventions) a phrase accent and a boundary tone. Since there's no evidence of anything
but L throughout, that's what I've selected. If there were a change from H to L here, or the reverse, I'd have a big trouble deciding
between the H* L-L% and H* H-L% or whatever. But no such conflict here.

Winnipeg, Manitoba
CANADA R3T 5V5
"Nothing beats a fudgey brownie."
So true, so true.
[n], IPA 116

Lower-case N
So I don't know what that noisy stuff is at the beginning. Maybe me wheezing into the microphone. The real nasal
starts with the real voicing at about 75 msec. Note the nice zero at about 750 Hz and the nice clear pole at 1500 Hz.
Ignore the murmur at 1000 Hz, which makes this look like a bilabial. Here's a generalization for you--if there seems to
be both an alveolar pole and a labial pole, and the labial pole is weaker, assume it's an alveolar (unless the transitions
are against you, which they're not here). Let's see if that holds up the next time this happens.
[ʌ], IPA 314

Turned V
Vowel. Low-to-mid, from the F1 at about 700. That's a little high of absolute mid, but probably close to mid enough
(considering my range is from about 350 to about 950, maybe 700 is really mid afterall). F2 is just low of central, so this
is a vaguely backish middish vowel.
[θ], IPA 130

Theta
Fricative. That's the easy part. It's sort of broad band and tilted to the high frequencies. So this might look sibilant. I've
certainly produced sibilants that light before. But notice how strong the syllable (the preceding vowel) is, and even the
following vowel. This is a particularly weak sibililant, if it's a sibilant, considering it's apparently stress-adjacent and
fairly early in the utterance. So probably not sibilant. Unfiltered, so that suggests anterior (in the mouth, as in
labiodental or interdental). Voiceless, by the way. Transitions are ocnsistent with something non-labial, I guess, so there
you go.
[ɨ], IPA 317

Barred I
So here's a reduced vowel. Really. It's a little loud, but trust me. You can tell because the F2 is all transition, with
nothing to suggest a separated target. So this is reduced. Trust me. And we transcribe it as barred-i due to the proximity
of F2 to F3, following Keating et al (1994). But check those transitions in F2 and F3 approaching the following segment.
[ŋ], IPA 119
Eng
Fully voiced and high amplitude in the voicing bar, but right above that the F1 (or whatever you want to call it) is
distinctly weaker than the previous vowel. There's also a pole or something just below 1000 Hz, and a stronger, broader
one between 2000-3000 Hz. Note how the F2 and F3 transitions in the preceding vowel point a) together and b) to this
general area. That transition pattern is pretty classic 'velar pinch' (although in the front part of the space) so this is a
(fronted) velar. THe flat resonances, the absence of energy in the high frequencies and the general 'sudden'ness of the
energy loss indicates a nasal.
[b], IPA 102

Lower-case B
Well, what we have here is a 'gap', an absence of any energy in the spectrogram. Indicating a plosive. Now, when we say
'no energy', we exclude the obvious energy down in the very lowest frequencies, i.e. where the voicing bar would be if
there were voicing. Which in this case, there is. You can tell because of the nice even pulsing striations. So this is a
voiced plosive. As to its place, the F2 and F3 transitions are rising sharply into the following vowel, i.e. the 'point' down
into the closure, which is classically a bilabial cue.
[i], IPA 301

Lower-case I
So here's arguably the loudest vowel in the spectrogram. Fairly long, compared to the other vowels, this must be
stressed. The F1 is low, indicating a high vowel. The F2 is outrageously high, up near 2250 or 2300 Hz, which is
characteristic only of [i].
[t], IPA 103

Lower-case T
Anoterh gap. The transitions into it are falling (F2) and flat (F3), and the fall isn't 'sharp' the way the bilabial transition
would be. It's 'pointing' at about 1700 Hz, rather than lower. So this looks sort of alveolar.
[s], IPA 132

Lower-case S
There is a segment here (probably we could tell that because of the noise is fairly consistent through the duration of the
sound, roughly 75 msec around the 800 msec mark). If you thought it was just aspiration/VOT and maybe some noise
associated with the release, the clues that you're wrong are that the 'release' moment isn't noticeably stronger and the
noise doesn't die off from there, if you follow. For the most part, this is a very wide, stong band, centered in the highest
frequencies, and so must be [s].
[ə], IPA 322

Schwa
Weak vowel. F1 and F3 are all transitional, which tells us this is reduced as well as unstressed. F2 is closer to F1, so
following Keating et al (1994), I transcribe it as schwa.
[f], IPA 128

Lower-case F
Anoter fricative. It looks a lot like the previous one (the [s]) but is much, much weaker. Except the transitions in the
surrounding vowels are all wrong. They all basically point downward, labial-like. So this this is a (strong) [f], rather than
a weak [s].
[ʌ], IPA 314

Turned V
A lovely clean, clear vowel. F2 and F3 are pretty transitional, by which I mean 'straight', but the F1 suggest there's a
'target'--there's sort of a steady-state, followed by a curvy offset. So the F1, when it's flat, is sort of mid--just a hair bove
500 Hz. F2, though moving, is a little below central, and certainly on average is vaguely back. F3 is up where it doesn't
tell us much, although it's rising. Don't ask me why. Well I have an idea, but I don't want to talk about it.
[d], IPA 104

Lower-case D
There's another, shortish gap, about 50 msec approaching the 1200 msec mark. The F3 in the previous vowel is clearly
pointing up, which is consistent with alveolar, and the F2 transition also is heading toward that 1700 Hz mark.
[ʒ], IPA 135
Yogh
Fricative. Loud, i.e. high amplitude. And high frequency. Sibilant. Note that it seems to be centered in the visible
frequencies rather than off the to, and dies off sharply below the F2 region. Post-alveolar.
[i], IPA 301

Lower-case I
Another high vowel with a very high F2.
[b], IPA 102

Lower-case B
Voiced plosive again, and look how sharply down past 1700 Hz the F2 transition into it is. Must be bilabial.
[ɹ], IPA 151

Turned R
So as the voicing begins just after 1400 Hz there's a weakness in the signal up to around 1500 or 1525 msec. But the
resnances aren't flat, and there aren't any lower-frequency 'zeroes' characteristic of a nasal. So this looks like an
approximant. F1 is just low of 'mid' and rises, F2 rises sharply to about 1500 Hz at the 'onset' of the vowel, and F3 has a
(short) flat spot at about 1700 Hz. Just a coincidence, that frequency. But it's a decidedly 'low' F3, down in the 'territory'
we associate with a (not particularly high) F2. So that's just 'low'. Low F3 = American [r].
[aɪ], IPA 304 + 321

Lower-case A + Upsilon
What we have here, folks, is a diphthong. When the 'vowel' starts, i.e. that moment when the energy above F3 kicks on,
the F1 is quite high, indicating a fairly low starting vowel. At the same moment, the F2 is abotu 1500 Hz, or so. F3 we'll
ignore since it's basically still 'recovering' from the previous sound. As the vowel smoves on, the F1 lowers, so the vowel
moves from lower to higher, and the F2 also lowers, so it moves from centralish to backish. Or roundish, as the case may
be.
[n], IPA 116

Lower-case N
Another nasal here. See how flat it is? See the zeroes? Can't be bilabial, with the F2 in the previous vowel isn't pointing
down into the closure. While the F3 maybe droppng a little, there's no real degree of 'pinch' here. So probably alveolar.
If the lower poles where clearer and less ambiguous, I'd say something about them. But they aren't, so I won't.
[i], IPA 301

Lower-case I
So weak and trailing off, look at that F2. What have we learned about F2s way up above 2000 Hz?

CANADA R3T 5V5
"Citrus doesn't go with chocolate."
People are always trying to foist those break-apart orange-chocolate orange things at me. Blech. I can sort of deal with
lemon cream and chocolate, and if you must cover your candied orange peels in something other than sugar, I won't
argue with you. But really, one of these things doesn't belong: citrus, coffee, mint, raspberry. Some things go with
chocolate, and some things don't. Get over it. End of sermon.
[s], IPA 132

Lower-case S
So we have a nice, slow-rising amplitude fricative here. It's evident that it's strongest in the highest frequencies (since
they are represented earlier and just get stronger as the lower frequencies come on, and centered above 4000 Hz
somewhere. So this is a pretty classic [s].
[ɪ], IPA 319

Small Capital I
Tiny short vowel, but quite loud and high pitched. If you're tempted to ignore it due to its length, fine. But if you do,
please tell me you labeled it as a barred-i. F1 is low (high vowel), f2 is highish (front vowel) but not clearly in [i]
territory (usually 2100 or 2200 Hz or so for my voice). And not long enough to be historically long, so probably a)
reduced or b) historically short (i.e. 'lax' whatever 'lax' means. Hence, [ɪ].
[t], IPA 103

Lower-case T
Plosive indicated by the gap in the spectrogram. No voicing bar so voiceless. Long VOT, so probably aspirated (but
there's something else going on there). Transitions don't tell us much. So all in all, go with the most common member
of the class. Honestly, I got nothing.
[ɹ]̥ , IPA 151 + 402

On the elsewise hand, we've got good evidence of a lowe F3, rising into the following vowel. This is why the transitions
aren't tell you much about the plosive--any information in the transitions for the plosive are sort of being wiped out, or
overriden, or hidden, by accommodating the low F3. So taking the lowest point of the F3, which is sort of in the middle
of the aspiration/VOT, I called this thing voiceless, and the F3 being down around 1750 Hz really it can only be an [ɹ].
[ɨ], IPA 317
Barred I
Well, this is longer than the last one, but it's also moving hard and fast. The F1 is rising slightly, as is the F2, and of
course the F3 is rising sharply back to neutral. So something that moves like that is mostly transitional, i.e. reduced,
Treat it as such as move on.
[s], IPA 132

Lower-case S
A little stronger in the high frequencies and weaker in the lower, but still classically [s] shaped--broad band, centered in
the high frequencies.
[t], IPA 103

Lower-case T
Another gap. This one a little more classically alveolar, what with the falling transitions in the following vowel,
consistent with the rising transitions in the preceding vowel (although those are obscured by other factors). The falling
F3 is really only something you should ever get with an alveolar.
[ʌ], IPA 314

Turned V
Vowel. So the F1 is rising slightly, but is storongest at the beginning at about 500 Hz. So this is middish and getting
lower. F2 is just below 15000 hz and falling, so backish and getting backer. So this is not a middish to lower middish
vowel, backish and backer. So there you go. Easy.
[z], IPA 133

Lower-case Z
Well, this is another alveolar sibilant, for the same reasons, but I thought this one was voiced. Now I'm not so sure. Oh
well.
[ɨ], IPA 317

Barred I
And thank heavens this vowel is just too short to worry about. In fact, most people would ignore it. Certainly in
transcriptions this sort of thing is regarded as a syllabic...
[n], IPA 116

Lower-case N
...nasal, but in practice, at least in American English, there's almost always a short (very, very short) period of open
vowel before nasal closure in this sort of situation. So what we see here is a longish sound with a fairly strong voicing
bar and very limited energy above it. It as fairly sharp 'eddges', which is another indicator of nasality (i.e. the sudden
occurrence and offset of a side cavity). Hard to tell what's going on with place, since the nasal pole I usually expect in
the F2 range (around 1000 Hz for labial [m] and around 1500 Hz for [n]) is not really supported by anything. So you
might guess [n], just playing the odds, or you might just call it a "N" for nasal, and move on.
[t], IPA 103

Lower-case T
There's not a lot of evidence or non-nasal oral closure here, so ordinarily, I'd indicate this as just a release on the nasal,
but since it appears to be voiceless, I didn't have a raised [t] symbol when I made up the solution spectrogram. So this is
just the release (a T in Keating et al (1994)'s release notation, without an accompanying TCL closure). Itis very
sharp/abrupt, and its noise is tilted to the high frequencies (i.e. [s] looking) rather than in the middle (F2/F3)
frequencies, as we might expect for a velar, or the lower frequencies as for a labial.
[k], IPA 109

Lower-case K
On the other hand, this is clearly a plosive. It features F2/F3 pinch (i.e. they start close together and move apart into the
following vowel) in the following transitions, and a nice mushy double burst, strongest in the F2/F3 region, absolutely
classically a velar plosive. Voiceless, of course, but unaspirated.
[oʊ], IPA 307 + 321

So except for the transitional blip at the beginning, the F1 of this looks pretty solidly "mid". So I think we're safe in
calling this a mid vowel. The F2 seems to transition in a straight line downard from central to low (i.e. from central to
back/round). This could just be transition, since when the F2 reaches it's minimum we also get a change in amplitude in
the upper frequencies, but the fact that it's straight means a) that there's no F2 target in the vowel (unlikely) or that b)
this is 'directed' or controlled or VISCy movement. So let's take it seriously and say this is centralish and moving backer
and rounder. So pretty definitely a phonemic /o/.
[w], IPA 170

Lower Case W
On the other hand, straight F2 transitions like this, and low F2s with low energy above are pretty standard for [w].
[ə], IPA 322

Schwa
Short vowel that's mostly transition, and otherwise adjacent to a fairly longish, strongish, i.e. probably stressed vowel.
So probably reduced. Moving on.
[θ], IPA 130

Theta
On the other hand, there's about 50 msec of 'something' approaching the 1200 msec mark. It isn't exactly gap-like,
there's noise in the extremely low frequencies and some noise or clunks or something all through the spectrum. So
given its clunkiness, this turns out to be a fricative. Not sibilant, which limits the choices.
[t], IPA 103

Lower-case T
Another gap. This one with nothing really useful on either side to give you much information. But phonotactics will
limit your choices in a moment.
[ʃ], IPA 134

Esh
For instance, right now. We've got here a sibilant fricative. It's still mostly one broad band, but this time it is
experiencing a little more filtering from the front cavities (we know this because sibilants really can only resonate
through the front cavities, the slit effectively closing the back cavities off. There's energy in the F2 region, and
substantially less energy below, which is suspicious, and the peak seems to be at 3000 Hz, instead of off the top of the
spectrogram. So taken together, I say this is almost definitely an [ʃ].
[ɑ], IPA 305

Script A
At last, a vowel worth talking about. F1 and F2 sort of converge into a single wide band, at least at this resolution, but
you can tell from the width that it has to be at least two formants. F1 and F2 converging (at about 1000 Hz) is a pretty
typcical way of describing [ɑ], the lowest backest vowel most of us can produce without a lot of rounding.
[k], IPA 109

Lower-case K
And there's another plosive. If you take a good look at the preceding vowel, the F2 is rising and the F3 is falling, which is
how we describe velar pinch. Well, we really like them to not only move toward each other but to actually make it close
together, but one can't have everything.
[ɫ], IPA 209

Tilde L (Dark L)
Well, believed it or not there's a long VOT in here hiding a different segment. The Low F2 and slightly raised F3 indicate
an American English dark /l/. At least partially devoiced due to the accompanying aspriation.
[ə], IPA 322

Schwa
Deceptively long vowel, due to final lengthening, but the F2 is all transitional.
[t], IPA 103

Lower-case T
The release isn't as sharp as I'd like, but the noise is vaguely [s]-shaped, so this is an alveolar plosive.
Robert Hagiwara, Ph.D. Current Mystery - Solution - Past Mysteries
Dept. of Linguistics How To - Research - Courses
University of Manitoba To the Lab - To the Department - To the University
Winnipeg, Manitoba
CANADA R3T 5V5
Solution for January 2008
Not Found
The requested URL
/~robh/wav/wav0801../wav/wav0801.wav
was not found on this server.
"A sore back is a real pain."

A h /2 0 63 (U i ) DAV/2 S
Har har.
[ə], IPA 322

Schwa
Cut this one kind of tight, but there's nothing really going on here. There's a hint of something going on around the
50msec mark, but the real action of this is around the 100 msec mark. If you thought it was two things, I fooled you. I
think that's just mucky onset of voicing. The real solid voicing part of this is to short to do anything with, so call it schwa
and move on.
[s], IPA 132

Lower-case S
Nice broad band, high frequency, high amplitude noise. You can see the oncoming liprounding (to get ahead of myself a
little) in the drawing down of the noise energy. So there's a cue for you. But it's clear that until that happens, the energy is
centered off the top of the spectrogram, and occupies one , really broad band, encompassing, basically, all frequencies.
Typical [s]. (Post alveolar noises tend to damp sharply below the F2 range, and are usually centered a little lower down,
say the F2-3 range.)
[o], IPA 307

Lower-case O
Look at that nice little stripe across this vowel, almost 200 msec of it, right at 500 Hz. Can't ask for anything more mid
than that. And the F2 is pretty low and flat, around 1000 Hz or just below at the beginning of this. Mid and back and
round. See how easy this spectrogram stuff is?
[ɹ], IPA 151
Turned R
And look at that F3. At th beginning of the [o], it's way up there in its neutral territory, roughly in the mid 2000s. And
down it drops. Gotta be an [ɹ].
[b], IPA 102

Lower-case B
So here we have a plosive, as indicated by the total cessation of energy above the voicing bar. This looks moderately
voiced throughout, so probably voiced rather than voiceless (although the perseverative voicing at the beginning is much
stronger than the 'lingering' voicing later). Plosivity(?) is also idnicated by the sharp onset of energy at about 550 msec,
with what looks like a release transient followed immediately by full on voicing. The downard pointing transitions on
both slides clearly indicate a bilabial, which is consistent with the low(ish) energy, low(ish) frequency release transient.
[æ], IPA 325

Ash
So we've got a vowel that starts sort of low and gets lower (ignoring the transition, the F1 is about 750 Hz at 600 msec and
rises to between 800 and 900 Hz by 700 msec. So we've got something fairly low . The F2 isn't at all back looking (low)
being, if anything, sort of neutral looking. So this is a frontish/centralish low vowel. The length also suggest a lower
vowel. Two possible vowels down there, Pick one, and guess.
[k], IPA 109

Lower-case K
Another plosive. Note here how the small amount of voicing just dies completely. Must be voiceless. The F2 and F3
transitions in the preceding vowel suggest velar pinch, and this is consistent with the 'double burst' (sort of) and the F2/F3
concentraiton of energy in the release. Yay, an easy velar for a change.
[ɨ], IPA 317

Barred I
Another short, weak vowel. Boring.
[z̥], IPA 133 + 402

Lower-case Z
'k, this is gonna be controversial. First, let's agree that this looks [s]-like. fairly high frequency, high amplitude noise,
broad band, the works. But definitely weaker than the earlier one, and much, much shorter. So even though it's not voiced,
even not knowing the words, I'd say this was a good candidate for an underlying [z], given the shorter duration and
weaker noise. So that's how I transcribed it. It helps knowing the final utterance, I admit.
[ə], IPA 322

Schwa
Now here's a longer, though sort of low-pitched vowel that is clearly all transition. The F2 is closer to the F3, which is
usually my criterion for whether it's a barred-i or not, but the F3 here is behing pulled down by ... well, something else, so
I ignored it here.
[ɹ], IPA 151

Turned R
Now check out this F3. Pushes so far down it merges practically (formants can't actually merge) with the F2. That and the
whole thing gets lost in the lower energy. Notice how the voicing never really dies, but the energy in the upper
frequencies is lost. Typical of an onset [r], rather than a coda one, if anyone cares.
[i], IPA 301

Lower-case I
So following the consonant, the F2 (and F3) soars upward as fast as it can, as if it were desperately trying to reach some
absurdly high frequency before being pulled in some other direction. And it makes it almost to 2000 Hz, which is pretty
healthily front. Look at the F1. Low low low. Must be a high front vowel. The frontest, in fact.
[ɫ], IPA 209

Tilde L (Dark L)
Meanwhile on the other side of the vowel, the F2 dives back down as fast as it rose, but the F3 keeps soaring upward. Up
past 2500 Hz, until it's actually high! High F3 is usually a good indicator of an English lateral, the low F2 of backness, i.e.
velarization, i.e. darkness.
[pʰ], IPA 101 + 404

So here's another plosive. In spite of the noise at the bottom, you can still see there's no voicing to speak of. If that weren't
enough, there's 100 msec of VOT following the release. So think aspirated. The transitions are a little hard to read, givne
the extremity of the formant positions to begin with (on the left) and the noise (on the right), so let's just console ourselves
with the knowledge that there's no hint of velar pinch, and no hint of that F2 trying to rise into 1700 Hz or rise out of 1800
on the left. So that and the burst all suggest bilabial. Which turns out to be right, but don't ask me how often it would.
[eɪ], IPA 302 + 319

This is typical of my /e/s. Let's start at thebottom. Due--I guess--to the oncoming nasalization, the F1 is fuzzy. I'd guess it
was at about 500 Hz to begin with, but by the time you get to 1700 msec or so the only evidence of anything is lower. So
maybe this is mid, moving to higher, or maybe it's just mid and getting progressively fuzzier. Next up to the F2, we start
quite high (i.e. quite front) and move steadily higher (fronter). So we've got something that moves from middish and front
to possibly higher and almost definitely fronter.
[n], IPA 116

Lower-case N
The oncoming nasalization does a few things in the vowel. First it's damping most of the energy across the board, such
that the zeroes start to appear between the wide-spaced F1 and F2, and between F3 and F4. At the same time, the
formants, especially F1, get fuzzier, by which I mean the bandwidths are expanding (if you don't understand that, don't
worry about it. You'll learn in acoustics class that nasals tend to have wider bandwidths than orals--if no one ever bothers
to explain why, it won't be the worst thing left out of your education). The energy in F2seems to cut off at about 1850
msec, whichis more or less where the 'quality' of the sound in the voicing bar changes, so that's where I'd locate the
closure. Following this is just nasal. Weak, being at the end of an utterance, there's no evidence of poles or anything to
give us a clue for place. The F2/F3 transition might just be pinching together, or that could just be the movement of the F2
accidentally impinging on the the F3 territory ("impinge" is a word I don't use very often, but there you go). I can
convince myself that the last little trailing bits of the F2 are actually headed back down, so this could be alveolar. The F4
is headed up, so I doubt this could be bilabial (what do I know from F4?). But in the end, I'll go with the statistics and
suggest alveolar. Although if I had to convince myself this was a final -ing, I could probably do that too. That's the handy
thing about being equivocal, you can convince yourself of almost anything if you really, really want to....
Last modified: 11/08/2009 22:57:54

CANADA R3T 5V5
"Snow can dampen sound."
This is actually an observation I make every year after the first serious snowfall. It's just weird how different the world
sounds for those first hours after a snow. First of all, all the usual sounds--footfalls, roadsounds, etc.--just don't have the
same sound. Secondly, the usual reverby-echoey stuff that we take for granted disappear. The world gets correspondingly
small and close. It's just odd. Anyway, that was on my mind when I was putting this together.
[s], IPA 132

Lower-case S
What we have here is a good 200 msec of serious noise. Broadband, and centered above 4000 Hz, this has to be an [s] of
some kind.
[n], IPA 116

Lower-case N
Well, aside from that very low-frequency bursty thing at about 275 msec, we've got something that is clearly a nasal. Nice
full voicing bar, clear zero(es) above, but evidence of nice resonances. Especially that one just below 1500 Hz. Which is
right where I'd hope one would be for an alveolar [n].
[oʊ], IPA 307 + 321

So starting from 325-350 msec or so and continuing until 550 msec or os there's one very mobile vowel. My new word
this year is VISC (which stands for vowel-inherent spectral change) which refers to what I would call 'category-dependent
formant dynamism'. But VISC is easier to say and to type, so there wo go. The point is this vowel doesn't seem to have a
target and transitions but is intrinsically 'moving'. So let's look at th emovement. F1 starts a little high and ends a little
mid-to-low. So this starts sort of lower-mid and rises to mid or higher-mid. The F2 starts just a hair low of central and
travels lower. So this starts backish and moves very backer and rounder. So, this being my voice, we have limited choices
in the mid-back range, and most of them don't move this much.
[kʰ], IPA 109 + 404

So there's no velar pinch heading into this closure, but that's presumably because of the strong prosodic boundary here.
TMSAISTI. Fortunately, the transitions into the following vowel, even through the aspiration. BTW, I've mentioned
before that Someone Important once complained about my timescale. It's here that I can see the point. I try to get a
comfortable 2.5 seconds in 800 pixels, which is about half the 'traditional' timescale, which is something like 2.5 seconds
in 14 inches. Just work with me. Anyway, at this timescale, ti does sort of obscure the fuzzy line between 'long' VOTs and
'short' VOTs. But this is clearly a long VOT, in the sense that you can definitely see the gap between the (mushy) release
and the first clear pulse. WIth shorter VOTs, that can be harder to see. But since this is clearly some VOT, we're almost
definitely working with something aspirated. So what we have here is an aspirated velar plosive.
[ɨ], IPA 317

Barred I
On the other hand, this vowel is just short. The F2 is closer to the F3, so following Keating et al (1994) I transcribe it as a
barred-i and move on.
[n], IPA 116

Lower-case N
So here's another nice nasal. Long. Probably means something, but I don't know what. Anyway, there's that nice full
voicing bar, the zeroes, and a nice little pole around 1500 Hz. Sounds familiar.
[d], IPA 104

Lower-case D
This is the first time in a while that This looked like a plausible plosive and not just an oral release. But that's debatable.
The voicing seems to die down for a few mpulses around 850 msec, and there's a nice burst and a pulse or so of irregular
voicing before the regular pulsing sets in. So this has to be a plosive release of some kind. The F3 transition isn't doing
much. THe F2 is sort of high, but isn't really reaching up to the F3, so this is an alveolar.
[æ], IPA 325

Ash
I'm getting really tired of my low front vowels. I'm going to have to do something about them. The problem is I don't
know how much of them is the way I represent them and the way I let the coarticulate with other things. Maybe it's the
following nasal. Oops, I shouldn't have spoiled that. But anyway, we've got a weird vowel here. F1 starts quite low but
rises hitting a steady(ish) state in the last third or so of the vowel. At that point, it's fairly high. So by usual stuff, taking
the steady state, we'd say this was a low vowel. But 'tensed', perhaps by the nasal. Although now that I think about it,
there's no reason to raise a low vowel before a nasal. Anyway, F2 starts quite high, indicating something front. But then
about one third of the way through, it starts to fall to a back position. Weird. So taking the steady state, we'd say this is
front. So let's just say low and front and gloss over the difficulties. Like the F1 and the F2 are steady in two different and
non-overlapping parts of the vowel. Argh.
[m], IPA 114

Lower-case M
So we've got a nice little nasal. Weak, and the voicing falls apart. I should proobably have treated this and the following
release as a presnasalized stop, or an orally released nasal, or whatever we call these [mp nd] things. But the nasal portion
of this thing starts with a pole just above 1000 Hz, which is rught where I want it to be for an [m]. Also the F2 transition
in the vowel is just too low to be anything but labial..
[p], IPA 101

Lower-case P
Well, in the absence of any other evidence, guess homorganic and move on. Really. There's just nothing to this thing
except the wonky release which doesn't tell us much.
[ə], IPA 322

Schwa
This is a classic schwa--(near) evenly spaced formants, short, low pitch (see the distance between the pulses), all of which
points to deaccented, reduced, etc.
[n], IPA 116

Lower-case N
And here's another nasal. It's hard to see, but there is apole, more or less in line with the F2 in the previous vowel, and
clearly higher than that hint of a pole in the [m]. So this is another alveolar.
Okay, at this point, some of you will be trying to fit in another vowel. Which is fair. But I swear I didn't produce one, and
I don't really hear one. I think that's just a transient closure approaching the fricative, overlapped with the nasal. So there's
a moment here where there's simultaneous nasal and fricative going on, and the point is it's all transitional so I'm going to
ignore it. TMSAISTI.
[s], IPA 132

Lower-case S
So we've got a fricatative. Tilted to the high frequencies, one broad band, etc. Must be [s].
[aɪ], IPA 304 + 321

So this vowel, and I'm looking from just after 1600 msec all the way to about 1950 msec, the F1 is very flat, amazingly so,
and very high. So we have something very, very low. At the beginning, the F2 is steady and fairly neutral, so for
something this low we've got something fairly central-to-front depending on how you think of the lower space. But then it
starts to move, toward a back, round vowel. So I've transcribed it as a standard diphthong, even though the F1 doesn't
seem to move at all. Which is why I don't like digraphs for diphthongs, which make it look like the glide portion is of
equal importance to the nucleus, which it ain't. Conventionally we've got a nucleus and a movement, not two targets. I
guess. I mean, I think there's a difference, at least phonetically.
[n], IPA 116

Lower-case N
Nasal. See the full voicing? No evidence of anything placewise--the F2 transition is very labial looking, but the F3 is
rising if anything, so that could be alveoar. Or both--tongue moving one direction, lips moving one direction, and both
having their effect on the output acoustics. Too bad the pole isn't very visible. Can't be velar. so must be bilabal or
alveolar, unless it's coarticulating with the following plosive. Which it porbably is.
[d], IPA 104

Lower-case D
In keeping with this spectrogram treating homorganic oral releases to nsasals as separate elements (I don't know where my
head was). The release noise is definitley tilted into the higher frequencies, even though it isn't classically [s] shaped.
There's enough voicing in the release that you can see the F2, and it's nowhere near the F3--no pinch. So we're dealing
with something that is either bilabial or alveolar. The sharpness of the release and its broad frequency range suggests
alveolar. And there's not a lot pointing towards labial. So all things considered alveolar is a better guess--and I'm betting
an [nd] will make a better word with [saɪ] than [mb].
Last modified: 11/08/2009 22:57:54

CANADA R3T 5V5
-->"Intimidating, we're not."
Such an interesting utterance. Inspired by a student's comment about the department. No, really. But we should be able to
deal with a little fronting. I did warn you that this sentence was syntactically marked.
[ɪ], IPA 319

Small Capital I
There's nothing like starting with a nice clean vowel. There's really nothing in that first pulse that looks like a release, and
obviously no VOT. No previoicing, no VOT, no murmur or even any glotalization that might call attention to the onset of
the vowel. So this is just a vowel. A fairly high vowel, and very, very front. But tending to centralizing rather than, well,
out-gliding. So this is an initial [ɪ].
[n], IPA 116

Lower-case N
Ah, a nice clear sonorant, judging by the voicing bar, but with precious little energy above. Must be a nasal (or a really,
really odd-looking approximant). Although precisely which nasal is open to argument. The F2 transition is falling, but to a
relatively high F2 frequency, suggesting an alveolar. Which would be a good guess even if it weren't, since alveolars are
so much more common. Anyway, since there's cleary a stop following (check out the release at 375 msec or so) it's
probably at least partially assimilated anyway. So leave it ambiguous and move on.
[tʰ], IPA 103 + 404

Now, for about 50 msec starting before that release thing, something happens to the voicing. This suggests closure, if not
devoicing and other things. The burst also tells us there has to be a plosive in here, since otherwise there's no way to
develop any pressure for a release like that. So let's talk about the release. Very sharp. But one might also say broad-band.
Let's look at the frication/VOT noise. "Tilted to the high frequencies" comes to mind. Almost...[s]-like. This, friends, is an
alveolar release. And with a VOT like that, obviously aspirated.
[ɪ], IPA 319

Small Capital I
Shortish vowel, probably dismissable as reduced, but let's pretend it's not for the time being. In which case it looks like it
as a mid-to-lowish F1, so it's moderately high, and a highish F2, indicating something quite front. Frontish and mid-to-
highish. Okay. Turns out this is the highest pitch of the whole utterance, so it must be the stressed syllable.
[m], IPA 114
Lower-case M
Another obvious nasal, with more going on in it that previous ones. First of all, there's a pole or something at about 1200
Hz. That's a little low for an [n]. But then it's a little high for an [m]. So look at those F2 transitions into and out of it. F2
dives to the level of the pole, and rises again. Must be bilabial.
[ɨ], IPA 317

Barred I
Another shortish vowel. It's longer than the last one, but the last one was clearly the stressed one, so this one isn'. If it
were, it would probably be the same. But for some reason it's alittle more central (as in a slightly lower F2), but whatever.
[d], IPA 104

Lower-case D
Plosive, probably, given the sudden loss of energy at all frequencies, and the absence of any energy above the voicing bar.
The weak voicing in the voicing bar is consistent with voicing during closure. As for place, Note the rising F4, the
ambiguous or falling F3, which together should give one pause. But cmoparing the F2 transitions on either side, both
clearly point to about 1700-1800 Hz, the usual 'locus' of F2 transitions for alveoalrs (if you don't believe that, then at least
convince yourself that the locus here is much higher than it was for the very clearly bilabial nasal.
[eɪ], IPA 302 + 319

F1 is middish, although at about 800 msec it seems to drop a little. F2 rises from the locus of the transition until about 800
msec or so, where it is clearly front. So we go from mid-to-high to higher and front to fronter. Not that many diphthongs
do that.
[ɾ], IPA 124

Fish-Hook R
Short little plosive looking thing. Too short. This is a flap. Remember what these look like. They come up.
[ɪ], IPA 319

Small Capital I
Now, on the surface, this looks a lot like the pervious vowel, except the onset frequency is a little lower, and the offset
frequency is a little higher. And that's true, but it's an illusion of the context. What we have here is a short little vowel
whose edges are getting pulled out of whack by the surrounding consonants. So abstracting away from the transitions
we've got something that is moderately high, and on average, extremely front. Could be [i] actually, but then the
transitions wouldn't be as smooth. I guess.
[ŋ], IPA 119

Eng
But notice how the F3 and F2 kind of meet each other? Easy to miss, but that's velar pinch, folks. So we're headed into a
velar. Note the strong voicing bar and the limited (but present) energy above. Most be another nasal. Woo hoo. Three
nasals in one utterance! Are you thrilled? Now take a moment and compare them. They come up too.
[w], IPA 170

Lower Case W
So by my reckoning (you know we're in trouble when I start reckoning things), the nasal lasts a good long while, but
'something' happens just before the 1100 msec mark. The voicing bar, or whatever suddenly gains a little energy,
suggesting the sudden de-coupling of the nasal channel. At that moment, F1 is so low (or so weak) to not matter much, F2
is as low as F2 ever gets, considering that by the time we definitely get F2 energy it's clearly raising from about 700 Hz or
so. F3 is nowhere to be seen, but isn't particularly low. So what do we know that has an incredibly low F2 (indicating
back, round, or given the extremity, both), but doesn't have a raised or particularly lowered F3? Right!
[i], IPA 301

Lower-case I
Well, that's what it seems like to me, but it's so coarticulated with the following sound it's hard to tell. Definitely heading
to a very front position. And fairly high.
[ɹ], IPA 151

Turned R
So here's the thing. F3 is plunging below 2000 Hz. That is, into F2 territory. Must be an [ɹ].
[n], IPA 116

Lower-case N
Oooh, another nasal. Knowing what ew know now, what do we know about this one?
[ɒ], IPA 313

Turned Script A
I'm developing quite a repetoire of low back vowels. This one just looks like it has a lower F2 than I'm used to, which is
why I thought it might be round. It's certainly typically round, or rounded, or whatever, than I'm used to around here.
Canadians, you know. Anyway, the F1 is just about as high as it can get, and the F2 is about as low as it can get
considering the F1, so this has to be low and back somewhere.
[t], IPA 103

Lower-case T
So there's some glottalization, or perhaps just low pitch at the end. So that's not a really good clue, but there's not a lot
happening from the last pulse, which is I guess at about 1750 msec, and the release. See the release? At about 1850 msec?
Slightly doubled, but don't let that fool you. Look at that frication in the high frequencies. Must be an avleolar. Sorry but
that's just how it is.
Last modified: 11/08/2009 22:57:53

CANADA R3T 5V5
Solution for October 2007 (from November 2001)
-->"Caffeine is a necessity."
[kʰ], IPA 109 + 404

Note the double burst, and the conecentration of energy in the release in the F2-F3 range. Good long aspiration. I can
convince myself of velar pinch, but it's very front, i.e. the high F2 is contiguous with a flattish F3 rather than classically
'pinch' shaped.
[æ], IPA 325

Ash
Very high F1, between 800 and 900 Hz. F2 is about 1550-1600, or basically neutral, definitely not low (which would
indicate backness). So the lowest non-back vowel available is [æ]. Depending on dialectal considerations, one might think
this is a lowered and or retracted [ɛ], but that's a topic for another time.
[f], IPA 128

Lower-case F
This is a little odd, in that it's alittle strong for N /f/. It is largely unfiltered, except for the F2 business. Transitions into it
aren't remarkably helpful, although post hoc we might noticed the F3 and F4 coming down a little due to labialization.
The following transitions aren't particularly helpful either, in that the F2 and F3 are so high they can't help but rise out of
this fricative. Okay, so it's a fricative, could be [h], although I'd expected more in the F1 region. Mark it as a fricative and
continue.
[i], IPA 301

Lower-case I
I've marked this as nasal, although it's hard to tell with these wide gaps between F1 and F2. F1, if anywhere is way down
below 500, say 400 or so. Recall that nasalization tends to 'mid-ify' F1 values. The F2 is way high, in fact about as high as
my F2s ever go (2400 Hz? Egad!). So very very front, and vaguely high. Once we decide it's nasalized we can settle on
transcribing the actually quality.
[n], IPA 116

Lower-case N
Clearly sonorant, i.e. fully voiced and fully resonant (i.e. has formants). Weaker than the available vowels. Flat and
vaguely evenly spaced formants, so unlikely to be an approximant. I'd be happier dclaring this a nasal if the zero were
clearer, but oh well. Place cuse are hard to say unless you know my voice fairly well, and know that that pole just below
1500 is pretty typical of of my [n]. The pole for [m] is usually lower.
[ʔ], IPA 113

Glottal Stop
This is mistranscribed above. It's a glottal stop. The irregular pulses without really 'stopping' suggest glottal stop (creaky
voice) more than anything else.
[ɨ], IPA 317

Barred I
Shortish weakish vowel, probably unstressed and therefore reduced and therefore not worth spending too much time on.
Following Keating et al 1994, I transcribe reduced vowels as barred-i if the F2 is closer to the F3 than the F1.
[z], IPA 133

Lower-case Z
This is definitely voiced (note the striations in the voicing bar at the bottom), and noisy (very little formant-y organization
of the noise). It appears to get a little stronger at the top of the spectrogram, indicating 'acute' or high-frequency center of
energy, typical of /s/. But voiced.
[ə], IPA 322

Schwa
Short little vowel. Nuff said.
[n], IPA 116

Lower-case N
This looks just about the same as the preceding nasal, but shorter.
[ə], IPA 322

Schwa
Short little vowel. Ditto.
[s], IPA 132

Lower-case S
Again, this is really hard to tell waht's going on here. All I can say is that it's voiceless, and that it seems to get a little
stronger as you go up in frequency. I'm starting to wonder if it's me or my microphone. Geesh I need some new
equipment.
[ɛ], IPA 303

Epsilon
So you can see this vowel is a lot stronger than the surrounding vowels, suggesting a degree of stress, and therefore we
should pay attention to it. Its F1 is either around 500 Hz or much higher, depending on which band you believe. Let's
just split the difference and say this is a mid-to-low vowel. The F2 isn't amazingly low, suggesting this is sort of front. So
the choices are limited. And even if this is a low vowel, it ain't as low as the previous [æ].
[s], IPA 132

Lower-case S
Another one of these, whatever they were.
[ɨ], IPA 317

Barred I
You get it. Short, runty, little vowel. Moving on.
[ɾ], IPA 124

Fish-Hook R
Short gap, actually kind of longish for a flap, and don't ask me what's going on in the low frequencies. But there's clearly
some kind of contact here, it's not really long enough to be a decent stop. Flap is the only obvious choice.
[i], IPA 301

Lower-case I
Well, again it's tough to see what's going on in the F1, but the F2 is definitely in /i/ territory. The transition into it is a little
low, suggesting /e/, but 'necessitay' isn't really an option.
Last modified: 11/08/2009 22:57:52

CANADA R3T 5V5
N"Sometimes they have donuts."

[s], IPA 132
Lower-case S
Now this is what I'm talking about. Ignore the noise at the bottom. I think that's just me blowing into the microphone.
But the noise above that is definitely weaker (it gets stronger over time as presumably the airflow picks up approaching
voicing) but gets stronger as you go up in frequency. Seems to occupy one very wide band rather than being shaped by
a the vocal tract, so this is a classic sibilant. Probably alveolar. And voiceless.
[ʌ], IPA 314

Turned V
A while ago I swore to stop using this vowel, but this one really looks mid and back, rather than central and low
(compare, well, it's coming later). What I mean is, this one has an F1 that starts and ends around, well, 600-700 Hz, and
tops out probably below 800 Hz. So not as high as it could be, but definitely higher than absolutely mid, so we're dealing
with a mid-open or an higher-low sort vowel. The F2 is actually quite low, around, oh, 1100-1200 Hz or thereabouts,
indicating a quite back, or quite round, or both, vowel. Well, back there, there aren't a lot of vowels to choose from.
[m], IPA 114

Lower-case M
From just before 300 msec to around 400 msec or so (ignoring for that other moment at around 375msec where
'something' happens) there's a longish stretch of voicing. The resonances above that indicate sonorance, but their
relative weakness suggests a side chamber. So we're dealing with a nasal or something. Concentrating of the first two
thirds or so of this stretch of voicing, the zero appears around 700-800 Hz, and the main resonance just above that. It's
moving up in frequency a little, but let's pretend it's not and just say it's around 1000 Hz. WHich is the perfect location
for a bilabial [m] resonance. The thing that happens after 375 msec suggests something has changed in the side
chamber. So looking just at that bit, the zero has raised in frequency a little, and widened in bandwidth a touch, and the
resonance above is somewhat higher, let's say 1300 Hz. Which is definitely not 'around' 1000 Hz, and closer to where I'd
expect an aveolar [n] to have its resonance. This is a clue to the upcoming plosive, which I think is alveolar (oops, I gave
it away), and an alveolar closure happens here in the middle of the nasal. Not sure this would have been audible, or if
audible if it would have been perceivable, but whatever. I think that's what happened.
[tʰ], IPA 103 + 404
Which brings us to this burst at 450 msec or so. The nasal resonance starts to fall apart just before the 400 msec mark,
and the voicing falls apart a bit later, so there's a tiny little spec of noisy background noise (I guess that's what that is),
that clearly isn't voicing, and there's a strong burst followed by a longish VOT. Proper voicing doesnt' really kick in in
F2 (which for some reason is where we usually look for it, until you get to almost 500 msec, so we have at least 50 msec
of voicing. So this is aspirated (and in English, so presumably voiceless, as aforesuggested). If we weren't sure it was
alveolar because of what happens in the nasal, the burst noise is decidedly [s] shaped, i.e. broadband, strong, centered
in the highest frequencies, which tells us this is an alveolar release.
[ɑɪ], IPA 305 + 319

So when we get past the VOT, we have a moment where the F1 is still kind of flat and the the F2 reaches a mimimum,
somewhere around the 550 msec mark. The F1 is way high, indicating an incredibly low (open) vowel. The F2 is very
low, indicating something very, very back (or round). So here we have a very low, very back vowel. Following that
moment, the F1 drops a little, suggesting a bit of raising, and the F2 starts to raise, indicating fronting. So this is a
'falling diphthong', meaning it starts low (open, or 'high sonority') and moves higher (close, or 'lower sonority', falling
= high-to-low sonority), and which I would call a 'low-fronting diphthong', i.e. something of the [aɪ] family, starting low
and moving frontwards.
[m], IPA 114

Lower-case M
And here we have something voiced. Not a lot of evidenece of resonance above the voicing bar, but that nice strong
voicing bar doesn't look like voicing during a closure, so this isn't likely to be a voiced stop. Call it a weak nasal and
move on, except those lowering transitions in the last glottal pulse or two moving into the nasal make it look more
bilabial than anything else.
[z], IPA 133

Lower-case Z
Hmm. Well. Voiced? Possibly. Fricative? Almost definitely, but it's kind of weak. Waht there is, between 750 and 800
msec (or thereabouts) is very high frequency, but weak. So this is probably /z/, if not actually [z].
[ð], IPA 131

Eth
On the other appendage, something defrerent is happening right around 800 msec. The very high frequency (for some
value of 'very') noise disappears, or at least lessens a lot, and filtered (i.e. in bands) energy takes its place, in places
roughly analogous to F1, F2 and F3. Still a little noisy, and still at least plausibly voiced. So this is fairly open in
articulation. The transitions in the vowel look vaguely bilabial, but the noise doesn't match up with them. So I think
taking the noise on its own (at around 500 Hz, 1600 Hz and, well, somewhere in the 2300-2400 Hz range, look sort of
alveolar. Conceivably. So split the difference and go for dental. Which leaves something interdental or labiodental.
Moving on.
[eɪ], IPA 302 + 319

So that transition probably isn't 'transition', it's the onset frequenc of F2. So we've got something with a mid-to-higher-
mid frequency F2, suggesting something sort of front, moving up, suggesting that it moves fronter. The F1 starts sort of
mid (around 500 Hz) and may move down, suggesting decreasing sonority. So waht we have here is a diphthongy thing
of some kind, starting mid and front and moving fronter.
[ɦ], IPA 147

Hooktop H
On the other hand, something definitely happens as the F2 reaches its maximum. See how the F3 fuzzes out, and the
enrgy below F2 kind of dies, even in F1? So something's afoot, as they say. But still articulated as a vowel, for the most
part, but noisy? How can something that noisy still be that open? If it's glottal noise, that's how. But it remains voiced,
so that's how I transcribed it.
[æ], IPA 325

Ash
When the voicing comes back on in the lower frequencies, the F1 has clearly raised to (somewhere around) 700-800 Hz.
Which is clearly higher than many vowels, though not quite as high as it could go. So we're talking about a fairly low
vowel. F2 is sort of front-to-central, which given how low the vowel is suggests something fairly front. Work with me on
this one.
[ʋ], IPA 150

Script V
Well, we have a longish gap beginning just before 1100 msec and going on to the release at 1200 msec. But the first bit is
clearly voiced. Could be perserverative voicing, but, well, could be not. It's a longish gap to suppose it's just one thing,
and the transitiongs in (all pointing down) don't jive with the alveolar looking release (tilted toward [s]), so I'm going to
suppose that we've got something labial, and voiced at the beginning. [b] perhaps. Frictionless, certainly. And it will
turn out not a stop. but there you go.
[t], IPA 103

Lower-case T So like I said, this clearly has an alveolar release, so this must be a [t]. Whether it's a /t/ or a /d/ remains
to be seen. But English isn't famous for its /bt/ clusters. So assuming there's a syllable break in here, this has to be an
unaspirated, voiceless plosive, i.e. [t], i.e. /d/ in an initial position.
[o], IPA 307

Lower-case O
Vowel. F1 is around 500 Hz, so probably mid. F2 starts sort of central and trends down. So central but moving backer
and/or rounder. F1 isn't moving a whole lot and F3 isn't telling us anything useful. So ths is some kind of /o/. It's a big
creepy that F1 loses its, well, F-iness and the whole thing sort of flattens out (except for F4) making it look like an
approximant. But whatever. I take the F1 thing to be an indication of increasing nasality, given...
[n], IPA 116

Lower-case N
... the following nasal. Which looks fully voiced, in spite of not having a lot in the way of resonances above the zero. And
the F3 and F4 transitions are pointing down, as if it were bilabial (it can't be velar, be cause F2 doesn't seem to be rising
at all to 'pinch' with F3). But the transitions on the other side are, well, ambiguous. F2 may be rising, but F3, if anything
is falling. F3 and F4 may be pinching, and that sometimes means something, but exactly what is in dispute. So it's
probably a nasal, and it will turn out to be alveolar, but at this point I have no idea why. It is both a science and an art,
folks.
[ɐ], IPA 324

Turned A
So here's what I mean about this vowel. This doesn't seem to be a reduced vowel to me. I think this word is a compound,
or at least acts like a compound, regardless of spelling. And this is a fairly low and vaguely central vowel. It is certainly
lower and more central than the [ʌ] at the beginning of this spectrogram. So I transcribed it as lowish and centralish.
Moving on.
[t], IPA 103

Lower-case T
Gap. Not much in the way of voicing. Can't tell much from the transitions since the energy in the vowel starts to die
halfway through the vowel. So try any plosive you like until you find one that makes a word.
[s], IPA 132

Lower-case S
This would be easy to miss, but there's definitely some high(er) frequency noise in the spectrogram, suggesting a weak
[s]. Presumably phonologically voiceless to go with the preceding plosive. Presumably a plural marker.
Last modified: 11/08/2009 22:57:51

University of Manitoba How To - Research - Courses Support Free Speech
CANADA R3T 5V5
Solution for August 2007
N"The ducks float downstream."

We seem to be on a wildlife thing. Except that this was supposed to be about rubber ducks in a race. But whatever.
[θ]̝ , IPA 130

Theta + Raising Sign
Well, there's not much of anything going on, but there's a transient or something that precedes the first 'real' voicing
pulse, which I'm going to take as evidence of something useful happening. What exactly it is, I'm not sure. We can
eliminate voiceless plosive, since that would aspirated here, and this isn't aspriated. We can rule out anything sonorant,
which owuld produce, well sonorance. Which leaves unaspirated plosives and fricatives. Which doesn't leave us a lot of
options. However, if you're familiar with my voice, you know that I tend to stop my dental fricatives word/utterance
initially. So this is probably some kind of /ð/. TMSAISTI.
[ə], IPA 322

Schwa
Tiny little short vowel, especially considering it's in an initial syllable. Must be reduced. Call it a schwa and move on.
[d], IPA 104

Lower-case D
Lovely voicing bar, and long. Clearly a voiced plosive of some kind, and the falling F2 and F3 transitions in the following
vowel suggest alveolar. The length may have to do with the stress, since this is clearly the onset of the most heavily
stressed (and highest pitched) syllable in the utterance. Then again, it may not.
[ʌ], IPA 314

Turned V
I swore I'd stop calling these things[ʌ], but this one is too far back (too low an F2) to count as central). And it's F1 is sort
of in between a lower-mid and a middish-lower kind fo f vowel. So [ʌ] it is, I guess.
[k], IPA 109

Lower-case K
Another plosive some some kind, from just before 400 msec to that clunk just before 500 msec. Longish. That clunk may
or may not be meaningful, so let's ignore it and see how far we get. Nice long closure. F2 in the preceding vowel is
headed ever slow slightly upwards, F3 is definitely trending down. So this could be a velar. And velars are often
accompanied by odd clunks of various sorts, so I'm going to move on. (I didn't say this was going to be a good solution,
now did I?)
[s], IPA 132

Lower-case S
So following the clunkc and heading to about 600 msec there's some frication. Voiceless, and apparently getting louder
the higher up in frequency it goes. Pretty standard [s].
[f], IPA 128

Lower-case F
On the other hand, after 600 msec, something else is going on. Suddenly we lose a lot of the high frequency stuff, and
we get more stuff in the lower frequencies. But it's not really organized like a 'voiceless vowel', so it's not going to be an
[h]. That leaves [f] or [θ]. I might suggest that the F2 transition into the following vowel, which is rising, might indicate
a labial, but Id'd be making stuff up to make myself sound smart. Then again, these two sounds are easily confused, and
I think we can sound out the right form later.
[ɫ], IPA 209

Tilde L (Dark L)
Now this would be easy to miss. There's a very slight change in the amplitude of the voicing/F1 complex round about
750 msec. But this corresponds to an apparent zero, or somethin, higher up that turns off at that moment too. So
there's something going on here that we might want to pay attention to. (Then again, if we can get a better
word/phrase out of it, we might want to ignore it. But I'm leading you somewhere.) If we take it seriously, we've got a
short bit of sonorant something in between the fricative and the vowel. It has a mid-to-low F1, and a 'quite' low F2,
indicating something backish. But the F3, if anything, is just a little high of where we see it in most of the rest of the
utterance. What do we think when we see a high F3? Right.
[o], IPA 307

Lower-case O
F1 is basically about 500 Hz, possibly a little lower. F2 is about 1200 Hz or so. F3 is trending down, slightly, from sort of
above 2500 Hz to sort of below 2500 Hz. So there you go. Middish, or higher-than-strictly mid. Definitely back and/or
round. F2 is pretty flat, although F1 might also be trending down. So flattish rather than diphthongy [o].
[ʔt], IPA 103 + 113(?)

Lower-case T + Left Superscript Glottal Stop (?)
By which I mean a preglottalized [t]. Which probably isn't right, but I convinced myself that the low pitch was
glottalization and I had to do something to stick in both /t/s. So anyway, , the transitions aren't really helping us. F3 is
sort of heading down, unless it's not. F2 is sort of heading up, but it's so low there's no where else for it to go really. I'd
rather talk about the release, but that's a separate phoneme, if it isn't a separate 'gesture'.
[t], IPA 103

Lower-case T
So there's a sharp elease transient just before 1000 msec. It's distinct from the glottal pulses which immediately follow
it, in both spectrum and amplitude, not to mention timing. The release burst itself has energy in the very highest
frequencies, and except for that bit between F1 and F2, looks very much like an [s]. Which is typical of [t] releases. The
transitions are consistent with an alveolar release.
[aɪ], IPA 304 + 319

I'm a little disturbed by the starting frequency of the F1, but since that's transition out of a plosive, I choose to ignore it.
F1 reaches it's flat bit at around 1100 msec near 600 or 700 Hz, which isn't exactly 'high', but will do. F2 starts frontish,
but dives sharly to almost 1000 Hz, so decidedly back and probably round. Typical of a frontish [aɪ] diphthong.
[n], IPA 116

Lower-case N
There's a nice little nasal from about 1200 to about 1260 msec or so. Fully voiced and sonorant, but with an overall
lowered amplitude, a nice little zero at about 800 Hz, and, unfortunately, a clear looking pole around 1000 Hz. Which in
my voice is almost always associated with [m]. I'm going to suggest that something weird is happening with the F2
transition (don't ask me what) and the pole we should really be looking at is the one at 1500 Hz (or so) aboe the little
microzero thing (whatever that is) above the whatever it is we're ignoring at 1000 Hz. ARound 1500 Hz, we get a pole
typical of [n]. So use your imagination and move on.
[s], IPA 132

Lower-case S
So now we move into voiceless territory, for quite awhile, and at least from around 1300 msec to almost 1400 msec,
there's serious [s] frication. Almost prototypical. The little release thing at about 1300 msec is probably the release of
the [n] into the fricative. A little stronger and it would be an excrescent [t]. Or whatever they're called.
[t], IPA 103

Lower-case T
And there's this short little gappy thing at 1400 msec. Must be a plosive. Nothing in it to suggest bilabial (there should
be a 'tail' as the [s] noise gets shaped by the closing lips. So it could be [k] or [t]. And what's coming up isn't helping.
There's some release noise (and a lower frequency double clunk that we're going to ignore), which is tilted to the
higher frequencies, except it *is* being shaped by the following segment ....
[ɹ], IPA 151

Turned R
Which features an F2/F3 complex at its left edge, if that's how you want to think about it, around 1600 Hz. F3 never gets
that low (in English) unless it's an /r/.
[i], IPA 301

Lower-case I
So F1 is where now? Hmm. Well, not high at least. Must be below that harmonic at 500 Hz, so we're looking at a fairly
high vowel. F2, when it finally settles down is up around 2200 or 2300 Hz. Great Gatsby, that's high. About as front as a
vowel can get. So this is very, very front, probably very high vowel.
[m], IPA 114

Lower-case M
And we can see the F2 and F3 transitions pointing down, which is typically a bilabial cue, but they're so high where else
could they go. It would be nice if there were a nice little pole to latch onto, but this thing (starting at about 1750 msec)
is so weak that we can't see anything really above the voicing bar. So in the absence of evidence to the contrary, we'll
guess [m] here and try to make a good word out of it.
Last modified: 11/08/2009 22:57:51

CANADA R3T 5V5
"Her pet cat's hip is bad."
[h], IPA 146

Lower-case H
We begin this month's solution with a fricative. We know it's a fricative because it's noisy. It's voiceless, because there's
no indication of striated energy at the bottom (a 'voicing bar'). And it probably isn't an affricate because the stop
portion of an affricate, particularly in sentence-initial position, is probably going to be strongly released, with a clear
transient and burst noise, and the left edge of this fricative isn't like a plosive release at all. Instead, it seems to fade in
(quickly) from nothing (instead of just 'starting' all at once, and very loudly). So this a fricative. The fact that most of
the noise is in the region of the lower three formants suggests that the noise is being bounced around the resonating
chambers of the vocal tract, so a) the tract is relatively open (let's say 'vowel like') and b) the noise is quite far from the
open end (the lips) so that it can bounce around the resonating spaces and get transmitted to the outside (via the open
end). So this is a glottal fricative.
[ɹ]̩ , IPA 151 + 431

So, where's F1? Right! Down there below 500 Hz. Somewhere. It's hard to tell exactly when F1 is down there, since it
runs into the voicing bar. But the top edge of the band is supported by a fairly strong harmonic around 500 Hz, so it
can't be any higher than that. So where's F2? Right! Somewhere around or above that strong harmonic around 1200 Hz.
So where's F3? BZZT. Well, I don't know, maybe you spoted it. There's another strongish looking thing rising across 1500
Hz just above the strong F2 harmonic. So while the two strongish bits are close enough together to be a single band,
they don't move like harmonics--which will always be roughly parallel. The F2 band is flat, except when it starts to drop
towards the end, and the F3 is rising throughout. Those aren't strong harmonics in a single strong band--these have to
be two separate bands. So F3 is somewhere around 1500 Hz. And what do we call it when an F3 is that low? Right!
[p ʰ], IPA 101 + 404

so we have a nice little gap, probably indicating a plosive. The residual noise in th elow frequencies is ignorable as a
pulse of perseverative voicing. The gap is mostly voiceless. Further it's followed by a sharpish release burst at about 350
msec (that's a burst, cf the initial [h] onset), and about 50 msec or so of aspiration. See the noise? Very loud noise,
actually, but it still has the resonances associated with aspiration rather than a fricative look all on its own. So we have
a voiceless, aspirated plosive from about 275 msec to all the way past 400 msec. Place of articulation? Well, the F2
pointing down into the gap is a pretty dead giveaway for a bilabial. If you look at where the burst noise is in the release
compared to where the formants end up, it looks like the F2 and F3 are rising into the aspiration as well, which is
another pretty good indicator of bilabiality. I suppose we could see that low F3 as indicative of velar pinch (if we ignore
the noise right in the burst and take the F2 to be basically falt), and then the energy in F2/F3 in the release would make
sense. But I'm not convinced. Velar bursts are often (not always) doubled, and rarely have all that strong energy in the
very, very low frequencies. Also the aspiration of velars tends to be a little longer. Which is a long way of saying that I'm
guessing this is a bilabial, but if it really turned out to be a velar, I wouldn't be offended.
[ɛ], IPA 303

Epsilon
Vowel. From about 400 smec to about 500 msec. Basically flat. F1 is a little high of 500 Hz, so we're dealing with
something mid-to-lower-mid probably. That is, certainly not in any way high, and probably not all that low. F2 is
definitely front, being up around 1800 Hz or so. Frankly that's the F2 of a very front vowel in my experience. So we've
got something monophthongal and lower-mid and front. There are a couple of vowels in that part of the space, and at
least in my voice one of them tends to be diphthongized (that is, centralizing). This one isn't. So it's probably [ɛ].
[t], IPA 103

Lower-case T
Which brings us to another plosive. This one runs from about 500 msec when the vowel shuts off to at least that
transient at 600 msec, which I take to be a release. Or a closure, considering there's more plosive on the other side. That
duration (from 500 msec to way past 650 msec is way too long to be a simple, singleton plosive, so I'm taking that
transient to be meaningful. The transitions in the preceding vowel are flat. Which doesn't tell us a great deal in terms of
direct information. But not seeing velar pinch we rule out a velar, and likewise with bilabial so we're going to guess
alveolar and move on. The transient thing could also be alveolar. It could also just be a wad of spit hitting something, so
we'll just move on.
[kʰ], IPA 109 + 404

So now we have this other plosive thing to contend with. The release burst is doubled, so we're really going to consider
velar. F2 and F3 are both high, but arguably pinched together. And there's no sense in which we can see that F2
transition as rising (out of a bilabial) or for that matter pointing toward the alveolar locus, which we hope is
somewhere around 1700 or 1800 Hz. It won't always (as we'll see) but we can hope. So this is a fronted velar. Voiceless
and aspirated, of course. See what I mean about long aspiration?
[æ], IPA 325

Ash
Okay, so now we have a vowel, This one's moving, so we're going to have to talk about it's 'moments'. I don't know what
else to call them. They're not 'spectral moments' but durational moments where 'something' seems to be going on. So
F1 makes a nice little arc, with a low closure locus on either side and a maximum in the middle. That middle is
important, since it's as close to the F1 'target' as we're going to get. And it's higher than the previous [ɛ], so I'm going to
guess this is a low vowel. Relatively speaking. F2 is moving, so it has a tleast two movements. So taking those moments
to be roughly when the F1 starts to level off and start to dive, more or less (soa bout 800 msec and about 50 msec later)
we've got something that's still fairly front, and moving towards neutral. Now this is compounded by the front velar
transition, but then since we mostly only get very front velars before very front vowels, I feel safe in suggesting this at
least starts front. And moves centrally. F3 is flat. So we've got something that's lowish throughout (more or less) and
basicaly front, with centralizing tendencies. So probably [æ]. Which makes the other one even more probably [ɛ].
[t], IPA 103

Lower-case T
So we've got another short plosive here. This one seems to have a transient which might be a release at about 925 msec
or so, but the there's a delay of 10 msec or so before the frication clicks on in the lower frequencies. So probably that's a
plosive release followed not by aspiration (which could come on more or less full strength with release) but by a
separate fricative. Placoe of articulation? I'd say bilabial, given the F2 transition, but then I'd be wrong. Like I said, you
can't always count on that locus thing, especially in codas. The F3 sin't really bilabial looking, but it's hard to tell that
yet. Certainly not a velar. So could be bilabial, or (as it turns out) alveolar.
[s], IPA 132

Lower-case S
Okay, I mis-located the 'transition' between the fricative and the aspirate. I should have located the aspirate at the
onset of F2, which leaves only the time from the release burst to about 15 or 20 msec later. Or something like that. So
concentrating on that bit, there's no energy to speak of until you get up around F3. Then there's some fairly high
frequency noise that looks like it could be a fricative. Trust me. And if it's a fricative, with that kind of spectral tilt (to
the high frequences, in a single broad band) it has to be sibilant. Probably [s], or else there'd be more energy in the
F2/F3 range. TMSAISTI.
[h], IPA 146

Lower-case H
So more interesting is the aspirate. Which we all should recognize at this point, being noisy (even possibly weakly
voiced) and above all strongest in the formant bands. Look at that F2. You can follow it all the way from about 1700 Hz
(yay!) until it reaches its 2200 Hz max around 1100 msec, and the falls again through the voiced section of the vowel. So
this has to be an aspirate, i.e. some variety of /h/.
[ɪ], IPA 319

Small Capital I
Well, there's about 100 msec of vowel here. F1 indicates something mid to higher- mid--that strong harmonic or
wahtever at 500 Hz being 'it' or the top edge of 'it'. F2 is very front, starting outrageously high and dropping. The fact
that it's moving throughout is a little suspicious--usually there's some indication of a target approximation. So if we
take the F2 extremum (the maximum just before the voicing sets in) it looks like a very front [i]. Not quite as high an F2
as we might expect, but plenty high enough. But then the movement down would have to be transitional. Which I
suppose is possible. Or if we do the other thing, which is to take the middle of the voiced section of the vowel, we have
an F2 around 2000 Hz. Which is very front--fronter than we might expect for an [ɪ], but is more compatible with the
less-than-unequivocally-high vowel indicated by F1. If I can put it that way. So something highish and front. And
relatively short, considering this is arguably the loudest vowel in the spectrogram, and probably the highest F0 (with
the other possible hiF0 spot being the second syllable [ɛ]).
[p], IPA 101

Lower-case P
So here's another plosive of some kind. Same no perseverative voicing as before, but no aspiration to speak of. Well,
there's a short VOT but not anywhere near enough to count. Barely the width of the hashmark at 1300 msec. So this is
voicelss but unaspirated. So either it's phonologically voiced (or lax) and initial in its syllable, or it's phonologially
voiceless (or tense) and final. Bear that in mind for a while. If you see the transitions on both sides, they point down
into the closure, clearly indicating a bilabial. I love it when it's that easy.
[ɪ], IPA 319

Small Capital I
So basically here we have the same vowel as before except a) it's F1 is even lower (so either it's higher, or just as likely
the harmonic that we're looking at is a little lower) and b) the F2 starts and ends lower. So we've got something that
could be higher (but regardless it's still a highish vowel of some kind) and not quite as front. So again on the balance, I'd
say this was [ɪ].
[z], IPA 133

Lower-case Z
There's a bit of very weak voicing at the bottom of the spectorgram, but nothing much above, except there's clearly
some evidence of noise in the very high frequencies. But wait! When do we get noise that looks like that in the high
frequencies? With nothing below it? With [s]! Imagine the noise were stronger. It would be a very broad band, loudest
in the higher frequencies (higher than 4400 Hz or wherever the top of this spectrogram is) and trailing off as it gets
lower in frequency. Looks like a weak [s], all right. But voiced of course.
[p], IPA 101

Lower-case P
Exactly waht's going on at the closure is not clear to me. You can see a nice clear moment where closure is reached in
the higher frequencies, but that corresponds to a pulse or two of relatively strong voicing. I can only imagine that in
closing the lips (and releasing the alveolar closure) there's a just enough change in the pressure that you can get just a
little bit more air moving across the glottis until either a) pressure builds up and voicing ceases or b) the vocal folds
move far enough apart for voicing to cease. So we've got a voiceless plosive, either way. The no aspiration/short VOT at
about 1550 msec suggests something unaspirated, so once again we've got something that's either phonologically
voiced (i.e. /b/) and initial in its syllable, or voiceless (/p/) and final. Since /zp/ final syllables are unlikely, I'd guess
that way. Oh, I've already assumed bilabial because once again all the tranitions point down into the closure. On both
sides, if you follow.
[æ], IPA 325

Ash
Do not be confused by the sudden change in voicing/amplitude in the last third of this vowel. That's just what happens
to my voice when I hit an utterance final low boundary tone. I don't know why. Maybe someday I'll do an EGG study of
it. But anyway, if you were thinking that's a final nasal or something, it ain't. No zeroes in the low frequencies and to
much definition in the formants anyway. But it is odd that the formants all kind of level off at that point. Hmm. Okay, so
let's see what's up. F1 starts fairly high and gets higher, reching it's maximum at about 800-900 Hz around 1700 msec,
and flattening out from there (until you hit that final transition). Ignoring the transition at the beginning, the F2 starts
at about 1750 Hz or so and transitions down to about 1500 Hz or a littl ehigher, and again flattens out. Look at that.
Anywah, F3 is pretty stable and neutral throughout. So it starts fairly mid-low to low and gets lower, vaguely front and
moves toward central, but ifyou know my voice, 1500 Hz is still characteristic of something fairly front. Even my
unrounded/centralized [ɯ] has an F2 a little lower than that. So this is centralish-frontish. So we're probably looking at
another [æ], withy maybe a schwa or [ɐ] (remember that I'm now using [ɐ] over [ʌ] for the STRUT vowel in my dialect,
which isn't properly back at all) off glidey thing.
[d], IPA 104

Lower-case D
And finally, we've got another plosive. This one is clearly (and quite strongly) voiced through the closure duration, and
even a little further. Obiously over-enunciated, but oh well. It was late, as I recall. F1 transitions down, which is
consistent with approaching closure, F2 is basically just sitting there around or just above 1500 Hz, although it may be
pointed just a wee bit upward from there. Note the corresponding bit of noise on the other side of the release (which
may or may not be meaningful) is at 1700 or so. Hmm. That number sounds familiar? In what context does that number
keep coming up? Hmm. The F3 transition in the preceding vowel is also not telling us much, being flat, but at least it
isn't falling, which would be telling us something like velar or bilabial. So the flatness isn't telling us much, but it in not
saying something specific, it's pointing us toward something else. That is, alveolar. Which is consisted with the high-
frequency noise in the release burst, and that teeny bit of 1700 Hz resonance or whatever it is in that tail of voicing that
follows the release.
Last modified: 11/08/2009 22:57:50 Support Free Speech

CANADA R3T 5V5
"Pyramids are common shapes."
After we do the segmental stuff, we'll talk about the prosody.
[p ʰ], IPA 101 + 404

Initial stops are hard to place, in part because you only get one set of cues rather than transitions into and out of it, the
transitions you do get are often lost in release noise and aspiration, and, well, they're hard. So the sharp onset of energy
at about 75 msec is a clue that we've had a plosive release, rather than just something starting with aspiration. In the
best case, there'd be a nice release transient, but we can't always be that lucky. The absence of such a release burst
might indicate something non-alveolar, since alveolars often have such releases. The sharpness of the release suggests
something non-velar since velar releases are often slushier. Which leaves labial. So the transitions don't tell us much,
but that really low frequency blip might also suggest bilabial. So on the balance, I'd say bilabial. But that's just a
hypothesis. Aspirated, obviously. Aspiration that length is often indicative of a second-in-cluster approximant, but here
I think it's just initial fortition.
[i˞], IPA 301 + 419

Well the transitions through the aspiration suggest a very high F2 target (higher than the F2 locus of the transition in
the aspiration, quickly falling. The only thing with an F2 that high is [i]. The falling is probably forced by the following
segment's F3 target, hence the rhoticity mark. Oh, notice the F1, clearly separated from the voicing bar. Starts lower
than neutral, and moves towards about 500 Hz s the F2 falls to its minimum.
[ɹ], IPA 151

Turned R
There's two things here (about 250 msec) to notice. THe first is the F2 (and F3) minimum, i.e. there's a place here where
the F3 changes direction and the F2 flattens out. The F1 fuzzes out a bit before this moment, but I'll take it to all be part
of the same approximant moment. There's also the decreasing energy right up until this moment where the energy in
all formants clicks back on and F1 and F2 flatten out and F3 definitely starts to rise again. This is one of those 'moments'
that we hang a lot of stuff on. The low F3 is the giveaway here. Gotta be an [ɹ].
[ɨ], IPA 317

Barred I
Well, one can see even in the spectrogram that this is very low pitch ( though th efollowing vowel is even lower, which
means that it's likely that this vowel is unstressed relative to the first vowel. It looks mid, or slightly high of mid (F1
near or just below 500 Hz. F2 is pretty neutral (1500 or just above), and the F2 is sort of low, but it's still 'recovering'
from being pulled down for the /r/. So this is sort of an r-colo(u)red schwa. Applying Keating et al (1994)'s F2-F3
distance metrick blindly, I transcribed this as barred-i.
[m], IPA 114

Lower-case M
Ah, a nice nasal. Clearly voiced and sonorant, but of overall less intensity than the surrounding vowels, with nice sharp
edges, and flat formant structure. Knowing this is my voice, I'd have said this was an alveolar [n], since the pole is a
little above the 1000-1100 Hz range, where I expect bilabials to be. But look at those F2 and F3 transitions, clearly
pointing down into the nasal (even the overall 'rising' F3 seems to hump down a little right at the onset of the nasal).
Downy-pointing transitions have to be bilabial.
[ɨ], IPA 317

Barred I
Very low pitch again (look at those striations--countable! even at this timescale. F1 is middish again, F2 is a little high,
and F3 is neutral. Barred-i again.
[d], IPA 104

Lower-case D
Plosive. Fully voiced, but the voicign bar is of lower amplitude than usual. No resonances. The F2 transition is pointin
gdonw, but not quite as low as for the bilabial. F3 is just hanging there. F4 is actually rising. Rising F3 or F4 is often a cue
to alveolar, especially if there's no hint of F2/F3 pinch. So we've got a voiced alveolar plosive, whit a weakish release at
about 600 msec.
[z]̥ , IPA 133 + 402

Lower-case Z + Under-Ring
I decided this was voiceless, that noise at the bottom being noise, but I suppose that could be evidence of very ragged,
weak voicing. The noise segment is short, which is usually a cue for underlying voicing, and mostly in the high
frequencies, a cue for sibilance.
[ɨ˞], IPA 317 + 419

Barred I + Rhoticity Sign
I'm throwing these rhoticity signs around like mad, and I usually don't. But THe F3s are interfering with the
interpretation of the F2-F3 distance thing that Keating et al (1994) recommend for transcribing schwa vs. barred-i, and
I'm heding my bets. This vowel is too short to worry too much about, so there you go.
[ɹ], IPA 151

Turned R
The /r/ here is required to explain this falling F3. Look at that F3. Almost 'pinch'ing into the F2. Hmm.
[kʰ], IPA 109 + 404

See that double burst just ahead of 800 msec? That's a double burst, usually a very good indicator of a velar release. The
fact that it's strongest in F2 or F3 is another, although the strength in the lower (than F2) frequencies might indicate
bilabial. The F2 center, in the burst, is at about 1300 Hz and is definitely falling. The F3 in the burst is just above 2000 Hz,
and is definitely rising. So we've have evidence of velar pinch on both sides of this plosive. So probably velar. Also
consistent with this is the really long aspiration.
[ɑ], IPA 305

Script A
F1 and F2 are both sort of straddling 1000 Hz, which is pretty typical of low-back [ɑ]. F1 as high as it can get
(considering the F2); F2 as low as it can get (considering the F1). Lowest and backest.
[m], IPA 114
Lower-case M
Another nice nasal. This one obligingly has a clear resonance around 1000 Hz, so it must be a bilabial.
[ə], IPA 322

Schwa
Short little, low pitched and probably stressless vowel. Call it schwa and move on.
[n], IPA 116

Lower-case N
And here's another nasal. This one obligingly with a nice high resonance and basically nothing at 1000 Hz. Must be
alveolar.
[ʃ], IPA 134

Esh
Sibilant fricative--look at all that high-amplitude noise. Notice it's darkest in the mid frequency range, down to F2,
where it dies suddenly. The zero (or whatever it is) below F2, along with the relatively low center of gravity in the noise
(relatively low compared to a very high-frequency centered [s]), is a pretty good cue for [ʃ].
[eɪ], IPA 302 + 319

The amplitude discontinuity at about 1400 msec is probably just my voice slipping from modal voice to fry or
something, due to the low pitch, rather than going from oral to nasal, or vowel to approximant. Although in a sense I do
go from vowel to approximant. The point is, it's not a nasal, in spite of how it looks. So looking at the F2, it starts a little
high of neutral moves to neutral/mid fairly quickly. And stays there. The F2 starts quite front and moves fronter
(higher). F3 is pretty flat and neutral. So this is middish vowel that starts front and moves fronter.
[p], IPA 101

Lower-case P
The loss of voicing makes the transitiongs hard to see. All we really know is that there's a gap starting at about 1500
mxec and lasting to just shy of 1600 msec. That's quite a gap, all things considered. Apparently voicelss--at most two
pulses of perseverative voicing, depending on exactly when you think the closure occurred. The release burst is a little
ambiguous--it looks like a [t] burst, in terms of having a sibilant component, but then the following fricative is sibilant
as well, so that might just be coproduction. The noisy blip at the bottom is a little worrying, since it's sort of like the
noisy blip at the bottom of the initial realease in this utterance. Which, if I recall, I took then to be evidence of bilabial
release, at least on the strength of knowing what the spectrogram was. So we've got something that probably isn't velar,
could conceivably be either alveolar or bilabial, and at least it doesn't have the full-spectrum sharp release often
associated with alveolars. So probably bilabial, but maybe we should just say 'voiceless stop' until we can get some
lexical access in here.
[s], IPA 132

Lower-case S
Well, this is a nice looking [s]. It's clearly noisy, and fairly high amplitude, strongest in the highest frequencies and
apparently centered off the top of this spectrogrma, so well above 4000 Hz. And it forms a signle broad band, trailing off
into the lwo frequencies (and not sharply shutting off below F2, like the previous [ʃ]. So this is probably an [s].
Okay, so let's talk prosody.
The Tones and Break Indices (ToBI) system is a set of notation conventions that are can be adapated for use in
describing and analyzing the intonational patterns in a language. A ToBI transcription contains a pitch track, an
'orthographic tier', with a transcription time-aligned with the pitch track, a 'tone tier' with tonal autosegments
indicated, and 'break index tier' indicating juncture. In my version, I time align everything to a spectrogram. I replace
the orthography with phonetic transcription, time aligned with spectrographic landmarks rather than just word edges.
I put tones and break indexes on the same tier, mostly to save space.
English conventions, broadly, recognize four levels of break index--roughly 0 for clitic boundaries, 1 for word
boundaries, 3 for phase boundaries and 4 four utterance boundaries (aligned with the right edge, or 'end' of the
constituent). 2s are used for 'anomalous' junctures--disfluencies, things that feel like phrase ends but don't get phrase-
appropriate tone marks, things that have phrase-appropriate tone marks but don't have phrase appropriate timing,
that sort of thing. The assumption is that these mark the right edges of strictly layered prosodic groupings (so a 4
corresponds to a 3 and a 1 simultaneously, since the end of an utterance must also be the end of a phrase and the end of
a prosodic word).
In English, there are a number of * tones, notably H*, H*-L and *L, where the *ed autosegment is aligned (usually) with
the stressed syllable of a prosodic word, so you'll usually get one for every non-0 BI (unless there's some deaccentuation
or something under focus, or something like that). % tones (boundary tones) usually align with the right edge of a 3 or 4
BI, i.e. marking the boundary of the phrasal constituent. In English, the assumption is that boundary tones align to the
BI, but 'spread' leftward to the end of the * tone.
The difference between H* followed by a L%, and a H*+L complex tone is subtle, and I chose to mark H*-L as sort of a
cheat. In some ToBI systems, - tones are associated with phrasal boundaries (i.e. 3s) instead of % tones (limited to
utterance or 4 boundaries). All three lexical words in this utterance seem to have the same tone pattern--the last one
has a low final tone (from the utterance-final 4), and the first one can have a low -L associated with the 3 (which also
accounts for the relative lenght of the last syllable), but the L second syllable of "common" is unaccountable. So I
declared that one an H*+L. But that felt arbitrary. So I sort of compromised.
If you feel strongly about this sort of thing, feel free to discuss this further. ;-)

CANADA R3T 5V5
"A bell adorns the gate."
So, this month, or rather last month, we continue the 'return to basics' trend, with a bunch of voiced plosives. Check
out those transitions. The goal here was to get each voiced plosive with a preceding schwa, so you could see the
transitions. Well, I tried.
[ə], IPA 322

Schwa
Well, it's kind of short, and the F2 is closer to the F1, so this is a schwa. The F1 falls from about 700 to about 500 Hz, F2 is
steady about 1300 Hz and F3 is a little high, around 2600 or 2700 Hz. Starts with a 'hard pulse', but not really a glottal
stop. Vaguely midish, and not amazing backish. Hmm. Well, it was supposed to be a schwa, perhaps opened a little by a)
overarticulation and b) utterance initialness.
[b], IPA 102

Lower-case B
Just over 100 msec of gap, with evidence of relatively strong voicing for the first, I don't know, two-thirds or so. That's
as voiced as a plosive this long is ever going to get, due to aerodynamics. Fairly sharp burst, by which I mean there's a
very narrow vertical line accompanying the onset of voicing. It's much 'sharper' than my plosve releases usually are.
Note that it doesn't have much in the way of high frequencies, and the Transitions all trend down into it, and up out of
it. F4 especially. Can only be a bilabial.
[ɛ], IPA 303

Epsilon
Well, everything is mjoving, so we're going to have to do our best. F1 starts at about 500, and rises a little. Owing to the
lowness of the transition, I'd say that overall this vowel looks like it's basically a little more open (higher F1) than mid,
so we're talking a midish, to lower-midish kind of vowel. F2 is about as neutral as you can get without actually trying,
and trends down (but that's clearly in transition to the following minimum. F3 is rising from neutral. So I'd say from
300 msec to about, well 400 msec, we've got a vaguely mid-to-lower-mid, not very front but not at all back vowel. Which
brings us to the [æ - ɛ] part of the space. And for me [æ] is usually going to be longer. But whatever. Somewhere down
there.
[ɫ], IPA 209
Tilde L (Dark L)
SO from 425-500 msec or so there's this region of full voicing, resonance, with a funny zero or something between F2
and F3. So we're dealing with something basically open and sonorant, but with less overall energy than a vowel. But
continuous formants with the surroundig vowels, so this isn't a nasal. And anyway, there's much too much F1 for it to
be a decent nasal. So this is an oral approximant. F1 is, wlel, a little lower than mid, so it's sort of higher-mid, but not at
all high. F2 is about as low as it gets for me, aroudn 1000 Hz. F3 is definitely higher than anywhere else in the utterance,
although it's a little fuzzy. So the low F2 indicates something very back, think velarized, and the raised F3 is usually a
good indicator of laterality. Therefore, dark /l/. An /r/ would have a low F3, and a /w/ would have a lower F1 and an
unraised F3.
[ə], IPA 322

Schwa
another short, inconsequential vowel. Trust me.
[d], IPA 104

Lower-case D
But the transitiongs. F4 is doing something weird, but it basically straight. F3 may be trending down just a little, but
since it was so high it can hardly do anything else. And not at all 'dropping like a rock' like you might get with a labial.
F2 is not trending down at all, which leaves us alveolar or velar as possibilities, and there's just no way the F3 and F2 are
'pinching' together, even if you believe the F3 is trending down. So probably this is an alveolar. Longish closure
duration, and quite strong voicing for the first three-quarters or so, so definitely voiced.
By the way, if you're wondering, a voiceless plosive can show some voicing energy during the closure, especially in
codas. But typically the voicing is weaker, dies off quickly (this and the previous [b] the voicing seems sort of 'steady'--
cf the coming plosive), and usually doesn't last into the second half of the closure. Just a rule of thumb.
[o], IPA 307

Lower-case O
So everything's moving again, but the F1 is basically straight, around 600-700 Hz. Theres a 'moment of interest' where
the F2 reaches a minimum, at around 750 msec, and more or less at the same moment, F4 reaches a minimum and
changes direction too. So I'm going to suggest that that 'moment' is important, and the fact the F3 is just sliding
through it just means that it's sliding on to its big 'moment', a bit later. So at the 750msec moment, F1 is mid or a little
lower than mid (although nowhere near as open as the end of the second vowel, before the dark /l/ kicks in), so
basically pretty mid. The F2 is low, indicating something fairly back and round. The F3 we can ignore because it's on its
way somewhere else. SO mid-ish and very back and round. Not that many possibilities in my voice (western American
English).
[ɹ], IPA 151

Turned R
SO the enxt moment where anything 'happens' is at about 850 msec, whe F2 and F4 are moving, but F3 hits its
minimum. F1 is still mid-ish, F2 is as high as it can be without hitting F3, and the F3 is down around 1700 Hz. When F3
gets that low, it's just gotta be an [ɹ]. Ignore the F4. It'll just confuse you.
[n], IPA 116

Lower-case N
Well, this is a nasal. Nice strong voicing, but F1 just goes away. There's a region of low energy (a 'zero') just above that,
and there's just a little nergy around 1500 Hz, which is never there in a bilabial nasal. The transitions into the closure
don't suggest a bilabial either. Nor do they 'pinch' as for a velar, so again we've probably got an alveolar on our hands.
[z], IPA 133

Lower-case Z
So around 1000 msec, the voicing continues but drastically loses energy. A the same time the upper resonances switch
off, and there's a trace of noise in the very high frequencies. This is pretty typical of voiced [z]. The frequency and
bandwidth of the noise is consistent with [s], but very, very weak, which is consistent with the attempt to maintain
voicing.
[ð]̝ , IPA 131 + 429

Eth + Raising Sign<
There's a short gap following the end of the sibilant noise, apparently around 1075 msec or so, which is also when the
voicing dies away. But the following bit of noise kicks on at about 1100 msec, has a different spectrum to it (it's got more
stuff in the lower frequencies, and it's 'shaped' into resonances a little more), and the noise seems not to accompany a
burst of any kind. So we've probably got another fricative, but this one 'raised', i.e. partially occluded. The 'voicing' I
get from a) knowing the answer and b) how short both the gap and the frication phase are (underlyingly voicless
fricatives tend to be longer overall, and louder). Can't be sibilant, which leaves the labiodental, the dental, and [h]. I
guess I could make a case for the noise being continuous with the resonatnces of the voicing in the following vowel, but
that's really only true of the F2. So probably not an [h]. But how you tell the difference between [v] and [ð] I'm not so
sure. Let the lexicon sort it out.
[ə], IPA 322

Schwa
Now this is another short vowel. Technically, I should have transcribed it as a barred-i, following Keating et al. (1994),
but I didn't. Sorry.
[ɡ], IPA 110

Lower-case G
ANother voiced gap. This one has weaker voicing that dies off, but it still has voicign for most of the closure duration, so
in the balance, I'd say this was voiced. If the voiced closure duration were shorter, relative to the voiceless duration, I
might say voiceless. But not aspirated. Ennyhoo, See how the F2 and F3 in the reduced vowel seem to be pointing
together, and how in the following vowel the still seem to be together? That's velar pinch. So this has to be a [ɡ].
[e], IPA 302

Lower-case E
So now we have a vowel. F1 is a little low, but still basically mid-ish. F2 is very high, around 2100 Hz, and very flat. F3 is
neutral, but if the F2 were any higher it would end up pushing the F3 out of the way. So we've got somethign mid, but
very, very front. And not diphthongized, in case anyone was wondering.
[t], IPA 103

Lower-case T
And one last gap, from 1450 to 1550 msec, where it gets released into some noise which is much stronger in the higher
frequencies than the lower. That is, it's suspiciously sibilant looking. Specifically like a short [s]. What plosive releases
into something which could be [s]-like? That's right, [t].
Et voilá there you go!
So, how'd you do?


CANADA R3T 5V5
"We have two dogs."
So, it's January, and we're starting with basics. The main exercise here is to recognize my 'point' vowels, the vowels that
are the furthest apart in my vowel space. These are [i], [æ], [ɑ], [u]. And I worked pretty hard at producing an actual
back round [u], in spite of the consonant context. You'll see that I failed, but I blame that on context. More below.
[w], IPA 170

Lower Case W
Much as I, as a phonetician, prefer to avoid the term 'glide', it's worth noting the term's facility in describing the
appearance of these pre-vocalic, (semi)vowel-like things, which are most reasonably regarded not as steady-state
'segments' on their own but as beginning (or ending) points of transitions. That is, they 'glide' into (or out of) the
following vowel. So when you see this kind of movement, especially where the transitions are 'straight', i.e. not
accelerating or decelerating, think of the semivocalic approximants. This one has an absurdly low F2 starting point
indicating something very back and round. If yout ake the three (or so) pulses before the F2 and F3 kick on (about 20-25
msec around the 100 msec tickmark) as a 'moment' of steady state, even though you can't see the F2, it has to be really
really low. The F3 starts around 2500 Hz, which is fairly neutral, not greatly lowered as it would be for [ɹ], and not at all
raised as it would be for a lateral. So that pretty much leaves [w].
[i], IPA 301

Lower-case I
Our first vowel. Abstracting away from the transition, we've got a fairly steady F2 from 200 msec to almost 350 msec.
The F1 goes from around 250 msec and moderates a little before you get to 300 msec, but it's still well below neutral
(around 500 Hz for F1), so this must be a fairly high vowel. F2 is absurdly high, up around 2300 Hz, which is about as
high as my F2 ever gets. It's practically into my F3 range. So this is very front. About as front as you can get. High and
front. And fairly long. Must be [i].
[ɦ], IPA 147

Hooktop H
Okay, this may be the first time I've used this symbol—it's certainly the first time I've used it in a long time. Hooktop-h
is the voiced counterpart to [h], and if you're wondering how you can voice an [h], think "breathy voice". So you can see
the striations at the bottom. That's voicing. Above that, however, the energy is mostly noisy rather than periodic. That's
a fricative. So we've got a voiced fricative. What makes it glottal, or rather epiglottal (don't ask), is that the noise is
clearly organized into formant-like resonances, and continuous with thevowel resonances. So the F1 is around 900 Hz,
right in line with the following vowel. F2 is transitioning between very high to, well, falling (it kind of levels out, a little,
for a while in the middle, but you can't really tell that from the formant trace). The F3 also transitions, a little less,
toward neutral. I don't know what's going on in the F4, but then I don't care. So what we have here is something with
the oral articulation of a vowel, or semivowel, or something vowel-like, with resonances and no real closure, but lots of
friction echoing around the oral cavity. And voiced. So this is an /h/. Or rather, [ɦ].
[æ], IPA 325

Ash
Now we have another vowel. Again, F1 is fairly flat for most of it, until the last quarter or so. It't high, around 800 Hz or
so, which is about as high as my F1 ever gets. Well, not quite, but close. So this is very, very low vowel, relatively
speaking. F2, while moving, kidn of flattens out, briefly, sort of, around 1500 Hz, maybe a little lower, which is
essentially neutral. But by no means back. And it's hard for very low vowels to be very front, so you have to interpet this
carefully. So very low, which in my voice means only [æ], [ɑ] or in a pinch [ʌ]. And it's not back enough to be [ɑ]. Which
limits the possibilities. And once you clue in to the 'pointiness' of the vowels in this spectrogram, you arrive at [æ]. But
note the fallinng formants in the last 25-50 msec or so.
[v], IPA 129

Lower-case V
Falling transitions in all formants like that almost always mean labial, if not bilabial. There's a short (probably voiced)
gap for about 30 msec or so around 650 msec, but then it's noisy, but without a release burst or anything. So this is
probably more fricative than anything else. Possibly tightened to stoppiness, but since there's no plosion, I'm hoping
you'll believe me if I suggest that it just takes that amount of time to recover airflow and generate enough pressure to
get this noise. That and I had to fix so much in the transcription so many times that I just don't have it in me to go back
and add a raising diacritic. So we've got a voiced labial fricative and in English that can only be labiodental. Moving on.
[tʰ], IPA 103 + 404

So there's a real gap from 700 to about 775 msec, with a nice sharp release burst and about 50 msec of aspiration. By
which I mean [h]-like noise between release and the onset of voicing. So we have an aspirated plosive here. The burst
noise is oddly concentrated in the F2/F3 range, and depending on how you interpret it it could be F2/F3 'pinchy'. But
that would be inconsistent with the noise in the aspiration, which is [s]-shaped—a single broad band, centered in the
highest frequencies. So on balance this looks alveolar.
[ʉ], IPA 318

Barred U
Another vowel, this time with the F1 really, really low, below 250 Hz or so. So a really high vowel. F2 is quite low, almost
as low as it ever gets in my voice, certainly in a real vowel. So we have a backish and mostly unround vowel. Or possibly
a centralish and fairly round vowel. Or something. I was trying hard to make this round, and not at all front, which my
post-alveolar /u/s tend to be (due to the western US merger of post-coronal /u/ and /ju/). I'm quite proud that there
isn't a distinct frontish on-glide to this. So not as back as it could be, but fairly round.
[d], IPA 104

Lower-case D
Okay, here's that big red herring. The transitions into and out of this are exceedingly misleading. But to details. Voiced,
weakly, but you can see it down there at the bottom. F2 transitions up to near 1500 Hz, and then down again on the
other side. So definitely not labial. Now for an alveolar we'd hope to see something even higher than that, around 1700
to 1800 Hz or so. But you can't have everything. The real scary part is the F3 and F4 transitions, which are just, well,
misleading. Usually, downward pointing transitions, particularly upper transitions, are indicative of a labial. Not here.
At least not 'officially'. I think this just indicates the extent of rounding in the preceding vowel. So given the uppy-
pointing F2 and the downy-pointing F3, this looks like velar pinch. Except there's no reason for F4 to do that for a velar.
So on the balance, alveolar is probably the best guess, but there's no shame in being misled. This spectrogram is about
the vowels anyway....
[ɑ], IPA 305

Script A
The formant trace places the F1 of this around 700 Hz or so, but I'd actually place it a little higher. I think what it's
finding is the a fairly strong harmonic at the bottom of a fairly diffuse band, and I'd put F1 up around 800 Hz. Similarly I
think the F2 is placed between the 'true' F2 and the 'true' F1, so the F2 should be up around 1100 Hz or so. But even if
you trust the formant trace, we've got something on the lower end of mid, and something in the very backish/roundish
space. The only thing that's ever back there in my speech is some kind of reflex of the "LOT"/"THOUGHT" vowel, which
are merged in my dialect. The [ɑ] character is clearer if you place the formants where I place them, but given this is my
voice, even something lower-mid and that back could only be a /ɑ/ just because there's nothing else back there for it to
be.
[ɡ], IPA 110

Lower-case G
Now here's a nice example of velar pinch. Even though it's not very 'pinchy' the transitions into this gap have to be
velar. F2 rising and F3 falling, with F4 rising or hanging out. In some models F2 and F4 are coupled (as the first and
second resonances of one cavity or other) especially approaching velars. That combination of rising F2 and falling F3, in
the absence of any other evidence, indicates velarity. So this is a velar plosive. At the bottom, we've got fairly strong
striations that last most of the closure, so this must be voiced. Purely perseverative voicing into a voiceless closure
usually doesn't last more than a 1/4 of the vowel duration or so, and is usually weaker and dies off more sharply.
[z]̥ , IPA 133 + 402

And finally, we have a nice fricative. It seems to be strongest off the top of the spectrogram, so its center seems to be
above 4400 Hz, which is where my spectrograms typically cut off. The only thing that centers that high is an alveolar
fricative. While not strictly speaking 'voiced', (no striations) this is too weak to be a fully voiceless [s]. A voiceless [z] is
produced with passive devoicing (i.e. not enough pressure-drop across the glottis, with adducted vocal folds) as
opposed to a truly voiceless [s], which would have high flow across the glottis and abducted vocal folds.
So, how'd you do?


Linguistics Department Linguistics Department - University of Manitoba
CANADA R3T 5V5
"The event is a local tradition."
Before diving in, take a moment and work out some basics. How many syllables are we dealing with? Are there any cues
that suggest where the lexical stresses and/or intonational accents and boundaries are? Any obvious segmental cues?
Inexplicably long vowels? Gaps? Apparent nasalization? Sibilance?
Eth
[ð], IPA 131
So there's this voicing that starts around 50 msec in from the left edge. The resonances of the vowel don't really kick in
until at least 125 msec, so that leaves us with almost 75 msec of voicing to do something with. Just 'cuz there's a vowel
following, I'm voting for consonant. And not a very open consonant, although since this is the beginning of an
utterance, I'd assume a fair amount of initial strengthening going on. If you look up to the 2500 range, and again at the
3500 range, what's that? That's noise. So this is probably a fricative. If it were sibilant, it would be louder. If it were /h/
it would have more F1/F2 involvement. Which leaves the labiodental and the interdental. And those transitions don't
look labial. The F2 starts too high, although frankly the F3 and F4 are not helping.
Lower-Case I
[i], IPA 301
Well, there's this big transitional thingie, but I'm looing specifically at that moment between 200 and 225 msec. The F1
is low, so it's high. F2 is way high, so very front. So we're dealing with a high front vowel. Owing to another 150 msec
more or less of vowel (even if the F2 is moving throughout), there's probably another vowel following, meaning we're at
a hiatus moment here. Meaning this syllable must be open, which means this vowel, if high, is tense. Think about your
phonotactics.
Schwa
[ə], IPA 322
Well, the F2 is moving throughout. So deciding on a backness value is perhaps a waste of time. It's interesting that the
vowel definitely is lower (though still not 'low') than the preceding vowel, so this is probably middish. But the F2 never
really comes to a stop, or even slows down, so there's no evidence of a true 'target' for the F2. Which makes me think of
vowel reduction. Which makes me happy because then I don't have to decide anything about this vowel.
Lower-Case V
[v], IPA 129
On the other hand, this F2 transition can't be anything but labial. It's way lower than the alveolar "locus" (we can argue
about the earlier transition into the [i], but it looks vaguely alveolar, more alveolar than labial, if it comes to that).
Perhaps most importantly, the F3 and F4 are clearly lowered--aside to Vineeta Chand: but not at all 'low' ;-)--suggesting
some labialization in here somewhere, but nothing like real rounding. So this thing from 350 msec to 400 msec or so is
vaguely labial, in the sense of not obviously being round, and being more labial than coronal or velar. So anyway, we've
got a consonant, voiced, and if we look really closely, fricative. That could be the noise that you get with my mushy
stops, but you don't get that kind of voicing even in my mushy stops. This being English, there aren't a lot of voiced
labial fricatives to wonder about.
Epsilon
[ɛ], IPA 303
So for almost 100 msec, there's a vowel. The F1 is, well, higher than 500, but not by much. So this is mid or vaguely
lower than mid. The F2 is, well, higher than 1500 but not my very much, so this is front, but not like front in high-and-
front. So middish to lower-middish, and vaguely front.
Lower-Case N
[n], IPA 116
So for almost another 100 msec, there's ful voicing, but the amplitude drops suddenly at 500 msec. So we've got full
vowel from 400 to about 500 msec, and then something less than fully open from 500 to almost 600 msec. Around 600
msec there's burst, so we'll have to jam in a plosive in here somewhere, but the voicing (and the upper frequency
resonance) aren't consistent with a voiced stop. So this is osmething else. The sudden change in amplitude suggests
nasal, although frankly I'm at a loss why it looks like it does. The transitions in the preceding vowel could go either
alveolar or labial, depending on your mood, although the F4 is just sitting there, which might tip the scales toward
coronal. But the pole, if that's what it is that stands in for F2 is moving (from about 1250 to around 1600 Hz), which is
just odd. And there's no obvious zero. So exactly what to do this one is a mystery to me.
Lower-Case T
[t], IPA 103
So this burst at 600 msec needs explaining. I explain it thus. This is how homorganic nasal-stop clusters look. Nasal
nasal nasal, with an oral burst. It's definitely alveolar, as the burst noise is 'acute' (higher in the high frequencies--i.e.
looks like about 5 msec of an [s]). There's definitely a disruption to the regular voicing pattern, although how exactly
we're supposed to realize voicing or voiclessness aligned with so 'instantaneous' a cue like this burst is again a mystery
to me.
Barred I
[ɨ], IPA 317
So after the bvurst, there's a short little vowel. Again, the F2 is just zooming, so I'd say there's no particular vowel
target, or at least the vowel target gets overwhelmed by the overlap with the flanking transitions. But that's just a
theory. It's reduced. Skip it.
Lower-Case Z
[z], IPA 133
Okay, so looking at the higher frequencies, there's some frication going on. It's not organized in bands; it's one big
band, centered really, really, high. Which makes it look sibilant, but it's not really that lound. But if you look down at
the bottom, there's some voicing. It's dying away really fast, but it's there. which explains the relative weakness of the
noise--hard to maintain sibilant airflow and voicing at the same time! So what we have is weak [s] noise, accompanied
by voicing. That is, [z].
Schwa
[ə], IPA 322
I'm not intending to make it a rule that if the F2 is just moving, you should ignore the vowel, but really? Is there *any*
evidence that the F2 is going somewhere other than between where the consonants are pulling it? Okay, well, if it is,
then this one has some kind of slightly low F2, so if you really want this to be back or round, fine with me.
Tilde L (Dark L)
[ɫ], IPA 209
But it's back because it's being coarticulatorily (?) velarized by this bit. From 825 msec or so for at least 100 msec,
there's something consonantal happening. Now, I'd say this was a nasal. Look at those sharp edges. Look at those flat
resonances. What's more, I'd say [m], because that nice little pole at 1000 Hz just screams [m]. But I'd be wrong, because
I didn't consider the upper formants. Look, there they are. In spite of the lower and higher apparent zeroes, these look
pretty strong. In a nasal, all the formants get relatively weaker. Than they would be in an oral vowel Compare the
overall amplitude of the resonances to the vowel in the preceding [n]. And then there's the F3 frequency to contend
with. It's raised. Compare the F3 here with that in any of the preceding vowels. That's raising. So that gives us a clue--
this could be a lateral, and that 'pole' is just the very low resonance of a very back (velarized) lateral. Ooh. I think we're
on to something.
[nʊ], IPA 307 + 321
Well, on the far side of the lateral is this thing. F3 starts around 500, maybe a little higher, and in the last 1/3 -rd or so it
drops, so this vowel goes form sort of mid to sort of high. And it's way back and possibly round, judging from the low
F2, and it gets backer/rounder. Easy.
Lower-Case K
[k], IPA 109
Well, this one is rough. The F2 and F3 are too far out of normal to provide much in the way of useful transitional
information, at least traditional transitional information. We've got a mushy gap, so we're probably talking about a
plosive. It's voiceless, so we're down to three possibilities. That's progress. The real giveaway is the burst. It's not [t]
looking. It's got bands, and if anything it's got a weak bit up around 4200 Hz. There's some energy in F2, but not below.
So this isn't really good for bilabial. So guess velar. Well, okay I can then convince myself that the F2 and F3 transitions
into the following vowel are vaguely pinchy (it's a stretch, but there's always some leaps of faith in this enterprise). And
then there's that clunk. See it? If the main burst is at 100 msec, then at about 1125 msec, int he F1/F2 region? See it?
That clunk? What if that's the second burst of a double burst? Ooh, double bursts are usually characteristic of velars.
Ooh, corroboration. Gotta love it, especially if it's all you've got.
Tilde L (Dark L) + Syllabicity Mark

[ɫ],̩ IPA 209 + 431
Does this look familiar? Except this time it's between consonants.

[tʰ], IPA 103 + 404
So the first thing to notice about this is the release. It looks alveolar/[s]-like. And that VOT is amazing. Too amazing.
Turned R + Under-ring
[ɹ]̥ , IPA 151 + 402
This explains the length I guess. Aspiration tends to go along with the duration of the following approximant, if there is
one. And there is one. There must be or there's no explanation for that F3. See how both F3 and F2, when the voicing
finally kicks on, are both below 2000 Hz. That low F3 is a dead giveaway.
Barred I
[ɨ], IPA 317
And here's another one of these stupid vowels.
Lower-Case D
[d], IPA 104
Well, here's another long gap. Quite a long gap, actually. And fully voiced. That's funny. The transitions are consistent
with alveolar, but it's so hard to rely on those. But the release looks like another alveolar release. Except this time it's
voiced, and the VOT is short, if it's greater than zero at all.
Small Capital I
[ɪ], IPA 319
Well, there's some high frequency noise, but given that we've got an alveolar plosive on one side and a clearly sibilant
fricative, I think we can just call it coarticulation, or reverb or something. SO what have we got. Well, it's got to be a
vowel. Probably high, or at least highish, judging from the F1. Quite front, judging from the F2, but not as front as the
first vowel in this spectrogram. So this is probably [ɪ].
Esh
[ʃ], IPA 134
Ah, sibilants. This is high amplitude, relatively broad band, high frequency energy. This one is more 'shaped', by
resonances/filters, than, say, the aforementioned [z] or the noise for the [t] releases. It's also centered a bit lower--most
of the others have their center, at least conceptually, above 4000 Hz. THe center here is definitely lower, just above 3000
Hz. ANd the energy shuts off abruptly (relatively speaking) below the F2 of the surrounding vowels. That's pretty
typical of [ʃ].
Schwa
[ə], IPA 322
Last vowel of the spectrogram. Kind of short for a final vowel. Must not be that important. Seriously. Final syllables
lengthen. If this is the lengthened version, how short would the unlengthened version be?
Lower-Case N
[n], IPA 116
And finally, something that has reduced overall amplitude, clear resonances, and definite zeroes. At least, something
that's definitely nasal. The pole, what you can see of it, is up around 1500 Hz, which if you're looking at my voice is
pretty clearly alveolar looking, especially without any hint of velar pinch in the incoming trnasitions.

Winnipeg, Manitoba
CANADA R3T 5V5
"Students complain a lot."

And calm down, this isn't directed at anyone in particular. It's just a paraphrase of something amusing somebody said
to me in an e-mail. The original was too long and had too many approximants in it. Believe it or not, I think about these
things.
So once again, before diving in, take a moment and work out some basics. How many syllables are we dealing with? Are
there any cues that suggest where the lexical stresses and/or intonational accents and boundaries are? Any obvious
segmental cues? Gaps? Sibilance?
Lower-Case S
[s], IPA 132
Speaking of sibilance, this is a typical sibilant. Quite high amplitude noise, here concentrated in the very high
frequencies. This really can't be anything except an [s]. The amplitude, the frequency, and the single, broad band. And
no voicing bar.
Lower-Case T
[t], IPA 103
Well, there's a gap here. It's short, but it's pretty distinct. It's voiceless, but then there's not much choice, considering
this is still the onset of an utterance-initial syllable and the preceding consonant is [s]. Okay, so the release isn't really
sharp, and tere's a hint of a double burst. But the double burst, if that's what it is, isn't in the F2/F3 range, so this
probably isn't a velar. It could be bilabial, but the noise is all wrong. There's not a nice sharp, across-all-frequencies
burst like I usually like to see with coronals, but that's what we're left with. So that noise up in the high frequency
range is pretty interesting. It's not going to be another fricative, [s-t-(fricative)] not being a conspicuous onset in
English, so that noise has to be release noise. And it's [s]-shaped, suggesting that the plosive is alveolar. If you're not
sure why, think about where the noise following the release of a [t] would be generated, and where the noise of [s] is
generated.
Barred I
[ɨ], IPA 317
I'm not 100% sure about this transcription. Let's go through the details. F1 is low, well clear of 500 Hz low, so this is a
high vowel. The F2 is falling slightly. It starts at about 1600 Hz and drops to about 1400 Hz. So the F2 is around 1500 Hz,
which is pretty much neutral territory. So this is vaguely central, moving very slightly back. Or round, but whatever. F3
is up where it's supposed to be, i.e. neutral as all get out. So high and central. But this is probably a stressed syllable.
Remember that this is my voice, general western US English with particular shades of southern California. My /u/ isn't
what you would call back. Or round, but whatever. Random fact about my dialect: /u/ and /ju/ neutralize after coronal
stops. To something quite flat. Like this.
Lower-Case D
[c], IPA 104
Another gap, this time with a nice clear voicing bar. So this one is voiced. There's that same funny maybe-double-burst-
thing-perhaps. There's not a lot else to the release, which is kind of troubling, but what there is looks like the other one,
which was coronal. In the absence of strong evidence to the contrary, I'd say [d].
Lower-Case N + Syllabicity Mark

[n̩], IPA 116 + 431
Well, we've got a nice little amplitude peak here, about 100 msec long, but it doesn't quite look like a good vowel. The
voicing is fairly strong, and there's a decent resonance un the low frequencies, but not much energy above. There's a
hint of something at 1500 Hz, and there's more energy (but a little diffuse in frequency) in the F3 range, but all in all
this is pretty weak. One might almost say zero. Which of course we associate with nasals. There's an antiresonance (or
zero) sapping all the energy out of this thing. Could be some kind of weak approximant, but the resonances, such as
they are, are wrong. So this is prbobably nasal. And the hint of energy at 1500 Hz suggests alveolar. There being
obstruents on both sides, this is probably syllabic.
Lower-Case S
[s], IPA 132
And here we have another sibilant. Voiceless. and higher in the high freuencies, so this is another [s]. Now, it's
voiceless. So eventually you'll have to decide whether it goes with the following syllable or the preceding. And there's a
trick here. So look at the next segment and then we'll talk.
Lower-Case K + Right Superscript H

[kʰ], IPA 109 + 404
We've got a gap, followed by another sibilant looking bit of friction. But this one is more 'shaped'. If you look at the [s]s,
they ahve one big broad band. This has a high-frequency band, something in F3, something in *F2*, and just generally
looks filtered. So if we look closely at the F2 transition in the release noise, it starts a little high and drops. So while this
is consistent with an alveolar relase, it doesn't look like the [t] release we had earlier. So what else could it be? Well, it's
probably not bilabial, since the F2 transition is falling from above-neutral rather than rising from below. So the other
choice is velar. Frankly this is ambiguous, so eventually we'll have to make a word out of it to know for sure.
Now here's the tricky part. This looks like aspiration. Depending on where you count from, the release noise is at least
50 msec long. This is an aspirated stop. So if the preceding [s] is part of this syllable, this should be an [s-(stop)] cluster,
and there shouldn't be much in the way of aspiration. But if the [s] is part of the preceding syllable, it follows a syllabic
[n] and is probably a plural marker or something (this being the second syllable of the utterance and therefore part of
the subjec NP). But we'd expect a plural marker following a syllabic sonorant (a voiced sound) to be voiced, i.e. [z].
Now, I'm famous for devoicing my [z]s, but the result doesn't look quite as [s] like as this. The noise of [z]s is typically
quite weak, compared to your average sibilant, even when devoiced. And devoiced [z]s are always (in my experience)
seriously shorter than analogous [s]. And this [s] isn't. So this is an [s].
Eventually, you'll notice there's another hypothesis, that there's an underlying voiceless consonant in between the [n]
and the [s]. Probably homorganic with the [n] (and coincidentally with the [s]). I'm not saying it deletes necessarily, but
if you've been following these things for a while, homorganic nasal-stop clusters tend to have reduced stop phases.
They look like [n]s (or whatever) with release bursts at the end. But instead of just releasing, there's an [s] to contend
with. It would be nice if there were a nice little burst, but life ain't perfect.
Moving on.
Schwa
[ə], IPA 322
Tiny short vowel, possibly a figment of my imagination. But I think there are a couple of nice clear periods of vowel just
before 700 msec.
Lower-Case M
[m], IPA 114
This one is another nasal, for the same reasons the last one was. This one is bilabial, however, because the pole is
around 1000 Hz, rather than 1500 Hz.
Lower-Case P + Right Superscript H

[pʰ], IPA 101 + 404
So there's tiny short gap around 800 msec, with a burst and a looong period of aspiration. The release doesn't look like
the coronal bursts we saw before, and there's nothing remotely velar looking about the burst or the transitions
(although the transitions are getting screwed around with by other things). The strongest noise in the release is
impossibly low, which sometimes happens with bilabial bursts, but not usually a cue to be relied on. But here I'll take it.

[ɫ],̥ IPA 209 + 402
Well, the length of the aspiration is suspicious. Also the transitions are bizarre. Notice the F2. It's almost up at 3000 Hz.
That's jusr freaking high. And teh F2, once the voicing kicks on, is incredibly low. The F1 fairly high too. The F1 tells us
this is fairly open. The F2 that this is about as back as anything can get, considering the F1. And the F3 is *raised*. So
this is a dark [l]. Laterals often have a raised F3 (or sometimes F4), and the velarization lowers F2. I don't know how
general this is, but in most voices I've looked at, aspiration/voicelessness persists a lot longer when there's an
approximant following than if there's just a vowel. Almost as if [-voice] is spreading wholly onto the next segment.
Hmm.

[eɪ], IPA 302 + 319
Well, that's what it is. The F2 target, if you believe in them) is wonked completely out of whack (or whacked completely
out of wonk?) by the velarized [l]. The F1 starts out slightly high of mid and falls, so the vowel goes from not-high to
high. The F2 ends up way front, so this is a fronting diphthong. And the F1 doesn't start quite high enough to be [aɪ],
but with the [l] that could be a red herring. Here it's not, but it could have been.
Lower-Case N
[n], IPA 116
Teeny short little gappy thing, but voiced. Fully voiced, with a nice strong voicing bar and some resonance. So this isn't
a standard flap, which is usually more plosive looking. If you abstract away from the length, or lack thereof, this coudl
be a nasal. You can even see a little big of pole around 1400 Hz. Okay, it's a little lower than the previous [n], but close
enough. Nasal flap. Remember it.
Schwa
[ə], IPA 322
Another vowel. I notice now that I've misplaced the segmentation mark. I think the vowel here starts at the end of the
flap, let's call that 1125 msec, but it's not quite, ad goes on to, well, let's call it 1175 msec. That's just short. And so
probably reduced.
Tilde L (Dark L)
[ɫ], IPA 209
The peak of the 'constriction' or whater you call it is where I marked it, but as I said above, I think the contact starts
much earlier than I've marked. But whatever. Somewhere in here, let's say between 1200 and 1300 msec, we've got
another dark [l]. Ignore the indiscernable F1. F2 is low. F3 is still way high.
Turned Script A
[ɒ], IPA 313
I love that symbol. I don't know if this was really round but the F2 was a little lower than I expected. The F1 is high, so
this is a very, very low vowel. And it's about as back and or round as it can get.
Lower-Case T
[t], IPA 103
I don't know if that last pulsey thing at 1600 msec is a glottal pulse or a closure transient, but it's the last evidence of
anything for almost 100 msec of serious gap. So we have another plosive (stop) here. F2 transitions in the vowel look
vaguely pinchy--but only because the F3 is returning to neutral from being raised, and the F2 is returning to neutral
from being lowered. The release at 1700 msec definitely looks like a nice alveolar burst.

Winnipeg, Manitoba
CANADA R3T 5V5
Solution for (mid-)July 2005
"They don't know where to go."
Eth
[ð], IPA 131
I worked really hard at making an unstopped, fully voiced initial Eth, and I'm still not 100% successful. But look at that
voicing. More than 100 msec of it. But the visible frication is, well, as weak as it should be, but it really only creep sin in
the last 25 msec or so, and really only once the voicing in the upper formants starts to creep in which might as well be
vowel. So I don't know what to do. We've got voicing. We've got fricative, sort of. We've got transitions into the
following vowel consistent with alveolar (less so than velar or bilabial--the F3 is just sort of sitting there, where for
either velars or bilabials it should start lower), but there's no sense in which the fricative looks sibilant. So we're talking
some kind of coronal fricative, nonsibilant. And voiced. Narrows it down quite a bit, actually.

[eɪ], IPA 302 + 319
Well, I'm not sure there's a height movement happening (that F1 looks pretty flat to me) but the F2 definitely starts one
place and ends up another. I'm not sure how much of that is transition from the consonant, and how much is
diphthong, but the fact that it's straight and not curved suggests two targets rather than one target with a longish
transition toward it. But I don't know. Mid vowel, judging from the F1, front, judging from the F2, and if it's a
diphthong, it has to be tense.
Lower-Case D
[c], IPA 104
Looking at this again, I'm wondering if I mis-transcribed it. We've got a shortish gap, with pretty full voicing
throughout. There's even some evidence of resonance through the gap, so maybe I should have transcribed as a flap.
But I didn't, so there you go. It's a gap, so if we take it as a plosive, then we have to decide which one. Well, voiced. No
velar pinch, which leaves bilabial or alveolar. ANd the transitions don't really look bilabial.
[nʊ], IPA 307 + 321
Ah, movement. This vowel, and the upcoming couple of consonants, were the whole point of this utterance. Okay, F1 is
a little higher in frequency than the previous vowel, but it's still in the mid range. The F2 starts up around 1750 Hz but
then falls sharply to a low just about 1000 Hz. So this is mid. It seems to start front, but falls outlandishly sharply
towards definitely back and round. I don't know how much of that is allophonic and how much is just me and my
screwed up back vowels. Something to think about as ew move on. But anyway, the only things that can end up that
back in my dialect of English are phonemically/historically back. And round. So this is an /o/, an id it looks like a
diphthong in my speech. Remember that. For the record, I think I transcribed it as nasalized (at least the offglide before
I decided that there was a nasal consonant in the following coda, distinct from the onset after that. More below.
Lower-Case N + Glottal Stop

[nʔ], IPA 116 + 113
At about 450 msec, the spectrum changes abruptly. All the the energy between the formants drops out. The movemnet
in F2 and F3 changes directon or slope or both pretty suddenly. But there's 50 msec or so of solid voicing, with
resonance (let's say 'pole') at about 1000 Hz, something going on at 2500 Hz or thereabouts, even energy at 3500 Hz.
Okay, so The absence of energy is evidence of a zero of somekind coming in, which is evidence of nasality. So far so
good. I transcribed as a nasal with a glottal stop, knowing what the utterance was, but actually, I have no good way of
knowing it's an alveolar nasal. This could be a nasalized [ʊ]. That would explain the low pole (it would be F2). The
evidence for glottality coems from the ragged striations in F3/F4 up around 2500, I guess the real clue that tehre's
something going on here, coda-wise, is the abrupt change at about 480msec. More below.
Lower-Case N
[nʔ], IPA 116 + 113
Now this is a nasal. I'ts fully voiced, the zero(es) have expanded to kill even the lower pole. And the resonances are flat.
Frankly, it still looks like a bilabial, but I'm *really* hoping that that's just coarticulatory rounding. The point of this
part of the spectogram for me was to see what this transition looked like. When you've got one continuous closure (I
suppose) at the alveolar ridge, the velum down continuously. I'm sure there's tongue and jaw gestures happening to
distinguish the coda nasal from the onset one, but I'm not sure how they'll show up. I'd assumed that we'd see evidence
of glottality and a change in the amplitude of voicing. Well, neither of those is amazingly robust. What I definitely see is
the change in the amplitude of the zero, which I'm going to have to think about some more.
[nʊ], IPA 307 + 321
Well, we've got an F1 in the mid range (around 500 Hz) but this time, it's moving slightly downward. The F2 is starting
slightly back of centered (i.e. below 1500 Hz), but again is moving downward. So this is starting mid and vaguely back,
and moving backer (and/or rounder) and higher. Once again, this is /o/.
Lower Case W
[w], IPA 170
Well, that's the lowest F2 you're ever likely to see from me. The F1 isn't quite as low as I'd expect for an intervocalic
(and presumably onset) /w/, but oh well this isn't an amazingly strong boundary, in spite of its syntax. Which we don't
know about yet. Oops. Okay, so we've got an F1 in the mid to higher mid range, an F2 extremely low indicating extreme
backness and/or rounding. F3 not doing much. So this is something resonant, backish and roundish. The lowered
energy in the high frequencies suggests a consonant. Given that it's obviously surrounded by things which are 'more'
vocalic, this is something vowel-like acting like a consonant, i.e. an approximant. So if it's a choice between /u/ (or /o/
or something like that) and /w/, we know which to pick.
Epsilon
[ɛ], IPA 303
Well, this vowel is mostly transition, but from what we can see of this syllable, the F3 is still hovering in the mid-range.
Abstracting away from the preceding /w/, the F2 seems to be heading toward the front space. The F3 is dropping,
eventually coming to a minimum around 850 msec, so we'll save that moment for later. But the point is the F2 is
heading as high as it can until it stats to run into the lowering F3. So this is middish, and frontish. And given the
following segment (that F3 thing) it's probably not worth asking whether it's tense/long/diphthongized or
lax/short/centralizing.
Turned R
[ɹ], IPA 151
So the F3, which up intil about 850 msec is hovering between 2300 and 2700 Hz dives to around 2000. Think that's not
low enough? Get real. Especially for a coda /r/ with a front vowel. So the reason we're not worried about tense or lax in
the previous vowel is that the tense/lax distinction neutralizes before /r/. La.
[tʰ], IPA 103 + 404
So there's a couple of very widespread pulses in this gap (from about 900-950msec or so), but given the aspiration
following the release, I'd not worry about it. The release occurs at about 950msec, and there's not only at least 50 msec
of VOT, but there's higher frequency noise (look at that around 3500 Hz for 100 msec after that. And that noise is high
frequency, broad band and *very* strong. So that's sibilant release, which means this plosive has to have been a) behind
the teeth and b) close enough to produce noise upon release that resembles an [s]. Hence an aspirated /t/.
Barred I
[ɨ], IPA 317
So amid all that high frequency noise, there's a neeny little vowel. GIven how short it is, it's not worth thinking too long
about. But it's mid or higher-mid judging from the F1, and slightly front, judging from the F2. But it's reduced, so treat
it that way.
Lower-Case G
[g], IPA 110
The gap here starts at agbout 1050msec and goes on to, well, after 1100 msec. And the voicing is pretty strong and fairly
consistent, considering it looks like a serious plosive and not something slushymushy like my plosives so often are. The
burst is very broadband, and it's a little tilted t the higher frequencies. But the burst isn't 'sharp' the way the [t] release
was. It could almost be 'double'. Hmm. And there's a lot of energy just about 1700 Hz or so, rather than being clearly
concentrated higher up. So while I'll admit this is ambiguous, but the slightl low F3 might be evidence of velar pinch.
It's unlikely to be coarticulation with the previous [r], just because there's at least a whole syllable in between.
[nʊ], IPA 307 + 321
And finally here's another /o/. Again it seem sto be mostly mid throughout (the F1 is basically flat at 500 Hz for 400
msec or so. It starts out sort of central (in the mid F2 range) and moves decidedly backer/rounder (lower in frequency).
So I guess my /o/ is a diphthong. My /e/ is still open to interpretation....

Winnipeg, Manitoba
CANADA R3T 5V5
Solution for (mid-)June 2005
"I can drive home after dark."
To repeat: There are different styles of reading--this left-to-right business is just how I do it for convenience. As time
goes on, I'll be introducing other styles. One of the things that I always forget about, at least when sitting down to do
these is the 'big' picture stuff. For instance, how many syllables (or at least vowels) are there in this? What evidence of
segmentation do you see? Where? Can you see anything suggesting pitch peaks/lows, correlates of stress like
amplitude, length, or pitch excursions? Once you've done that sort of thing,usually you go through and mark all the
things that are obvious--the sibilants, the nasals, if you can see them, things that are obviously [i] or [a], that sort of
thing. Then, once you've got the big picture, then you start in on specific cues.

[aɪ], IPA 304 + 319
I'm not sure what's going on in the voicing from the onset (around 125 msec) for the first 25 msec or so. Maybe it's
residual glottalization. There's not much other explanation for what's going on here. And trying to force a consonant
here won't get you very far--it would need to be voiceless, noisy, but have formants. Not something that is easy to do.
Except for [h], you don't have many options, and it's tough to make a word, let alone a sentence, with an initial [h] here.
So ignoring it, the F1 starts around 800-900 Hz or so, and falls. Low low low but moving up. The F2 starts around 1250 Hz
or so and rises. So back back back (and probably not so round round round, since the f3 is a little high at the beginning
at least) and moving forward. Sounds like a failry classic /aj/ dipthong. I go back and forth about transcribing these. In
my voice, this can only have an [ɑ] as its nucleus.
Lower-Case K + Superscript Lower-case H

[kʰ], IPA 109 + 404
Well, it's mushy, as my (especially velar) stops are, but okay. The voicing preseverates just a little into the gap, but at
least in the low frequencies there's a serious loss of energy starting at about 250 msec and going on to almost 325 msec.
The transitions into and out of this gap clearly evidence velar pinch. The double burst is also a pretty good pointer to
velar. I'm not sure now whether that VOT is really long enough to count as aspirated, but it's clearly a voiceless release.
Schwa
[ə], IPA 322
My favo(u)rite. about 25 msec of vowel. Too short to do anything with, too long to ignore. Must be reduced due to
extreme stresslessness. Funny how something so short can still get a pitch accent on it, tho. Hmm.
Lower-Case N
[n], IPA 116
Well, what's really interesting is this almost 100 msec of consonant. Sonorant, with formants and everything, and
obviously very fully voiced, this clearly has less energy than your typical vowel. So it must have some kind of closure
somewhere. But it's *long*. Okay, so this is probably a good candidate for a nasal. I just wish it had an obvious zero.
Well, up near 3000, but that doesn't really count by itself. Oh well. The F1 is clearly being depleted by something. Let's
call it a weak zero... and then the pole is nice and high, as poles go, up around 1500 Hz. So that's a good indication of an
alveolar nasal.
Lower-Case D
[c], IPA 104
Well, really the only evidence of an obstruent moment here is the transient--be it release, burst, or clunk (that being
the technical term for a moment like this that we would otherwise choose to ignore). But there it is, and if it isn't a
clunk, we have to explain it. As I've observed before in these things, my (especially homorganic) nasal-plosive
sequences tend to look like this, using Steriade's Aperture Theory, a nasal closure with an oral-looking release. So this is
some kind of oral release. It looks slightly velar, concentrated in the middle frequencies rather than the upper
frequencies (as would be more typical of an alveolar). But then it would be tough to make word out of. The transitions
are not amazingly helpful, in that the preceding sound is a nasal, and the following is an /r/, which perturbs the
formants beyond all useful visual cues. So know there's a plosive here and move on.
Turned R
[ɹ], IPA 151
Well, this is perhaps the lowest F3 I've ever produced. So there you go. The F1 is very low (very close articulation), the
F2 is quite low. And F3 is the so low it's threatening to move into the low *F2* range. Can't get any lower than that.
Must be an /r/.

[aɪ], IPA 304 + 319
This is rough. There's the transitions to deal with, and then you have to figure out where the formants are 'supposed' to
be. So abstracting away from the /r/, it looks like the F1 reaches a maximum (sometimes we call these 'turning points'
or 'target points'. I usually just refer to a local 'extremum') of about 800 Hz around 575 msec or so. So there's a low
target here somewhere. The F2 at that moment is relatively low. So at that moment, we have a backish lowish vowel.
Then the formants move. F1 drops. But it may do so just because it's approaching a closure. F2 rises and drops, so we
have to account for the rising at least (the dropping may either be a 'cue' or it may just be transition. So we've got a
lowish, backish vowel that moves (at least slightly forward). So if this is a diphthong, it must be approaching a front
offglide. The F3 ends a little low, which combined with the low F4 and the diving F2 transition, suggests in general a
labial transition. So that's something else to abstract from. Either this is some kind of low vowel, or this is a low vowel
with a highish frontish offglide. Guess which.
Lower-Case V
[v], IPA 129
So I've already given it away. THere's clearly labial transitions moving into this bit of noise. And there is a very short bit
of oral fricative here. So it must be labial. This being English, it's labiodental.
Lower-Case H
[h], IPA 146
On the other hand it opens immediately into something a bit louder, voiceless, but with evidence of formant structure.
Classic [h] stuff.
Lower-Case O
[n], IPA 307
So when the voicing finally cicks on, we've got sort of a problem. I'm not entirely sure where F1 is. The F2 is that bit just
under 1000 Hz. There's no zero-ish looking thing below it, so the F1 can't be jammed up too close to the F2, but it's not
so low as to disappear into the voicing bar. So this isn't low and isn't high. Well, we have some mid vowels to play with.
Note the lack of evidence of an offglide. For once. Don't let anyone tell you that 'tense' mid vowels in English are
*always* anything. Whether they are or not is an empirical question.
Lower-Case M
[m], IPA 114
So starting around 875msec and going on for about 50 msec or so, there's this thing where the amplitude falls off. So
there's something here. There's a pole more or less where the F2 in the preceding [o] was, but the F1 sort of disappears.
There's a move to transition up for both formants, if that's what they are once the amplitude starts to kick back on, so
all we're really looking at is this short lower amplitude bit. Which is lower amplitude because it's a nasal. And the pole
is that F2 thing, down just below 1000 Hz, which is pretty typical of my bilabial nasals.
Ash
[�E6], IPA 325
so again we have to abstract away from some odd transitions. The F1 doesn't reach its extremum until about 1000 msec,
but it's a high, so this must be a fairly low vowel. But the slope of the movement toward the extremum doesn't look
only transitional, so I wonder if there isn't another target floating around there. Then again, most English diphthongs
don't have *low* offglides, so I'm probably just dreaming. The F2 is fairly flat. It's maximum occurs sort of early, and
falls very slowly over the course of the vowel. There's evidence in the last 25 msec or so of downward pointing
transitions (labial again), so maybe that trend in F2 is just transitional. But maybe not. So we've got something mid-to-
low and centralish, and it's sort of long, so it might be moving to definitely low and very slightly back. Backer than
front, but not flat out back, especially compared with the nuclei of the earlier diphthongs. So this is probably a vaguely
front, but very low, vowel.
Lower-Case F
[f], IPA 128
So again we have something labial looking, this time with a burst. Which would make it a plosive of some kind. But the
voicing or noise or whatever it is at the bottom is a little loud and a little flat to go with a truly closed plosive. Then
again, you know my plosives are often sort of mushy. So as plosive as this looks, I'm going to suggest that it's worth
paying attention to the teeny weeny, almost imaginary, bit of noise up in the very high frequencies, and suggest this is
fricative. That and it makes a better word. This explains the noise, and the greater duration and frequency of the noise
at the bottom, compared to what I want to call noise or perseverative voicing or whatever it is in that thing earlier I
wanted to be a [k]. There's a lot of wishing going on in this spectrogram. But oh well. The bursty thing then isn't a
burst, is a noisy transition between the [f] and the [t] closure. Which if you do it a few times, can get really sharp (noisy
and short?). And the fact that it's sharpest in the mid frequencies I attribute to the high-attenuating properties of the
labiality with the short front cavity formed by the closure at the alveolar ridge. TMSAISTI.
Lower-Case T
[tʰ], IPA 103 + 404
Well, here's gap. Even ignoring the 'explanation' above, there's a gap, a release, and some fairly significant aspiration
noise. So there's voiceless aspirated plosive here. The noise is broad band, and very loud in the very high frequencies.
Pretty classically alveolar. There's another concentration of energy in the mid frequencies, but I attribute that to the
following context....

[ɹ]̩ , IPA 151 + 431
Well, this is a more classic /r/. Nice mid-to-low F1, F2 sort of sitting there up against the low low F3. Flat as the prairies.
Syllabic /r/.
Lower-Case D
[c], IPA 104
At last, a gap that looks like a gap and is one. There's some serious voicing going on, but it's properly attenuated, as if it
were being produced in a closed space. Woo hoo. Okay, so about place. Nice sharp release. But that's about it. It looks a
little velar, depending on whether or not you think that's velar pinch in the offset. The onset transitions don't look
velar at all. The F4 is diving, who knows why. The F3 and F2 are basically just sitting there. So it's ambiguously velar.
Which turns out to mean ambiguously not velar. IF it isn't velar, it must be alveolar, since there's no way those F3
transitions on either side look bilabial. Watch me be wrong about the next set of bilabial transitions that come up...
Script A
[ɑ], IPA 305
Well, the F1 extremum occurs sort of late in the thing I've marked off as the vowel (that mark is based on the weird
amplitude/pitch thing that happens about where I've marked it, which is as arbitrary as anything else). But this is fairly
flat F1 anyway, so oh well. It's a quite low vowel. The F2 extremum (a minimum this time) occurs more or less at the
same moment, so it's a good bet that it means something. So this is very low and very back. And r-colo(u)red, judging
by the F3, if that sort of thing matters to you.
Turned R
[ɹ], IPA 151
Another low F3 deal. Not much else to say. Except look at those F2 and F3 transitions into the following stop.
Lower-Case K
[k], IPA 109
Now there's some velar pinch for you. The noise here has less amplitude than a for a typical alveolar closure, and it's
organized in bands as if it were far back and exciting forward cavities. And one of those resonances is in the region of
the F2/F3 pinch, which is a pretty good indication that this is a velar.

Winnipeg, Manitoba
CANADA R3T 5V5
"It's a work in progress."

This year, I'm trying to start from basics, so I'm going to try to take nothing for granted except the basic acoustics and
phonetics. If you don't know what a formant is, or what a plosive is, or what it means to be back, go read _A Course in
Phonetics_, or at the very least, see my "How To" page.
So here's the thing. As I start writing this, I'm at gate 211 at YYZ (Pearson, Toronto) waiting to board my flight home,
hopfeully in about 15 minutes. It's Sunday, 30 January, and I'm at the end of what I think was a successful conference.
But it's also late-ish on my fifth straight day of being 'on' most of the day on about 3-5 hours of sleep a night. So since I
have to put this up Tuesday morning, I'm grabbing my spare moments here to do this. I'll have to continue on the
plane, and who knows when else tomorrow. So if the text of this is more disjointed than usual, you know why.
Small Capital I
[ɪ], IPA 319
Well, picking up after the unusual 150 msec of silence at the left edge (just ask anyone, 150 msec of silence from me is
an unusual event), the first visible thing here is a vowel. Regular pulsing begins at about 150 msec and continues for
about 75 msec. The energy is quite strong and goes all the way up the (visible portion of the) spectrum. So now we
check the formant structure. The F1 is the lowest formant, and it seems to occupy the bottom half of the first 1000 Hz.
So bandwidth being bandwidth, the centre of this formant is probably well below 500 Hz. Let's say 300-400 Hz or so. The
F2 starts up not quite around 2000 Hz and falls to about 1750Hz. So this ain't an upgliding diphthong. F3 doesn't tell us
much, sitting up around 2500 Hz and F4, if you care, is right where it's supposed to be around 3500. I've never looked so
Average General American Male in my life. ANyway, we have a very low F1, so we have a quite high vowel. We have a
quite high F2, so we have a distinctly front vowel. And it's short, and glides, if anything, backward rather than forward
(I.e. the F2 indicates retreat from front to not-so-front), characteristics of front lax vowels. So this is a high, front,
lax/short vowel.
Lower-Case T
[t], IPA 103
So what we have here, starting about 225 msec and running to about 300 (and a bit further in the lower frequencies) is a
gap. An empty space in a spectrogram. A moment of relative silence. Which is usually associated with plosives. Now a
good, strong, domain-initial plosive would have a nice closure transient and a nice strong release burst. But this one has
neither, as far as I can tell. But that might just be because it's in a weak position, prosodically. So it's probably a plosive,
and probably ina coda. It's definitely voiceless, with no striations at the bottom in the 'voicing bar' we look for. So we
have a limited number of choices. (Quiz for beginners: What are the voiceless plosives in English?) So we need to look
for clues as to place. With plosives, those are usually in transitions, into or out of the closure, in the release
information, and in the top-down phonotactic knowledge that we all have such good command of. So if you look at the
transitions into the closure, F4 isnt' doing much. F3 isn't doing much. F2 seems to be approaching about 1750 Hz. F1
doesn't seem to be doing much.
Brief excursus 1: There's "positive" evidence, i.e. this observation points to this conclusion, and then there's "negative"
evidence, i.e. the absence of anything that points to something different. Positive evidence is better--there are some
areas where negative evidence isn't even permissible. But this is spectrogram reading and we take what we can get.
So I'm going to guess /t/ here. It's voiceless, it's plosive, and it's consistent with an F2 transition target of around 1750
Hz, especially if you are fond of locus theory. If you're not, that isn't much evidence to go on, but there's no evidence of
velar pinch and not real evidence of labiality (in the form of lowering of all formants, or at least one other than the F2
which is ambiguous, as far as 'lowering' goes). It's also a good guess statistically and phonotactically, just because
coronals are so much more common than other plosives. The release characteristics, such as they are, are consistent
with this guess--more consistent with a guess of coronal than of anything else.
Brief excursus 2: At this point, it's useful to start looking at top-down information. So how many words an you think of
that start with /It/? How many of those are likely to start an English sentence? Good.
Lower-Case S
[s], IPA 132
So there's a brief bit of friction here, about 50 msec long straddling the 300 msec mark. This could just be the release of
the preceding plosive, but the fact that it gets stronger and broader-band (involves more frequencies) at the right end
rather than at the release of the preceding plosive suggests that this is not 'just' the release of the preceding /t/. So if it
is something else, what is it? It has a broad band, i.e. the energy is distributed over a large and mostly continuous range
of frequencies, and it's strongest off the top of this spectrogram (so its peak must be above 4400 Hz, probably at least 6-
8 kHz. This is typical of siblant [s].
Schwa
[ə], IPA 322
Well, we've got here a vowel. If you notice, the F1 is just a little higher than the previous vowel, so this must be vaguely
mid. The F2 is sort of all transition, so it doesn't seem to show evidence of a 'target' of its own. That's a pretty good
indicator of a reduce vowel, i.e. something which in English you'd just transcribe with a schwa and then move on.
Which is what I'm going to do.
Lower Case W
[w], IPA 170
Well, the F2 is the real clue here. It's diving down to about 750 Hz or so, indicating something very round and/or very
back. The reduction in energy in the frequencies above 1000 indicate a degree of stricture greater than for a vowel, but
since we've got something apparently sonorant and fully voiced, it must be an approximant. A nasal would have the
reduction in energy, but it would effect the low frequencies as well, to a greater degree than here. So we've got a
backish roundish approximant. The F3 is being drawn down by the the coming transition, so it's not a good source of
information, but the F4 is also lowered, again suggesting rounding.

[ɹ]̩ , IPA 151 + 431
My favo(u)rite sound. Just look at that F3? What else needs to be said? You never see an F3 that low except for an
English-type approximant /r/. The syllabicity you have to derive from the fact that you want to call the two flanking
sounds 'consonants' so that this has to be the vowel.
Top-down alert: At this point, we can start hypothesizing. "itswer" is an unlikely word, but 'it' is a very likely beginning
to a sentence. If 'it' is the subject of the sentence, we need to look for a third-person verb. If 's' is that, then we're
looking for some kind of locative, predicate nominal, predicate adjective, or something like that...
Lower-Case K
[k], IPA 109
Another gap, so probalby another plosive. And voiceless. It has a double burst approaching 700 msec, which is most
characteristic of velars. Velars also exhibit 'pinching' of the F2 and F3 frequencies, which we also see, although between
the lowered F3 of the preceding sound and the raised F2 of the following sound, the apparent approximation of the F2
and F3 frequencies may not be the most useful cue here. Now, note that I said voiceless, but not *aspirated*. (Quiz for
beginners: What is the significance of a voiceless plosive being unaspirated in intervocalic position?)
Barred I
[ɨ], IPA 317
ANother teeny short vowel that's mostly transition. This one is fronter (higher F2) than the previou sone, so following
Keating et al (1994), I transcribe it as barred-i.
Lower-Case N
[n], IPA 116
So this is what I meant above by the lowered amplitude applying to all the frequencies. Somewhere around 750 msec,
two things happen--the amplitude drops off at all frequencies (with the sudden appearance of zeroes in several places)
and the formants flatten out completely. So this is a nasal. For nasals, you want to look at the frequency of the first pole
above the F1. Which would seem to be about 1500 Hz, which for my voice is consistent with the alveolar nasal. Bilabials
have that pole closer to 1000 Hz, and velars usually evidence some degree of velar pinch, which if you look at the
barred-i is not at all in evidence.
Top-down alert: If the previous syllable is 'work', then this syllable could well be 'ing' or rather "in'". But what are the
odds? If this is so, then the phrase 'it's a working...' is plausible, but the next thing is likely to be a noun, modified by
'working'.

[pʰ], IPA 101 + 404
So we've got another short gap here. The only evidence of anything I can see is in the preceding nasal, which in terms
of transitions makes no sense whatsoever.
Turned R
[ɹ], IPA 151
There's another one of those low F3 things, this time on the periphery of a vowel rather than being one. Note the slight
attenuation of the higher frequencies characteristic of approximants as opposed to vowels.
Script A
[ɑ], IPA 305
So we've got a vowel. Abstracting away from the transitions, we want to look at the stretch between about 1050 and
1100 msec, where the F1 and F2 are 'steady', and the F3 is as stable as it gets. And during that stretch, the F1 is very
high, indicating a very low vowel. The F2 is very low, which as we said before indicates backness, rounding or both. So
we're looking for a vowel which is about as far back as we can go and as low as we can go. So we're looking in the
vicinity of the Cardinal 5.
Lower-Case K
[k], IPA 109
So the question is whether the falling F3 transition leading to this is another /r/, or if it's just pinch. I'd say it was just
transition, but I'm not sure--it seems to me that the F3 woudln't bother to rise and be steady at all if there were
flanking /r/. So since the gap here is followed by a nice double burst, and probably another velar, I'd say that this is all
consistent with velar pinch.
Turned R
[ɹ], IPA 151
I actually didn't intend this to be an r-ful spectrogram, but then I am the /r/ guy. Mostly I wanted to throw the
Canadians in the audience by saying pr[ɹ]gress instead of pr[o]gress.
Schwa
[ə], IPA 322
This looks to me like a schwa, although that's not what I *think* the vowel is phonemically. But whatever. The formants
seem to be "about" 500, 1500, 2500 Hz (this last at least at the end as the voicing dies out).
Lower-Case S
[s], IPA 132
So here's another fricative, quite weak, given how very long it is, but I take that to be a function of its utterance-final
position. It's broadband, and it's strongest in the highest frequencies (and best organized up there too--the low
frequencies kind of come and go, but the upper frequencies are always there. So this is another sibilant [s].
So it turns out that that middle word wasn't 'working', but two words 'work in'. Always go back and reconfirm and
retest previous hypotheses.

Winnipeg, Manitoba
CANADA R3T 5V5
To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93 or Lucida Sans
drop me a line.
"Could you ... take a later flight?"
Okay, this is a slightly retooled version of something I actually said, the way I said it. When I realized it was a prosodic
nightmare, I decided to use it. Ha. In addition to the segmental transcription, I've included my best guess at a ToBI-style
transcription of the pitch track. I've taken some liberties with the ToBI conventions. I've sort of conflated the Break
Index Tier and the Tone Tier into a single line. Instead of an Orthographic Tier, I've just aligned the break indices to my
segmental transcription. I've skipped the additional 'miscellaneous' tier, but it might be worth having one, if only to
mention the absurd lengthening at the end of the first 'phrase'. If that's what's going on. If anyone out there is ToBI
savvy, I wouldn't mind discussing whether my interpretation is plausible. I've appended a discussion of the prosody at
the end of this page.
But first, the segmental stuff
Lower-case K + Right Superscript H, IPA 109 + 404

SIL [kH], Unicode [kʰ]
First thing to notice is the looooooong aspiration. Starting from about 125 msec stretching out to about 225 msec, that's
a lot of aspiration. Not outrageous, but a lot. The initial stretch where it's heavier I interpret as release. I'd be happier if
this were a double-burst, and not a long slushy release, but there you go. The F3-resonance in the burst is at an
ambiguous frequency, not to mention unusually weak. The low F2-resonance is troubling (see the vowel to follow) but
the concentration of energy in F2, rather than F4 or above, and the fact that that energy doesn't get stronger in the
lower frequencies (as we might expect for a bilabial) suggest velar over anything else.
Barred I, IPA 317
SIL [ö], Unicode [ɨ]
QUite a low F1, indicating a quite high vowel. The F2 is a little low, but I guarantee, because I made sure I articulated
this vowel this way, that it's not at all round. This might be better transcribed as 'turned m', (IPA 316 high back
unround, cardinal 16, [ĩ] or [ɯ] if you can see the characters), but I"m not entirely convinced this is as back as it might
be. Hence Barred I.

Well, there's not much evidence of anything going on here. There's a nice little voiced gap here, a little long to be a flap.
But the transitions are amiguous. The F1 and F3 don't seem to be doing anything. The F2 is clearly transitioning
between the lower position for the previous vowel and the very high position in the following segment. Interestingly it
seems to be interrupted by this stop at 1700 Hz, near the (apparent) locus of alveolar transitions, but that may just be
coincidence. In the absence of anything to the contrary, guess alveolar.
Lower-case J, IPA 153

SIL [j], Unicode [j]
Round about 425 or 450 msec, the F2 reaches a maximum well above 2000 Hz. Which means this is about as front as it
gets. The extremum here is very short and starts rapidly to decline. Now, if you've been following these things for a
while, you know a) my back round vowels are never really back or very round (and certainly never both at the same
time) b) I've merged OE /y/ (ME and ModE /ju/) and /u/ everywhere except coronals, and c) the reflex of /u/ after
coronals has a front on-glide). So you may be tempted to ignore this. But that extremeum is just too high. Also, there's
amoment just after the extremum where the F1 apprently jumps just a little (although the harmonics are dancing
around so it's a little hard to tell), so something is 'going on' here. So I transcribe a front glide and get on with things.
Barred I + Barred U, IPA 317 + 318

SIL [öŽ], Unicode [ɨʉ]
Okay I'm not 100% sure what's going on here. This is absurdly long, so who knows what's normal and what's weird
about about it due to the length. Starting at the point (about 500 msec) where the F1 changes, there's two stretches that
seem sort of flat. The first is from about 625 to about 700 msec, and the second is about 800 to 900 msec. So I transcribe
a diphthong with changing roundness, in the high vowel range, and mostly central.
Lower-case T + Right Superscript H, IPA 103 + 404

SIL [tH], Unicode [tʰ]
There's a nice little 50 msec or so gap here, around 1000 msec, followed by some pretty serious aspiration. The noise is
pretty spread out top to bottom, but stronger at the top. It looks [s] like, which is a pretty good clue that this alveolar.
(Why would the aspiration following [t] resemble [s]? Discuss.) The transitions are not particularly helpful, and once
again, the semi-natural course of F2 is interrupted at exactly the right moment. On the other hand, the F2 transition
into the stop is a different angle than the transition out. Maybe there's something to this locus stuff after all.
Lower-case E + Small Capital I, IPA 302 + 319

SIL [eI], Unicode [eɪ]
Moving F1 and moving F2, I'd say this is diphthongal, which isn't always typical of my speech. F1 moves from the lower-
mid range (indicating a higher-mid vowel) to even lower (higher vowel quality). The F2 is high, moving higher. Sounds
like the standard description of NAE /e/ to me.

SIL [kH], Unicode [kʰ]
A front velar. Note the F2-F3 pinch, even though it's mostly F2 action. I'm beginning to thing my velar stops just never
fully stop. There being no velar fricatives in English, I call this a stop. There's a burst, just before 1300 msec, or maybe
it's just a 'clunk'. But if you trace the F2 and F3 transitions through the aspiration, or whatever it is, they seem to pinch
just about 2000 Hz. This is at best weakly aspirated, but I read the clunk as a release. The VOT then is about 33 msec,
which is a little short for a velar, but is a little long for an unaspirated stop. If it is aspiration, this stop can't be syllable
initial--the VOT would be a lot longer. On the other hand, there isn't quite enough voicing and the aspiration is a little
bit long to be a /g/ syllable initially. The point is it's velar, probably a stop.
Schwa, IPA 322
SIL [Ŧ], Unicode [ə]
There's 25-35 msec of vowel just after 1300 msec. Note the transitions, so we're not quite at the following /l/ target yet.
Also the amplitude is greater, more vowel-like. But it's so short, I transcribed it as reduced.

SIL [lō], Unicode [ɫ]
Someone who Shall Remain Nameless suggested that this couldn't be a dark /l/ because it was syllable initial (or at
least intervocalic, and if the preceding vowel is unstressed then, given onsets first, this is probably pretty initial). To
which I replied 'look at that F2. Looks plenty dark to me.' Which should be a lesson to us. Phonetic transcription (or
acoustic transcription, whatever I'm doing here) is about recording the sounds as they are produced, not as we would
like them to be, and definitely not as we were taught they were supposed to be. The fact is that in NAE, the rule doesn't
distribute a light /l/ and a dark /l/, but rather a dark /l/ and a darker /l/. Where we understand 'dark' to mean
'velarized', and we take 'velarization' to be a lowering of F2. There's no way to treat the F2 of any /l/ I've ever produced
(in English) as anything but low. Particularly this one, which is otherwise admirably initial. It's nice and short, the F3 is
a little high, the amplitude is down.

Okay, I've used the same transcription, and this is arguable. Part of the problem is the starting and ending points are
messed up by surrounding segments. The F2 starts way low due to the F2 of the preceding /l/, and the falling F3
(anticipating the later /r/) is screwing with the F2 and F3 in this vowel. But the F1 is definitely mid-ish, and the F2 is
going from not front to definitely front. How many mid vowels have front offglides? Well, two. But this one is /e/.
Fish-hook R, IPA 124

SIL [R], Unicode [ɾ]
Flap. Tiny little gap. Barely a stop. Doesn't interrupt the 'flow' of the resonances at all. Voiced. Flap.

SIL [Ļ`], Unicode [ɹ]̩
Low low F3. Must be an /r/. No evidence of any vowel on either side, so probably syllabic.

This is a function of my noisy office, my hand-held microphone (which is really a pretty good microphone, but it gets a
lot of abuse banging around my office), and my noisy /f/s. THis is quite long for a fricative, and no slouch in the
amplitude department. But notice the noise. It's not well-organized. It's distributed more or less evenly through the
spectrum. It's not much darker at the higher amplitudes. It doesn't die out below 1500 Hz or so. It just doesn't have the
profile of a sibiliant. It doesn't have centers/resonances, and certainly none contiguous with the lower formants. It
doesn't have the profile of an /h/. So, as fricatives go in English, there aren't a lot of choices. The unfiltered/unshaped
nature suggests labiodental or interdental (or even bilabial), even if elimination didn't lead us to the incisors one way
or another. So, since this is voicless, it really has to be [f] or Theta. Sound it out.

SIL [lō], Unicode [ɫ]
There's this moment where the frication ends, the voicing starts, but there's no energy in the higher frequencies. Okay,
it's two pulses, but I'm taking it. Given the F2, it has to be fairly back, and /r/ isn't indicated (no indication of a low F3).
So this is /w/ or /l/. Again, sound it out.
Lower-case A + Small Capital I, IPA 304 + 319

SIL [aI], Unicode [aɪ]
Once you abstract away the formants of the dark /l/, we've got high F1 that falls, and a lowish F2 that raises. So low-to-
high vowel, moving backish to front. /ai/.
Lower-case T + Right Superscript H, IPA 103 + 404

SIL [tH], Unicode [tʰ]
Okay there's no useful information in the transitions into this gap--the F3 is flat. The F2 looks like it's rising toward the
F3, and it is suggestive of velar pinch, but then again the vowel's F2 is moving in that direction anyway. So it's almost
definitely not bilabial, but the transitions approaching 2000 msec aren't otherwise helpful. But the release/aspiration
noise is a dead give away. It's top-heavy, i.e. darker in the higher frequencies, and, well, dark. Looks a lot like /s/, which
is a good approximation of voiceless (/h/) noise filtered through an alveolar release. So the release must be alveolar.
And this doesn't look like one of those /kt/ sequences that start out /k/ and end up /t/. So I think we can just leave it at
/t/.
"could"
Break Index: 1
I've included a H* on the stressed 'could', since the pitch track indicates (relatively) high pitch here. This might be a
case of those rare %H right boundary tones, since this utterance could just as easily have started low. I didn't affricate
the /...d# #j.../ sequence here, so I decided it didn't merit a 0 BI.
"you"
Break Index: 3
The absurd lengthening of this word I attribute to it being, somewhat accidentally, at the end of a phrase. In the 'real'
utterance, this was me trying to decide whether not to suggest an option to something else that was going on. Even
though this is a separate word (i.e. the no affrication) it doesn't seem to exhibit any evidence of a separate pitch accent
of its own. I assume the slope is just interpolation between the H* on 'could' and the L- edge tone. I think these are still
officially called 'phrase accents'. I just call them the edge-tones (or occasionally 'minus-tones'), as opposed to the star
tones (word level pitch accents) and boundary tones (marking the ends of Intonational Phrases ('clause' or 'utterance'-
level constituents).
It's been pointed out to me (thanks, Kevin) that this isn't so much a real phrase ending as it is a filled pause. The whole
flat-but-declining contour here then could simply be just unmarked mid pitch (note the high isn't as high as the later
highs and the lows get lower). Since there's obviously no contrast with other kinds of relatively unmarked intonational
stuff (e.g. low throughout) this is probably not a bad interpretation. But less fun. ;-)
"take"
Break Index: 1
As the speaker, I feel like there's a high of some kind on 'take', but as you can see, the pitch peak is displaced onto the
following syllable. And if that weren't bad enough, there's a distinct low on the 'take' syllable (it's higher than the edge
L- preceding it, suggesting baseline reset and justifying my 3BI at 'you', if the lengthening weren't enough). Since the 'a'
definitely isn't tone-bearing in the usual sense, I've decided this is a 'scooped' L*+H.
"a"
Break Index: 0
The 'a' is marked with a 0 BI, since I regard it as proclitic on the next word. It doesn't get a pitch accent (star tone) of its
own, and I definitely don't think it deserves a H* of its own, which would be the most obvious choice.
"later"
Break Index: 1
I think the current ToBI conventions call for some kind of pitch accent on every stressed lexical word unless it's
deaccented in some way. So I've marked 'later' a L* on its stressed syllable. I almost didn't, except for my interpretation
of the aforesaid convention, but I think this is the right decision independently. If there weren't a separate pitch accent
on 'later', there'd be no reason for the tone to drop so abruptly here, instead of just interpolating in a more or less
straight line to the L* on 'flight', as in 'you'.
"flight"
Break Index: 4
This gets a 4BI due to being at the end of an utterance. 4s indicate the end of an Intonational Phrase, which as I said
above are (generally) clause- or utterance-level, or have the feeling thereof. Anyway, 4s are interesting, because a) being
both phrase and utterance final, the last syllable before a 4 lengthens a lot, b) being both the end of an Intermediate
Phrase and an Intonational Phrase, they get both edge-tones and boundary-tones. And this 4 marks the end of a one-
syllable lexical word, which gets a * tone of its own. So I chose a L*L- H% sequence. There's clearly a L* of some kind on
flight, which is only to be expected in a yes-no question. And it being a yes-no question, it gets some kind of H%
boundary tone. But what kind of edge tone should go here? I've marked a L-, since the L* seems a little long. But that
means displacing the L- to the left, sort of next to the L*, instead of next to the boundary tone where it usually goes.
The other choice, L* H-H%, is a possibility, but then we'd have to say the longish L* is due to final lengthening, and I'm
just not up enough on intonation to know of the targets get long under lengthening, just like segments, or not. If
anyone who knows the E_ToBI conventions better than me wants to comment, correct, or discuss, please e-mail me.

Winnipeg, Manitoba
CANADA R3T 5V5
drop me a line.
"The galleys are due back next week."
Eth, IPA 131

[D], [ð]
Okay, there's no good way to tell this one. It's short to be a nasal, but nasals tend to get short initially. They also tend to
not have much energy in them, and this doesn't even have a proper zero in it. So rule out Nasal. Could be some kind of
weak aspiration to a stop. Could just be a really short fricative. For now, just notice something is going on between 60 and
80 msec that needs to be explained. Posit some kind of consonant, and get on to the vowel. And promise yourself to
come back to this later when you have a better idea what this utterance is likely to be.
Barred I, IPA 317

[ö], [ɨ]
Well, I misled you. There's not a lot going on in this vowel either. It's just high of mid (judging from the F1) and just
front of central (judging from the F2). It also seems to be moving frontwards, although you should probably notice that
what it's moving to is a collision with F3. That should tell you something, at least about the following stop. But anyway,
this is short. It's vaguely high and the F2 is closer to the F3 than the F1. Invoke Keating et al (1994) and call it barred-i.
Lower-case G, IPA 110

[g], [g]
Well, maybe it's me, maybe it's my microphone, maybe it's my noisy office, but I can't seem to make a good, no-
nonsense stop anymore. This span is voiced, there's a sharp burst, and so, knowing how these go, I can convince myself
that the low-amplitude noisy stuff in the middle of all this is worth ignoring. I really suppose I should go in fro an
aerodynamic screening for a dyspraxia or dystonia or something, because something is leaking. But ignoring that, this
looks like a good candidate for a stop. The F2-F3 pinch in the preceding vowel suggests velar, but the height of the pinch
is boggling. Can only be a 'front velar'. Or a back palatal, or something like that. Don't get me started. Note the obvious
VOT in spite of the stop being voiced for most of its duration. If I weren't so tired as I write this, I'd come up with
something useful to say about aerodynamics and the duty cycle and pressure and such, but whatever.
Ash
IPA 325
[Q], [æ]
This is the ugliest vowel I've ever seen. It seems good and long, but you'd think something so long would have a hint of a
steady state somewhere. Unless this is a diphthong. Which is is in a lot of dialects, and is sort of here, depending on
your definition. Okay, here's what's noticeable. Ignoring the first 50 msec or so as transition, we've got something with
a high F1 (or at least something moving to a high F1). The preceding velar can only be front, and this being English that
suggests that this vowel should also be front. This is consistent with the starting position of the F2, but the F2 is diving
so fast-and-furious-ly that it's hard to tell whether its initial frontness, is late backness, or its transitional mid-ness
should be taken seriously as a feature. There being neither a steady state or a portion where the thing even slows down
for a bit, take some point in the durational center and use that. I think the center of the formant at that moment is
slightly high of the mid-range for F2, so let's call it front. Lowish and vaguely front. That only leaves a couple of
possibilities.
Lower-case L + Mid-Tilde, IPA 155 + 428

[lò], [l ̴]
I wish I'd asked Penny to record the Unicode number for this as a composed symbol, because concatenating the plain
"l" with the diacritic is just ugly. (That's Penny Gilbert, who kindly produced the table of IPA, Unicode, and SIL
ALT+keystrokes that I use to look up my characters. Yay Penny.) The good news is that the falling F2 into this and the
rising F2 out of it point us at a nice low F2 target that coincides (not-so-coincidentally) with a nice steady F3 and F4, at a
moment when the total amplitude seems to be going down. There's a target here folks! Okay, nobody can quite tell
what's happening in the F1. The F2 is low low low suggesting back back back. The F3 (and F4) are high, suggesting ...
what? Well, /l/. Dark /l/, cuz it's English, it's me, and it's got that low low low F2 suggesting backness. Velarization.
Darkness.
Lower-case I, IPA 301

[i], [i]
Ugly ugly ugly. Okay, pick the moment where the least stuff seems to be happening. There are huge, herkin' transitions
from the offset of the preceding /l/, at about 425 msec, to apoint where the F2 reaches a maximum and the F3 reaches a
minimum, somewhere between 500 and 550 msec. Then there are transitions again. So why do you suppose there's this
moment where the F2 and F3 seem to come together, only to be whisked apart again? There must be some reason, so
let's assume there's some kind of target to be reached here, in spite of the forces acting to pull the formants in other
directions. So the F1 at this moment looks a little low, suggesting a high vowel. The F2 is frickin' high, above 2000 Hz,
which is just plain high for me. And the F3... wel he F3 is a little low, but not really so low as to suggest anything like /r/.
So, low F1, high F2. Must be high and front. Very front. Fronter than front.

[z], [z]
There's a hint of voicing here, so this must be voiced. It's noisy, what there is of it, so it's probably a fricative of some
kind. It's got some high frequency energy, certainly higher off the top of the spectrogram (we only see up to about 4200
Hz) than anywhere in the lower frequencies. Sounds /s/-like to me. But voiced.
Lower-case A + Rhoticity Sign, IPA 304 + 419

[aÕ], [a˞]
Yer gonna kill me here, but I cheated. I excised this vowel and listened to it again and again, and transcribed it the way I
heard it. But it looks nothing like it should. Okay, I heard something that, while not strictly speaking low, was sort of
low and front. Not really ash-like, not quite as front. But not strictly speaking low. It wasn't the vowel I usually
transcribe as Turned V, which I don't do strictly IPA. I use it for the reflex of short U or O or whatever it is in 'hum' and
'hut'. It wasn't that vowel. This vowel, excised, sounded a lot like that Atlantic-Canada vowel I hear on 22 Minutes all
the time--the one Rick Mercer and Mary Walsh have. So that's how I transcribed it. But the acoustics suggest a high
vowel, if anything. Certainly at least mid. Definitely not low. The F2 tells us precious little, given what the F3 is doing,
but it still looks a little front. Abd the lowering F3 is the rhoticity. But the vowel I seem to be describing is Small-Cap I,
or one of those central vowels I never remember the symbols for. But it just doesn't sound like that. It's a vowel. It's /r/-
colo(u)red. Moving on.
Turned R, IPA 151

[¨], [ɹ]
Well, at least the /r/-col(u)ring in the preceding vowel is come by honestly. I'd locate the 'moment' of this /r/ at the
bottom of that F3 dip, between 725 and 750 msecs. It's worth noting that the F2 appears to be moving, slighlty, at that
same moment, and it's low extremum is back before 700 msec. I'd say there was a vowel moment (around the F2
extremum) and an /r/ moment (around the F3 extremum), and so there's two segments here, or at least two different
sets of targets separated in time. It just so happens that the F2 target is specified in one and the F3 is specified in the
other, leaving the other to vary. My take on segmentalism.

[d], [d]
Wonder of wonders, a stop that actually looks like I gap. If you don't pay too much attention to that stuff going on
around F2.... Anyhoo, the sharp transient/burst thing suggests some pressure is being built up, so this must be a stop. If
you went nasal, good spotting, but then you'd have to deal with the burst. If you went prenasalized stop, well, okay, but
I'd expect some nasalization on the vowel preceding, and, well, just more resonance. Based on experience with my
voice. F3 transitions point up (that is, up into it, and down out of it), F2 transitions are high of mid, so this is probably
alveolar. Nice strong voicing bar for a good stretch of the gap in the higher frequencies, so voiced.
Barred I, IPA 317

[ö], [ɨ]
What would a spectrogram read be without a little controversy? This does, at least at the beginning, look a little like the
previous barred I. But it's longer, and it's higher in pitch (note the closeness of the striations) so it's probably stressed.
English doesn't have barred-i in stressed positions. Hmm. But look at that F2. Starts off drifting slightly downward, and
picks up speed as it goes. Hmm. Okay, let's be systematic. This is definitely a high vowel. It's not quite as high as the /i/
from before, but it's still high. The F2 starts mostly front and then takes a nosedive. What makes F2 to that? Well,
rounding and/or backing, of course. So we've got a high vowel, moving from central and slightly front to either backer
or rounder. Hmm. Well, high and central is barred-i. But how do we account for the F2 change? We could be backing the
vowel, but front-to-back high diphthongs are hard to come by in English. Maybe it's just rounding. But rounding
doesn't do that to an F2. What's going on? Okay, the answer is that things are conspiring here to make this harder. It
comes down to this. First, this is my high vowel that isn't /i/. It's usually regarded as /u/, but being a West Coast kind of
individual, it's really more central, and it's usually completely unrounded. But, like most of my American colleagues, I
don't maintain a difference between /u/ and /ju/ (i.e. 'do' vs. 'due'). Second, the preceding consonant is coronal, which
for me selects a something closer to [ju] as an allophone. Third, the following consonant's place is also having it's effect
on the F2 transition. Take a quick look at the transitions in the vowel on the other side of the following consonant and
identify the consonant's place. Right, bilabial. So there's just a little labialization at the end of the vowel, which is doing
this weird thing to F2. Just for my benefit, please notice that the closest thing to a steady state in this vowel is the
beginning. And please notice for my benefit that the durational center of this vowel still has an F2 consistent with a
frontish vowel. Hence my argument that this thing is just frickin' front. Get over it.
Lower-case B, IPA 102

[b], [b]
Well, we've already talked about this one. The transitions, even the incoming F3, all the outgoing ones, and well maybe
not the incoming F1, but there you go, point to bilabial. Particularly those transitions into the following vowel. You
can't get much more classic than that. Wow. Nice little gap, some decent voicing (maybe not quite enough to warrant
the voicing label, but this is definitely unaspirated, even if there is a clear lag between the release transient and the first
clear glottal pulse of the vowel.
Ash, IPA 325

[Q], [æ]
This looks like a diphthong, which isn't how I want it to look. But there it is. There's a sharp pitch change about halfway
through this vowel, and there's a quality change there, sort of, as well. The first part, from 1025 to about 1100 msec, has
pretty steady formants, although the F1 looks definitely like it's transitioning to the second part. But taking the middle
of this bit, there's the F1 of a moderately low vowel, getting lower; an F2 just a hair front of central, and an F3 that isn't
doing much. The second part has the F1 of something that's definitely low, the F2 of something central inching
backward, and the F3 is falling. Weird. Okay so low or mid-to-low, mid or front-of-mid, followed by definitely low,
possibly central. Sounds sort of like lower-case a or Ash, followed by schwa. Which is one of the available diphthong
versions of /ae/, so we'd be right. What this does to my targets-in-time theory is open to discussion. Discuss.

[k], [k]
Well, note the transitions moving into this gap. Lowering F3, and rising F2. Classic velar pinch. There's a little
glotallization or breathiness moving into this stop, I'm sure because there's a major prosodic boundary following this
word (I mean, this word is at the right edge of a fairly major prosodic constituent). The burst and aspiration, if that's
what you want to call it, don't provide much useful information. THe lack of energy in the low frequencies in the
aspiration might be misleading since it makes the upper frequencies look stronger. But it still doesn't look /s/ like,
which is how it would look if this were a /t/. The burst itself is kind of long looking, but it doesn't look double, which
would have been nice as a clue.

[n], [n]
This is a pretty good nasal, considering it's initial in a phrase. Starting at 1350 msec or so, we see good voicing, but very
little upper resonance. There's a definite zeroing of some of the energy, and evidence of resonance at about 1500. The
higher resonance probably means something, but I don't know what. But this is definitely a nasal, what with the sharp
difference between the zeroes in the nasal and the energy in the non-formant areas in the following vowel. The
transitions in the following vowel are pretty alveolar looking--falling F3 and F2 rising from just above 1500.
Epsilon, IPA 303

[E], [ɛ]
It's not easy to tell what's happening with the F1 here, since the bandwidth seems so wide, but if the top edge of the
band is around 1000 (I could go a little higher or a little lower depending on how I'm feeling) and the bottom band is
definitely down near the bottom of the spectrogram, and there's something like the peak of the formant or at least a
strong harmonic or something at about 500 Hz, I'd say there's a good chance this is mid. At least this isn't obviously
very high or very low. The F2 is high, indicating frontness, which leaves us a choice between /e/ and /E/. You might
thing with the rising F2 to go with [e(I)], but then you would have been misled by the transitions to the following
consonant. This is short, and it looks like it's in a closed syllable. Epsilon is probably the safer bet, but partial credit
either way.

[k], [k]
Well, here's another one, and once again it's in a coda. The useful transitions are the ones moving into this stop. This
one doesn't even have any useful release information. Sorry.

[s], [s]
Well, this is short, but it has the right spectrum, noisy energy, strongest in the highest frequencies. No evidence of
voicing, although there is some energy down there. This may have a little too much energy to be a devoiced [z], but it's
hard to tell. It's awfully short and weak, but with that spectrum it can't help be anything but alveolar. If the onset of the
noise were sharper, I might suspect this is just alveolar aspiration. But I'm rambling.

[t], [t]
Well, it's not much of a gap. In fact it's not a gap. But it looks like it might have been trying to be a gap. But I don't think
it's my imagination that something is happening in here. There's [s]. There's something gap like happens before the
following [w] (which is what that is, even if it is partially devoiced. But I'm getting ahead of myself. Following the gap-
like thing, there's something that looks a heckuva lot like the short, weakish [s] we were just talking about. Which is
what prompted the comment about it looking like the aspiration (or just the release frication) of an alveolar. Thus we
have the thematic payoff I strive for in my writing. (Did I mention how tired I am?)
Lower-case W, IPA 170

[w], [w]
Well, you can definitely see the F2 sweeping down in the 'tail' of the aspiration/release noise, and sweeping up again for
most of the vowel. The more I look at [w]s, it looks like the F2 is w-colo(u)ring the vowel. But we don't hear it as
colo(u)red. But anyway, low F2, back and/or round. No appreciable effect on F3, to some on F4, which is a little weird.
But without the raised F3 or F4, this is unlikely to be a 'dark l'.

[i], [i]
Well, if you're not spotting this incredibly high F2s by now, I don't know what more to say.
[k], [k]
Gap, followed by some kind of release and some frication. I hesitated to transcribe this as an aspirated stop, since I'm
not sure this is aspiration and not just frication. Richard Wright suggested it should be transcribed as [x] or something,
but I was ambivalent about that idea. Anyway, there are signs of velar pinch going into it, and there is noise in the
burst/release/frication/aspiration part that's also centered in the 'pinch' range. Oddly enough the two pinches don't
happen at the same frequency. Probably due to the front vowel, it's a very front velar at its onset, but either the contact
slides back to a more neutral position or the tongue body rolls backward during the closure. This happens. Pat Keating
told me she saw the x-rays, but I don't remember where. The front is assimilation or coproduction or something like
that with the front vowel. But since this is utterance final, there's no carticulatory pressure coming from the other side.
We might expect the closure to not change, but obviously we'd be mistaken.
I'm really tired.

Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
you may volunteer to teach me how to do 'family' font calls, in the (now deprecated) FONT tag. Or teach me how to do it
with a cascading style sheet.
August 2002
"Every year it's the same thing."
Epsilon + Rhoticity Sign, IPA 303 + 419

[EÕ], [ɛ˞]
My first guess at this was [I], and I knew what this spectrogram was. Whatever it is, it ain't lowered (for those of you
who believe Canadian Shift is part of the American "Third" dialect). Note the F1, which is in the mid range, as opposed
to the vowels in the following syllables, which are higher (i.e. have a lower F1). So we're not dealing with the world's
highest vowel, even if we are dealing with something that is higher than mid. The F2 indicates an extremely front
vowel, at least for the first 75 msec or so. So very front and not /i/. So /e/ or /E/, and get on with things. The radical
lowering of the F2 and F3 during this vowel is coarticulatory, or assimilatory, depending on whether you think these
things are phonetic or phonological. There's something coming up dragging the F3 down out of all proportion, and as a
resonance it either has to cross the F2 resonance (and cause everything to get renumbered) or it has to push the F2 out
of the way. So for those of you who believe the F3 of /r/ is the F2 of the surrounding vowels, explain to me what's
happening to the F2 here. My /e's tend to be quite high (see the one coming up), and the F2 tends to move up (i.e. the
vowel tends to diphthongize frontwards), so /I/ and /E/ are the best guesses here. Plus the /r/ colo(u)ring. I do not
want to get into the debate about whether there's a glottal stop at the beginning of this.

[v], [v]
This may be the best [v] I've ever produced. It's noisy. It's at least partially voiced, and it doesn't look like anything else.
It's tough to tell from the F1 transitions, since there really aren't any, and the F2 and F3 transitions aren't helping, since
they're falling anyway. So there's no transitional information to tell us about place. The fricative is vaguely reminiscent
of Esh noise, in that it's fairly broad band and cuts out in the low frequencies. So if you guessed Esh here, give yourself a
point, but then try to make a word out of the rest of it. The noise is unfiltered, i.e. isn't shaped by vocal tract resonances
(at least to the extent that the noise is pretty even across frequencies--as long as you ignore the low frequencies).
Another great argument for anteriority is the fact that there's an /r/ in the vicinity, and there's not a speck of a sign of
the /r/ resonances filtering the noise. So it's probably fairly front in the mouth, at or in front of the incisors. Since this
is English, that only leaves a (inter)dental or labiodental. Weakly voiced. If you think it's Theta, again try to make word
out of it.
Turned R, IPA 151

[¨], [ɹ]
Okay, here's where we'd get to talk about the 'beads on a string' model, and how it fails. I 'm not at all sure there's
actually a distinct /r/ moment here, but there's really no other explanation as to the F3 being so low through the
preceding fricative. I've segmented off the bit I have with the following ideas: 1) This is Spectrogram Reading, and it
Just Wouldn't Be Fair to include a segment for which there is no useful cue. 2) Even if I didn't know what the utterance
is, there's no reason for the F3 here to be so low, so there must be an /r/ in here somewhere. 3) This being phonetic
transcription, I have to linearize somehow 4) It seems to me there's a hint of something resonant in the tail end of the
fricative (note the apparent resonance rising from about 500 Hz just around the 300 msec mark). 5) I can convince
myself that the F2 is stable from when it comes in, just before 300 msec to about 330 msec. 6) Something is going on in
the resonances above 3000. See the F4? See how it's weak and sort of gets louder at about 400 msec. So, I have decided
that the /r/ is in there somewhere, and that's where I put it.

[i], [i]
This vowel is a hair higher (has a lower F1) than the initial vowel, so whatever you called the first one, make sure this
one is higher. The F1 is remarkably flat from about 300 msec all the way to about 750. This is a long span to be so flat,
especially with the pitch changes and all the fronty-backy-F3-swoopy stuff going on above. Since the F2 eventually ends
up above 2000 Hz, which is way high, this is way front. /i/ is the frontest vowel I can think of, and it's higher than
whatever the first one is to boot.

[j], [j]
Regarded as the approximant (i.e. consonantal) equivalent of /i/, this doesn't really look like anything but a vowel,
except for a tiny bit of noise and fuzzy stuff in F4 and above. Initial in a word or phrase, I can make this a real fricative.
But here it's fully voiced, not really /r/ colo(u)red (who can tell with the F2 that high), and it's flanked by things I
definitely want to be vowels. I call this object 'jod', which I believe is the name of a Hebrew letter for something in this
range. I think I picked this up from Sharon Hargus. 'Jod' is that object for which "j" is the IPA and "y" is the American
symbol, the non-syllabic counterpart of /i/. I don't usually call the offglide of [ai] [ei] [oi] type diphthongs "jod", but I
think they're the same phonological object. But I digress. This looks like an /i/, but it's short and flanked by /r/-
colo(u)red vowels. The F1 dips just a little, suggesting a moment of greater stricture/lesser aperture, again consistent
with /j/.
Lower-case I + Rhoticity Sign, IPA 101 + 419

[iÕ], [i˞]
Well, who knows. But it's definitely /r/ colo(u)red.
Turned R, IPA 151

[¨], [ɹ]
Well, if there was any question, this would clear it up. The F3 barely clears the 2000 Hz mark, but there you go. I also
think there's a short F3 steady state here, but that's arguable. The main thing is the low F3.

[I], [I]
The thing to notice here is that the F3 rises again to normal, but the F2 doesn't follow it. It rises all right, but not in
parallel with the F3 as in the preceding series of segments. But this thing is still quite high (check the F1), and not as
front as /i/. Two guesses. Well, maybe three.

[t], [t]
There's a nice clean gap (for a change) here, indicating a stop. This looks pretty voiceless, but there are some pops down
there that make me wonder. There's no hint here of velar pinch, nor of bilabial transitions (the F2 seems to sink a little
but not so far as to suggest a definite bilabial and the F3 doesn't indicate anything but coronal.
[s], [s]
Frankly, this looks like a voiceless /z/. So guess what. What I mean is is that this seems real short and weak for an /s/.
But there you go.
Eth + Raising Sign, IPA 131 + 429

[D3], [ð]̝
Foul, you cry. Well, I'm trying to be consisntent with IPA recommendations regarding diacritics. True there's a stop-
moment followed by an incredibly short fricative moment, followed by some aspiration. But just try to make a word out
of that. Phonologically, this is an Eth, the beginning of the word 'the' (in case you haven't figured out that far ahead).
Produced as a phonetic affricate, it might be the beginning of a prosodic phrase (actually, I'm not so sure, but work with
me), and might be fortis, i.e. produced with more forceful articulation. So if you start with something fricative, and
raise it, you get something that's a stop. Or an affricate. Or something like that. Okay, and it's voiceless, but I just
couldn't deal with all that.
Schwa, IPA 322

[«], [ə]
Nice, shortish vowel. Evenly spaced formants. Schwa.

[s], [s]
This is a more canonical /s/. Very broad band, almost white, at least for the frequencies we can see. Some hints that the
highest amplitudes are off the top of the spectrogram (use your imagination if you can't just convince yourself that the
higher frequencies are just a little higher in amplitude).

[eI], [eɪ]
There's precious little evidence of F1 movement here, except perhaps at the very end of the vowel. I tend to ignore that
since it's obviously coincident with the sharp F2 transition, suggesting that this section of this vowel is transitional and
should be ignored. On the other hand, you can't miss that F2. This is the movingest (?) F2 in /e/ I've ever produced. So,
if you take the F1 to be middish, and the F2 to be front and getting fronter, what would you have?

[D3], [ð]̝
Foul, you cry. Well, I'm trying to be consisntent with IPA recommendations regarding diacritics. True there's a stop-
moment followed by an incredibly short fricative moment, followed by some aspiration. But just try to make a word out
of that. Phonologically, this is an Eth, the beginning of the word 'the' (in case you haven't figured out that far ahead).
Produced as a phonetic affricate, it might be the beginning of a prosodic phrase (actually, I'm not so sure, but work with
me), and might be fortis, i.e. produced with more forceful articulation. So if you start with something fricative, and
raise it, you get something that's a stop. Or an affricate. Or something like that.
Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
"It's an odd place to find it."

[D3], [ð̝ ]
We can debate the extent of this segment until the cows come home, but here's my thinking. The left edge (about 125
msec) suggest a burst. The segment is voiced, and it looks a little like a nasal, in thtitseems to have a little bit of a
resonance at about 1400 Hz and zeroes. But, if you can make them out, there's a little bit a noise in the upper 'poles',
particularly as they move into the following vowel and the resonance of the upper formants (which is discontinuous with
the apparent upper poles). So if you think this is a nasal, hold on to that. It's really a dental affricate, which is typical for
initial Eth.
Schwa, IPA 322

[«], [ə]
Teeny short vowel, especially for an utterance-initial syllable. This suggests a stressless vowel, which suggests reduction.
Following the usual conventions, this is transcribed as Schwa.

[s], [s]
Fairly typical sibilant, nice dark noisy energy, getting strongest in the highest frequencies. I'm tempted to suggest that you
can tell this is syllable initial (or rather that it forms a syllable-initial cluster with the following stop) because the noise
increases in amplitude until the stop closure. Somebody who works on syllable affiliation and fricatives should tell me
whether that's really true or not, or if it's just that pressure always rises as you approach a following stop. Stay tuned.

[t], [t]
Stop (ignore the reverberant noise in the upper frequencies), and voiceless. The transitions following the release look a
little velar, in that F3 seems to be rising and F2 seems to be falling (i.e. like they start in a velar-pinch configuration). Note
though that the burst/release thing is a) really loud, b) not at all double-looking, considering its apparent loudness, and c)
not centered in the F2-F3 area (as would be typical of a velar release) but in F3 and F4 (and if you work at it, higher).
Also, the closure itself is a little short (if we're still thinking velar), and the highest visible frequencies of the burst are too
strong. So limited positive evidence either in the velar or bilabial direction, so pick alveolar. Note that this gap is clearly
voiceless, but the stop is not aspirated.
Open O + Rhoticity Sign, IPA 151 + 419

[Õ], [ɔ˞]
This is the backest vowel I've ever produced. Note that the F1 is fairly plainly mid, around 500 Hz. but the F2 is,
depending on where you measure it, just below 1000 Hz. That's just outrageously back and round. The /r/ colo(u)ring (the
lowering of F3, in anticipation of, well, what's coming up next) doesn't start until almost 100 msec in. The result is
spectacular--there's actually a steady state where the F2 and the F3 are flat at the same time. Okay, it occurs very early
(from about 490 to 525 msec), but there it is. Mid, back and round. That's only /o/ for me, but given that it's /r/-coloured, I
suppose open-o is better.
Turned R, IPA 151

[¨], [ɹ]
No steady state, and I like to point out that the F3 minimum, which is the point at which I tend to measure these things, is
not accompanied by any kind of indication that F2 is doing anything in particular, beyond moving from where it was to
where it's going. F3 is nice and low, 1650 or something like that. That ought to be enough.

[i], [i]
Okay, there are some vowels that don't reduce quite as drastically as others. /o/ and /i/, mostly, although /e/ particularly in
apparent compounds (yesterday, monday, etc.) sometimes pops up this way. Then there's the whole if-it-doesn't-reduce-it-
must-bear-some-stress argument which I don't believe. Anyway, this is an /i/, but reduced. Note the F2 goes way high
(although not as high as it might be). The F1 looks like it stays mid, but that's partly an illusion having to do with the
bandwidth and the underlying harmonics. It doesn't lower much tho. Hence, reduced. The proximity of F2 and F3, while it
might be due to F3 being relatively low at the point the F2 reaches its max, suggests /i/ over /I/. Depending on who you
believe, this could be /I/ underlyingly or even allophonically, but it doesn't sound like an /I/ to me.
Barred I, IPA 317

[ö], [ɨ]
The F3 is moving throughout, suggesting that, while a vowel, this vowel doesn't have a particular F2 target (following
Keating, among others). If redution involves the relaxing or deletion of targets, this is what you would get. Also, given
that this is a sequence of vowels, one of them has to be reduced, if not outright non-syllabic. This is too long to really be
non-syllabic, and it doesn't look much like one of the usual off-glides. So this probably reduced. Following convention,
since the F2 is closer to the F3 than the F1, I've transcribed it barred-i.

[z], [z]
What noise there is is in the high frequencies. It's not particularly loud, so you might miss it. But there it is. Nice striations
at the bottom, indicating voicing.
Schwa, IPA 322

[«], [ə]
Short, obviously reduced vowel. Pick one.

[b], [b]
Nice little stop, fully voiced. The F2 and F3 transitions, such as they are, suggest bilabial, over anything else, although
they're subtle. The release burst (at about 990 msec--the first 'pulse' of the following vowel) is not strong in the high
frequencies or middle frequencies as it would be if it were alveolar or velar. As one would hope. Also, bilabial stops tend
to support voicing in a way that alveolar and velar stops can't (extra points if you can explain why).
Lower-case A + Upsilon, IPA304 + 321

[aU], [aʊ]
This is a diphthong. Trust me. I used to always use Tie Lines with diphthongs, but it's just too much work in Unicode.
Anyway, The F2 goes from something vaguely neutral to something clearly more back. The F1 isn't helpful, in that it's
sort of just inverted-U shaped. But it clearly hits an extremum (maximum, in this case) at about 1075 msec, indicating that
something in here is low. So low vowel, backing diphthong. Review the history of English, and there you go.

[R], [ɾ]
Nice little flap. A gap in the spectrogram, but incredibly short. Too short to be a 'real' stop. Nice little flap. Which means
that this is probably a /t/ or /d/.
Schwa, IPA 322

[«], [ə]
Another short little vowel, and given that the previous segment is a flap, this vowel is unlikely to be stressed. Get on with
your life.

[m], [m]
Now this is a nasal. Fully voiced. Nice little zero. Actually more than one. Without any other nasals in this spectrogram to
compare it to, you'd have to know my voice pretty well to see that the pole (F2) is just a little low (for my nasals). Which
is typical of my /m/s.
Barred I, IPA 317

[ö], [ɨ]
Lots of unstressed vowels in this spectrogram, but that's why it can be long and still readable. See above.

[kH], [kʰ]
Following strict IPA conventions, aspiration is a diacritic mark on a (stop) segment. UCLABET, following ARPABET and
a few others, suggest marking closure phases indpendently from release (plus aspiration) phases, which would make
segmenting easier, but whatever. There's frontish vowels on both sides, so this is fronted (i.e. front velar), so the pinch,
such as it is, is high rather than mid-frequency. Note that the burst is double, and centered around the F2/F3 (high) pinch
area. Dead giveaway for velar. Voiceless during the closure, and strongly aspirated, with loooong aspiration duration.
Velars like long aspirations. Extra points for explaining why.
Ash, IPA 325

[Q], [æ]
An incredibly high F1, indicating frankly the lowest vowel I think I've ever produced. The F2 looks like it's moving
throughout, and it looks pretty neutral. But for most of it, it's just a hair high. So vaguely front, and very low. This is
English, and this is me. Must be Ash.
Fish-hook R + Tilde, IPA 124 + 424

[R)], [ɾ̃]
Nasalized flap. I love this. How the heck are you supposed to know? Well, there's this very short thing. If it weren't so
short, it would look like a nasal. Hence nasal flap.
Barred I, IPA 317

[ö], [ɨ]
Okay, this really looks like an /I/. If you want it to be an /I/, fine. But this vowel turns out to be radically unstressed, so
I've transcribed it with a barred-i. Argue with me if you want to.

[kH], [kʰ]
This is a nice example of a stop with velar pinch on either side. Released, definitely. Aspirated, let's argue about it.
Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
]
"The story is about a mechanic."

[D3], [ð̝ ]
We can debate the extent of this segment until the cows come home, but here's my thinking. The left edge (about 125
msec) suggest a burst. The segment is voiced, and it looks a little like a nasal, in thtitseems to have a little bit of a
resonance at about 1400 Hz and zeroes. But, if you can make them out, there's a little bit a noise in the upper 'poles',
particularly as they move into the following vowel and the resonance of the upper formants (which is discontinuous with
the apparent upper poles). So if you think this is a nasal, hold on to that. It's really a dental affricate, which is typical for
initial Eth.
Schwa, IPA 322

[«], [ə]
Teeny short vowel, especially for an utterance-initial syllable. This suggests a stressless vowel, which suggests reduction.
Following the usual conventions, this is transcribed as Schwa.

[s], [s]
Fairly typical sibilant, nice dark noisy energy, getting strongest in the highest frequencies. I'm tempted to suggest that you
can tell this is syllable initial (or rather that it forms a syllable-initial cluster with the following stop) because the noise
increases in amplitude until the stop closure. Somebody who works on syllable affiliation and fricatives should tell me
whether that's really true or not, or if it's just that pressure always rises as you approach a following stop. Stay tuned.

[t], [t]
Stop (ignore the reverberant noise in the upper frequencies), and voiceless. The transitions following the release look a
little velar, in that F3 seems to be rising and F2 seems to be falling (i.e. like they start in a velar-pinch configuration). Note
though that the burst/release thing is a) really loud, b) not at all double-looking, considering its apparent loudness, and c)
not centered in the F2-F3 area (as would be typical of a velar release) but in F3 and F4 (and if you work at it, higher).
Also, the closure itself is a little short (if we're still thinking velar), and the highest visible frequencies of the burst are too
strong. So limited positive evidence either in the velar or bilabial direction, so pick alveolar. Note that this gap is clearly
voiceless, but the stop is not aspirated.
Open O + Rhoticity Sign, IPA 151 + 419

[Õ], [ɔ˞]
This is the backest vowel I've ever produced. Note that the F1 is fairly plainly mid, around 500 Hz. but the F2 is,
depending on where you measure it, just below 1000 Hz. That's just outrageously back and round. The /r/ colo(u)ring (the
lowering of F3, in anticipation of, well, what's coming up next) doesn't start until almost 100 msec in. The result is
spectacular--there's actually a steady state where the F2 and the F3 are flat at the same time. Okay, it occurs very early
(from about 490 to 525 msec), but there it is. Mid, back and round. That's only /o/ for me, but given that it's /r/-coloured, I
suppose open-o is better.
Turned R, IPA 151

[¨], [ɹ]
No steady state, and I like to point out that the F3 minimum, which is the point at which I tend to measure these things, is
not accompanied by any kind of indication that F2 is doing anything in particular, beyond moving from where it was to
where it's going. F3 is nice and low, 1650 or something like that. That ought to be enough.

[i], [i]
Okay, there are some vowels that don't reduce quite as drastically as others. /o/ and /i/, mostly, although /e/ particularly in
apparent compounds (yesterday, monday, etc.) sometimes pops up this way. Then there's the whole if-it-doesn't-reduce-it-
must-bear-some-stress argument which I don't believe. Anyway, this is an /i/, but reduced. Note the F2 goes way high
(although not as high as it might be). The F1 looks like it stays mid, but that's partly an illusion having to do with the
bandwidth and the underlying harmonics. It doesn't lower much tho. Hence, reduced. The proximity of F2 and F3, while it
might be due to F3 being relatively low at the point the F2 reaches its max, suggests /i/ over /I/. Depending on who you
believe, this could be /I/ underlyingly or even allophonically, but it doesn't sound like an /I/ to me.
Barred I, IPA 317

[ö], [ɨ]
The F3 is moving throughout, suggesting that, while a vowel, this vowel doesn't have a particular F2 target (following
Keating, among others). If redution involves the relaxing or deletion of targets, this is what you would get. Also, given
that this is a sequence of vowels, one of them has to be reduced, if not outright non-syllabic. This is too long to really be
non-syllabic, and it doesn't look much like one of the usual off-glides. So this probably reduced. Following convention,
since the F2 is closer to the F3 than the F1, I've transcribed it barred-i.

[z], [z]
What noise there is is in the high frequencies. It's not particularly loud, so you might miss it. But there it is. Nice striations
at the bottom, indicating voicing.
Schwa, IPA 322

[«], [ə]
Short, obviously reduced vowel. Pick one.

[b], [b]
Nice little stop, fully voiced. The F2 and F3 transitions, such as they are, suggest bilabial, over anything else, although
they're subtle. The release burst (at about 990 msec--the first 'pulse' of the following vowel) is not strong in the high
frequencies or middle frequencies as it would be if it were alveolar or velar. As one would hope. Also, bilabial stops tend
to support voicing in a way that alveolar and velar stops can't (extra points if you can explain why).
Lower-case A + Upsilon, IPA304 + 321

[aU], [aʊ]
This is a diphthong. Trust me. I used to always use Tie Lines with diphthongs, but it's just too much work in Unicode.
Anyway, The F2 goes from something vaguely neutral to something clearly more back. The F1 isn't helpful, in that it's
sort of just inverted-U shaped. But it clearly hits an extremum (maximum, in this case) at about 1075 msec, indicating that
something in here is low. So low vowel, backing diphthong. Review the history of English, and there you go.

[R], [ɾ]
Nice little flap. A gap in the spectrogram, but incredibly short. Too short to be a 'real' stop. Nice little flap. Which means
that this is probably a /t/ or /d/.
Schwa, IPA 322

[«], [ə]
Another short little vowel, and given that the previous segment is a flap, this vowel is unlikely to be stressed. Get on with
your life.

[m], [m]
Now this is a nasal. Fully voiced. Nice little zero. Actually more than one. Without any other nasals in this spectrogram to
compare it to, you'd have to know my voice pretty well to see that the pole (F2) is just a little low (for my nasals). Which
is typical of my /m/s.
Barred I, IPA 317

[ö], [ɨ]
Lots of unstressed vowels in this spectrogram, but that's why it can be long and still readable. See above.

[kH], [kʰ]
Following strict IPA conventions, aspiration is a diacritic mark on a (stop) segment. UCLABET, following ARPABET and
a few others, suggest marking closure phases indpendently from release (plus aspiration) phases, which would make
segmenting easier, but whatever. There's frontish vowels on both sides, so this is fronted (i.e. front velar), so the pinch,
such as it is, is high rather than mid-frequency. Note that the burst is double, and centered around the F2/F3 (high) pinch
area. Dead giveaway for velar. Voiceless during the closure, and strongly aspirated, with loooong aspiration duration.
Velars like long aspirations. Extra points for explaining why.
Ash, IPA 325

[Q], [æ]
An incredibly high F1, indicating frankly the lowest vowel I think I've ever produced. The F2 looks like it's moving
throughout, and it looks pretty neutral. But for most of it, it's just a hair high. So vaguely front, and very low. This is
English, and this is me. Must be Ash.
Fish-hook R + Tilde, IPA 124 + 424

[R)], [ɾ̃]
Nasalized flap. I love this. How the heck are you supposed to know? Well, there's this very short thing. If it weren't so
short, it would look like a nasal. Hence nasal flap.
Barred I, IPA 317

[ö], [ɨ]
Okay, this really looks like an /I/. If you want it to be an /I/, fine. But this vowel turns out to be radically unstressed, so
I've transcribed it with a barred-i. Argue with me if you want to.

[kH], [kʰ]
This is a nice example of a stop with velar pinch on either side. Released, definitely. Aspirated, let's argue about it.
Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
]
"It comes in a huge bottle."

Unicode. In the description that follows, the first line gives the name of the symbol (from Pullum & Ladusaw, 1996)
followed by the IPA reference number for the symbol named. The second line provides the symbol as a typed symbol in
SILDoulos IPA93, followed by the same symbol as a Lucida Sans Unicode symbol. One or the other of these may not
appear as the appropriate symbol, depending on which font(s) you have installed.

[I], [ɪ]
This may start with a glottal stop, but I just couldn't cope with trying to explain why or why not, so we're starting with the
vowel. The F1 is low, The F2 is quite high (1900? 1950? Hz). If you look a little further down in the spectrogram
(between 900 and 1100 msec) you'll see the F2 getting higher still, suggesting that this isn't the most front vowel
available. What's incredibly front but not as front as /i/?

[t], [t]
There's not much in the way of useful transition information. But in the absence of any indication of velar pinch, and any
indication of labial transitions, alveolar is the best guess. Note the apparent double burst, possibly triple (quadruple?),
depending on what you count as what. I think the first might be the actual alveolar closure, the second the alveolar release,
and the third (when the aspiration starts), the release of the following stop. See below.
Lower-case K + Right Superscript H*, IPA 109 + 404

[kH], [kʰ]
*Technically,P&L don't name the right-superscript letter diacritics for aspiration, palatalization and so forth. This should probably be called 'Right
Superscript Lower-case H', and I'd really prefer 'aspiration mark', but it's not always up to me.
There's not a lot of information here either. The lag between the big burst at about 250 msec and the beginning of the
aspiration at 300 msec or so is a little long for your typical double-burst situation. And I think there might be a true second
burst (of a velar release) about 20 msec after the aspiration begins (there's certainly something odd about the noise
between 500 and 1700 Hz just surrounding the 300 msec mark). But there is some evidence of velar pinch in the
transitions into and through the following vowel. The F3 is weak and difficult to distinguish, but there is some indication
that both F3 and F2 (for instance in the following vowel are pointing to about 1500 Hz in the release burst/early aspiration
moments, where the noise is definitely concentrated around 1500 Hz. This is the middle of the F2 range, and noise
centered here is typical of dorso-velar type approximations (hence velar pinch). Definitely voiceless and aspirated at least.
Turned V, IPA 314

[Ã], [ʌ]
If you transcribed this as Script A (IPA 305, SIL [A], LSU [ɑ]), you'd be justified. This has the high F1/Low F2/straddling
1000 Hz configuration of that vowel. But comparing it to the similar vowel later (1400 - 1600 msec), it's not quite as high
an F1, not quite as low an F2, and it's quite short. Could be prosodic, although the amplitude is a little high and the pitch
periods a little close together. Could be random, no two vowels are ever the same. But it might be the vowel it turns out to
be. The official description of Turned V in the IPA is as a back, unround vowel, which is true of this vowel. But so often
this vowel is central, it should probably be transcribed with another symbol. Especially in my California data, where this
vowel is clearly more fronter-than-central. But in this token, it's quite back.

[m], [m]
The abrupt change in amplitude indicates something is going on, This isn't a prototypical nasal--there's no clear zero,
there's energy in the higher frequencies. But this doesn't have the structure of any of the usual approximants either. No
swooping transitions into it or out of it. So this is probably nasal. If you know my voice, you could probably guess
bilabial, especially given the transitions into it, but knowing it's nasal is probably enough for now.

[z], [z]
Just vaguely voiced (I can convince myself there are some striations at the bottom, although this might not be obvious on
the web. Definitely fricative, spanning the spectrum. But the noise gets slightly stronger as you go up in frequency,
suggesting /s/ noise. So [z].
Barred I, IPA 317

[ö], [ɨ]
Very short vowel, so probably reduced. Usually transcribe reduced vowels as schwa, since they don't carry enough
information to make it worth doing more. Following Keating et al. (1994), if the F2 is closer to the F3 than the F1, I use
barred I rather than schwa.

[n], [n]
Another nasal, again without swooping transitions or a clear zero. On the other hand, this one is definitely different from
the preceding one, and the pole, such as it is, is higher, closer to 1500. This is typical of my nasal pole in /n/. Notice here
the other argument that this isn't an approximant--the transitions are not really continuous with the spectral pattern in the
closure.
Barred I, IPA 317

[ö], [ɨ]
See above. But notice this one is transitioning in F2.
C Cedilla, IPA 138

[C], [ç]
Ah, controversy. All those who transcribed this [s], raise your hand. Esh? [h]? Okay good. This is the famous example of
a true palatal (as opposed to post-alveolar/palatoalveolar) fricative. But, you cry, English doesn't have a palatal fricative.
Well, yes it does. This is an /h/. /h/ is voiceless (usually) with glottal or epiglottal noise. This is source and it resonates
through the vocal tract the way voiced source does. The result is formants, excited by noise. Here, it's combined with a /j/
supralaryngeal articulation, owing to this word being what it is. So imagine a palatal approximant [j], and now make it
voiceless. But, you cry, the IPA doesn't have a voiceless palatal approximant symbol, it has a diacritic. True. But this isn't
approximant. It's a fricative. Look at it. Now, this /h/ isn't absolutely voiceless. There's (all right, there are a lot of
striations here. So I probably should have used the voiced palatal fricative symbol here, but no one would have recognized
it. So I didn't. But it is. And for those of you who care, it looks like this: SIL [ï], [ɟ].

[j], [ɨ]
There's just enough here that is clearly voiced and not fricative to transcribe, so that's what I've done. Also, leaving it out
would just be confusing given my voice of vowel following.
Turned M, IPA 316

[µ], [ɯ]
The F2 indicates this vowel is back, but it never quite gets as far back and round (i.e. low F2) as a good back, round [u].
In my dialect this vowel (/u/) is almost never round, and never fully round, or if it is round, it's definitely never full back.
So I've transcribed it as unround and back, because that's what it seems to be. Central and round is another choice, but
given that it's me, and I know my rounding/backing situation pretty well, I pick the other one. BTW,this sounds that
people transcribe conventionally/phonologically rather than phonetically. This is the famous /ju/, which has its origins as a
reflex OE or ME /y/ (front and round), which 'decomposes' to /ju/ in those "new/few/due" words that Wells (1984) lumps
into the GOOSE set. Compare this with a /du/ sequence sometime. You may learn something.

[d], [d]
This one is going to get confusing, because this is the stop portion of an affricate. But I couldn't talk about them as a unit
without making it more confusing. So here goes. This is a gap. The F2 rises into it, and the F3 doesn't really fall. So this is
probably alveolar. It's got a fair amount of (mostly perseverative) voicing, so I'd transcribe it as voiced. If it were
underlying voiceless, it might still have some perseverative voicing, but not that much. So it's probably underlyingly
voiced. Keep that in mind and move on to the fricative.
Esh, IPA 134

[S], [ʃ]
Fricative, quite strong for its duration, broad band, but this time not strongest in the higher frequencies. Concnetranted in
the range of F2 and falling off sharply in the low frequencies. Pretty typical of Esh. And it's completely voiceless. But,
you cry, how can you have a d-esh cluster, especially followed by another stop. Well, you can. But underlyingly, this is a
d-yogh affricate, i.e. a <j> in English. Actually a 'soft' <g>. Moving on quickly.
Lower-case P, IPA 101

[p], [p]
Long gap here, indicating a stop. The noise is drifting down in the preceding fricative, and the transitions, sort of, are
rising in the following vowel, indicating labiality. Totally voiceless throughout, but with very (very, very) short VOT. So
this is voiceless, unaspirated [p], which I have transcribed accordingly. So, given that it's English, it's /b/.

[S], [ɾ]
Very short, and actually this one is laterally released, but I'm not sure how you'd tell that from the spectrogram. Teeny
short things like this are best transcribed as a flap and forgotten before your brain explodes.
Lower-case L + Mid Tilde + Syllabicity Mark, IPA 155 + 428 + 431

[lò`], [l̴̩]
The F3 rises sharply before dying off completely, suggesting a high F3. F3 is difinitely up out of normal range. F2 is
definitely very low. It would be easy to just say this was more Script A, but the upper formants do not permit this
interpretation. The high F3/F4 indicates lateral. I'll leave it to someone else to describe why. If it were /r/ the F3 would be
low. If it were /w/ the F2 would be lower. If it were any other vowel, the F3 and F4 wouldn't be the way they are.
Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
Note: Okay, this month, we're kissing my GIFs goodbye. Nobody likes them, and as long as a) browsers standardly support client-side
fonts and Unicode, and b) the relevant client-side fonts are free, I'm not going to miss them. SIL Doulos also is having trouble in some
browsers (or vice versa) with character spacing and overstrikes, so Unicode seems to be pulling into the lead. So this month it's Pullum
& Ladusaw labels, IPA identifiers, SIL Doulos, Lucida Sans Unicode. You can download Lucida Sans Unicode from John Wells, and
you can download the SIL font(s) from SIL.
"Curling season is over."
Lower-case K + Superscript Lower-case H, IPA 109 + 404

[kH], [kʰ]
The double burst at about 200 msec is a dead giveaway. The burst tells you that there was definitely a stop closure, as opposed to a
weak fricative or more open articulation. The burst is doubled, characteristic of /k/ in English. The aspiration is very very long, again
characteristic of velars. The usual velar cue (velar pinch) is not in obvious evidence here, but then the following vowel pretty much
makes that pretty impossible. But note that the double burst is centered in the F2/F3 range of the following vowel, i.e. there's a velar-
pinch-shaped filter here. Voiceless, velar, aspirated.

[¨`], [ɹ̩]
F1 appears to be quite low, F2 is at about 1250 Hz and F3 is about 1600 Hz. Low F3, therefore /r/ (in [{North}American] English).
Long and high-amplitude (and fully sonorant/resonant), and therefore a local vowel/syllable head.
Lower-case L + Superimposed Tilde, IPA 155 + 428

[lò], [l̴]
F2 dips sharply here (around 435-450 msec), as does the amplitude. F3 rises out of the R, hitting a peak toward the end of this stretch
or in the beginning of the following vowel. Very low F2 means very back or very round. The F2 transition in the following vowel looks
suspiciously /w/ like, but that's the nature of velarization in dark /l/. You'll know it's a dark l and not a /w/ or something by context more
than anything else. I suppose the F1 is weird for a /w/, but you couldn't prove that by me. Mark something going on here and then go
on. (Unless you're Canadian, in which case /l/ might be the only logical choice for this word....)

[i], [i]
Well, this turns out to be a stressless vowel, so we might expect it to look like a schwa. Frankly, I didn't think I did this, but there it is.
In California, we gave a test where one of the questions was to identify the minimal pair, and two of the choices were 'king, keen' and
'keen, kin'. And the native Californian students *all* chose 'king, keen', because in Californian English high front vowels go all tense
before the velar nasal. "We went hikeeeng and canoeeeeeng." So just look at that F2 and tell me this isn't the frontest vowel you've ever
seen. (The transition heads up higher than some of the non-/r/ F3s in this spectrogram). So how you can transcribe this vowel as
anything but [i] is beyond me. So I did.
Eng, IPA 119

[N], [ŋ]
The F2 and F3 have practically merged, suggesting pinch and velar-ness. The strong nasal formant (high as it is) obviously suggest
nasal.

[z], [s], [s]
There's some stray formant organization which might mislead some, but ignoring the F2-looking shaping, this is pretty standard
sibilant. High amplitude (for a fricative), cent(e)red in the very high frequencies. Voiceless, sibilant, and alveolar (due to the high
frequency cent(e)r(e). (I really have to just switch to Canadian spelling or not.)

[i], [i]
Low F1, super high F2. Must be /i/.

[z], [z]
Same spectral profile as the preceding fricative, but this one is shorter, slightly weaker, and voiced. Typical of /z/ with respect to /s/.
Schwa, IPA 322

[«], [ə]
This schwa is pretty classic. Evenly spaced formants, at approximately 500, 1500, 2500 Hz. Well sort of approximately. Close enough.

[n], [n]
You know something is going on here because there's clearly an amplitude drop from about 930 msec to 990 msec. The amplitude drop
suggests a) a consonant and b) a nasal zero. So this is probably a nasal, although sometimes /l/s look like this, except in English they'd
be dark/velarized and the F2 would definitely be lower. This one looks like a schwa, only a consonant. Okay, so this is nasal. If you are
familiar with my nasals, this looks alveolar, but whatever.
Barred I, IPA 317

[ö], [ɨ]
Well, this one looks just like the thing I called schwa before, except the F2 is a little higher. Following Keating et al. (1994)), if it's a
reduced or stressless vowel, and the F2 is closer to the F3 than the F1, call it a barred i and go on.

[z], [z]
This one is even shorter and weaker than the previous one, but the features are the same.

[?], [ʔ]
Actually,this is creaky voice, but you can't mark a stretch of creaky voice as creaky voiced without marking it as some kind of vowel or
something too. And I didn't want to do that. So this is 'creaky voice as glottal stop'. Irregular, widely spaced glottal pulses. Could be
taken as a series of transient bursts or something, but you can see the resonance/echoey sound in the formants in between each pulse.
Lower-case O, IPA 307

[o], [o]
This looks amazingly monophthongal for me. The F1 is mid-to-low. The F2 is as low as my F2s really ever get. So mid-to-high, very
back and round.

[v], [v]
This is definitely fricative. The fricative noise here is quite definitely there, which is unusually, especially since it looks like this one is
at least partially voiced. The F3 transitions down into it, but that could just be the F3 starting down for the following /r/. The F2
actually has vaguely labial-looking transitions, sort of. Definitely not alveolar or velar looking. So not coronal, not dorsal, and fricative
and voiced. /v/ is really the only choice.

[¨`], [ɹ̩]
Well, this is pretty clearly an /r/ for the usual reasons. Given the preceding consonant (not to mention the incredible length due in part
to phrase-finality), this must be syllabic.
Robert Hagiwara, Ph.D. Support Free Speech
Winnipeg, Manitoba
CANADA R3T 5V5
"There was no mystery at all."

[ ], [D], [ð̝ ]
There's some voicing here, so something is up. It looks a lot like a stop, clearly voiced and with a very short VOT. But underlying voiced
stops are usually voiceless initially, and you don't get much more initial than initial in utterance. (Well, you can, but there's very little
work on the segmental stuff of higher than utterance prosodic constituents.) The F2 and F3 transitions following start high, suggesting
alveolar, or at least coronal. Could be a /d/, but will turn out not to be once you get the whole utterance together. What's the thing that's
most likely to forticize to [d]? And then mark it raised. Moving on.
Epsilon + Rhoticity Sign, IPA 303 + 327

[ ] (no diac.), [EÕ], [ɛ˞]
I really go back and forth on those rhoticity signs. But here we go. Middish vowel, actually might be a trifle high for no apparent reason
(check the F1). Starts quite front, which narrows down the possibilities. The diving F3 (hence the rhoticity sign on the vowel) is clearly
heading toward an /r/. Following general convention, I picked the 'lax' version of a mid-front vowel, with r-colo(u)ring.
Turned R, IPA 151

[ ], [¨], [ɹ]
F3 very low, in this case prototypically below 2000 Hz, although not nearly as low as it might be. Please note, all of you lip-rounding
fanatics, that the following segment has an incredibly low F2, suggesting [w] more than anything else, and the F3 actually rises across it.
I never want to hear another complaint about lip-rounding actually lowering F3. It doesn't. At least not enough to matter.
Lower Case W, IPA 170

[w], [w], [w]
There isn't much information here, since all the energy above about 900 Hz dies. But note the transitions into and out of this stretch
(centered around 300 msec) tell us that the F1 is very very low, as is the F2. Must be very very high, and very very back/round. Note also
that if you interpolate or extrapolate or whatever the course of the F3, the extreme rounding/backing going on here has *no* useful effect
on F3. None. Zip. Zero. See above.
Turned V, IPA 314

[ ], [Ã], [ʌ]
Ignoring the transitions from the preceding /w/, we've got a mid-to-low, moderately back or round vowel. There's only so many vowels
back there, and I, being from the Western US have fewer than most. I must say, I would have transcribed this with a Script A, IPA 305,
except I know what the word was. Actually, when I'm speaking in my fake professional voice, this vowel is very Script-A-ish. Somebody
remind me to listen to this again.
Lower Case Z, IPA 133

[z], [z], [z]
Nice high frequency noise, but clearly voiced (note the striations at the bottom). Clearly sibilant, and alveolar to boot (due to the noise),
and voiced. Onward.
Lower Case N, IPA 116

[n], [n], [n]
Well voiced, certainly. Sonorant probably. Apparent zeroes in between the resonances, suggesting not only weakness but actual zeroes.
Hence probably nasal. Hard to tell what place, since the following transitions are, um, unhelpful, to say the least. Mark it as "N" and go
on.
Lower Case O, Top Ligature, Upsilon, IPA 307 + 433 + 321

[ou] (no diac.), [oƒU], [oʊ] (no diac.)
Middish vowel, going from not particularly back to quite round. Get out your IPA vowel chart, and work it out for yourself. (I'm in a
hurry this month, okay?)
Lower Case M, IPA 114

[m], [m], [m]
Another sonorant. Less obviously zeroed than the previous one, which might mislead you, except the edges are *really* sharp and clean,
which you just don't get with your average oral sonorant. The resonances are different than the preceding one, so it has to be at a
different place, and if you're really imaginative you can see that the second resonance in the previous one is at about 1500 and the
resonance in this one is definitely closer to 1000 Hz. Hence, of the two, this one must have the longer side cavity, i.e. the oral closure is
further forward.

[I], [I], [ɪ]
Low F1, therefore high. Rather high F2 (though not as high as for /i/), so mostly front. Only so many choices in English, folks.
Lower Case S, IPA 132

[s], [s], [s]
This one is obviously voiceless (unless you are misled by the sound of my refrigerator in the background of this whole thing). And the
noise is centered very high and mostly unfiltered at lower frequencies. If you don't recognize this as /s/, you need to go back and practice
some more.
Lower Case T + Right Superscript H, IPA 103 +404

[t ], [tH], [tʰ]
In keeping with official IPA position, I'm trying to make sure I transcribe this as a unit, even if I really only segment off the closure
portion. There's a very very very short closure portion, unless you think it isn't closed, in which case there isn't. Following a fricative,
there's probably enough airflow that it doesn't take much time to build up any pressure behind it, hence it can be short and yet have a
sharp burst and high-volume aspiration. Little useful transitional information, so default to alveolar, especially given the short duration.
Please also notice the shape of the aspiration noise. There's this pattern where /tr/ gets rendered with something like a retroflex affricated
release. That's not what's going on here. Here, the noise is just being pulled down and post-alveolarized by the bunching of the tongue.
but you can see how it might be confused with an T-ESH affricate.
Turned R, IPA 151

[ ], [¨], [ɹ]
The very high F2 (required by the following vowel) is pushing the F3 up out of the usual range. But this is typical. Just cuz the F3 isn't
below 2000 Hz doesn't mean it isn't an /r/, any more than a random F3 just below 2000 Hz must be an /r/. Just ain't so. But you can still
tell this is an /r/ because the F3 is low (locally), and in fact about as low as it can get without having to push the F2 out of the way (which
it would do, if there weren't another competing demand on the F2. Hence my belief that the F2 and F3 or /r/ are probably perturbations to
the second and third resonance of the main tube, rather than side-cavity resonances. But proper acousticians disagree.
Lower Case I, IPA 301

[i], [i], [i]
Very low F1, very, very high F2. Must be [i]. Explain.

[ ], [?], [ʔ]
This is the stoppiest glottal stop I've ever produced. Note the weird voicing quality in the /i/, and the last couple of pulses which are just a
little off. If you thought it was an oral stop, look at that 'release'. No transient, the high-amplitude spots of the first pulse/release thing is
pretty evenly distributed across the frequency range below 4000 Hz, and exactly contiguous with the formants following. No transitions
in the first few 'pulses'. If there were anything oral going on, you'd expect *something*. So this is glottal.
ASH, IPA 325
[ ], [Q], [æ]
Nice high F1, meaning a low vowel. Not particularly Front, but not particular back either. Lowest not-obviously-back vowel available is
/ae/.
Fish-Hook R, IPA 124

[ ], [R], [ɾ]
Classic flap. Briefest of interruptions to the sonorance and pulsing of the surrounding vowels. SLight bursty transition thing, but that
shouldn't trouble you too much.
Script A IPA 305

[ ], [A], [ɑ]
Very high F1, so very low vowel. Very very low F2, so as back as it gets, given the height of the F1. Low back vowel. Not many choices
for me.
Lower Case L + Superimposed Tilde, IPA 209 (155 + 428)

[l] (no diac.), [lò], [ɫ (l̴)]
On the other hand, what the heck is going on here? This looks like weird, phrase-ending-voice-quality variant of the preceding vowel,
but how many words end with [ ] (or if you're SIL Doulosing or Unicoding, [A], [ɑ]). Well, lots actually, but notice that there's a
moment, at about 1775 msec where there's an abrupt change in amplitude, or otherwise some kind of transient moment. So something
changed at that moment. So how many consonants could it be? Must be quite back, may not be high, but definitely involves some kind
of oral closure. Well, this one is tough, but once you consider /l/, you're home.
Note: I hate coding my GIFs for IPA characters, and I've been looking for alternatives. This month, as an experiment, I'm
coding my IPA GIFs into the text as always, but also coding in FONT FACE calls to the SIL Doulos 93 fonts, which are
downloadable for free from http://www.sil.org/computing/fonts/encore-ipa.html. For this to work, you must have the SIL
fonts installed on your machine. For good measure, I'm also putting in symbol names from Pullum and Ladusaw (1996)
(the 2nd edition of Phonetic Symbol Guide. Please let me know what works and what doesn't, especially if you have old
versions of Internet Explorer, any version of Netscape or any other browser, or if you're working on any kind of Mac or
Linux machine. Merci.
"Racoons don't hibernate."
[ ], [¨], TURNED R
Here's the thing about approximants--they're articulated similarly to vowels, but they behave like consonants in a string.
So this is obviously sonorant and voiced, having formants (resonances) and striations. Note the transistions in the
following vowel, which show you that the F3 in the voew is continuous with the very very low F3 in the consonant. The
low F1 doesn't tell you much, the extremely low F2 might tel you back and round, but that F3 in the middle-F2 range is a
dead give-away for approximant /r/ in American English. (I respectfully remind you all that even though I'm in Canada
now, my dialect is definitely US, and western US at that.)
[ ], [Q], ASH
(I prefer 'AE LIGATURE' to 'ASH', but who am I to argue with Pullum and Ladusaw?) (Does anybody remember what a
runic AESC actually looks like?) Okay, The F1 in this latter part of this vowel (the steadier part, relatively unaffected by
the transition from the preceding sound) is highish, suggesting a lowish vowel. The F2 is sort of ambivalent, and the F3 is
a little low. So we're looking at a mid-to-low vowel, of a non-back variety. For me, that can only be [ ]- [Q]-ASH, or [
], [E], EPSILON. It'll turn out to be one or the other.
[k], [k], LOWER-CASE K

This is mushy, being vaguely fricative througout. I'm wondering if I'm developing neuromotor problem. Anyway, if one
assumes this isn't a fricative (it would have to be /h/, although it's probably closer to a [x], but that's not a useful guess in
English), one would have to note the sudden dip in F3 in the last bit of the preceding vowel (F3 hits its high point around
200 msec in and before you get to 250 it drops again), and the relative stability of F2 (at least it doesn't drop obviously),
suggesting a velar pinch. If this were alveolar, there would be no explanation for the dropping F3, and if it were labial
we'd expect F2 and F1 to drop a little more. Once the idea of [k] comes to mind, we cna find other evidence that might
support that--the 'burst' if that's what you want to call it might be double. It's definitely centered around the mid-F2 to F3
range, as is the following aspiration. This all suggests [k], if weakly.
[ ], [H], RIGHT SUPERSCRIPT H

(Pullum and Ladusaw don't name this symbol explicitly. Under [h] LOWER-CASE H, they mention "Used as a right
superscript, it is the official IPA diacritic for aspirated sounds." (p. 72). They are of course correct. However, since the
acoustic cue for aspiration is distinct both from the closed phase of a stop, and for that matter the transient release of a
stop, I always segment aspiration separately. This is something I should resolve, one way or another.) Aspiration.
relatively low amplitude (usually) noise, following (usually) the release of a stop consonant before the onset of voicing.
That's what we've got here.
[u], [u], LOWER-CASE U

Low F1, high vowel. Low F2, round vowel. This is frankly as low an F2 as I've ever produced for this vowel. I wonder
what I was doing that day.
[n], [n], LOWER-CASE N

The only evidence that there's something going on here is the sudden change int he 'quality' of F1. At about 525 msec,
something happens to the F1. Also the harmonics seem to flatten out (that's that slanty stuff going on in the preceding
vowel), and the evidence of a nasal zero/antiformant creeps in. Not a lot to go on, but hypothesizing a nasal moment or
something in here helps explain the change from the first half to the second half of this stretch of voiced stuff. Maybe it's
just nasality, and there isn't a lot in the way of nasal stop here. But whatever.
[z], [z], LOWER-CASE Z

(Where I come from, this letter is called "ZEE", but in Canada and elsewhere, it's called "ZED". "ZED" is the older form,
but it's the only letter name that has both a consonantal onset and a closing consonant. Think about it. "EFF" "EM" "BEE"
"KAY" "AITCH". "ZED"? Pfft.) Weak frication, would have been easy to miss. But it's there, and it's very high. Alveolar
sibiliants have very high frequency energy--the loudest noise is centered around 6-8kHz or even higher. When I was
transcribing the image, I though this was voiced, but it sure doesn't look that way. Technically, probably a devoiced [z] or
a weakly fricated [s].
[d], [d], LOWER-CASE D

This is very short for a full-fledged stop. I might have transcribed it as a flap ([ ], [R], FISH-HOOK R, but it has a fairly
strong release. Again technically voiceless. I don't know what I was thinking. Well, I do, but I should have been thinking
about actual cues, not phonemics of English.. Note the highish transitions into the following vowel, suggesting high 'loci'
for F2 and F3, suggesting alveolar. Also the short VOT is filled with high-frequency (alveolar-looking) noise.
[ou], [oƒU], LOWER-CASE O, TOP LIGATURE (TIE LINE), SMALL CAPITAL U

(One of the advantages of using the SIL fonts is the full suite of diacritics, which are messy to do by hand. on the other
hand, naming the resulting symbol(s) in words gets to be a mouthful.) The vowel here is clearly dynamic. There are some
clues to work with, though. The F1 has an extremum (a moment of maximum displacement in one direction or another)
just before 800 msec, and the F2 has an extremum just after 800 msec. This suggest to me that there are two targets, or
something like that, in this stretch of vowel. The earlier one is middish and sort of back, the second one is higher than
middish and very back/round. Search through your lexicon of English vowels, and you'll come up with something that is a
reflex of /o/.

Okay I admit it. I'm definitely cheating here. I believe that There is another 'moment' in this stretch ov vowe,k, where F3
gets fuzzy and changes frequency. That's the only thing to motivate a nasal here. Mea culpa.
[t], [t], LOWER-CASE T

Again, this could be a flap, except for the fairly clear burst. Or it might have been glottalized/unreleased, except for that
damn burst. Don't believe anything they tell you in baby phonetics, cuz it won't be true when you look at it on a
spectrogram.
[h], [h], LOWER-CASE H
This is very strong for an /h/ noise. It looks a little like a weak [ ], [S], ESH, but then I'd expect both the high
frequencies to be less loud (dark) and the frequencies below 1200 Hz or so to be absent. I cna convince myself that the
energy here has some formant structure, which would be more characteristic of /h/. On the other hand, having decided it's
an /h/, it may be particularly loud, long, and distinct from the following vowel due to its initialness in a) a stressed
syllable, b) a word, and c) a verb phrase. Fortition, don't you know. This is something I'm going to look at over the next
few years, using a small corpus I hope to collect over the summer.
[ai], [aƒI], LOWER-CASE A, TOP LIGATURE, SMALL CAPITAL I

This is quite short for this diphthong, but the F1 clearly goes from high to low (moves from a low vowel to a high one),
and the F2 clearly rises (moving from backer to fronter, although it's moderately front even at the beginning). Again,
searching for the possible diphthongs for my English, /ai/ is the best bet.
[b], [b], LOWER-CASE B

Please noteice that at the end of the preceding vowel, in the 10 msec or so approaching 1100, the F2 takes a sharp dive.
The F3 also lowers quite a bit, though neither as sharply or dramatically as F2. Anyway. the low 'loci' of these transitions
suggest (bi)labial. This one really is voiced, thank heaven.
[ ], [¨`], TURNED R, SYLLABICITY MARK

Even though this is moving, it definitely is /r/ like throughout, owing to the F3 being low. At the begninning of this vowel,
the F3 is more or less at the same frequency as the /r/ earlier in the utterance. Note the difference in F2 frequency--some
people require F2 and F3 to be close together. I don't know how they explain the othero ne. I think the F3 is low, and then
as the F2 moves between its onset (bilabially-locused) transition to its higher offset (alveolarly-locussed) transition, it
pushes the F3 out of the way. How people for whom the F3 is a side-cavity pole and not a perturbation of the main-cavity
F3 explain this, I don't know. It must have something to do with the coupling, but why it should work this way I'm not
sure. Anyway, this is a syllabic /r/, and the fact that it isn't flat' is explainable by the competing needs of the surrounding
consonants.

There's definitely a nasal here. It's voiced, and sonorant/resonant, but it definitely has lower energy than a vowel, and it
even has a zero. Sinc there's not a lot to compare it to, it's hard to identify its place features, so just mark it as a Nasal and
go on.
[ei], [eƒI], LOWER-CASE E, TOP LIGATURE, SMALL CAPITAL I

I convinced myself that the F1 moves between mid to low (i.e. vowel moves mid to high), but looking at the spectrogram
now, I'm less sure. The badnwidht certainly changes, but so does the voice quality, which might be what's going on. The
F2 starts highish and moves higher, suggesting front-to-fronter in vowels. How many vowels can you think of that
routinely move front-to-fronter? There you go.
[t ], [tH], SMALL CAPITAL T, RIGHT SUPERSCRIPT H

Not a lot in the way of clues, except the aspiration looks like it follows an aveolar, for the reasons suggested earlier.
"Most require careful management."
Woo, a rough one. Very. Things will get easier in January. Probably.
Lower-case M
[m], IPA 114
From about 50-150 msec there's strong voicing, a weak, flat formant at 1200 Hz, another around 2600 Hz and one higher
than that too, but nothing in between. So something with that kind of weak resonance/zero structure, and flat, has to
be some kind of a nasal. The pole at 1200 is lower than I'd usually get for an alveolar, though a big higher than I'd
normally get for a bilabial. But the transition in the following vowel is in no way alveolar-looking, so there you go.
Probably in initial position my tongue wasn't as low as it might have been, effectively shortening the side cavity. Think
about it.
Lower-case O
[n], IPA 307
F1 at about 500 Hz, F2 just above 1000 Hz. Don't ask me why the F3 is high. But something that's got a mid-vowel F1, a
back/round vowel F2. And basially flat, rather than obviously diphthongized.
Lower-case S
[s], IPA 132
Not the strongest I've seen, but whatever. Probably the effect of the syllabic psoition (coda, but not final). A single
broad band of noise, centered fairly high, and without a sharp drop off in the low frequencies. Don't be distracted by
the weak perseverative voicing.
Lower-case T
[t], IPA 103
Short gap, followed by [s]-shaped release noise (pulled down a bit in frequency by coarticulation with the following
sound). So alveolar [t] has [s]-shaped release noise. Discuss.
Turned R
[ɹ], IPA 151
Hard to tell, but that's F3 just above teh F2. That thing up around 3000 Hz is just too high to be F3. So the F2 starts about
1100 Hz and rises, and F3 starts around 1400 Hz and rises. F3's that low can only be rhotic.
Barred I
[ɨ], IPA 317
On the other hand, this short vowel is too short to worry about. To the degree that it's not just 'more' /r/, I had to
transcribe it as something, and following the F2-closer-to-F3 rule from Keating et al (1994), I chosed barred-i, although
with an F3 that low it can't help be close to the F2. I didn't use a rhoticity hook on it, since that mostly implies a
following /r/ (although there's no reason why it should). Moving on.

[kʰ], IPA 109 + 404
So we have a short, but fairly solid, gap between 500 and 600 msec, with a release that heavy in the low frequencies.
Very suspicious. Usually indicative of a labial release. But the F2 transition in the following vowel may be throwing off
that judg(e)ment. As is the estimation of the transitions. If you look just at th transitions into the following voicing,
they look like they're rising, which again suggest bilabial. But that's almost 100 msec after the release, so anything
could be happening in the VOT. So the only clue that this is anything except a bilabial is the relative mushiness of the
release--not nice and sharp like a typical alveolar. Labials are usually fairly sharp. I'm choosing to believe that the little
blip just below 4000 Hz just after the main bit of release noise around 3750 Hz is evidence of a double burst. Of course, if
I knew this was bilabial, I'd choose to ignore it. This is not an exercise in the scientific method so much as hindsight
being 20/20. Don't confuse the two. Please. I couldn't live with myself as a scientist.
Lower Case W
[w], IPA 170
The thing about a [w] is that it's all transition. The noise that is visilbe in the low frequencies during the VOT is
supported by F1 and F2, which you can see rise sharply once the voicing kicks in. Note the 'straight' F2 transition,
typical of English (onset) [w]. Also note the F3 starts a little low, but by no means as low as an [r].

[aɪ], IPA 304 + 319
So abstracting away from the [w], there's nor eason for the F1 to rise to 750 or 800 Hz unless it's got some kind of target
independent of the relatively flat F1 in the midrange later. So there has to be something lowish here, something
compatible with the low F2 (or something odd would be happening to the F2 as well. So we're looking for something
lowish and back. Then the F1 drops, so something is moving slightly higher, but critically the F2 is zooming to a
maximum of aboutg 1800 Hz. Again, there's no reason of that to happen unless it's heading somewhere specific. It may
or not make it, since the F3 starts coming down and knocks it out of the way, basically. So what we have here is a
sequence of a lowish, backish vowel followed by a very front vowel followed by something with a low F3. But I'm getting
ahead of myself.

[ɹ]̩ , IPA 151 + 431
So I've mentioned the F3. In my head, this is syllabic, but I don't know why. I used to think I always had syllabic
approximants following 'falling' diphthongs, as in 'file' and 'fire'. But 'hire', 'higher' and critically 'choir' "feel"
monosyllabic to me. "Require" "re-choir". Dunno now. Hmm.

[kʰ], IPA 109 + 404
Now this is a [k]. The pinchiness is a little off, since it may be that the F2 and F3 seem to be rising out of the /r/. But
there's a nice double burst, the burst noise is centered in the fronted F2/F3 region. Dunno what's going on in the low
frequencies. I think I'm just too close to the microphone, since something weird is happening to my input these days.
The noise here looks a little sibilant, but the burst noise is just wrong for an alveolar.
Epsilon
[ɛ], IPA 303
Well, the F3 is definitely low, but in my head there's a separate vowel here. Maybe these should be transcribed as
diphthongs. But anyway, the F1 is middish, the F2 is very slightly front. ANd there's an /r/ coming up, so there's a
neutralization here anyway.
Turned R
[ɹ], IPA 151
Well, there it is.
Lower-case F
[f], IPA 128
Very slight frication, fairly broad band, unshaped by any resonances, and strongest, at least at the beginning, in the
very low frequencies. No way this can be sibilant. The F2/F3 transitions are consistent with a labial, but that could just
be the [r]. So if this turned out to be an interdental, I wouldn't be particularly surprised, although phonotactically it
would be odd.

[ɫ],̩ IPA 209 + 431
Don't ask me. But this is pretty classic. The F1 is nondescriptly mid. F2 is about as low as it could be, F3 is distinctly
raised, relative to where it usually is. Lateral, and dark (velarized--the low F2). And since it's got less sonorous items on
each side, must be syllabic.
Lower-case M
[m], IPA 114
Nice voicing bar, but F1 is either 'gone' or so low it's in the voicign bar. Zero below 1000 Hz, nice little pole about 1000
Hz, more zero, then another pole in the neutral F3 range. Look familiar? It should. Flat, resonant, with zeroes, must be a
nasal. The 1000 Hz pole is pretty classically bilabial, and the F2 transition in the following vowel can't really be anything
but bilabial also. So this one is pretty clear.
Ash
[�E6], IPA 325
So here we have another mid-to-low vowel (highish F1), a frontish but not comepletely convincingly front F2 moving, if
anyything, toward neutral. Really can only be [E] or ash.
Lower-case N
[n], IPA 116
Short enough to be a nasal flap, I guess, the nasal here explaines the fuzziness of the F1 in the preceding vowel. Note th
zero, tand the pole. Note also that even though the pole 'looks' like it's in the bilabial ergion, there's just a trace of
something right at 1500 Hz! Woo hoo, because that's the only thing about this that makes it look alveolar. That and the
flappiness, but I've been known to produce very flappy bilabials (no comments from the peanut gallery, please).
Schwa
[ə], IPA 322
Short vowel. Don't want to belabor it. Note the offset frequency of F2, near the 'locus' for alveolar transitions.
Lower-case T
[t], IPA 103
Which suggests that this gap is alvoelar. Or close.
Yogh
[ʒ], IPA 135
Broadish band of voiceless noise, sharp energy drop off below F2 and concnetrated in the F3 region. Too low to be [s]
noise, so must be postalveolar.
Lower-case M
[m], IPA 114
Well, turns out there must be something here, or there's no real reason for the F2 to 'dip' into the silence the way it
does (pointing down in the fricative and up in the vowel). I mean, if there were just the fricative, or just aspiration,
there'd be no reason for F2 to do anything except transition. So there's something here, perhaps weakly voiced. Could
be an approximant, but then it would have to be /r/, since it looks like F3 is low. But knowing wha tI do, I'll choose to
ignore that... Hindsight. So the weakness might be nasality, in which case, the F2 transitions look decidedly labial. But
this is hindsight too. Sorry.
Schwa
[ə], IPA 322
Vowel. Lengthend in a final syllable, but weak, very low pitched, and not really 'long'. So final lengthening
notwithstanding, unstressed and reduced.
Lower-case N
[n], IPA 116
But this last syllable, if reduced, is just too long to be just a vowel and a stop. So there must be something here.
Something weak. And potentially devoiced. But I have no idea how I'd tell what it is since there's not a lot of
information available.
Lower-case T
[t], IPA 103
So there's something that looks like a double burst in the low frequencies (just at 2100 msec), but there's nothing else
until you get up to sibilant frequencies. So on the balance this is probably a weak alveolar burst.
So putting it all together, you get "most require careful man()ch()t, where the ()s indicate some kind of vowel/syllable
affair. Should be too rough to come up with something plausible, and fill in the features later.

CANADA R3T 5V5
"Mushrooms are an edible fungus."
Or maybe that's "mushrooms are inedible fungus", now that I look at it again. Discuss.
Lower-case M
[m], IPA 114
Wow, more that 150msec of nice flat nasal. Okay, so rom about 50 msec to about 200 msec there's a nice sonorant (i.e.
nice, striated voicing bar, and resonances all the way up). Almost definitely a nasal, because of a) the zeroes at about
800, 1800, and 3000, b) the flat formant structure, and c) the sharp discontinuities with the following vowel (clearly a
vowel, since it's also obviously sonorant, and of higher overall amplitude, no zeroes, and transitiony-looking
transitions). So anyway, if it's a nasal, it must be [m], since the pole (formant) is at 1000-1100 Hz, which is typical of my
bilabila nasals. The transitions in the following vowel are also consistent with that, but are so short it's hard to tell.
Turned A
[ɐ], IPA 324
You maybe be wondering what happened to everybody's favo(u)rite vowel, [ʌ]. Well, I've been thinking about my
commitment to the IPA, and this has been bugging me. The IPA defines [ʌ] as a lower-mid, unround, back vowel,
Cardinal 14, the unround counterpart to [ɔ]. This vowel (as in 'hut', 'tuck', and especially STRUT, if you're into Well's
lexical sets. It and [ʊ] are reflexes are ME short /u/ (and short(ened long) /o/). But enough of the history lesson. It's not
round. In general North American English, it's not amazingly back, and in Western American and Canadian English, it's
downright frontish. Not as front as front [æ], so let's split the difference and call it central. Which is not controversial.
But the symbol for a lowish/lower-mid central vowel is turned-a, [ɐ], not turned-v, [ʌ]. So are we transcribing
phonetically, using the IPA, or aren't we. I've decided we are. So here it is. Now, that said, this vowel looks like an [ɑ] or
even an [ɒ], but it's outrageously short. Which I suppose makes it look back again. So maybe I'm still a hypocrite.
Esh
[ʃ], IPA 134
That falling peak between 300-400 msec (falling from 2500 to 1500 Hz) is a bit worrisome, but it'll turn out all right. So
ignore it's movement. We've got something that looks like a voiceless fricative, quite strong and sibilant, but with a
peak in the lower frequencies (in the F2-F3 range) rather than a lone peak way the heck off the top of the spectrogram.
So this is probably post-alveolar. Following that, there's a sharp drop-off in amplitude below 1000 Hz, which is more
typical of postalveolar than alveolar sibilants. The sloping peak is probably indicating some kind of transition....
Turned R + Under-ring
[ɹ]̥ , IPA 151 + 402
Well, the loss of high-frequency and high-amplitude energy (i.e. sibilance) suggests that this is osmethign else. The F1 is
invisible, since it's still coarticulating with the zero (or whatever it is) that zaps the low frequencies of the fricative. F2
and F3 are clearly visible in the noise (and contiguous with the F2/F3 of the vowel, an low and behold look at what that
peak in the fricative seems to be--something that follows the resonances from about 2500 Hz (what you might call the
neutral frequency of F3, or near enough), d own to about 1600 Hz, which looks like the frequency of the F3. Which is
plenty low enough for an [ɹ]. But voiceless.
Barred U
[ʉ], IPA 318
I know this is round because I remember it being round at the time I recorded it and I spent time workng out why--I
think the tendency of [ʃ] and [ɹ] to labialize, and the following bilabial (I don't want to get ahead of myself, but there I
go), I think this vowel just tends to get rounded. A little. But while This vowel is round, the F2 isn't really that low,
compared to the F1. So again, let's split the difference and call it roundish, but not back, or backish, but not round, or
just throw up our hands and say central. So that's what I did. F1 is a little high for something I think of as a high vowel,
making this look more mid, but hey, I pick my moments of IPA precision. I guess.
Lower-case M
[m], IPA 114
Shorter, and with considerably less energy than the earlier nasal, this still looks like a nasal. But, well, shorter, and with
considerably less energy. But the transitions are consistent with bilabial, and to the degree that we can see any energy
at all in the resonances, there might be a pole at about 1000-1100 Hz.
Lower-case Z
[z], IPA 133
Well, so this is a sibilant, with that high-frequency, high-amplitude peak. And it's the only peak, so we're looking at an
alveolar rather than a postalveolar. And I thought it was voiced when I did the figure but I'm not not sure that couple of
pulses at the beginning should count. But maybe it does. And it's shorter than a voiceless sibilant probably would be,
and weaker, sort of, both of which correlate with an 'underlying' voiced fricative. Okay, I'm just a complete hypocrite.
But I really think we should be using turned-a for the STRUT vowel.
Script A
[ɑ], IPA 305
The harmonics are getting in the way of this vowel, but I take the F1 to be the sort of peakish thing at about 900 Hz (as
opposed to the one at 500 Hz) to be the F1, and the F2 would be the one at about 1200 Hz. Ignore the diving F3 for the
moment. So we've got a very, very low vowel with a mostly back tongue position. Unless we have a mid vowel, but I
don't think we do.
Turned R
[ɹ], IPA 151
And here's what we do with the diving F3. Moving on.
Barred I
[ɨ], IPA 317
On the other hand, there's a transition after the F3 minimum that is a little long to be just a transition. So I've shoved in
an unstressed vowel. Moving on.
Lower-case N
[n], IPA 116
Well, there's something going on here. Fully voiced, but too weak to support any higher resonances. But too strong in
the voicing bar to be an obstruent. So some kind of unknown sonorant. Probably nasal, judging by the sudden loss of
energy from the preceding vowel. It just looks like an edge, of the kind nasals have but oral sonorants don't. The F3
transition (into it) is hard to read, since it starts so low it has no place to go but up. But assuming it's going up, the F2
isn't really pinching into it, nor is it obviously dropping bilabial-wise. So maybe this is an alveolar nasal. It would be
nice if we could se some resonance around 1500 (or anywhere 1300-1500) Hz, but you can't have everything....
Glottal Stop
[ʔ], IPA 113
Well, not so much a stop, as a creakiness at the end of the nasal and into the following vowel, but that's as close as we
usually see in my voice.
Epsilon
[ɛ], IPA 303
So the vowel looks like it's short and transitional, mostly in F2, but there's shorter coming, and it's unlikely they're
both completely stressless. So if we have to choose, let's look. THe F1 is basically mid, although it's moving from slightly
higher to slightly lower, so it's moving from lower to higher in the mid-range. The F2 is also in the central range, but
moving frontish (slightly) to backish (slightly). F3 is just neutral. So this is a middish, possibly lower-mid-ish vowel,
moving from frontish to centralish. Which is about all you can say.
Lower-case D
[d], IPA 102
Well, clearly voiced. Not really resonant, except for some mush in the upper formants. Could be a flappy type thing, but
is a little long, or a shortish stop. I went back and forth and decided on the stop. No pinch, no serious labial transitions,
so probably alveolar.
Schwa
[ə], IPA 322
Short little vowel, the F2 clearly all transition. Moving on.
Lower-case B
[b], IPA 102
Again, a voiced stop, this one even plosive-y-er than the othe rone, and sufficiently long to not really be a question. As
to place, thre's not a lot of information. The F2 transition could be labial (it couldn't be much else, but it's also
consistent with just a vowel-to-vowel transition (check the formants of the following vowel). So who knows. Not velar.
Probably not alveolar. But that's a guess.

[ɫ],̩ IPA 209 + 431
Well, I looked, and couldn't convince myself there was a separate vowel in here. The F1 is mid-looking, the F2 is
absurldy low, and the F3 is raised. The apparent zero between the F2 and F3 is probably just realitive weakness in the
harmonics with nothing resonant to support htem, rather tahn a real zero. If this were a nasal, I'd expect a) a weaker
voicing bar, b) no F1, and c) no higher resonances, given the range of the apparent zero. So this must be an oral
approximant, and the raised F3 suggests a lateral. It's length and the absence of anything you'd want to call a vowel on
either side suggests syllabic.
Lower-case F
[f], IPA 128
A a non-sibilant fricative. Voiceless, and with no formant-like shaping. So this has to be labiodental or dental. Hard to
tell, but the trnasitions in the following vowel are more labial-looking than anythinge else. I guess.
Turned A
[ɐ], IPA 324
Ick. Okay, for the record, we're looking ath the sort of fuzzy-formanted thing that's mostly stransition, from about 1450
msec to where the F2 (or whatever it is) leaves off at about 1525 msec. F1 (not to be confused with the strong harmonic
over the voicing bar, is that t hing that starts at about 600 and rises, sort of to about 1000 Hz, maybe. The F2 starts really
low as well, let's say 900 Hz, and rises to about 1500 or so. F3 is lower than it was, and more or less flat, but it gets
fuzzier as it progresses. Okay, so the weakness in F1 and the increasing fuzziness in F1 (and the increasing weakness of
the inter-formant energy, and ultimately the formatns as well) suggests increasing nasality. Just something to file away
for another segment. F1 is mid-to-high, so we're dealing with a lower-mid-ish kind of vowel. This one sort of back as
well, but I'm sticking to my guns on thins one, at least or this spectrogram. Lowish central-to-backish vowel.
Eng
[ŋ], IPA 119
Backing perhaps helped along by coarticulation with a following velar, which is waht this is. It's a nasal, and the only
real reasonacnes is sort of in F3. But more important than that, there's a bit of a gap, with definitely velar transitions
following, and English nasal-place-assimilation being what it is, I'd say this was a velar nasal.
Lower-case G
[ɡ], IPA 110
That's assuming I can convince myself that there really is a gap here. Homorganic stops following nasals tend to be very
short, in terms of their apparenty oral-plosive component, so I'd take this little bit of low-energy voicing around 1600
msec to be sufficient evidence of a plosive. And as I said, the transitions in the vowel can only be velar.
Schwa
[ə], IPA 322
WHich leads us to the velar, which if it ain't a 'real' vowel, it hsould be transcribed as a barred-i, following Keating et al
(1994) as I do. But if it isn't a reduced vowel, what is it? Well, it's definitely mid. And definitely central or front of
central. And the F3 is a little low, but that again may just be coarticulation with the velar (transitional). So soemthing
schwa-like or epsilon-like, or somewhere in there....
Lower-case S
[s], IPA 132
Oooh, this is weak for a sibilant, but it definitely has that centered-off-the-top, broad band 'shape' of a sibilant
spectrum. Final weakening lives, I guess. Even though it seems to have that postalveolar low zero, it doesn't have the
lower (F2-F3) peak. So this has to be [s].

CANADA R3T 5V5
"The store opens daily at 10am."
Eth
[ð], IPA 131
Unfortunately, ther's a sharpish release looking thing in this, followed by some voicing and frication. There's a little bit
of noise, suggesting prevoicing, down at the bottom before the first 'pulse' thing, but not so much that it really tells us
there must be something going on before this. But what we can see of the voiced area is weaker than the following
vowel (as short as it is) and noisy, so it's a fricative. Voiced. And not sonorant, what with the formant structure showing
through. So that only leaves a couple of possibilities, and only one is likely to look like a stop in initial position.
Schwa
[ə], IPA 322
So the first full period of this vowel comes on at about 125 msec, and the last one is about three or four pulses later. So
this vowel is absurdly short. What do we always say about absurdly short vowels? They're reduced. Mark them as some
kind of reduced vowel, and move on.
Lower-case S
[s], IPA 132
Nice long fricative from at least 175 msec to 250 msec. Broad band (no formant-like banding, just one big band)
apparently centered off the top of the spectrogram. Typical for [s].
Lower-case T
[t], IPA 103
The gap and release burst here are nicely indicative of a plosive. The release noise is sibilant looking (high amplitude
and broadband) typical of an alveolar release. The formant transitions in the following vowel are consistent with that.
But the short VOT means this is unaspirated.
Lower-case O
[n], IPA 307
Ignoring the F2 transition, let's pick up this vowel around 400 msec. It seems to go on to about 550 msec, which is when
the F2 starts to change and the F3 hits its minimum. So that's where I marked the end of the segment. F1 is about 500,
so mid-ish, F2 is about 1000 Hz, so backish or roundish.
Turned R
[ɹ], IPA 151
The F3 here is at about 1700 Hz. Such a low F3 can only be an [ɹ].
[nʊ], IPA 307 + 321
SO the F1 hasn't really moved from the preceding two segments, and the F2 is roughly back to where it was, but heading
down. Whether this is really diphthongization or just the transition into the following gap, I have no idea. But since
people seem to like their diphthongs....
Lower-case P
[p], IPA 101
Well, as I suggested before, the drop in the preceding F2 could be interpreted as a labial transition. The F3 transition
could also be interpreted that way, but that might just be wishful thinking on my part. Similarly, the vowel on the other
side is too short to provide much in the way of transitional information. So let's see, what else could we use. Well, the
release burst is sort of mushy, so it's probably not coronal. And the concentration of energy seems to be in F1 and F2,
rather than F2 and F3, so again that might tell us labial. I think that noise at the bottom is just noise, but if you
interpreted it as voicing I guess I couldn't fault you. But it would lead you down a garden path....
Schwa
[ə], IPA 322
Barely three pulses of vowel. Look at the F1 die out. Reduced. Moving on.
Lower-case N
[n], IPA 116
Notice how the apparent F1 (around 500 Hz is suppressed, but the voicing bar below it still looks nice and strong. That's
typical of nasals. Full voicing bar with supporessed upper frequencies. There's a resonance around 1300 Hz or so, and a
fairly strong one around 2500 Hz. No evidence of velar pinch on either side, and the pole around 1300 is far enough
away from the 1000 Hz I usually expect for bilabial nasals, so I'd say this was alveolar.
Lower-case Z
[z], IPA 133
Now around 850 msec or so, the voicing bar loses energy, but keeps its striated quality. So whatever this is, it's voiced.
And the loss of energy suggests an obstruent, i.e. something that doesn't resonate easily. The noise at the top of the
spectrogram looks sibilant, at least band-wise and frequency-wise, so this is probably a [z].
Lower-case T
[t], IPA 103
Short little gap, from about 900 msec to about 950 msec or so. The burst is a little mushy again, but it's obviously
centered up high (in the 3500-4000 Hz range) which is high enough to be an alveolar release. The F2 and F3 transitions
are consistent with that, no diving into the gap and no pinchiness.

[eɪ], IPA 302 + 319
Well, okay, this looks like another short vowel, but it turns out bto be a really short nucleus and a long offglide. So. If
you look ath the obviously vowel part, just around 1000 msec, we've got a middish F1, and an F2 that looks front
(relatively high) moving fronter. So mid and front. Taking the next part, the F1 lowers just a little, so goes a little higher,
and the F2 just zooms up to a peak around 2200Hz, which is really the range only of [i]. So this is a diphthongy /e/.
Whether it 'counts' as a diphthong, I don't know, but there it is.
Tilde L (Dark L)
[ɫ], IPA 209
So it's worthn oticing that around 1100 msec the F2 reaches a minimum and sort of loses cohesion. The F1/voicing bar
also sort of dies off, but slowly, but clearly clicks back on just before 1200 msec, which is the same moment that the F2
comes back. So taking F2 off/on as th edges of 'something', we can see that the energy above is also suppressed (and the
formants more diffuse) for that same duration. So this is a thing. The lesenened energy suggests some closure
somewhere, but the presence of low frequency energy (between F1 and F2) suggests an oral sonorant rather than a
nasal (which should have a zero somewhere in there). So the F1 appears to be in the middish-highish range (that is, is
500 Hz or below). The F2 is a little back (below 1500 Hz. The F3 is a little raised, which is usually indicative of a lateral. so
we've got a darkish /l/. Yay.
Lower-case I
[i], IPA 301
Remember what I said about an F2 around 2200 Hz? That would be useful to remember here. The apparent zero betwixt
F1 and F2 is probably just due to the very widely spaced formants, and the overall lower amplitude of this vowel
compared to others. The F2 transition which makes it look like an [eI] is just a transition from the low F2 of the dark /l/.
Exactly how you would tell the difference, I don't know. Maybe the F1. If we could be sure where it was...
Ash
[�E6], IPA 325
But then there's the transition after the F2 peak, and this is just too long to be just a transition. So it's gotta be another
vowel. But what kind of vowel? The F1 is still hovering around mid, and the F2 is still basically front, if not amazingly
front. So this could be another /e/, or even lower. I think the height is coarticulatory, i.e. with a high vowel in hiatus it
doesn't actually get low. Sure felt low when I did it. But definitely front. Hmm. Lucky for us this is a function word....
Glottal Stop
[ʔ], IPA 113
See how towards the end here the F1 sort of dies, the pulses in the upper frequencies seem to come every other
striation in the voicing bar. That's shimmer, folks, a reflex of glottalization, which suggests a) a syllable-final plosive, b)
probably [t]. But this is creak, so we'll call it a glottal stop.

[tʰ], IPA 103 + 404
OTOH, there's a serious gap following the creak, so there's still a plosive here. Now see that sharp, high-amplitude
burst? See how its energy 'tilts' toward the high frequencies? This has to be coronal. Unless the transitions point
elsewhere, which, thank heavens, they don't. And a long VOT, so aspirated. Whee!
Epsilon
[ɛ], IPA 303
Teeny short voiced vowel, and according to accepted rules we should just regard this as reduced. But if it weren't
reduced, what would it be? Well the F1 is sort of in the midrange, or maybe higher, so this is mid-to-lowish kind of
vowel, but the F2 is definitely higher than neutral, i.e. telling us this vowel is more front than anything else. So a mid-
to-low front vowel of some kind. Hmm.
Fish-Hook R + Tilde
[ɾ]̃ , IPA 124
Haven't had one of these for a while. This, folks, is nasalized flap, such as you almost only get in North American
English. The usual flap is a super-short plosive thing, so it should look like a short gap, or at best a little noise where
you're expecting a gap. This looks like a sonorant. It's fully voiced, if slightly reduced amplitude. See how it has 'edges'
like a classic nasal stop, but it's so short? See how it has a zero-ey thing around 1000 Hz and a pole-like thing at about
1400 Hz or so? See how the resonances are flat? See how the upper frequencies are vastly lowered amplitude? Looks like
a nasal. An [n] in fact. But it's so bleeping short! Flap, folks. Or tap. Whichever. But nasal.
Ya gotta love spectrograms.

[eɪ], IPA 302 + 319
Well, does this one look at all like the previous one? Not really. It's longer and more stretched out, but it's spectrum is
silmilar. It's F2 has a similar frequency range, but moving over a longer time. And we are moving toward the end of the
utterance. Notice how the F1 has fuzzed out to practically nothing from here to the end? Hmm. Broadened band F1.
Must mean something....
Epsilon
[ɛ], IPA 303
Well once again, we're faced with something that's too long to 'just' be a transition. So wha t is it? WHo knows where F1
is? Could be around 700-750 Hz. Or I guess it coudl be somewhere else, but for lack of a better idea, let's suppose this is
the F1 of a mid-to-lowish vowel of some kind. The F2 is stronglest as it approaches the midrange (1500 Hz or so) from
above, and then it starts to lose some integrity. Also around 1850, the F1 does something odd. So those last 50-75 msec
or so (approaching 1900 msec) are probably more 'transtional' than the rest of it at least. So if we tkae everything
before that as non-transitional, we've got something that seems to be vaguely frontish. Not wildly frontish, but vaguely.
So frontish and not higher than mid. Hmm. And there's that fuzzy F1 again.
Lower-case M
[m], IPA 114
A-ha! I hear you cry! A final nasal! Flat resonances, full voicing, overall lessened amplitude, and a nice clear zero
between the voicing bar and the first pole. The first pole is just above 1000 Hz, which puts it closer to my [m] ranged
than any previous nasal in the spectrogram. And the final nasal explains the fuzziness of the F1. There's a zero creeping
in to the resonances, which is broadening the bandwidth of F1. Whence the fuzzies! Don't you love it when things come
together like that?

CANADA R3T 5V5
"Silly people walk too close."
I like this spectrogram because you have to separate your attention to the formant and the voicing bar. Just a hint.
Lower-Case S
[s], IPA 132
Well, this is a decent sibilant. It's got more resonance structure than I prefer, I don't know what was going on in my
mouth that day. But the strongest bit of energy is wa-a-ay up off the top, which is a good cue for [s]. The amplitude
(darkness) is consistent with sibilance, and the broad band (ignoring the resonant structure) is typiccal as well.
Small Capital I
[ɪ], IPA 319
This is so short, it probably should be treated as reduced, but you can sort of tell that it's the local pitch peak, which
suggests that it's stressed. So whatever. The F1 is low of 'mid', the F2 is, well, moving. Part of the problem is that the
following sound is throwing off the expected acoustics of this vowel. So whatever. Fill in the features later, I guess.
Tilde L (Dark L)
[ɫ], IPA 209
I've been doing a lot of these latetly. The overall amplitude here is slightly less than th e surrounding vowels, although
not by much. The F1 fuzzes out a little. The F2 hits a minimum of about 1100 Hz at about 325 msec. F3 is raised to about
2800 Hz. Oooh. Raised F3 almost always means lateral. The lowish F2 is consistent with a) the dark /l/, and b) the
surrounding front vowels.
Lower-Case I
[i], IPA 301
There are still transcription guides that insist that unstressed -y as in 'city' is always [ɪ]. It ain't. Transcribe what you
hear, not what you have been told to transcribe. Okay, so this again is a relatively high vowel (F1 is lower than 500 Hz.
Much lower, actually.), F2 is way high up above 2000 Hz. Can really only be [i].

[p ʰ], IPA 101 + 404
So from 400 to 525 msec there's a pretty clear gap. Voiceless (no energy in the very low frequencies), and if you notice
all the formants in the preceding vowel fall as you move toward the closure. Also, they all seem to rise out of it duirng
the aspiration. So this is probably bilabial. The noise in the aspiration (from 525 to 600 msec or so) is tilted a little high,
but the rising transitions really only say bilabial. That and the low-frequency noise (absent any noise above it until you
get up to the F2 transition) is pretty bilabial looking as well.
Lower-Case I
[i], IPA 301
Well, we still have our low F1, in fact possibly the lowest of the entire spectrogram, F2 is close to 2400-2500 Hz, which is
about as high as I've ever seen it. F3 is sort of pushed out of the way. So this has to be [i].
Lower-Case P
[p], IPA 101
Another gap. With falling transitions into it and for the most part rising transitions out of it. At least F3 and F4. Also you
can sort of see a burst that is stronger in the low frequencies than the high frequencies. That's also a cue for bilabial,
sometimes. No aspiration this time, tho.

[ɫ], IPA 209 + 431
Well, if this were a vowel, which is what it looks like, it would have a mid, or just low of mid, F1, a very low F2 (for my
voice) around 1000 Hz, and an F3 way the heck above anything it should ever need to be. So if this were a vowel, it
would have to be something middish and very back and/or round. Maybe [o]. But that wouldn't explain the high F3....
On the other hand, this is clearly the sonority peak of this syllable, so what are we going to do?
Lower Case W
[w], IPA 170
So from 900 msec for about 50 msec, we've got a seriously reduction in aperture, resulting in suppresion of acoustic
energy. Actually, it starts earlier than that, but you can see it really kills the first and second formant in thi ssection.
With a low F1/F2 like this, it really can only be either dark /l/ or /w/. Given that the 'peak' of the F3 movement looks
like it's here rather than before, you might have found this string to be [pol] and not [plw], but then you would have
been mistaken. One way or another. There's not a lot about this that looks particularly [w] like (relative to a dark /l/)
except for its extremity of F1/F2 lowering.
Script A
[ɑ], IPA 305
It's always hard to decide what to do with moving formants, but here goes. I usually ignore the first and last fifth or so,
so you're really only looking at the middle 2/3s or so of the vowel (do the math yourself, if you care). This allows you to
ignore the obvious effect of very local transitions. With something like this, that doesn't quite do it, so we'll have to
move on. F1 starts (absent the worst of the transition) in roughly id position, and rises to very high, up around 900 Hz.
So this vowel mostly occupies the lower part of the vowel space. The low starting frequency is attributable to
coarticulation with the preceding [w], so ignoring the last bit of the transition, the 'target' here seems to be around 800
Hz or so. The F2 again starts absurdly low due to coarticulation, but kind of levels out around 1000 Hz. So we've got
something with a lowish quality, and very back and/or round. This being my voice there's only one vowel back there,
really.
Lower-case K
[k], IPA 109
Well, we've got a gap, from just before 1100 to about 1150 msec, with some bursty releasey stuff following up to about
1200 msec. The transitions have a falling F3 but a a flattish F2 (and F4, if it comes to that). So the falling F3 might say
bilabial, but then we'd expect to see more falling formants, especially in F1 and F2. So this is probably velar. There's no
reason for a coronal plosive to have a falling F3 like that, and while not strictly 'pinch'y, it's as close as we're going to
get. The strong noise in the F2 range is also consistent with velar release, although the higher frequency noise (in F4) is
distracting, I admit.

[tʰ], IPA 103 + 404
Now we immediately find another gap, with a long, aspirated release. The noise is vaguely [s] shaped, which is sort of
the point. Again, there's a little more formant-shaping than I'd like but you get thiat in aspiration noise rather than
clear sibilance. He says. At least the release noise is consistent with coronal transitions. Can't see the F2, but the F2
seems to start right about 1800 Hz, and the F3 is pretty flat and neutral.
Barred U
[ʉ], IPA 318
Ah, my favo(u)rite vowel. Sort of. Somebody asked on PHONET recently about the difference between barred-i and
crossed-u. And I don't know what it is. This is actually my version of post-coronal /u/ (which has merged with post-
coronal /ju/) into this thing with a frontish onglide, and a backish/roundish, but not amazingly back or round offglide.
Nice, straight F2 transition. Anyway, I remember being careful to round this, so that's why I chose [ʉ] as opposed to
anything else.
Lower-case K + Right Superscript H+ Tilde L (Dark L) + Under-ring

[kʰɫ],̥ IPA 109 + 404 + 209 + 402
There was no way to segment this, so I just jammed the aspirated plosive and the voiceless approximant/fricative thing
together. Sorry, but it's been a rough month. So we seem to sort of have some kind of gap. Somewhere. Followed by a
long period of aspiration and voicelessness. The only clues to place here are the strong bit of noise in F2 in the release,
which is typical of velars. Although again the [s] shape to the noise is distracting. The only clue to the lateral is the
absurdly raised F3 (also the F4). Sorry.
Lower-case O
[n], IPA 307
So here againg we have an middish F1, a quite low F2 and a fairly high F3. The fundamentl is lower here, so the formants
are all a little broader, but this looks a lot like the previous dark /l/. But of course it isn't. Not sure why the F3 is so
consistenly high and flat here. Note how flat this is. Not really diphthongy at all, at least until you get to the last few
pulses. There's no reason for the F2 to drop like that unless something was going on, but for the most part this vowel is
pretty flat.
Lower-case S
[s], IPA 132
Now this is a decent looking [s]. It's length is attributable to final lengthening, and it's relative lack of amplitude is also
consistent with being at the end of utterance. It's still pretty strong as fricatives go, tho. This is what I mean by one
really wide band. THere's very little shaping to this at all, and the center frequency of this is somewhere off the top.

CANADA R3T 5V5
"They liked the warm sunshine."
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
Well, there's some voicing that starts at about 75 msec. It looks a little like closure voicing, but if it's a closure, it doesn't
really do a good job of staying closed. Which is not to say my closures are usually good at staying closed, because
they're not. But if you're a fan of these things, you already know that. But there's a long (relatively speaking) noisy
phase at/near the 'release', which just doesn't look 'release'y. So that's our evidence of frication. I'm not sure what else
to say, except how many non-sibilant fricatives are there, not obviously labial, and subject to this kind of fortition.
Hmm.

[eɪ], IPA 302 + 319
Beginning at about 125 msec and going on to almost 250 there's a nice clear first formant right around 500 Hz. And flat.
Ooh, ya gotta love flat. The F2 is a little odd. Starting at about 1750 Hz and rising to around 2100 or so, and then sharply
falls off to below 1500 Hz when something else starts happening. So let's take each of those bits in turn. The 1750 is
consistent with something front, and given the F1 (which tells us this vowel is mid-ish) this is pretty significantly front.
(For a higher vowel, with a lower F1, F2s can get much higher than this before they get 'really' front, but with this F1,
the F2 range is more limited). So we've got something middish and frontish, and it stays mid but the F2 moves up,
indicating forward movement. So on the balance the first part of this is [eɪ] or something like that. I'll assume therefore
that the rest of it, with the falling F2 is transitional, since there's clearly a low F2 target coming up later. I mean, you
have to get there somehow.
Tilde L (Dark L)
[ɫ], IPA 209
So there's this sonorant consonant between 250 and 325 msec or so. It's fully voiced and resonant. The lowered energy
leads to what looks like a zero around 2000 Hz, but there's too much energy below to really be a good nasal. So I think
that apparent zero is just the low energy dropping off the end of the visible scale. So anyway, this is probably oral. With
consonants we don't worry about the F1, usually, because there's not much variation across types. Close-to-closure is
close-to-closure, after all. The F2 is swooping to a low of something like 1000 Hz at or near 325 msec, that is towards the
end. That probably means something in terms of prosody but I'm not sure what. The F3 is really interesting though. It
raises from the beginning to the end of this consonant. What do raised F3s mean? Right. Lateral. Probably. Consistent
with the low F2 (I only have fairly dark /l/s after all), and the overall intensity. Good spotting.

[aɪ], IPA 304 + 319
Okay, so the amplitude becomes appropriately vowel-like again around 325 msec and stays that way to about 500 msec.
The F1 starts (except for the first 25-50 msec of transition) around 750-800 Hz. While moving, the corresponding part of
F2 starts at about 1200 and moves up to 1500 Hz. It continues to rise to a pea at 2100 Hz or so, while the F1 dives,
perhaps in transition, to a low just about 500 Hz at 500 msec. So we've got something that starts moderately low (higher
F1) and rises slightly, and quite far back and/or round (low F2) and shoots forward in the vowel space. Now part of the
lowness of the F2 starting frequency is coarticulation with the backness of the preceding dark /l/, but there's no
getting around the backish bit. So we've got something that moves from low and back to high(er) and front. It's worth
noticing the falling F3 ...
Lower-Case K
[k], IPA 109
... because combined with the rising F2 we've got something that looks like velar pinch. If this transition were bilabial,
then the F2 would have to come down, at some point. If it were alveolar, there's no reason for the F3 to come down. So
that transition must be velar. So from 500 msec to at least that release noise thing around 550 or 575 msec, this must be
a velar plosive. Looks pretty voiceless.
Lower-Case T
[t], IPA 103
On the other hand, from that release thing there's more gap up to about 625 msec. There's some clunks which might be
release noise, in the high frequencies, but they're not very loud. They are consistent with the real noise from 625-700
msec. This noise is [s]-shaped. It's very loud, and loudest at the highest frequencies. It's a little disturbing that the noise
dies off below 1500 Hz, which makes it look like a post-alveolar rather than an alveolar. But the concentration of energy,
such as it is, in F4 and above rather than below is probably the best cue for alveolar-ness. So if this is just the release of
the stop, then it must be alveolar.
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
Well, here's something odd. Voicing starts at about 700 msec, and something 'happens' about 25 msec later. Then the
voicing settles down, but with some very high frequency, but very low amplitude, noise at the top. Then there's three
pulses or so of real noise around 800 msec. Hmm. Well, I'd be stumped. Voiced, probably obstruent. Now what. Well, the
transitions into the following vowel are all basically alveolar. In fact there's something odd about the F3/F4 being so
high. Higher than the lateral. But whatever. Lateral fricatives aren't really an option in English. Dental ones are. And
subject to a lot of fortition at the beginnings of some constituents.
I confess I chose this phrase because I was interested in this sequence of consonants. Now I wish I hadn't. But
challenges help us grow, right?
Schwa
[ə], IPA 322
The vowel, such as it is, is weak in amplitude, and the formants are all in transition. So this is a classic reduced vowel.
Call it schwa and move on.
Lower Case W
[w], IPA 170
Well, we have a problem. From about 900 msec to about 1000 msec (or 1050, depending on where you want to draw the
line) there's something that's clearly sonorant. The voicing is full and resonant. On the other hand, there's no energy
above 1000 Hz, and precious little between 600-1000. So we don't have a lot to go on. So then we need to look at the
transitions. The F1 starts to fade out, but it seems to be headed down from about 500 Hz and back up again on the other
side. So let's suppose it's heading to someplace like a close vowel. The F2 in the schwa is falling, and when the F2 finally
fades out, it's about 800 Hz or so. It seems to click on again even lower on the other side. F3, falls from high to neutral,
and is still headed down by the time it kicks on again. So we've got something very close (consistent with a high vowel
or an approximant), with a very low (back/round) F2. Lower even than it gets with the dark /l/. So backer/rounder
than that. And nothing really going on in F3 specific to anything else. So this is probably a [w].
Lower-Case O + Rhoticity Sign
[o˞], IPA 307 + 419
Well, let's suppose this starts around 1000 msec and goes on to about 1125. (I guess I forgot to stick in a segment mark
and recenter the vowel symbol. Oops.) So we've got mid or higher-mid F1 (i.e. near or low of 'neutral'), and a very low
F2 (indicated something round and back. That's easy. The F3 is way low for a normal vowel, hence the rhoticity sign.
Turned R
[ɹ], IPA 151
And my favo(u)rite approximant. F1 heaven only knows where, let's say around 500 Hz. F2 rising to about 1300-1400 Hz.
F3 falling to a low of 1600 Hz or so. You just don't see F3s that low with anything except North American-style
approximant [ɹ]s.
Lower-Case M
[m], IPA 114
So just shy of 1200 msec, the amplitude drops suddenly. Things stay sort of constant to about 1250 msec when
'something' happens. So let's talk about that stretch and ignore the rest for the moment. The sudden amplitude drop is
characteristic of nasals, so my guess is that's what we're dealing with. There's a zero around 750 Hz, although there's
not much to it. There's a pole around 1000 Hz. I'd be happier if there weren't apparently another pole around 1500 Hz,
which makes this more ambiguous. But the lower one is stronger, so I'll pretend that's the one we're supposed to pay
attention to (this is cheating. If I thought it was supposed to go the other way, then I'd ignore this one. As Peter
Ladefoged used to say, sometimes "you have to know what you're looking at before you can look at it," or something
like that. Anyway, in my voice, a pole around 1000 Hz is indicative of a bilabial nasal. (The higher pole around 1400 Hz
would be indicative of an alveolar.) The transitions in the previous segment are consistent with a bilabial (notice how
the F3 just keeps falling and the F2 seems to drop just a hair in the last few msec before the nasal kicks on). The
lowering effect of the /r/ confounds that, but if the following sound were really alveolar, I'd expect both those last
transitions to be just a little bit upward. Or at least to level out.
Lower-Case S
[s], IPA 132
This is a better [s], spectrally speaking, even if it is a little lacking in amplitude. compared with the noise in the previous
[t] release. But this is a pretty classic [s]. A single, very broad band of noise, extending from bottom to top, with very
little resonant-like shaping. The single, broad band is centered off the top of the spectrogram, so above 4500 Hz. If we
could image the higher frequencies, the center could be anywhere between 6-8 kHz, maybe up to 12 kHz. But whatever,
higher than we can see.
Turned V
[ʌ], IPA 314
Vowel. That's all. Vowel. Fully voiced, right amplitude, resonances all the way up. Formants? Well, I don't know. My best
guess is that F1 is around 750, or at least somewhere between 500 and 1000 Hz. F2 is around 1250 Hz, or at least between
1000 and 1500 Hz. F3 is raised a little, but since this is a vowel that doesn't tell us a lot. Okay so we've got something
mid-to-low and central-to-back. Or somewhere in that area. Turned V is the traditional symbol used in North America
for this vowel, but I'm not sure it's the right one.
Lower-Case N
[n], IPA 116
Another nasal. This one as a nice clear zero from 600-1300 Hz, and then there's a pole, very faint, but it's there. Around
1300 Hz. Close enough.
Esh
[ʃ], IPA 134
Now here's another sibilant. High energy, and mostly high frequency. This one being a fricative we'll pay attention to
the loss of energy below F2. And again this is ambiguous, but there's a little extra energy in the F3 area, and maybe
again in F4. So this fricative has lower-frequency center(s) than the previous [s], and has more resonance-y
organization. So the lower center, especially in F3, and the loss of energy below the F2, are classic [ʃ] markers.

[aɪ], IPA 304 + 319
Now this is what a standard [aɪ] looks like. Very high F1 with a short transition at the end. Nice low F2 rising sharply to
the front space. Ah.
Lower-Case N
[n], IPA 116
Well, there's an abrupt change in amplitude just before 2000 msec. And basically all the energy got sucked away as a
result. So this has to be a nasal. Nice little voicing bar, nice little zero, and then no (visible) pole to tell us anything. So
we'll have to look at the transitions. And the really obvious thing is that the F2, after climbing drops sharply into the
nasal. It seems to point to that 1700-1800 Hz 'locus' for alveolars, rather than lower down for a bilabial. THe F3
transition is a little ambiguous, in that it also seems to drop a little. But in the end [n] is a better guess.
Robert Hagiwara, PhD

Winnipeg, Manitoba
CANADA R3T 5V5
"Mice in cartoons eat cheese."
Lower-Case M
[m], IPA 114
Starting at 75 msec and goign on until about 150 msec, we've got a nice little sonorant happening. It's got a nice, clear
voicing bar at the bottom, and resonances at the higher frequencies. The sharpness of the edge (of the following vowel),
the overall lowered energy (relative to the following vowel), the presence of a nice clear zero (around 750 Hz) and
mostly flat (unchanging) resonating structures, are all good pointers to a nasal stop. The pole around 1000 Hz is usually
a pretty good clue (in my voice) that it's bilabial. The F2 transition in the following vowel is consistent with that--that
F2 onset frequency is too low to be alveolar, and the distance between F2 and F3 is atypical of velars. But it's that F3
transition that bothers me. The F3 seems to fall into the following vowel, which is consistent really only with alveolars.
So we've got conflicting cues. Which are we going to believe? Well, we're going to wait for a deciding vote. Once we
have a clearer idea of what the first few syllables of this utterance are, knowing it's English and a declarative sentence,
we'll use lexical access to decide whether we're looking at an [m] or an [n]. Or something else...

[aɪ], IPA 304 + 319
So the F1 onset frequency in the first full pulse is just below the 750 Hz zero in the nasal, but it rises very quickly and
reaches a peak well before 200 msec. So ignoring the first few pulses as transition, we've got something that starts fairly
low in the vowel space. The F2 at that moment is still fairly low as well, but some of that might be transitional. So we've
got something that starts lowish and sort of backish (or roundish?), but the F1 lowers over almost 100 msec toward the
following consonant, indicating a slow rising of the vowe, and the F2 never stops moving up (forward in the vowel
space). So what we have here is a diphthong starting lowish and backish and moving up and forward. Again, there may
be two choices, but one is probably better than the other. (Quick, what's the other choice, and how would you expect it
to look, assuming this isn't it?)
Always be an active learner.
Lower-Case S
[s], IPA 132
Well, this is interesting. From 300 msec (a little earlier in the higher frequencies) to almost 400 msec, there's a nice
voiceless fricative. There's no hint of voicing or anything at the low end. There's some noise into the very low
frequencies, and for some reason the amplitude hikes up a bit at about 1500 Hz. Then it stays pretty much flat (i.e. at
the same amplitued) all the way up. So this is fairly strong and broad band, typical of sibilants. And the sudden drop off
below 1500 Hz is usually a clue that it's post-alveolar. But I'm going to suggest it's not. Partly, it's because I know what
it's supposed to be, and I'm floundering for reasons to be right. Okay, usually a post-alveolar (rather than alveolar)
sibilant has that strongest energy in the F2-F4 range, and I think that low energy 'border' isn't quite continuous with
the F2 band in the following vowel, such as it is. So I don't know. This is supposed to be an [s]. And I think if we followed
it up to the 6-12 kHz range, we'd see it really gets really, really loud up there. So this is an alveolar. Accept it. Move on.
Barred I
[ɨ], IPA 317
Well, for a scant 25 msec or so, there's a vowel. There is. Look at it. But it's so short, it's hardly worth spending any time
worrying about. So I won't.
Quick, why isn't it worth spending any time worrying about?
Lower-Case N
[n], IPA 116
Another nasal. Now look at this one carefully. There's a nice strong voicing bar, and there's a band of weaker energy just
above that. Now compared with the initial one, this one is a little higher in frequency or broader in band. So they're not
quite the same. There's a zero. It's narrow, but it's a little higher in frequency than the zero in the previous one. There's
a little energy at 1000 Hz, but it's weak w/r/t the previous one. And there's that blip, or whatever youw ant to call it,
several pulses of resonance, or something, up just below 1500 Hz. I point that out because it turns out that it's
important. I think that's the real pole. But I could be wrong. But what are the odds.

[kʰ], IPA 109 + 404
There's what appears to be a closure transient, or maybe it's just a clunk, where I've marked the boundary. There's
some perseverative voicing, I guess, but look at that aspiration. Even excluding the material before the second burst,
that's at least 75 msec of aspiration. Which is quite a lot for me. So this has to be aspirated, and therefore voiceless. Now
look at that double burst. Double bursts like that, especially centered in F2/F3 like that, are typical of velar releases. So
there you go. There's not a lot of unambiguous transition information but the long VOT and the double burst and pretty
good cues.
Script A + Rhoticity Sign

[ɑ˞], IPA 305 + 419
Well, the F3 is low, so you might be tempted to call this a syllabic /r/. But that wouldn't explain the F2 movement. Or
for that matter, the F1 movement. Which together look like diphthongal movement, which I suppose is what this
sequence is.
Lower-Case T + Superscript H
[tʰ], IPA 103 + 404
From 775 msec to about 850 msec, there's serious gap. The few periods of voicing leadin gup to 800 msec I'd say are just
perseverative. Since the release at 850 is followed by going on to 75 msec of aspiration (voicelessness, VOT), there's
little doubt that this plosive is aspirated. The transitions into it are decidedly alveolar looking, in the sense that both F2
and F3 are pointed up, but given their frequency in the preceding segment, they have precious little choice. The
aspiration noise is the big clue. It all respects (except for the formant shapping in F2 and F3, this looks like a sibilant,
particularly [s]. (I suppose you might say it looks like an [ʃ], but really it doesn't. There's not enough energy in the F2/F3
pole relative to the higher ones.) Ennyhoo, it's not an [s], it's just really heavy aspiration following an alveolar release.
So it's not 'grooved' like an [s], but the airflow is basically high pressure being directed at the incisors, just like [s]. SO
this has to be alveolar. The transitions out look vaguely velar-pinch-y, but since there's no way a velar would have
aspiration that looks like this, we can rule that out.
Turned M
[ɯ], IPA 316
Well, this is not good. The highest-pitched voice in the whole spectrogram. Which probably makes this syllable the
nuclear accent, or at least the focus accent of the utterance. But in practical terms it means a) the striations are so close
together you can't tell one pulse from the next, and b) the harmonics are widely separated (Quick--why?) and so
bandwidths just increase. sSo it's hard to tell exactly where F1 is. It could be that band around 500 Hz (or just below, but
above the very strong voicign bar), or it could be that band up around 800 Hz. Which makes this either a relative mid to
higher-mid kind of vowel or a very, very low one. The F2 is a little easier. Before it fuzzes out, you can see the F2
transition in the aspiration noise, so you know where it's headed at least. So the F2 has to be around 1200 Hz or so,
depending on exactly where you measure. So knowing the answer, I might suppose that the strength of the 'voicing bar'
was actually a very low first formant, and the two things I'd considered before are just strong harmonics. But I don't
know. It probably ain't the increibly low vowel that it would be. SO figure not high and realtively back, but not
outrageously round (or very round but not outrageously back). And we'll try to make a word out of it later.
For the record, this is a fairly typical /u/ for me. Not at all round, fairly high, and with front on-glide following the
coronal.
Lower-Case N
[n], IPA 116
So I think the oral closure happens on at about 1075 msec--when the zero kicks in. Which is another contributer to the
fuzziness of the preceding vowel--nasalized vowels tend to have broader bandwidth (and more centralized formant
frequencies) than their oral counterparts. So the zeroes are a good thing, really--they tell us this has to be a nasal.
Frankly, the pole looks like it's about 1000 Hz, and so I'd say this was bilabial. And I'd be wrong. Good guess, but if it's
not bilabial, then it has to be alveolar. No hint of velar pinch, and, well, there is that narrow thing at 1500, which is
where I'd expect the pole for an [n] to be, in my voice. There's no hint of that in the initial nasal of this utterance, so
there's some difference. But I wish I knew what was going on on at 100 Hz.
Lower-Case Z
[z], IPA 133
Well, there's a hint of voicing at the bottom, so this is probably voiced. The noise is [s]-shaped, if you follow, and weaker
(and shorter) than we'd expect for [s], which is consistent with the idea that it's voiced.
Lower-Case I
[i], IPA 301
Well, if the previous thing is an alveolar, then we can say that the onset frequency of F2 is in line with the alveolar
locus, which means all that movement is just transitional. Or we could suppose that it's meaningful. I n the first case,
coupled with the relatively low F1, I'd be looking at that spot, just after 1300 msec where the F2 levels off or just a bit,
and say that was our target F2 frequency, which would make this an [i], just because nothing else ever has an F2 above
2200 Hz. But in the other case, we'd say this was a relatively high, front vowel moving higher (I guess) and much much
fronter, something much more like classical [eɪ]. One or the other. One is right, the other's a good guess.
Lower-Case T
[t], IPA 103
So with the exception of that one pulsey thing before 14500, the gap here seems to start at about 1350 emx and go on
for almost 100 msec. The transitions into look sort of pinchy (but very front velar, if you follow) and the burst is slightly
doubled. All of which just screams [k]. But then we wouldn't get this spectrogram to say anything. So on the high-tilt to
the burst, and the phonotactics of the following thing, I'd say this was [t].
Esh
[ʃ], IPA 134
So here you see how much stronger the F2 pole is. And the energy below is weaker. So this looks like an [ʃ]. THis is also
more consistent with the F2/F3(/F4?) poles, which are more typical of postalvelaors than alveolars. There's just more
room to couple and a longer front cavity to play in. That is, for acoustic coupling to take place and to resonate in,
respectivecly. Shame on you for thinking what you were thinking!
Lower-Case I
[i], IPA 301
Well, there's a couple of odd amplitude discontinuities, but they're not really radical, considering the length and overall
energy in this vowel. So I'm thinking it all has to do with pitch change, and therefore striation spacing and harmonic
structure. So from 1575 to 1925 msec, I'm thinking this is really all one vowel. And since the F2 reaches 2200 Hz (i.e.
'absurdly high for anything except [i], and then still very, very high'), I'd say this was [i]. If you were determined to put
vowels on either side, what would you do with the middle?
Lower-Case Z + Under-Ring
[z]̥ , IPA 133 + 402
Well, this is a lesson, so here goes. This looks like an [s] again, but it's very weak. There's no hint of voicing, but it's
weak, and it's shorter than even the fricative in the affricate, even though it's final in utterance. So there's something
odd about it. It's not post-alveolar, because even though it looses energy below F2, you'd still expect the F2 pole in the
fricative to be a little stronger than above it, and this is flat. The noise gets a little better organized off the top of the
spectrogram. All this points to [s]. So how do we account for the weakness? Well, voiced fricatives are almost always
weaker and shorter than their voicless counterparts, just because the act of voicing impedes airflow and therefore
pressure build up. But this isn't voiced. So I'll suggest it's passively devoiced. That is, rather than devoicing by abducting
the vocal folds (as with underlyingly voiceless sounds), the vocal folds remain adducted here. But because we're at the
end of an utterance, we (I) don't have a lot of subglottal pressure to work with, and the result is the vocal folds don't
vibrate. And there you have it, devoiced [z]. As distinct from [s].

Winnipeg, Manitoba
CANADA R3T 5V5
N"Harmony is achievable."
Lower-Case H
[h], IPA 146
So starting note quite 100 msec in, and going on until 225 msec or so, there's some voiceless (no striations in the very
low frequencies, in the range of the fundamental or first harmonic, which in my voice could be anywhere between 90
Hz to 130 or 140. So it's voiceless. There's lots of energy up above, but it's aperiodic, or noisy. If you notice the formants
of the following vowel, there's a little more noise in those same frequencies. Which is typical of [h]. The noise, being
produced in the laryngopharynx, bounces around the vocal tract the same way periodic energy does, and thus gains
energy in the frequencies of the vocal tract resonances and loses it in between. What's interesting is the high F3-there's
no hint of rhoticity in the noise, until about 200 msec, when it starts to come down in frequency. We can see that
transition continue once the voicing kicks in, but then we're well into the next segment.

[ɑ˞], IPA 305 + 419
So ignoring the voicing bar, which you can see is a very narrow band down there around 150 Hz or so, the first
resonance is quite high. Depending on how you measure these things, I'm thinking it's that upper band around 850 Hz
or so. If you look though, there's another, slightly fainter band just below, closer to 600 Hz. I'm thinking that's just an
idiossyncratically strong harmonic (there's something about there all through the spectrogram, regardless of where the
F1 is). (Well, use your imagination.) In a perfect world, we might located the 'center' of that formant in that slightly
lower-energy space in between what I'm calling F1 and what I'm calling that weird harmonic, since the combined width
of those two things is only a little wider than the formants above it. So I don't know. But F1 is definitely high here, so
this is a low vowel. F2 starts about 1200 Hz or a bit below, but it rises a little into the following segment. Now look at
that F3. This is the best argument for segments (or at least sub-syllabic constituents) I've seen in a while. The F3 in the
[h] is up around 2500 Hz, dead neutral. It comes down in the last part of the fricative and through the 'clear' part of the
vowel until it approaches its low steady state in the following segment. But if you believe a) [h] does not have oral
features/targets of its own, and b) "rhoticity" (lowering of F3) is a feature realized on vowels before approximant /r/,
why doesn't the F3 start low in the fricative? Or at least lower, if you believe that the F3 of the vowel is categorically
affected by rhoticity. Which it obviously is not. But here you can see the rhoticity is a) not phonological, and b)
constrained in the phonetic grammar to the coda /r/, and is allowed to creep into (but not take over) the F3 of the
vowel. But not really at all into the fricative. But there's nothing in the fricative to prevent it from doing so. Except
obviously there is. So there must be something 'there'.
Turned R
[ɹ], IPA 151
So I guess I've given it away, this is an approximant /r/ (properly, IPA [ɹ]) of the North American variety. The F1 is still
where it was for the vowel, the F2 (oddly enough) is raised to approximate the low F3, and the F3 is very low, almost 800
or 900 Hz lower than it was in the beginning of the [h] (where you can see it returns eventually. Typical of [ɹ].
Lower-Case M
[m], IPA 114
Then the amplitude falls off around 350 msec. The F2 transition in the /r/ is diving at that moment, which suggests
labial transitions. The overall energy from 350 to 425 msec (or so) is lower than either of the surrounding vowels, so this
is relatively consonantal. And its edges are sharp, if you see what I mean, suggesting some acoustic change that sucks
energy out of the source suddenly turns on, and then off. So this is a typical nasal-the aforesaid sucking occurring as
the nasal cavity is opened and the oral cavity is closed, and then stopping when the velopharyngeal port is closed and
the oral closure released. There's a nice pole around 400 Hz, which is just to be expected, but the first 'real'
pole/formant in the nasal is around 1000 Hz. You can see pole above that (continuous with the F3 of the /r/) is rising.
The frequency of that middle pole, the one around 1000 Hz is a good cue to this being bilabial-if the oral closure were
further back, this would be higher in frequency. (Go back to acoustic phonetics and read about 'side cavities' if you're
not sure why.) So that's two solid cues to this being [m], and none particularly pointing anywhere else.
Barred I
[ɨ], IPA 317
So from 425 to about 475 msec, there's a vowel. The F1 is sort of low, unless you believe it's still high, but it's not
particularly distinct either way. The F2 is in constant motion, almost as if it had nowhere in particular to go. The F3 is
still transitioning, so it' snot helping either. Also the F4 if it comes to that, but since we almost never look at F4, we
won't belabo(u)r the point. So we've got a short vowel of indistinct structure that never really develops a strong
identity of its own. So call it reduced, transcribe accordingly, and move on.
Lower-Case N
[n], IPA 116
So here we have another one of these. Note its similarity, in terms of its amplitude and edges, to the previous nasal.
There's a pole I don't think I've ever seen before at about 850 Hz, so I'm going to ignore it.... The main pole is up around
1400 or not quite 1500 Hz. Note how much higher it is than the 1000 Hz or so pole in the [m]. So there we go. This one
isn't bilabial, so we're stuck with alveolar or velar. There's no hint of velar pinch in the transitions into or out of this
nasal, and the transition-end frequencies (around 1700 Hz) is consistent with the locus of alveolar transitions.
Lower-Case I
[i], IPA 301
So the F1 is still rather low. Note the voicing bar in the first syllable. There's a strongish harmonic just below 500 Hz but
the main body of the resonance is clearly between the voicing bar and that harmonic. So this is an exceptionally low F1.
So this is an exceptionally high vowel. The F2, once it straightens out, is exceptionally high, up around 2100 or 2200 Hz.
So this vowel is exceptionally front. And the highest, frontest vowel you can think of? Right!
Barred I
[ɨ], IPA 317
Well, another section of vowel that's mostly F2 transition. If you missed it as just transition, you have to explain why
this vowel is so long when its pitch is clearly quite low (see how far apart the striations are compared to most of the
preceding vowels-each of those striations is a glottal pulse). So I think this is actually two different vowels/syllables. In
fact, two different words. I worked hard at not putting a glottal stop in this one, so I hope you appreciate the duplicity
involved.
Lower-Case Z
[z], IPA 133
So the striations continue, albeit in weaker form, all through the following amplitude dip (from about 700 msec to 750
msec or so?). So whatever it is, it's a consonant and it's voiced. But up above the voicing bar, there's no evidence of
periodicity, so no resonance to speak of. So there must be a very tight constriction somewhere. And it's noisy, so it's a
close constriction, but not a closure. So we're talking about a fricative. Voiced, but very noisy. The noise is not
particularly organized into bands. In fact, it's one broad band. It's a trifle weaker in the lower frequencies than the
higher frequencies (note the relative lightness of the noise just around and below 1000 Hz compared to anywhere
above), so this looks like it's tilted to the high frequencies. Very high frequencies, without any tilt toward the F2 or F3
region. So there you go. [s]-shaped noise, but voiced.
Barred I
[ɨ], IPA 317
And another short little vowel, overlapped in the high frequencies with a bit of the noise from the fricative. Or maybe
the noise is coming from the upcoming closure. Or both. Hmm. So this is amazingly reduced.
Lower-Case T
[t], IPA 103
Nice sharp gap so obviously we're dealing with some kind of plosive. There's not a lot going on in terms of transitions
suggesting anything in particular. On the other hand, if you look at the release noise burst, it's very sharp, broad band,
and evidently [s]-shaped. Although this may be in part a product of the following frication. But whatever. Believe it's
alveolar, or at least coronal, or remain agnostic. When it comes to parsing the upcoming fricative your choices will be
limited.
Esh
[ʃ], IPA 134
So here we go. We've got some very loud friction here. No voicing bar, but with that much noise, you wouldn't really
expect any voicing. The frication is very loud, but you'll notice it isn't one very broad band, but has some formant-like
shaping to it. It's loudest not off the top of the spectrogram (i.e. between 4-6-8-12 kHz), but seems loudest in the F2-F3-
F4 bands. And the F2 band is pretty noisy, while below it the energy drops off sharply. That's pretty typical of post-
alveolar [ʃ].
flight of the concords new zeland duo folk parody
Lower-Case I
[i], IPA 301
So it's tough to tell where F2 is. You have to surmise from that falling transition afterwards that it's really, really high,
around 2200 Hz or so. It's almost merged with the F3, but that's not supposed to happen, so the combined band is still
wider than you'd expect a single band to be, but at this bandwidth there's no telling where the separation is. So the
edges of the filter overlap slightly. Get over it. So that's the F2, where's the F1? Low low low, I say. We could argue about
that, but TMSAISTI.
Lower-Case V
[v], IPA 129
Another voiced fricative here, from 1075 to 1125 msec or thereabout. Nice striations at the bottom, but no periodicity to
speak of above. This is a very loud fricative-it has about the same energy as the previous [z]. But spectrally, this looks
different. It doesn't have any tilt to it at all. It just looks white, in the sense of having equal energy at all frequencies.
Sort of unfiltered. Well, probably this is louder than it should be-I may have been spitting into the microphone or
something. The unfiltered-ness is a huge clue though. In order to be unfiltered, your source has to be uncoupled from
the resonators of the vocal tract. Which means it has to have a tight closure, and no vocal-tract-tubey-volumes in front
for the energy to bounce around. So this has to be at the teeth or lips. Given that this is English, the lips (bilabial) is
unlikely. It would be really helpful if the transitions on either side looked more labial, but they don't. Which might
make us think coronal, just by default. But then we'd be wrong. So let's just keep both [v] and [ð] in mind until we can
make a word out of it.
Schwa
[ə], IPA 322
Very short, indeterminate vowel. Moving on.
Lower-Case B
[b], IPA 102
Another gap, this one rather long, although since we're approaching the end of the utterance that might be
lengthening of the final syllable. There's a nice, clean gap in most frequencies, but if you look at the bottom, there's an
awful lot of perseverative voicing. More than you'd get if there were a nice abduction gesture associated with an
underlying voiceless stop. So this is probably voiced. It's a little annoying that the transitions are so ambiguous. The F2
in the preceding vowel seems to be coming down, well below the 1700-1800 Hz alveolar locus we usually look for with
alveolars. So that looks labial. F3? Seems to be high, if anything. Ya gotta love coproduction messing up all your cues. So
on the balance, I'm going to say bilabial. The F2 isn't even close to alveolar or velar looking. The F3 is ambiguous, but I'll
attribute it to coproduction with ...

[ɫ],̩ IPA 209 + 431
... the raised F3 of this segment, which is lateral. You can tell because of the raised F3. /r/s have greatly lowered F3s,
/l/s tend to have slightly raised F3s, and/or sometimes F4s. With an F2 below 1000 Hz, this can only be described as
back (or round), so it's dark as well. If you believe those first few pulses with energy in F3 and F4 and above are
evidence of a separate vowel before lateral-contact, you're welcome to insert a schwa or something. But I tried to be
careful and release the /b/ into the lateral. There are advantages to doing these things with your own voice....

Winnipeg, Manitoba
CANADA R3T 5V5
Solution for (late) February 2006
Due to various delays, I decided to take a shortcut on this month's spectrogram. This one is composed of four words,
one word each from the following pairs.
1. HE/SHE
2. CHAINS/TRAINS
3. MEEK/WEAK
4. LEADERS/READERS
So the real trick here is to work out whether the first consonant is [h] or [ɫ][h] or [ʃ], whether the second onset is [tʃ] or
[tʰɹ]̥ , etc.
Here's the labelled spectrogram from February
N"She trains weak leaders."

For the sake of comparison, I've included the 'opposite' spectrogram at the bottom of the page.
I'll only be discussing differential cues this time, again, just because of time constraints. (I will leave it as an exercise for
the reader to segment the 'known' segments and work out what cues there are as to their identities.)
Esh
[ʃ], IPA 134
From about 75 - 225 msec. This looks more like [ʃ] than [h] for a couple of reasons. The first is that it's too loud. This has
absolute amplitude like a vowel rather than a consonant. So this very loud frication is tilted to the higher frequencies,
typical of sibilants in general. This looks like [ʃ] rather than [s] since it has very little energy below F2, below which it
drops off fairly sharply ([s] has broad band noise that may diminish at the lower frequencies, but it'll do so more
gradually). The fact that it drops off right below F2 is suspicious, if you were wondering. Also an [s] would not have that
strength specifically in F2/F3/F4, but presumably would have a single broad band much centered much higher. An [h]
would have less energy over all, and wouldn't have any kind of discontinuity with the following vowel (except in terms
of voicing). If you notice, the "F2" in the fricative doesn't match that in the vowel.
Lower-Case T + Right Superscript H, Turned R + Under-Ring
[tʰɹ]̥ , IPA 103 + 404, 151 + 402
350 msec to 450 msec or thereabouts. The choice here is really between an [ʃ] release to the affricate or a voiceless [ɹ]̥ .
I'll duck the whole question of segments and affricates and so on. Okay, so the gap for the plosive goes from about 325
msec to the release somewhere between 375 and 400 msec. The release frication probably runs from about the release
for between 25 and 50 msec. The 'center' of the /r/ moment, if you follow me, is around 425 msec. Notice that by the
time voicing kicks on at about 450, the formants are already moving fast. So our choices for this bit, from 400 to 450
msec or so are the /r/ (devoiced due to the aspiration) or Esh. Notice the intensitive of the noise—on release, it's nice
and sibilant. It's centered pretty low, sibilant-wise, and looks a lot like the previous Esh. But you'll notice the intensity
drops off fairly quickly, instead of being nice and sustained through the voicelessness, and also that the noise is in the
shape of the following formants. The F2 starts up wherever it stars on release (around 1900 Hz or so), falling rapidly to
just below 1500 Hz. The F3 falls out, but notice in the release how the corresponding band is definitely falling.
Extrapolating or interpolating, or whatever, from the angles of the transitions on either side, it looks like the F3 drops
to just below 2000 Hz, but there's not a lot of evidence that it really gets there. But those transitions in F3 can only be
due to rhoticity. And the lowness of the F3 and the closeness of F2 and F3 together explain the esh-shaped-ness to the
release noise--the center of the energy is being pulled down by the low formants. But this explains why there seem to
be people who have a /tr/ goes to [tʃ] rule and/or /dr/ goes to "jr". For comparison, notice how, while diminishing, the
esh-noise in the comparison spectrogram is more or less stable right through until the voicing kick in.
Lower Case W
[w], IPA 170
Approxiamant or nasal? We're looking at the fully-voiced segment from about 725 msec to just past 800 msec. It's got
less energy in the voicing bar than in the following vowel, but that's typical of both nasals and close approximants. The
transitions are mostly bilabial, although F3 isn't helping much. So nasal or not? Well, not. Nasals don't have to have
'sharp' edges, but prevocalically the usually do. See that moment near 800 msec in the comparison spectrogrma. The
edge here is the velum closing--at that moment, the acoustic change suddenly. The energy that was being lost by the in
the nasalization is suddenly regained, the main resonances change--notice how the formants 'pop on' without
transitioning. Here, ther formants are all transition, suggesting something oral throughout, with continuously
changing articulators transitioning from the /w/ moment to the following vowel.
Tilde L (Dark L)
[ɫ], IPA 209
Finally another lesson in approxmants, this time /r/ and /l/. We're looking at the moment that begins when the
voicing kicks on around 1000 msec, and going utnil the upper frequency periodicity really becomes clear, at about 1075
msec. Again, this doesn't look nasal due to the continuity of the whole thing. That ahd the F1/voicing bar complex is
too continuous (in both amplitude and frequency) to indicate a sudden addition or loss of a cavity. So what's the
difference between a North American /r/ and an /l/? The F3. Lowered F3 for /r/, raised F3 (or sometimes F4--ideally
both) for /l/. So where's the F3? F1 is down just below 500 Hz. F2 is just above 1000 Hz, and F3 is way up there around
2750 Hz. It's falling a little, so by the time the upper frequendcy periodicity kicks on it's already almost back down to
2500, but you can still see how high it was in the noisy, semiperiodic energy during the approximant. So that's it. Raised
F3.
For comparison, the 'opposite' of the original spectrogram
labeled spectrogram
This page last modified: 11/08/2009 22:57:37 N"He chains meek readers." Support Free Speech

Winnipeg, Manitoba
CANADA R3T 5V5
"Most are perfectly capable."
Lower-case M
[m], IPA 114
We start with a clear sonorant consonant of some kind, fully voiced, nicely striated, and with nice clear resonances all
the way up. The formants are flat, and there's a nice clear zero about 750 Hz, both of which suggest a nasal. The pole
(formant) around 1100 Hz suggests a bilabial (at least for my voice--an alveolar nasal usually has a pole somewhat
higher, closer to 1400 or 1500 Hz, and a velar nasal a) wouldn't be initial in an English utterance and b) would have more
evidence of velar transitions in the following vowel.
[nʊ], IPA 307 + 321
I love the overlap between this and the following consonant, but whatever. F1 is at about 500 Hz or just higher. F2 is up
around 1100 Hz or thereabouts. F3 is high for some reason. But since this is a vowel we're not going to worry about F3.
Just put it out of your mind. Don't let it consume you for twenty minutes like I just did. So we've got the F1 of a mid-ish
vowel, and the F2 of something fairly back and/or round. The F1 and F2 seem to move downward slightly (hence the
transcription as a diphthong) but you'll notice that the upper frequencies are taken over by the incipient sibilant noise
coming up. Gestural overlap? Spreading? Whatever. The illusion of segments. Moving on.
Lower-case S
[s], IPA 132
Well, since it's September, we'll review. From about 350 to almost 500 msec. This is a fricative (random, snowy 'noise').
It's voiceless (no striations or energy in the low-frequency 'voicing bar'). And the noise is in a single, very broad band
(unfiltered by a lot of vocal tract resonances) which suggests that it's relatively forward in the vocal tract. It's very loud
(and broad band) which suggests sibilance, and centered in the very high frequencies, which suggests alveolar (the
postalveolar sibilant is usually centered in the F2/F3 range rahter than above the F4 range). So this must be an [s]. [s] is
your friend, spectrographically speaking.
Lower-case T
[t], IPA 103
Our first real plosive. From about 475 msec to the release burst at about 525 msec there's a gap in the spectrogram,
indicating no airflow, no resonance, no voicing, squat. It's got a short VOT (not even 25 msec) so it's unaspirated.
Voiceless goes without saying, right? (Study question: Why?) The F2 transition starts at about 1600 Hz and falls, the F3
transition starts around 2400 Hz and again falls. So we have 'uppy' pointing transitions (pointing into the gap, that is)
and so this is probably an alveolar. The noise in the VOT is a little low (we'd like to see more [s]-looking release noise
following a [t], but there's a coarticulatory thing going on....
Schwa + Turned R
[əɹ], IPA 322 + 151
Okay, I've transcribed a diphthong here because there was just no place to segment. Sorry. I've also used a deceptive
sequence of symbols--for a lot of people, the sequence schwa-r is a shorthand for the symbol schwa-r (i.e. [ɚ] IPA 327,
for which I always use turned-r with the syllabicity diacritic, i.e. [ɹ]̩ .) But here we have something that looks and sounds
like a diphthong. So I've transcribed it as such. F1 is in the mid-region. F2 is neutral (and falling). F3 is sort of neutral
but also falling. The end of the F3 fall is way below 1800 Hz, which accounts for the F2 fall as well--that is, that's
rhoticity, i.e. approximant /r/ in North American English. But there's a non-rhotic vowel in front of it. So it's a
diphthong.

[p ʰ], IPA 101 + 404
Another gap. This one is long,and that might mean there's a sequence here. Or it might mean it's initial in some
domain. The transitions into it are a bit difficult to interpret, since the F2 and F3 are pulled down so far by the /r/. But
if it weren't for the /r/, these would look bilabial. I mean, they point down. The release at about 700 msec has a lot of
noise in the very low frequencies. The transient seems to go all the way up, but it's neither 'sharp' (typical of especially
alveolar stops--especially if accompanied by sibilant noise, which this isn't) nor 'doubled' (more typical of velars,
especially if accompanied by 'pinch). Well, F2 and F3 are close together, but that could just be because F3 is so low in the
following vowel. That's also a reason why the previous /r/ doesn't transition anywhere else. On the other hand, both of
those suggest that there's no coronal action in this consonant, which leads us back to considering bilabials and velars.
So if we look at the transitions in the aspiration noise, it looks to me like F2 and F3 both pont down into the gap (that is,
rise as they move into the vowel), and so the transitions look bilabial. And voiceless, of course, and the loooong VOT
can't be anything but aspirated.

[ɹ]̩ , IPA 151 + 431
See, this is a syllabic /r/. It's not a vowel like schwa 'combined' with some diphthongy rhoticity. It's just a vowel. F1 is
mid-ish, F2 is as neutral as it can get, considering the F3 is around 1600 or 1700 Hz. An F3 that low can only be an /r/.
Lower-case F
[f], IPA 128
What we have here is another voiceless fricative. Now take a moment and compare it to the previous [s]. Broad band
noise but not of sibilant amplitude. So probably fairly far forward in the vocal tract. Given that this is English this
means labiodental or (inter)dental. Any other clues? The F2 in the preceding /r/ seems to transition downward, just a
tad, while the F3 is rising, slightly. The only reason for the F2 to not be transitioning in the same direction as the F3 is if
it's a labial transition. The transitions on the other side of the fricative all point down (that is, rise into the vowel, which
also makes this look bilabial. Now, we might discount the F3 transition as just 'rising' from the low position for the /r/.
But the F2 transition(s) still look(s) labial. Vaguely. So probably [f]. The double clunky thing at the onset of the vowel is
probably just a clunky thing.
Barred I
[ɨ], IPA 317
Absurdly short vowel. Reduced. Ignore it. Well, don't ignore it, but don't waste any time trying to identify it. Move on.
Lower-case K
[k], IPA 109
Another long gap, but now there's another double clunky thing at about 1100 msec, in the F3/F4 range. Not much of a
cue, but double burstiness is sometimes indicative of a velar release. Then again, there's another double-bursty-looking
thing at 1150 msec (or so) which I'm going to claim is a red herring (or rather, that the usual explanation for velic
double-bursting doesn't account for the other double clunk (either of them), but my explanation will. Anyway, that's
really the only clue that there's something else going on here, or that it's a velar release into another plosive. So if you
caught it great, if you didn't, you'll have to insert it in through lexical identification later.
Lower-case T
[t], IPA 103
The release here is good and sharp. It looks doubled, but I think most of the energy is in the F3, rather than in a pinched
F2/F3 combination. F2 seems to start at about 1500 Hz. So what's going on? I'm going to suggest the release at 1150
msec is coronal, and that the double clunkiness we see is the result of ...
Tilde L (Dark L)
[ɫ], IPA 209
... lateral release. I was being persnickety with the half-under-ring diacrhitic but the point is that there's dark /l/ here,
partially (or fully) devoiced by the aspiration following the /t/. The lowness of the F2 transition (relative to the
expected frequency of 1700-1800 Hz) is compatible with rounding, but I'm going to suggest that it's there result of
velarization of the lateral. The second clunk is the other side of the lateral releasing. So my story on double clunks is
that they involve two sides opening at different moments. So velars and laterally released [t] is most likely to have a
double burst. The standard story is that the long closure associated with velars (and dentals) causes a high-velocity
airflow on release, and a Bernoulli 'clunk' immediately after release. I think I'm right, but I'm apparently the only one.
And sometimes a clunk is just a clunk. The vocal tract is a juicy place, after all.
Lower-case I
[i], IPA 301
So after all that, we end up with an F2 up above 2000 Hz (way up above) and an F1 which is quite a bit lower than the
mid-ish F1s we've been seeing. So this is a highish vowel, amazingly front. /i/ or /e/. In this case [i]. Trust me.

[kʰ], IPA 109 + 404
Ah, another gap. I hope you noticed the subtle velar pinch in the preceding transitions. Also the double burst. And the
long aspiration noise, concentrated in the F1/F3 region. All classic velar signs.
Lower-case E
[e], IPA 302
Now this is a flat /e/. Not diphthongy. F1 is mid or just low, F2 is around 2000 Hz and relatively stable. Not obviously a
diphthong. So there.
Lower-case P
[p], IPA 101
A gap of about 100 msec. Withs ome perseverative voicing, but not enough to worry about. The burst is not amazingly
sharp, and it seems to be loudest in the low frequencies. If I work hard enough I can convince myself that the F2 and F3
transitions into this gap are bilabial, but they're not obviously sow on the other side. At least they're not obviously
anything else....
Schwa
[ə], IPA 322
Another absurdly short vowel, mostly transition, so all it reduced and move on.
Lower-case B
[b], IPA 102
Well here's gap. Notice that the perseverative voicing here is more 'voiced'. Probably meaningful, although there's no
guarantee. THe transitions into this look bilabial, at least the F2 does. The F3 and F4 transitions out of this gap and into
the following vowel are also suggesting bilabial more than anything else. So potentially voiced and bilabial.
Schwa
[ə], IPA 322
I hear a vowel here, so I guess it's a schwa. But it's really so totally coarticulated with all but the contact-part of the
following lateral, that I can't really blame you for wondering what the heck I'm talking about.
Tilde L (Dark L)
[ɫ], IPA 209
Okay, so here's the trick. THe formants, such as they are, indicate a mid-ish vowel (neutral F1), and a very back or round
tongue body F2 well below 1000 Hz. Ideally we'd like to see F3 (or at least F4) rise to above neutral for a lateral, but no
such luck. But it can't be an /r/ with an F3 like that, and it certainly can't be a /j/ with an F2 like that. So that leaves the
lateral and the labial-velar. So which is more likely to a) follow schwa and b) form a word with the preceding. But you
can see how dark /l/s and /w/ or /u/ shaped vowels can resemble one another....

CANADA R3T 5V5
"The good guys wear white hats."

N
[ð], IPA 131
Eth
Well, there's a nice little fricative here. It's really short, such that it looks like a release, but it ain't. It's a fricative. Kind
of loud, but you can tell it's no sibilant--it has too much formant-like shaping to the spectrum. Like aspiration. Which is
what I'd probably have guessed what it is if I didn't know any better. The transitions into the vowel (and the spectral
change in the frication itself) looks like it's moving from something alveolar. Or at least coronal. So we're looking for a
non-sibilant coronal. Vaguely voiced, I thought at the time. Now I'm less certain.
[ɨ], IPA 317

Barred I
Short little vowel, with mostly transitional looking movement during. Reduced. THe F1 looks decidedly low to me,
hence barred-i, although that hardly matters.
[k], IPA 109

Lower-case K
The transitions into this look vaguely labial. The F2 seems to be coming down. The F3 is definitely coming down, but
the F4 is definitely not. And there's no particular reason for all the formants except F4 to be responding to labialization.
So if you thought this was labial, I hear you. Something weird is definitely happeing in F2. Turns out it's probably
backing or rounding rather than moving to a labial closure. But how are you suppsed to know? Search me. On the other
hand, the messy release looks sort of like a double burst. While not striclty pinchy, the F2 is clearly loweriing into the
beginning of the voicing in the vowel, so maybe that's really velar pinch.
[ʊ], IPA 321

Upsilon
Highish vowel (low F1), very back and round (as my back and round vowels go), but transitions towards schwa rather
than being steady or moving back. So this is sort of [uə], which is a pretty stereotypical realization of /ʊ/. Well, I'm
trying.
[d], IPA 104

Lower-case D
The F3 is sort of sitting there, the F2 is sharply rising to about 1700 Hz. Gotta be an alveolar. Voiced. The clunk I think is
the alveolar closure releasing--it doesn't look [s] shaped because there's no air moving from behind it, due to the ...
[k], IPA 109

Lower-case K
... closure back here. Voiceless. Again with a mushy, sort of double-looking release. Fairly clear pinch in the following
transitions. So we have to posit a second closure, and it pretty much has to be velar.
[ɑɪ], IPA 305 + 319

Ah, diphthongs. Or triphthongs. Or VISC. Or whatever. Look at that movement. F2 hits its minimum just ahead of 600
msec, around 1200 Hz. So quite back. Not at all central. At that same moment, the F1 is rising, hitting its peak just after
600 msec, at which the F1 is between 800 and 900 Hz. So about as low as my vowels ever get. Ordinarily, I'd locate a
'moment' at the F2 minimum, and another at the F1 maximum (since they don't coincide, they must be different
moments) and consider this some kind of weird low-low diphthong. Since that's not really a likely option, I'll decide to
ignore the two moments, and decide this is a low vowel. But then there's the the long, slow offglide, relatively speaking.
Which is a fronting diphthong. Apparently.
[z]̥ , IPA 133 + 402

Lower-case Z + Under-ring
So here we have a weak fricative. It is [s] shaped, being broad band and strongest in the higher (above 4000
Hz) frequencies. But it's not voiceless. Let this be a lesson--/z/ can be realized as voiceless--probably passively, due to
airflow issues rather than vocal abduction. But anyway, the correlates of /z/ involve a shorter, weaker fricative than a
corresponding [s].
[w], IPA 170

Lower Case W
Nice little moment of approximant. Loks like a nasal (sorry, no comparisons in the spectrogram) with a strong voicing
bar and nothing much in the way of resonances bove. But the transitions indicate something other than 'just' an oral
stop articulation. Just look at that swooping F2. Wow. What could cause an F2 like that? Only something seriously back
and seriously round. Which aren't really abundant in my dialect. Really, only [w] is really that back and round. And
close enough that it can seriously damp the high frequencies. Seriously. Remember that, in case it comes up again....
[ɛ], IPA 303

Epsilon
Okay, this is not the clearest vowel. One the F3 is coming down, probably pushing the F2 down with it as it does. But the
F1 clearly separates from the voicing bar, edging up past 500 Hz, if only just barely. Gotta be 'mid' at least. The F2 starts
low because of the [w] but zooms up as far as it can unti the F3 starts to push it down again. So let's call this one 'clearly
heading to someplace front'. Mid and front. Only a couple of possibilities, and one (in my dialect) typically has an
unexpectedly low F1 (i.e. is really a high vowel).
[ɹ], IPA 151

Turned R
On the other hand, whatever else might be going on, the F3 is dropping like a rock. So here we have a coda /r/. But
notice what happens, the F2 flattens out and the F3 starts to rsie again. And then ...
[w], IPA 170

Lower Case W
... something else happens. The F2 stays low low low, the F3 appears to at least start to head back to neutral, and the F1
drops again. So we've got something very close, and outrageoulsy round and back. With damped higher frequencies.
Look familiar? Good.
[ɑɪ], IPA 305 + 319

So here we go again. This one has a clearer offglide, and the F2 doesn't have a moment of its own separate from the F1
moment. Look at that. Who'd'a' guessed these two things were related.
[t], IPA 103

Lower-case T
The plosive here is weak, almost flap-like. Chew on that one for a while as you look around. But in the mean time,
there's clearly an [s]-shaped/acute release burst just before 1400 msec, so there has to be some kind of alveolar release
here. Unless we're really unlucky and it's just a clunk. It ain't. But I suppose it could ahve been.
[h], IPA 146

Lower-case H
So given that there's a plosive release, is this aspiration, or a separate thing (bonus points for anyone who can clearly
explain why this isn't really an interesting question)? Well, notice that the release noise, if that's what it is, isn't
'continued' in the following frication. That is, it doesn't just sort of keep going, as aspiration/release nosie might. The
noise is 'clustered' in the formants of the following vowel, so we're clearly dealing with something glottal, and probably
not just 'aspiration' of the preceding plosive.
[æ], IPA 325

Ash
Very high F1. Very low vowel. But the F2 is sort of, well, starts high and doesn't really fall to 'low'. So this is not a back
vowel. Which leaves a few possibilities, I admit, but if you're putting together words at this point, only one makes much
sense.
[t], IPA 103

Lower-case T
Long gap. Gotta be a plosive. The last few pulses of the vowel look like the F2 is transitioning to a labial, but the F3 isn't
moving. At all. So possibly, but not probably labial. Probably not velar (but see earlier). SO split the difference. It's a
guess. At least this sentence is semantically predictable at this point. It wasn't going to be.
[s], IPA 132

Lower-case S
Or possibly a devoiced [z]. So think about that for a second. How will you choose? (Bonus points available.)

CANADA R3T 5V5
N"Snow blocks the street."

Lower-Case S
[s], IPA 132
Well, this is sort of weak, but you can see the fricative, starting about about 100 msec and going on to almost 250 msec.
It's broad band (rather than organized into narrower formant-like organization), and concentrated in the very high
frequencies. Toward the beginning, where the overall amplitude is much less, the frequencies we can see are very high.
So this is a pretty typical sibilant, almost definitely [s].
Epsilon
[ɛ], IPA 303
So the vowel starts just before 250 msec and goes on for about 100 msec. The F1 is a little low of mid, suggeting a
slightly higher mid vowel, which is weird for me if this is [ɛ]. But anyway, this vowel looks quite central, and so this
looks very schwa-like. But judging from the amplitude it must be stressed, and if it's stressed, this can't be my [ʌ],
which is typically low. So this is probably mid or high, and otherwise non-descript. Oh well. The falling formants are
clearly transitional, since they mostly all do it, so they don't help. Not long and not tense.
Lower-Case V
[&#x;], IPA 129
Well, it's very weak, but there's frication throughout this gap up to 400 msec. There's also voicing, so this is either a
very weak voiced stop or a weak voiced fricative. The transitions in and out all suggest bilabial (although the F3 doesn't
help--more later), so, since bilabial fricatives are not an option, as this is my English, labiodental is not a stretch.
Turned R
[ɹ], IPA 151
So there's that F3, clearly transitioning way down in the previous vowel, and there it is here, down at about 1600 Hz or
so. So this must be an /r/ of some kind. Nuff said, I guess.
Schwa
[ə], IPA 322
Transcribing this as a vowel is merely a convenience. There seems to be 'something' between the /r/ and the following
segment, but exactly what is open to interpretation.
Tilde L (Dark L)
[ɫ], IPA 209
So from about 475 to 550 msec or so, there's a dip in amplitude, accompanied by an apparent zero in F2, and a relatively
high F3. Very high considering the previous /r/. I'm not quite sure what's going on in F2, but the raised F3 is usually a
good indicator of the lateral. And the F2, if it's anywyere, is down there below 1000 Hz, so it must be dark.
Barred I
[ɨ], IPA 317
Again, this is a bit of vowel. I made a mistake transcribing it as barred-i--I think I must have misread the F3 as an F2, but
that's idiotic, since even the highest F2 can't get up that high. But it's a transitional vowel more than anything else. So
there.
Lower Case W
[w], IPA 170
So here's another attenuated, presumably consonantal articulation, but fully and clearly voiced. So this is almost
undoubtedly a sonorant, but very, very close. The low F2 is consistent only with something very round and very back,
and the following F2 transition is typical of [w], so there you go.
Open O + Rhoticity Sign

[ɔ˞], IPA 306 + 419
Again, a transcriptional convenience more than anything else. I needed something mid and fairly round. This might be
better as an [o], but whatever. The /r/ colo(u)ring is fairly clear, with that low F3 again, but the vowel is again mostly
transitional. To the degree that it F1 indicates something mid and the F2 indicates something mostly back, take your
pick.
Turned R
[ɹ], IPA 151
Well, here's another /r/. Low F3, though not as low as previously. I've been noticing that the bandwidths of initial /r/s
being very narrow, but that may just be me doing stuff to do that. Anyway, thsi looks like a typical /r/ in coda position,
with the higher (closer to F3) F2 than in other positions.
Lower-Case D
[c], IPA 104
Gap. Probalby a plosive of some kind. Very voiced, which is interesting. There seems to be a folling F4 (or something),
But the F3, if anything, has a rising transition into this gap. But then it would be, since it' starts so low. The F2 is
ambiguous to say the least. SO on the balance, I think the alveolar guess is just a default thing. The release is even weak,
so it's not clear if that fricative coming up is just release (which would tell us a lot about the place of this plosive) or if
it's a fricative.
Lower-Case Z
[z], IPA 133
Well, if there's a fricative, it must be a sibilant. Look at that frequency. And it must be alveolar. Same reason. And
voiced.
Lower Case W + Rhoticity Sign

[w˞], IPA 170 + 419
I went nuts with the rhoticity this time, I guess. There's some labial shaping to the tail of the fricative, which is the only
decent indication of anything other than just the /r/. If you missed it, I don't blame you. There's some weird overlap of
the fricative and the following /r/, but the only thing I latched onto was the attenuation of the voicing before the F3
minimum coming up. The rhoticity sign is is just because the F3 is so low.

[ɹ]̩ , IPA 151 + 431
Okay, well, there's clearly something we want to call a vowel here, and it's got the typical low F3, hence the syllabic /r/
transcription.
Lower-Case G
[g], IPA 110
Well, there's another gap here, with voicing. Nice loud, but noisy release. Transitions in and out have F2 and F3 close
together, so velar is probably the best guess. Transitioning from back to front velar is apparent from the frequency of
the 'pinch' on either side.
Small Capital I
[ɪ], IPA 319
So if it ends up as afrotn velar, this vowel must be front vowel. ANd it is. Quite front, at least at the beginning. And quite
high, judging from the low F1. The F2 transitions down in a way that I'd expect an [i] to have more of a steady state or
trend upward, at least until it starts to transition into a following consonat. So this is probably lax/short/whatever you
want to call it.
Script V
[ʋ], IPA 150
Well, this looks short, and vaguely flap-like, being a short, fully voiced 'gap' looking thing. But while the amplitude
attenuation is appropriate for a flap, the sonorousness is not. The formants may dip away, but the don't 'stop' the way
the would/might ina proper flap. Then there are thte transitions. All falling. So this looks (bi)labial again. Not really a
good fricative like the previous one, but an approximant-y looking fricative. And again probably labiodental over
bilabial, just because this is English.
Schwa
[ə], IPA 322
Okay, now this looks like a schwa. Formants at 500, 1500, 2500 and--well, short of 3500, but what the heck.
Lower-Case N
[n], IPA 116
Another segment that has roughly the duration of a flap, although it might be just a tad long. And considering the
length, it's fully sonorant (with resonances), so probably not a tap. The attenuation therefore is probably close
articulation, and the discontinuity in the frequency/bandwidth of the formants, not to mention the apparent zero
below "F2" suggest a nasal. No evidence of velar pinch or a single F2/F3 range pole, and the pole is up around 1500 Hz,
too high to be my bilabial. One one choice left.
[aʊ], IPA 304 + 321
Well, abstracting away from the first 50 msec or so, the F1 here is fairly high, indicating a fairly low vowel. The F2 starts
in the central range, and goes down, indicating increasing rounding and/or backing. So this is probably a diphthong
[aʊ]. I wish I could see hte F1 dropping a little, to suggest going from low-to-high, vowel height-wise, but whatever. I
don't regard /aO/ a likely diphthong in this case.

[tʰ], IPA 103 + 404
Well, I probalby should have marked some preglottalization on the previous vowel, but I guess I just read it as low pitch
when I was doing the transcribing. But looking at it now, it looks creaky, sort of. Anyway, without the release, it would
be hard to tell anything was going on here. But the release at about 1800 msec, is clearly alveolar-looking. Sharp and
abrupt, broad band, followed by something that looks very sibilant. Typical of alveolar plosion noise. Especially if you're
used to seeing my voice in these things.

Winnipeg, Manitoba
CANADA R3T 5V5
"We paid too much for those."
Lower Case W
[w], IPA 170
Well, the voicing starts at about 75 msec, but the upper formants don't really kick on for another 50 msec or so. So
there's something here, something less 'open' than a vowel. But it doesn't look gappy or fricativey, so that leaves nasal
or approximant. The fact that the F1 is 'full' and not damped in any obvious way suggests approximant. The F2 starts
very (very, very) low, so this can't be a [j]. The F3 doesn't seem to be doing anything. Certainly not low enough to be an
[ɹ], it doesn't look raised either. So it looks like back and round is the only good choice.
Lower-case I
[i], IPA 301
So even though the F2 is apparently being pulled down on both sides, from the front by the preceding [w] and on the
right by whatever that transition is doing, it reaches an extremum (in this case a maximum) well above 2000 Hz, mayb
eeven 2200 Hz. Whenever you see a male voice with an F2 above 2200 Hz, it can only really be an [i].
So the astute spectrogram reader will be saying to itself, "[wi]. Hmm. And this is a declarative English sentence, so I'm
probably looking for some kind of NP at the beginning. Hmm."

[pʰ], IPA 101 + 404
Well, we've got a gap--a suddent cessation of resonance at all (or almost all) frequencies. There's some residual voicing
in the low frequencies, but we can ignore that, and there's a little noise from somewhere near F3, but not really enough
to make us pay attention too much. This is obviously some kind of plosive. So check the transitions. F1 doesn't tell us
much, F2 is ambiguous--it's dropping at the left and rising at the right, but it's not pointing at a frequency low enough
to be clearly clear of the alveolar locus, which for me is about 1700 or 1800 Hz. The F3 on the left isn't doing much, and
the F4 is rising. On the right, the F3 and F4 seem to be rising. So on the right we've got things entirely consistent with a
bilabial closure, and on the left we have, well, ambiguity. So I'll take the bilabial and run with it at least until I can't
make a word out of it, or can't make a word with it that makes sense with anything else. Note the VOT, so this is
apsirated.

[eɪ], IPA 302 + 319
Well, F1 is sort of mid, I guess, and moves slightly lower, suggesting a mid vowel that moves toward high. The F2 starts
very high, indicating a front vowel, and moves up, indicating fronter. So [eɪ] is the best bet.
Lower-case D
[d], IPA 104
Ah, another gap. But this one should strike you as very long. So maybe something is going on here. Look at that voicing.
It's strong, as if it was really voicing and not just perseveration of the vowel's voicing. So this might be a voiced stop. Or
part of this might be a voiced stop. I arbitrarily segmented the gap along with the voicing, just cuz, but that means we
have to look at only the left-side transitions for a cue to place of this stop (since any right-side transitions will be
covered up by the proposed following stop). So the F2 seems to be rising, but that last pulse looks like it's dropped al
ittle. F3 doesn't seem to be doing much. F4 seems to be rising, sort of, but I'm not sure what that means. Well, at least
we know it's voiced. Probably not velar. Not amazingly labial looking, and statistically [d] is more likely than [b] post-
vocalically anyway.
So now the astute spectrogram reader (hereafter to be known as the ASR) will be thinking, "[wi] might be a pronoun,
which might be a good subject, and now we have [pʰeɪd], which might just make a decent verb. Hmm."

[tʰ], IPA 103 + 404
So on to the voiceless side of this gap thing. Wow, talk about voiceless. Big huge release followed by very strong
aspiration. How much more voicless can you get? The noise is [s]-shaped, i.e. broad band, strongest in the very high
frequencies, so this is probably an aspirated [t].
Small Capital I + Upsilon

[ɪʊ], IPA 319 + 321
We have something short of 100 msec of vowel where, starting from the onset of voicing and ending at that, well, let's
just call it a 'discontinuity' for the moment, just shy of 800 msec. At the begniing, the F1 is mid or low-of-mid,
suggesting something moderately high. The F2 is, well low-of-mid and dropping, although the drop may just be
transition. F3 is just hanging out, but F4 is definitely heading downward. So all in all this looks like rounding from
something vaguely high and not at all front to something rounder or backer. Or maybe it's just transition.
The ASR at this point will be recalling that in my west-coast USA voice, /u/ is not particularly round or back, and
following coronals I will have that merged /u-ju/ thing. And this is almost definitely post-coronal.
Lower-case M
[m], IPA 114
Well, there's an abrupt discontinuity as I mentioned before, one that involves reduced amplitude and steadying of
resonant frequencies for not quite 100 msec, when about 875 msec or so there's a 'symmetrical' moment, where the
amplitude and formant movement suddenly start up again. So we've got something resonant, but of reduced amplitude
(compared to the surrounding vowels). And unlike your average approximant, the edges are quite sharp, and there's no
movement or anything happening during the 'closure'. Which is pretty good indication of a nasal. The transitions all
suggest labial, as does the relatively low pole, or whatever that is, at about 800 or 900 Hz. (My coronal pole is closer to
1000 Hz.)
Turned V
[ʌ], IPA 314
Well, without being distracted by the voicing bar, the F1 here is moving up from a middish kind of vowel to something
that is pretty definitely low. THe movement may again just be transition from the preceding labial, but whatever. The
F2 is defintely low to start with and moves, well, to the mid-range. F3 and F4 just don't tell us much. So this is a mid-to-
lowish vowel of indeterminate back-to-centralness. How's that for a description?
Lower-case T
[t], IPA 103
Gap. Plosive. Rising F2, but not much indication of a lowering F3, I guess, so whatever this is it probably ain't labial and
it probably ain't velar. That and the release looks sibilant again.
Esh
[ʃ], IPA 134
Well, I was convinced earlier that this was definitely an esh, but now I just don't see it. I'm tempted to point at that F2
or F3 shaping of the noise, but there' some of that in the previousl aspiration that I just called apsiration. The noise is a
little clunky, suggesting spittle more than anything else. And I just don't see the usual sign of esh-ness, which is an
absence of noise in the lower frequencies. So I just don't know. If you think that's just more aspiration, make a word out
of it and get back to me.
Lower-case F
[f], IPA 128
But there's definitely something going on beyond just aspiration, cuz otherwise it (or the fricative release of the
affricate, or whatever) would go on for almost 200 msec, which is just too long. So I think there's qualityative change
here, in the form of the formants which sort of go away in favor of very diffuse, unshaped, unfiltered noise in this
segment. The formants that we can see all point down on the left, and on the right most of them are rising out of the
fricative, so this is all consistent with labial. Or, since this is an English labial fricative, labiodental. And voiceless, of
course. No one is thinking voiced, right?
The ASR will be going crazing trying to make a quantity (that can be 'paid') out of the sequence [t] vowel [m] vowel [t]
fricative [f], until it starts to sound it out.

[ɹ]̩ , IPA 151 + 431
Well, look at that F3. Down there below 1800. Way low. Must be an /r/. Since it doesn't seem to have a steady state, it
doesn't really look syllabic, but if I transcribed a vowel in here it would have to go on the wrong side, so I'm doing some
finessing here. There's an /r/. There may be another vowel heading into that flappy thing, but, well, try making a word
out of.
Eth
[ð], IPA 131
So there's shortish vaguely gappy thing, but with full voicing. Could be a nasal flap, but the upper frequencies a) are
there and b) are noisy. Pretty good indicator of some kind of fricative. Or at least something oral. Transitions aren't
telling us anything. I mean nothing, in the sense of no information, and not just ambiguous. Which is often more
consistent with coronal (not to say alveolar) than anything else. So think of all the voiced, coronal, flappy fricatives you
can, and plug one in.
Turned V + Upsilon
[ʌʊ], IPA 314 + 321
Do not ask my why I have this vowel. It might be allophonic (following eth) or it might be isolated to this word, or it
might just be me watching too many Britcoms on TV. But ther it is. Mid throughout. Starting just back (or round) of
neutral and moving oh-so-slightly backer or rounder. The extreme length is presumably phrase-final lengthening
(combined with some phonological lengthening, but we'll come back to that) so that's not all that odd. The loss of
amplitude doesn't really look like nasality so much as just overall loss of amplitude, again consistent with just being
phrase-final. But get as far as mid and obviously not front, and you're doing pretty well.
[z]̥ , IPA 133 + 402
Ah, an [s]. It's broad band, it's concentrated in the very high frequencies. But it's kinda short for something that's being
phrase finally lengthened. And it's kind of weak for a phrase-finally boost. So maybe this is a [z], but devoiced. Which is
what it is. This would explain a) the obvious lack of voicing, b) the weird length (short because its voiced and
lengthened from short because it's phrase final) and c) the incredible lengthening of the preceding vowel. Calling this a
devoiced (or voiceless) [z] is what we call an 'elegant solution'. Yeehaw.
So, ASR, what did you come up with?


Winnipeg, Manitoba
CANADA R3T 5V5
"A highboy is a tall chest."
Glottal Stop
[ʔ], IPA 113
Well, glottal stops are nonphonemic in English, but this is phonetics. There's some noise, or something, in formant-
looking frequencies, before 100 msec, something that looks like a glottal pulse or two (depending on where you look)
just after 100 msec, and then regular voicing kicks in. So unless you believe this is an [h], which I suppose it could be,
you have to account for this. It doesn't look like aspiration (unless you believe this is an [h], which it isn't), so this can't
be the release of a plosive. So if it's not an [h], and it better not be, this is just the glottal 'attack' of a vowel-initial
utterance.
Schwa
[ə], IPA 322
So the vowel here starts just after 100 msec, and goes on, sort of, until almost 200 msec. It's actually very low and back,
but I swear I hear it as a schwa and not at all like an [ɑ], which is really what this looks like to me. So if I were working
from just the spectrogram, this is an incredibly short [ɑ], and the only real reason for it to be so short is that it's
reduced. So I still might call it a schwa. But check out these formants, because they'll come back to haunt us in a
moment.
Hooktop H
[᧖], IPA 147
Well, there's a dive in amplitude here, and an increase in the noise above 500 Hz, but if you notice, this is pretty much
voiced throughout (though perhaps only passively). The noise is sort of formant-shaped, if you know what I mean,
which is pretty characteristic of [h]. But voiced.

[aɪ], IPA 304 + 319
Well, it starts low (high F1) and gets high (low F1). It starts back(ish) and goes front(er). It's a diphthong. Part of the
reason I chose this word is that I wanted to compare back-to-front diphthongs. So this one starts as [ɑ], which it's not
supposed to, if you follow the usual descriptions of American English. I don't know what I was thinking when I
transcribed this. I do remember I was in a hurry. I must have been cheating.
Lower-case B
[b], IPA 102
Well, the transitions out of the preceding vowel are definitely falling into this. THe F2 transition is clearly falling below
1500 Hz, which is alreayd lower than you might expect for an alveolar. So this is pretty cledarly a [b]. It's even fully
voiced. Ignore the transitions out. They'll just confuse you.
Lower-case O + Small Capital I

[oɪ], IPA 307 + 319
Well, the F1 is mid (for the most part--I attribute the (relative) lowness at the beginning to the transition) pretty much
throughout. The F2 starts severely low (below 1000 Hz) so this definitely starts out either seriously back or round or
both. But after bottoming out around 525 msecs, it rises in an unbelievably straight line. So this starts out mid and back
and round, i.e. [o] (and not particularly [[ɔ], so for once I really was paying attention to the spectrogram), and moves to
something mid(ish) and seriously front. So of the available diphtongs, some variant of [oɪ] is the likekly candidate.
Please note the differences between this diphthong and the previous one. If you don't, this whole spectrogram will have
been a waste of time. This one looks like it has two targets and a quick as-the-crow-flies shift in between them. The
other one looks like it has two targets and a smooth acceleration-deceleration interpolation between them. Hmm. And
what the heck is going on with the F2 at the beginning of the [oi]?
Barred I
[ɨ], IPA 317
Well, there's something here beyond just transition between the offglide of the previous diphthong and the amplitude
dive between 750 and 775 msec. So there's a short little vowel there. Probably reduced from something.
Lower-case Z
[z], IPA 133
If you look at the very top of the visible part of the spectrogram, there's noise. It's strongest way there and trails off as
you go down in frequency. The noise doesn't seem to be well supported by the resonances, i.e. there's no formant-like
organization to the noise that is continuous with the vowel formants on both sides. So there's got to be a fairly close
articulation here, probably fricative or there wouldn't be that much noise, I suppose. And pretty much voiced
throughout. There's not a lot of really good transitional information in the surrounding vowels, but luckily the noise is
clearly [s]-shaped. But voiced.
Schwa
[ə], IPA 322
Another short, weak little vowel, this time from just before 800 msec to 850 or so. Probably reduced.

[tʰ], IPA 103 + 404
There's a gap between about 850 to about 925 msec. Well, except for that clunk just before 900 msec. Up there around
2800-3600 Hz. That transient thing. A clunk. It might be a closure transient. Or it might just be a clunk. It's probably the
closure transient, but it could just be a wad of spit. So ignore it (unless you think it's a release, in which case you need to
stick in another consonant in there), and concentrate on the release burst and VOT phase. OMG, that's some aspiration.
Note the [s]-shape of the noise immiediately following the release transient, and the formant structure (especially in F3
and F4) suggesting high-amplitude aspiration/airflow, rather than a separate fricative phase. The VOT looks like it's a
good 120 msec, which is pretty outrageous. This can only be the result some kind of stress probably in combination
with being initial in some kind of prosodic phrase.
Script A
[ɑ], IPA 305
On the subject of stress, it's worth noting that the voicing striations in the following vowel are far apart, indicating
relatively low pitch. So disconnet 'high pitch' from your notion of stress, and replace it, if you must, with the notion of
'pitch accent' or 'pitch excursion'. Ennyhoo, we've got a great long vowel here. Qutie high F1, so very low vowel, Very
low F2, indicating backness and/or rounding. The backness is enhanced here for contextual reasons (the velarized [l] to
follow, but that's for later). I don't have a phonemeic [ɔ] or the Canadian [ɒ], so that limits the choices.
Tilde L (Lower-case L + Mid Tilde)

[ɫ], IPA 209
Some /l/s are darker than others, but for me they're all prety dark. The only real cue that I can see that there's
something ogin on here, tho, is the attenuation and narrowing of the formants, especailly F3. I can convince myeslf that
F4 is raised, but since it's pretty depressed in the vowel, it's really just returning to neutral. So there's something here.
And the backness of the vowel suggests one of those modifications before /l/. Sort of. So confluence of possible cues
leads us, possibly, to a good idea, but without a lot of confirmatory 'positive' cues.
Lower-case T
[t], IPA 103
Well, it's mushy, but there you go. The main thing here is that there's a sudden cessation of voicing, and so approaching
the following sibilant you've got a sharp gap. Moving on.
Esh
[ʃ], IPA 134
Well, this is clearly a fricative, and probably sibiliant. It's concentrated in the higher frequencies, but not really in the
highest frequencies. I don' t know off hand if this is characteristics of /t/ shaping of esh noise in the affricate
(broadening the band of the noise, and pulling up the pole from the F2-F3-range cut-off) or if this is just how esh-noise
is really shaped. [s]-noise is usually concentrated well above 4000 Hz, and this noise is clearly centered between 3000-
4000 Hz. So it's and esh, and this is an affricate. You can tell because of the sharp onset. You get a sharp onset because of
the preceding 'gap'. TMSAISTI.
Epsilon
[ɛ], IPA 303
Lowish vowel, but not as low as it could be (as indicated by the high F1), basically neutral F2. Not much going on in F3.
So frankly ,this look slike an [a]. But it ain't. I've decided the attenuation that I always get in the middles of final
(stressed) vowels is just the low boundary tone. If this were further back (or rather 'earlier) in the utterance, I'd swear
there had to be a lateral or something here. But since we can attribute the amplitude change to the pitch excursion, we
should.
Lower-case S
[s], IPA 132
Nice little bit of noise. Centered way up above where it was in the esh. So there.
Lower-case T
[t], IPA 103
Well, there's are a real gap. What makes this look like a [t] is the release/aspiration phase. It's [s]-shaped, indicating
alveolar airflow.

Winnipeg, Manitoba
CANADA R3T 5V5
"That could be one approach."
Egad, this turns out to be a hard spectrogram, because there's a dearth of positive clues, and a lot of ambiguity.
Welcome to the real world, gang.
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
Well, there's not much here, except the one pitch-period-or-so of frication. But since the vowel here (from 200 to 325
msec or so) doesn't seem to start with creakiness, there must be *something* here. (Or this word would be vowel initial
and get a glottal onset.) The frication (rather than a sharp release) suggests a raised fricative. Utterance initial eths
often 'strengthen' to (inter)dental 'stops', but with fricative rather than transient releses. So safe bet.
Ash
[æ], IPA 325
Lowish vowel (higher than 500 Hz F1), not outrageously back (or the F2 would be lower), so this could be something
lower than mid and fronter than, well, back. Good candidates are [æ] or [ɛ].
Lower-case T
[t], IPA 103
Nice little gap (okay mushy in the lower frequencies, but whatever), apparently voiceless with a nice sharp release. So
this is almost undoubtedly a voiceless plosive of some varaiety. The release is a little ambiguous, being strongest in the
F3 region. A little low for alveolar, a little high for bilabial or velar. The transitions aren't really helping. No obvious
downtrends like bilabials, no obvious pinch like velar, no obvioust lift to F3 like alveoalrs. But the F2 doesn't seem to
move at all, and it's overing sort of above 1500 Hz. This is near the locus for alveolar transitions, but it would be nice to
ahve some serious positive evidence. How about this. How many words like 'thep' or 'thek' can you think of?

[kʰ], IPA 109 + 404
Well, part of the problem with the previous plosive was that there's some coarticulation with this plosive going on. So
this one is definitely a plosive too, but its cues are also sort of ambiguous. The F3 in the following vowel seems to point
up (into the plosive, i.e. it may be falling. and F2 seems to point down, if anything, but not a lot of useful movement is
going on. Okay, well, we know its plosive, and that 60 msec or so of aspiration clearly suggests voiceless (and aspirated,
to be precise) The release burst is sort of long, possibly doubled, and centered, sort of, in the F2 region, more
characteristic of velars than alveolars, but bilabials can't be ruled out either, except by the transitions. But speaking of
the transitions, I'm really unhappy at the absolute *absence* of velar pinch in the aspiration. What's up with that?
Schwa
[ə], IPA 322
Short vowel, mostly transition in F2. Call it schwa and move it on.
Lower-case D
[d], IPA 104
Well, if I didn't know better, I'd suggest this was a nasal. But it's not. It's not really resonant enough, considering how
voiced it clearly is. But good guess. So if it isn't a nasal, it must be a plosive. I guess. Again, transitions aren't telling us a
great deal, except that they're 'consistent' with alveolar. At least it can only be voiced.
Lower-case B
[b], IPA 102
Ditto this, as voicing is concerned. This is a long, long stretch for a single voiced consonant. There also seems to be
some kind of amplitude discontinuity just before 700 msec, if that means anything. The release (at about 750 msec) is a
little clean to be alveoalar, even though it seems to be broad band and concentrated, if anything, in the high
frequencies. But the transitions into the following vowel are totally inconsistent with that. The F2 clearly starts quite
low, and the F3 and F4 all definitely point down (toward the plosive, i.e. they rise into the vowel), which is most
consistent with bilabials.
Lower-case I
[i], IPA 302
Well, the F1 doesn't move a lot, it seems to stay sort of low. So this is a fairly high vowel throughout. The F2 extremum
(800 to 825 msec) is way high, at least 2200 Hz. The only thing that ever gets that front is [i]. And I don't usually produce
offglides that front. So this is probably [i] and not [ei] or something like that, and the F2 movement is entirely
coarticulation. That's my story.
Lower-case W
[w], IPA 170
Well, the swooping F2 can only mean extreme backness and rounding. The loss of ampoitude in the higher frequencies
suggests initial /w/, although there's really nothing here to rule out a full-on [u], since a very tightly-rounded high [u]
could damp the higher frequencies like this. The fact that there's another vowel on the other side might sway the
decision.
Turned V
[ʌ], IPA 314
This is short, and could be another schwa, but the F1 definitely approaches the lower vowel range (compare it with the
first vowel in this spectrogram), and the F2 doesn't rise they way it might. So this is farily back, even at the end. Lower-
mid, and back.
Lower-case N
[n], IPA 116
This is a nice little nasal. Sharpish edges, but fully voiced and clearly resonating. Nice little zero at about 750 Hz, and a
few more as you go up). The pole is up around 1200 Hz or so. This is a little low for my [n], but is a little high for my [m].
No apparent velar pinch, nor bilabial transitions, so in the end this is consistent with [n]. If it were a little shorter, it
would look like a nasal flap, so it would definitely be alveolar.
Schwa
[ə], IPA 322
Short vowel. Moving on.

[pʰ], IPA 101 + 404
Another plosive, followed by a sharpish release, and some pretty obvious aspiration. So voiceless and probably word
initial. The transitions from the preceding schwa suggest bilabial, which is consistent with the release information. So
that's enough of that.
Turned R
[ɹ], IPA 151
Well, even with the aspiration, we can see a seriously low F3 at the left edge of this vowel. So there must be an /r/ here.
Finally, something sreighforward. Please notice that moment just before 1400 msec, where the F2 reaches a local peak.
At abou same moment, there's amplitude change in the space between F3 and F4, and I think a bandwidth change in F3.
So that's the moment I segmented off the /r/. Something 'changes' here. Exactly what, I'm not sure, but it's as good a
landmark as any.
[oʊ], IPA 307 + 321
So starting at the moment before 1400 Hz where everthing changes, F1 seems to be mid, and drifts downward a little. So
this vowel goes from mid to higher-mid. The F2 starts sort of low and gets lower. So theres movement from backish to
backer, and/or roundish to rounder. Maybe both. Ignore the F3 transition, which is just transition. Mid and backish to
higher and rounder. It's worth noting, I guess, that there really seem to be two targets here, and two different pitches of
voice too.
Lower-case T
[t], IPA 103
Ah, gaps. Well, this one isn't bad, from the point of view of transitional information. There's a nice little closure
transient, for once, between 1525 and 1550 msecs, and the good news is that the energy in F3 is definitely higher than it
was when the harmonics in F3 seekmed to turn off. So what we have here is a rising transition. F2 rises as well, and
that's pretty typical of alveolars. That and the following fricatve pretty much make this a dead cert.
Esh
[ʃ], IPA 134
Sibilant fricative, which by the way means it's high amplitude and high frequency, Actually, I think 'sibilant' means that
it's produced by directing a jet of air at the teeth, but this is acoustic phonetics, not articulatory. This fricative, unlike
the prototypical /s/ is not obviously highest in apmlitude in the very high freuencies, but definitely has lower
frequency centeres, in the F2 and F3 region. Also the energy drops off sharply below the F2 region, leaving basically no
energy below 1500 Hz or so. Typical of Esh. Note also the sharp onset upon release of the preceding plosive. So this and
the preceding plosive together are an affricate [tʃ].

Winnipeg, Manitoba
CANADA R3T 5V5
"Roads can be icy in winter."
Turned R
[ɹ], IPA 161
This is what I call a type DA /r/. Well, first things first. THe F1 is quite low, The F2 is pretty low, but that F3 is just
freakin' *low*. So it's an /r/. My type D is one that has no steady state, i.e. the F3 is absolutely always moving and seems
to start at its minimum (insteady of having an extremum in the middle of something). It's type A in the sense that it has
three serious formants, and a clear duration prior to the kicking in of a the upper formants (which I take to be the
'beginning' of the vowel, or the 'release' of some constriction or other associated with the /r/.
Lower-case O
[o], IPA 307
Okay, so once the F2 decides where it's supposed to be going we've got a moderately flat, overall. The F3 is just too busy
trying to get back up to where it thinks it was supposed to be all along to tell us anything useful. The F1 is sort of lower
than for a basic mid vowel, but it's definitely higher than it was where it started in the /r/. So mid-to-high vowel.
Mostly back and/or round.
Lower-case D
[d], IPA 104
Voiced plosive. Clear voicing bar lasting quite a while, and no resonance. Transitions (from the [o]) pointing up.
Probably coronal.
Lower-case S
[s], IPA 132
Well, this is a little short fricative. The very high frequency noise is suggestive of an /s/, even though its duration and
overall intensity aren't all that compelling. Still, there's a plosive following, so maybe that's hiding. Also, it's an
incredibly weak position. But whatever. The duration and weakness may be indications of the 'underlying' /z/ (by
which I mean the phonologically predicted [z] allophone of the plural marker), and this may be better transcribed as a
de-voiced [z]. But I didn't.

[kʰ], IPA 109 + 404
Tiny short gap, but there it is, significant enough to get some pressure build-up and a good strong release. The release
is centered in the F2-F3 range, consistent with a velar release. Longish VOT, so this is aspirated.
Schwa
[ə], IPA 322
Tiny short vowel, transcribed as schwa and otherwise not worried about. I'm glad I noticed, this. I probably should have
marked it as a barred-i, but it was all I could do to notice it was there.
Lower-case N
[n], IPA 116
From about 375 to about 550 msec or so, there's some serious voicing going on. Most of it has formants and is therefore
resonant. So this is some kind of sonorant. The formants actually look pretty good, but they are separated by areas of
no energy, indicating the presence of zeroes, as in a nasal. So probably that's what this is. The F2 is at about 1500 Hz,
which is about where my F2 usually is for [n].
Lower-case B
[b], IPA 102
This is a shortish gap, but enough to have a clear release to it. The nasal isn't doing much in terms of transitions, but
that's the nature of nasals. For place information we'll look at the burst and the following transitions. And it looks to
me like those transitions are pointing down (that is, rising, out of the gap), which indicates bilabial. Of course that's just
a guess, since if you look at where the F2 and F3 end up, it's not like they could be heading anywhere but up out of the
gap. But they look sort of smooth, so I'll say bilabial. If they were coronal, the F2 would start a trifle lower, and if they
were velar, I'd think the F3 might start a little lower.
Lower-case I
[i], IPA 301
Well, we've got a low, low F1, and a high, high F2. So this is [i].
Glottal Stop
[ʔ], IPA 113
Well, not so much a stop, in the sense of a gap, and certainly not plosive in the usual sense, but the absence of a voicing
bar (sort of) and the irregular pulse pattern in the upper frequencies is indicative of creak or glottality or whatever you
want to call it. So we've probably got a vowel-initial word coming up, probably phrase initial too.

[aɪ], IPA 304 + 319
F1 starts (by 800 msec) up around 800 Hz, while the F2 is just low of 1500 Hz. Then by the time you get to 900 msec, the
F1 has dropped and the F2 is crossing 2000 Hz. So this goes from sort of low and neutral to high and front. Classic /aj/
diphthong.
Lower-case S
[s], IPA 132
Now that is a fricative. Look at that. Probably longer than 100 msec of voiceless fricative, with more noise on either side.
The noise is extremely broad-band, and centered in the higher frequencies. This suggests [s], which is what I'll say it is,
but frankly with something this long I'd expect the amplitude, especially in the higher frequencies, to be a lot stronger.
Then again, maybe it is, off the top of the spectrogram. Or would have been if we hadn't low-passed before sampling.
Nyquist, you know.
Lower-case I
[i], IPA 301
Another incredibly low F1 accompanied by an incredibly high F2. (By incredibly high, I mean well up above 2000 Hz, at
least in my voice). But then after reaching a max around 1150 msec, it starts to drop. F1 seems to moderate about that
moment as well, so I'm calling that a separate segment. The pattern of the movement is just not characteristic of a
transition, so this must be a separate thing. He says.
Barred I
[ɨ], IPA 317
So what is it? Heck if I known. The F1 is still sort of low, but the bandwidth has changed. The F2 is moving without any
indication of trying to get anywhere in particular either in terms of having an 'inflection point' or even a place where it
starts to slow down as it approaches its target. ALmost as if it doesn't have a target. Which is one of the descriptions of
vowel-reduction (or rather the acoustic manifestation of vowel reduction) in English. This one I have marked as a
barred-i, because it seems to me the F2 stays above 1500 Hz, and the F1 is always below 500, so the F2 is always closer to
F3 than F1, and following Keating et al. (1994), I transcribe it as barred-i.
Lower-case N
[n], IPA 116
This one is less obviously zero-ey than the preceding one, but you can still see the total reduction in amplitude
characteristics of nasals. I'd probably mistake this one for an [m], since what we can see of the F2 is low (just above 1000
Hz). But the F3 transition into the following vowle doesn't look bilabial at all. I think what this is actually a nasalized
flap-kind of thing, and the tail end (as the tongue is retreating) the oral resonance can be shaped by the lip rounding (in
preparation for the following sound). But I'm not sure what's actually going on here. Definitely a nasal, and, well, you'll
get further if you guess [n] than [m], if you are trying to make a sentence out of all this.
Lower-case W
[w], IPA 170
Well, let's start with the F3. Looks like it's rising. Don't ask me why. The following vowel seems to have a neutral F3 and
an F4, where the F4 is continuous with this F3. So I choose to ignore the evidence of the F3 and look at the rest of it. The
F2 is about as low as it can possibly get, well down below 1000 Hz. So this must be as back and as round as anything I can
produce. The transition out from this minimum is pretty straight, which is more characteristics of an /w/ retreat-from-
rounding transition than anything else, The raised F3 might indicate [l] (in this case, a dark [l], of course), but since I
can't tell if that's a raised F3 or a really low F4 (conceivably consistent with rounding) I'll again ignore the F3 and just
assume this is a [w].
Barred I
[ɨ], IPA 317
Short vowel. This one is even more schwa-like than the one I marked as a schwa. What was I thinking?
Lower-case N
[n], IPA 116
It might be easy to miss, but there's resonances at F2 and F3 above the voicing bar/F1 thing. If those were absent, I
might be include to regard the F1 thingy as perseverative voicing into an oral plosive. But the upper resonances are
there, so there must be a nasal in here. Again, the F2 is up around 1500 Hz, indicative of [n].

[tʰ], IPA 103 + 404
But there's a really sharp burst following the nasal thing, so there must be an oral plosive in here as well. The [s] shaped
release indicates a [t], and the VOT is just long enough to strike the ear as aspirated.

[ɹ]̩ , IPA 151 + 431
Well, call it [əɹ] if you want to, since the F3 moves to its minimum rather than being dead flat. But since this is
unstressed, maybe that's just characeristic of unstressed syllabic /r/. Or maybe I'm just making this all up. Low F3
indicates the /r/ at least at the end, the the only available vowel is really schwa (F1 about 500 Hz, F2 about 1500 Hz,
what else could it be?). So this is either a syllabic /r/ or a schwa-r diphthong of somekind. Not what I call a robust
contrast in American English.

Winnipeg, Manitoba
CANADA R3T 5V5
"This show's been selling out."
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
I'm going to have to study my fricatives, because they don't seem to be coming out the way I expect anymore. But this
is understandble, being initial in utterance, you don't necessarily expected it to look like a weak little fricative. But I'm
getting ahead of myself. There's some serious voicing going on for close to 100 msecs, but it isn't fully sonoroant--no
resonances above the voicing bar, and the amplitude is off compared to what is clearly a vowel coming up. So probably
not a sonorant. If it were it could only be a nasal, but with that much voicing there ought to be something above it in
the way of a pole. So this probably isn't a true sonorant. So it could be a voiced plosive, but we all know (don't we) that
English doesn't typically have voiced plosives in initial position. Which leaves voiced fricatives. The transienty releasey
thing at about 175msec and the next few pulses look noisy and disorganized, not like nice clean resonance, which is
compatible with the fricative theory. So of the available voiced fricatives, the raised (plosivized?) Eth is the most likely
just given that this is English.
Small Capital I
[ɪ], IPA 319
So looking at the resonances, there's an F1 somewhere below 500 Hz. Not outrageously low, but definitely lower than
mid, which makes this a higher-than-mid vowel. But not outrageously high. F2 is nice and high, up around 1750 or 1800
Hz, indicating something quite front, but not outrageously front.
Lower-Case S
[s], IPA 132
Well, this at least we know is a fricative. It's fairly broad band and concentrated in the very high frequencies. It's a little
suspicious in that you can see the F2 travelling though it, and there isn't much in the way of energy below the F2
resonance. Usually that kind of drop off in the amplitude profile is characteristic of an Esh, but the center of the energy
here is just too high for that. So it must be an [s].
Esh
[ʃ], IPA 134
On the other hand, this is more plausibly an Esh. The center of the energy is still a little high, but at least it's in the
visible frequencies. (By the way, in case anyone is wondering I regard the 0-4500 Hz range the 'visible frequencies' just
because it's what I'm use to looking at. Most linguistic information is below 3000 Hz for men and about 4000 for women,
except for the odd high center and noise information like here.) I'd like somebody to check these "assimilations",
maybe in Lisa Zsiga's work, about whether this is how these sequences look--the alveolar loses its low frequencies and
the postalveolar's center is pulled up. But that's how it looks to me.
Lower-Case O
[nʊ], IPA 307 + 321
Well, there's a mid-looking F1 for you. And pretty freaking flat too. F2 starts, well, sort of neutral--I think this is the
frontness of the preceding fricative at war with the backness/rounding of the vowel, and gets slightly backer and
rounder. I've really only got one vowel that does this, and it's /o/. So there you go.
Lower-Case Z
[z]̥ , IPA 133 + 402
Now this is what an [s] looks like. Well, actually, [s]s are longer and louder. This is a devoiced [z]. Note the sibilant
pattern, but short.
Lower-Case P
[p], IPA 101
Well, there's a nice little gap. You can see a closing transient at abotu 725 msec, which is interesting. Note that it's only
really obvious as a blip at the bottom. That's going to be important. Just before 800 msec, there's a release--it's very
weak, and noiseless, so it's probably not coronal. It's way to clean to be a velar release. So that leaves (bi)labial, which
gets further support from the low frequency clunk at the closure. There's not a lot that clunks down there at the low
frequencies except labial stuff.
Barred I
[ɨ], IPA 317
Vowels that are this short and relatively low amplitude are almost always a) lax, b) reduced and/or c) just plain not
worth bothering with. Take it as some version of schwa and move on.
Lower-Case N
[n], IPA 116
Well, following that moment of obvious vowel, whatever it was, ther's some more voicing, but much weaker resonances,
and there's something 'discontinuous' about the resonances. This should be ringing some kind of bell. Sonorant,
weaker-than-vowel resonances, nice little zero down there around 800-900 Hz..... So thi smust be a nasal. If you know
my voice, this can't be [m], because what resonance there is is up about 1400-1500 Hz, and my [m] resonance is closer to
1000-1100 Hz. Eng is unlikely, given that F2 transition in the barred I or whatever it is. Not only is there no evidence of
velar pinch, there's just no way for it to be compatible with velar pinch.
Lower-Case S
[s], IPA 132
See how much longer this fricative is? I wish it were higher in amplitude, but oh well. But it has the typical profile of an
[s].
Epsilon
[ɛ], IPA 303
Well, it's tough to see the F1, but I'm thinking it's that thing that mvoes sort of upward from about 550 Hz at about 1100
msec to about, oh, 750 Hz about 50 msec later. Which makes this a lower-than-mid vowel, but in no way 'low'. F2 starts
sort of neutral but is being drawn down by the low target in the following segment. So wishing really, really hard, you
come to believe this is a lowish, but not amazingly backish vowel. So that would make it Ash or Epsilon. And this isn't
really long enough to be Ash.
Tilde L (Dark L)
[ɫ], IPA 209
So there's no zero, at least in the low frequencies, but a distinct loss of amplitude. So this isn't a nasal, but there's
something here that's relatively 'close' and damping the amplitude. Sonorant consonants spring to mind. That low F2
suggests something back, which suggests [w] or dark [l]. I'd hope the F2 of [w] would get lower than this, but you can't
always count on that. But [w] wouldn't have such a weird (and asymmetrical) effect on the onset transition (compared
to the offset transition). In a perfect world the F3 or F4 would be raised to tell you this really was dark [l], but in life
there is ambiguity.
Barred I
[ɨ], IPA 317
And here's another short vowel, and again it's mostly transition. Skip it. After you notice the velar pinch.
Eng
[ŋ], IPA 119
And front velar pinch at that!. With a nice little zero, fuzzy F1 especially on the left. So probably nasal, likely a coda
nasal, and almost definitely velar.
Lower-Case A
[aʊ], IPA 304 + 321
Well, the F1, what we can see of it, is pretty high, indicating a very low vowel. Starts frontish and ends quite back and
round. And this is actually quite long. I wish my voicing doesn't die out like it always does on these final falling
intonation thingies, or you'd see the F1 and F2 targets twoard the end more clearly. But of the three 'true' diphthongs,
only one goes from front to back.
Lower-Case T
[tʰ], IPA 103 + 404
Someone pointed out to me recently that I've been marking these as 'apsirated', when in fact they're just strongly
released--'aspiration' by definition is VOT, and you can't have VOT without some V coming on at some point. And since
this is utterance final, this is just a release. But the release is [s] shaped, which is typical of alveolar releases (for extra
credit, explain why). And voiceless, of course.

Winnipeg, Manitoba
CANADA R3T 5V5
"You played that song again."
Lower-case J
[j], IPA 153
Well, I don't know if this is a separate moment or not, but the first few glottal pulses in this thing are greater in
amplitude (or the bandwidths of the formants are wider, or something) than in the transition/vowel thing. So
segmented it. The F1 is quite low, as F1s go, and the F2 is up above 2000 Hz, up in [i] territory. Coming before a vowel or
something, I follow IPA common practice and transcribe it as an approximant.
Barred U
[ʉ], IPA 318
I'm not sure if I've ever used this symbol before in my own voice. I'd ordinarily transcribe this vowel as a barred-i or
something, but I'm pretty sure it's rounded or rounding, partly in deference to the underlying rounding I usually lose
for /u/, but also in anticipation of the following bilabial. F1 is still low (high vowel), round(ing) but not at all back.

[pʰ], IPA 101 + 404
Well, there's a gap of some kind, although it looks like there's some low-frequency noise coming from somewhere. But
the release it too sharp not ot call this a plosive, so I choose to ignore that low frequency stuff. The release (at 325 msec
or so) is sharp and, well, plosive, followed by some high-energy aspiration and a 75 msec VOT. Aspirated. As for place,
the transitions suggest labial, or round, especially the F4 during the aspiration.
Belted L
[ɬ], IPA 148
Okay, I transcribed this in the spectrogram as a dark l, which it is, but it's also vastly voiceless (due mostly to the
aspiration of the preceding plosive), so as long as I was playing with my new Unicode markup database, I figured I'd go
for yet another symbol I'm not sure I've ever used before. Belted-l represents a voiceless lateral approximant and/or a
voiceless lateral fricative. More than one person has argued that there is not and cannot be a contrast between those
two things, so this symbol seems to have avoided the IPA's attempt to disambiguate its approximant and fricative
symbols. Now, how do I know there's anything here at all. First, there's the matter of the aspiration, which seems both
long and loud for just plain aspiration. Second, the F3 in the release is at about 2750 Hz, which is distinctly higher htan
the 2550 Hz it is in the vowel. Which for me is enough evidence of an [l] as I'm likely to get. Positing a dark /l/ (or a [w],
I guess) here will also allow me to explain the otherwise weird displacement of the transition into the following vowel
rather than having it happen all at once on the release of the plosive. TMSAISTI.

[eɪ], IPA 302 + 319
Well, abstracting away from the transition for a second, we've still got something that's pretty obviously a diphthong. It
ends in a high front semi-vowel definitely, and th efirst part of vowel looks like the F1 is difinitely in mid-range rather
than high or low, so we're looking at either [eI] or [oI]. I'm hoping that an [oI] the F2 would stay lower longer, just
because the target would be lower, but I'm not positive.
Lower-case D
[d], IPA 104
Well, another one of these mushy gappy things, this one looking more voiced than other one. The F3 transition is kind
of ambiguous, as is the F2, but the F2 transition clearly 'stops' before it gets too far below 1800 Hz, suggesting an
alveolar locus. But....
Eth + Raising Sign

[ð]̝ , IPA 131 + 429
...the quality of the voicing changes abruptly just after 600 msec. There's a few pulses of (weak but regular) voicing, and
where a good sharp alveolar release should be, there's a couple pulses worth of mushy stuff. Which suggests a fricative
release. And the only reason for a /d/ to a have a fricative release is if there's a fricative in there somewhere.
Ash
[æ], IPA 325
I've always been bothered by the spelling of 'ash', since it's the 'English' spelling/calque of aesc. Which brings up
another point. I use the symbol names from Pullum & Ladusaw (1996). The IPA does not have official names for most of
its symbols. Unicode very carefully names each of its symbols, with long descriptive names that are meant to be avoid
as much ambiguity as possible. I think they call this 'Latin small letter ae' or 'Latin small ligature ae'. I'll keep using
'ash', after P&L, much as I hate it. Anyway, we've got something that approaches very low for a vowel (high F1) round
about 750 msec or so, but the F2 indicates something not at all back. I'm wondering how often I see that falling F3 thing
during my /ae/s. I thing I see it a lot, but I'm not sure.
Lower-case T
[t], IPA 103
Ah, gaps. This one seems to be slightly pre-glottalized, or at least comes at the end of something with very low pitch.
Now that I look at them again, the transitions are a little ambiguous, which is a good indicator that 'alveolar' is as good
a guess as anything else. That and the glottalization, which is more prominent/common with alveolars than other
places. There's a hint of a release at about 900 mseec up at top of the spectrogram, which I took to be indicative of
plosion.
Lower-case S
[s], IPA 132
Meanwhile, after all that plosion, there's some herkin' fricative going on. Very long, quite high in amplitude, very broad
band and concentrated in the highest frequencies. Very typical of [s].
Script A + Tilde
[ɑ]̃ , IPA 305 + 424
Well, the nasalization is not represented by the usual zero, bur just by the general fuzziness of the formant structure.
Which is not helped by the high frequency F0, but there you go. The F1 you can see is about as high as it can get, and
the F2 is about as low as it can get and still be F2, if you follow me. So this is a very low, very back vowel. But, unlike my
Canadian colleagues, not round. How you can tell that I have no idea, since I don't have [ɒ] for you to compare it with.
(Hmm, on my browser, in my preferred font (Gentium) this symbol isn't popping up. It's turned-script-a, or supposed to
be.).
Eng
[ŋ], IPA 119
Well, there's definitley a change in amplitude, along with the wiping out of F1, both pretty typical of nasals. It doesn't
look at all velar (compare the velar transitions for the following consonant), but this is where top down knowledge of
ENglish will come in handy--bilabial [m] is unlikely here, and [n] would likely flap in this environment.
Barred I
[ɨ], IPA 317
Short vowel, F2 closer to F3 than F1. If I had to call it a real vowel, I'd have called it small-cap I, but given the H pitch
accent on the preceding vowel and then length of the following, I'm betting we should see this as stressless, if not
reduced, regardless of what we think it's supposed to be.
Lower-case K
[k], IPA 109
Well, you can't get any more velar, and front velar at that, than this. Mostly voiceless gap, with a short VOT.
Epsilon
[ɛ], IPA 303
Well, this looks like another [ae]. It's hard to tell exactly waht the F1 is doing since it looks like it's headed straight up in
the picture, but you can also see the bandwidth fuzzing out on you (an indicator of nasality which obviously I missed
the first time around when I did the transcription), so it could be doing just about anything. Well, anything mid-to-low
and not at all back.
Lower-case N
[n], IPA 116
Transitions aren't helpful again, and there isn't enough information in terms of poles to tell what's going on. There's
definitely something long and voiced here, probably sonorant judging by the regularity of the voicing, but beyond that
I have no idea. In the absence of a better guess, pick the alveolar.

Winnipeg, Manitoba
CANADA R3T 5V5
N"Laughter can soothe and heal."

Tilde L (Dark L)
[ɫ], IPA 209
Voicing begins, without an obvious release or anything, at about 50 msec. At about 125 msec there's a sudden change in
both the overall amount of energy and the frequencies of the formants. So I'd say we have a segment from 50 to 125
msec or so. Could be a nasal, but there's a little oo much energy in the upper formants (and no clear zero in the space
between F1 and F2, and for that matter, too clear and strong an F1). So we're probably looking at an approximant. And
probably not a glide-like approximant, owing to the discontinuity with the following vowel So this is probably /r/ or
/l/. The F1 is below 500 Hz. The F2 is at about 1250 Hz. And F3 is up around 3750 Hz at least. That's a raised F3, typical of
laterals. The F3 of an initial /r/ would presumably be down around 1500 or 1600 Hz. Note the F2, well below neutral,
indicating velarization. This might be 'lighter' than a coda /l/, but there's no way you can interpret this as anything
except velarized.
Ash
[æ], IPA 325
Nice clear formants. F1 is very high, let's say 800 Hz or so. The F2 is a hair lower than neutral, let's say 1300 or so. Not
quite low enough to be really, really back, but not really what you'd call amazingly front. I'll have to listen to this vowel
again, but I'd say this was pretty central(ized), judging from the F2 frequency. Remember how centralized, relative to
other dialects, this vowel is in the western US.
Lower-Case F
[f], IPA 128
This is an interesting lesson in acoustics. The periodicity in F3 seems to leave off at about 225 msec. But the voicing
doesn't end for almost 50 more msecs. So what seems to be happening is that as the constriction increases, the upper-
frequency harmonics are getting suppressed. I'm not sure what lip-teeth compression does to radiation, but I'm
wondering if there's either an acoustic or an aerodynamic change in spectral slope here. Anyway, There's friction
starting about 225 msec or so, and clearly voiceless friction starting 275 msec or so. It's not particularly loud friction, so
this isn't sibilant. There's some organization in the resonant frequencies, but not the kind of support you'd get with [h].
So this is probably a voiceless labiodental or interdental. I'm not sure how to tell the difference. The formant
transitions aren't really giving us much information. Odds are against the interdental, just because it's a coda of a
stressed syllable. I think.
Lower-Case T
[t], IPA 103
Nice little voicelss gap from about 300 msec to the release just after 350 msec. Interestingly enough, there seems to be
an alveolar-shaped (that is, broad band, tilted to the very high frequencies) *closure* transient. There's a nice sharp
release at about 350-375 msec or so. The release is a little odd, centered in the F3 region, or possibly showing signs of
F3/F4 pinch. F3/F4 pinch is sometimes associated with dentality (velar pinch is F2/F3), but I haven't seen it enough to
be sure about its value as a cue. But the center frequency is a little low for an alveolar burst, and might be a into the
velar-burst range. But there's no involvment with F2, which you'd expect with a velar, there's no pinch, and the burst is
sharp and fairly clean--not at all mushy or doubley-looking. So this is an alveolar burst. The lowness of the center
might have to do with the upcoming low F3 (i.e. a long front cavity?) or liprounding (i.e. a long front cavity?). But I
don't know.

[ɹ]̩ , IPA 151 + 431
There's a vowel here, sandwiched between two consonants. THe F1 is failry low. THe F2 is a little high of neutral. The F3
is way freaking low. Barely below 2000 Hz, which is why I never say 'below 2000 Hz', but it is. So this is an approximant
/r/ in syllable nucleus/local sonority peak position.

[kʰ], IPA 109 + 404
If you're a fan of these things, you know this is my voice. And you know my velar stops (and my stops in general) are
kind of mushy. So the noise here is distracting. There's some low frequency, but not much. THe main centers to the
noise are in the pinched F2/F3 range, and up in F4, if that's what that is. So while it does have some formant shaping, it
doesn't really look like an [h]. The velar pinch on both sides is a pretty strong cue, and the noise in the F2/F3 pinched
range is typical of a velar (fricative?). Note also the double-yness (though quick) of the release, or whatever you want to
call it.
Barred I
[ɨ], IPA 317
Tiny short vowel, barely four or five pulses long. We don't waste a lot of time on these. Transcribed as a reduced vowel,
following Keating et al (1994), barred-i iff the F2 is closer to the F3 than the F1, schwa elsewise.
Lower-Case N
[n], IPA 116
On the other hand, the voicing continues even though at about 600 msec the amplitude takes a sharp dive. This is a nice
nasal-y looking thing. Reduced overall amplitude, reduced formant amplitudes, and a nice clean zero between the lower
resonances. The F1 is mostly neutral or low of neutral, typical of nasals, and the F2 is nice and high (relative to nasal
poles) at about 1500, which in my voice is a very nice, clean alveolar [n]. (Velars show more F2/F3 pinchiness than we
see here, and labials always have their pole much lower, around 1000 Hz or even just below.) That "clunk" at about 625
msec (in the F2, and from the F3 all the way up) is a phenomenon known technically in the biz as a "clunk". Clunks can
happen any time, but for some reason they often happen in nasals. They're due to something viscous (saliva or some
other fluid somwhere in the vocal tract) flying around somewhere at the wrong moment. Distracting in a spectrogram,
but so obviously an anomaly (unless it happens where you might be wondering if it's a release transient or something)
that they can safely be ignored.
Lower-Case S
[s], IPA 132
So, even if you're a beginniner, you should at this point be able to tell that there's something going on from about 750
msec all the way to 900 msec. It's voiceless (no striations at the bottom). It's noisy--the energy is snowy and random,
not organized in nice striations. It's very broad band--there's no formanty-organization. It's centered (darkest) in the
very high frequencies. So this is very loud (dark) this is very high pitched (as noise goes) and very long. Sounds like a
classic sibilant to me. In fact clearly an [s]. Even though the energy cuts off (sort of) below F2, if this were an esh the
noise would be centered lower down, in the F3 F2 range, down to the cut-off frequency (around F2).
Turned M
[ɯ], IPA 316
If you are a beginner, or if you're not familiar with the west coast US vowel system (or Japanese...) this vowel will
mystify you. BUt I'll try to explain. Starting about 900 mex all the way to about 1150 or 1175 msec, there's some very
high pitched voicing going on. The F1 (lowest formant) is sort of low, at least lower than neutral (around 500 Hz), so this
vowel is higher-than-mid. The F2 starts basically neutral (near 1500 Hz), maybe a little lower (backer) and moves a little
lower (backer). F3 is nice and flat in more or less its neutral range (about 2500 Hz) as is the F4. So we've got a highish,
central-to-back and moving backer-or-rounder vowel. So this my /u/. There's nothing particularly round about it, or
alternatively it might be round but then there's nothing particularly back about it. So take your pick. I've transcribed it
as back and unround, but that's my intution, not anything measurable. In southern California, the primary effect is
definitely unrounding, although the 'centralizing' of the F2 is achieved in other dialects of US English by centralizing
the tongue but maintaining rounding. Go fig.
Eth
[ð], IPA 131
Well, the energy in the very low frequencies is 'voicing bar', even though the frequency is such you can't really see the
individual striations. So it's voice, whatever it is. It could be a mushy stop, but the noise isn't really organized the way
I'd expect. So it's probably a fricative. Voiced. Definitely not sibilant. So again we've got something that is most likely
labiodental or interdental. Here, the transitions are being a little more helpful. There's definitely a 'lift' in F4, and no
evidence of anything remotely labial about any of the transitions. So on the balance, the (inter)dental is more likely
here based on the cues, although it's pretty unlikely statistically. That should make this word really easy to identify--no
near neighbors... ;-)
Schwa
[ə], IPA 322
I probably should have transcribed a glottal stop in here, as there's defintiely some creaky voice going on here. But oh
well. The vowel here is sort of short and the creakiness doesn't make it any easier. So in the end, given the great lenght
of the preceding and following vowels, I'd say this was reduced and move on.
Lower-Case N
[n], IPA 116
Well, it's not as long as the last one, but this is another nasal. From about 1350 to 1400 msec. Or thereabouts. Following
thoes three or four clear periods of voicing. The F2 again is up around 1500 Hz, at least if you can see it. So this really
can only be [n].
Lower-Case D
[c], IPA 104
Keating et al (1994) distinguished closures from releases (in similar fashion to Steriade's Aperture Theory model of
stops), which would be a handy thing to be able to do here. This is an oral release (see the nice sharp burst) of
something that doesn't seem to have much in the way of an oral stop component. Nasals stop with oral release. But
that's not an option the IPA givse us (I'm not suggesting it should, it just underscores the theoretical constraints
imposed by strictly segmental model like IPA transcription), so there you go. The release characteristics are consistent
with alveolar. In case homorganicity wasn't an option. Given the following segment, it probably was.
Lower-Case H
[h], IPA 146
This will be controversial. Because there's some very clear voicing starting from the release of the previous stop, at 1400
msec, that goes on for almost 100 msec. There's a dip in the voicing amplitude from 1475 to 1550 msec or so. ANd then
the voicing comes back up. But if you look at the upper frequencies, there's no periodicity to speak of in the formants.
So what we have here is a mostly voiced [h]. Which I should have transcribed as such, but I was paying more attention
to the noise than I was the voicing bar. It's not unusual for intervocalic /h/s to be fully voiced, but this is just bizarre.
But it's an [h], voiced or not. Note the formanty organization of F2 and F3.
Lower-Case I
[i], IPA 301
Well, this will be controversial too. I'm guessing the 'real' vowel is really just get beginning of this, i.e. when the voicing
kicks back on at 1550 or whenever, up to when the F2 starts to dive, around 1675 msec or so, and the rest of the vowel is
just transition. But whatever. The F1 is low, the F2 is unbelievably high (especially in the preceding /h/, which is typical
of [i]. The diving F2 is transition to the following consonant.
Tilde L (Dark L)
[ɫ], IPA 209
Speaking of which, this is weird again. There's a sharp discontinuity in the F3 and F4, which makes this look like a
sudden aperture change, but the F2 and F1 keep their energy and maintain it longer than that. SO I don't know where
the 'boundary' is. There probably isn't one (again one of the limitations of the segmental model). But by the time you
get to the end, the F1 has moderated to neutral, the F2 has lowered to about 1000 Hz which clearly indicates backing or
velarization, the F3 has risen again to well above the neutral freuency it has for most of the vowels. SO this is another
velarized /l/.

Winnipeg, Manitoba
CANADA R3T 5V5
your system.
Gentium, an IPA-enabled Unicode-compliant font by Victor Gaultney
"There's a big gap in it."
Eth, IPA 131,

[D], [ð]
Well, it looks like a plosive. There's a couple pulses of something that isn't quite vowel-like right around 100 msec, and
it ain't glottalization, so this is the clue that there's an actually consonant at the beginning here. It can't be a nasal or
an approximant, unless the voicing just happens to click on for the 10-15 msecs that there's something there, which I
guess is possible, but unlikely. The edge of the vowel doesn't look like a proper plosive burst, so that leaves some kind of
fricative. At least underlyingly. So it's probably a non-sibilant (and non-/h/) fricative. The good news is that the
transitions look alveolar, so if you thing this is a /d/, you're close.

[EÕ], [ɛ˞]
The F3 drops linearly from the onset to the offset of this vowel, which is typical of /r/-colo(u)ring before /r/. And you
only really get r-colo(u)ring before /r/, so there must be an /r/ in here somewhere. But I get ahead of myself. The F1
here is a little low, which would suggest this is a mid-to-high vowel, but it probably isn't. It's very front, so given the /r/
colo(u)ring the vowel must be /I/ or /E/. Okay, it looks like /I/. But oh well.
Turned R, IPA 151,

[Ļ], [ɹ]
So where's the F3 below 2000 Hz? In your dreams.
[z], [z]
Well, it looks voiced. And it looks fricative. And it's strongest in the highest frequencies.
Schwa, IPA 322,

[Ŧ], [ə]
It's short. It's mid, vaguely central and pretty neutral F3-wise.
Lower-case B, IPA 102,

[b], [b] This gap is preceded by some pretty significant downward-trending transitions, suggesting bilabiality. The gap
seems to last almost 150 msec, and the first half is voiced. That's pretty long, and you might think that this is two
phones. Well, think what you want when you try to make a sentence out of this mess. So voiced and probably bilabial.
The release is definitely not aspirated, without a huge burst, and without a lot in the way of specific transitions. So
conservatively this is one big [b].

[I], [ɪ]
Well, if you compare the F1 of this vowel with the F1 of the prevous vowels, and wish really really hard, it's just a little
bit lower, so whatever those vowels were, this one has to be higher. And way front.

[g], [g]
Another very long gap, the first half is voiced. The transitions in (and out) suggest velar on both sides, so once again,
you might think this is two phones. THis time y ou'd be right, but I'm not sure how you'd tell the difference.

[k], [k]
The F2 F3 pinch in the transitions out suggest that this plosive (or these plosives) are still velar. The multiple burst is
also compatible with that hypothesis.
Ash, IPA 325,

[Q], [æ]
Okay, this looks like a diphthong, [ia] or something. Which it might be, but that's not how I think I talk. The preceding
velar is very front, in part due to coarticulation or coproduction or whatever with the surrounding vowels. So the vowel
here starts front, and even though the F2 dives, it doesn't dive past neutral. So from about 900 msec to the end of the
vowel, F1 is very high and F2 is fairly neutral, and both are pretty steady. So taking that as our cue, we've got a very low
vowel, that is vaguely central, which is pretty much where /ae/ is, so I take the rest of it to be transitional.
Lower-case P, IPA 101,

[p], [p]
Another gap, but this one doesn't start terribly sharply. I have to wonder if that isn't more typical of coda consonants
than onset onces, but maybe I'm the only one. For its duration it doesn't build up a lot of pressure to get released into
the following vowel, so this is either a really weak onset or a really long coda. But that's just me musing about prosody.
There's not a lot of transitional information I can take away and be sure of. Compared to the previous longish gaps, the
voicing in this one dies pretty quickly, so this is probably voiceless, whatever it is.

[i], [i]
Well, this is another ambiguous F1. The F2 starts quite front, and drops into the neutral range. The F2 tells us nothing.
So I'd say schwa or /I/ or something, hence barred-i.
Fish-hook R + Tilde IPA 124 + 428,

[R)], [ɾ]̃
Shortish gap, like a flap, but there's resonances in the formant regions. Which makes this look like a nasal. Which would
explain why the bandwidths of the F1 on either side are so goofy.
[I], [ɪ]
So, no F1 information. F2 vaguely front, but not very. I probably should have transcribed this as another barred-i, but I
didn't. But now many VnVC sequences can you think of?
Lower-case T + Right Superscript H, IPA 103,

[tH], [tʰ]
Shortish gap, with a sharp, [s]-shaped release. So even though the transitions are not telling us much, this really can
only be a /t/.

Winnipeg, Manitoba
CANADA R3T 5V5
your system.
Gentium, an IPA-enabled Unicode-compliant font by Victor Gaultney
"They're made of reclaimed wood."
Trivium: It was going to be 'they're made from reclaimed barnboard', but I couldn't get a "fr" I was happy with, and I
decided 'barn board" would just be too much rhoticity for anyone but me to be interested in.
Eth, IPA 131,

[D], [ð]
Starting just a bit after 100 msec, there's some fairly significant voicing going on, but without a lot in the way of
resonance. So there's something voiced going on here. I probably should have transcribed it with a Raising Sign, since it
looks pretty much like a stopped one of these, whatever it is, but I didn't. Maybe I saw more frication on the original. Oh
well. Clearly voiced, probably obstruent, not at all fricative, so either this is a plosive or a very weak (in the sense of
having no noise--in other words, "strong" in an absolute consonant/fortition kind of way) fricative. The F3 is a little
ambiguous, depending on how you interpret the transition at 200 msec, but isn't overwhelmingly bilabial or velar
looking. The F2 transition around 200 msec definitely points away from velar and doesn't really point 'low' enough to
indicate bilabial. So this is probably coronal. So if it isn't Eth (or [d]), I don't know what else it could be. I've seen high
F4s associated with dentals, but not consistently enough to be sure if that's "always" true.

[EÕ], [ɛ˞]
Following convention, I've transcribed a lax vowel before an /r/ (not to give anything away, but there you go), as well as
with rhoticity (ditto). And if you want to compare it with "true" [eI] diphthongs coming up later (ditto ditto), you'll see
that the F1 is in fact just a little bit higher (indicating a slightly lower vowel, although still not what I'd really want to
call 'mid'), and the F2 is definitely a little lower (though that may have to do with the rhoticity, i.e. the F3 pushing down
into the F2 space), which may indicate something less front. So you pick. Still a very front vowel, not at all low, with a
mid-ish rather than a high-ish F1. And r-colo(u)red.
Turned R, IPA 151,

[Ļ], [ɹ]
I was convinced that the F3 (and F1) leveled off a little here, while the F2 kept on sinking, so warranting a distinct
segment, but now I'm not so sure. But this F3 approaching 2000 Hz or below is usually a pretty good indicator of a North
American English /r/ floating around somewhere.

[m], [m]
Well, here's a pretty good nasal, for those of you who were clamo(u)ring for one. It's clearly of lesser amplitude than the
vowels on either side, has good sharp amplitude boundaries with very sharp transitions on either side. Good healthy
zeroes between most of the formants, and in particular a band of distinctly low energy (i.e. a zero) below 1000 Hz. All
those zeroes indicate side resonance which is usually a good indicator of nasality, especially combined with the
transitional information. The pole (weak as it is) above the low zero is about 1000 Hz, which is typical of my bilabial
[m]s.

[eI], [eɪ]
Well, the F1 here is just low of neutral, so let's call this a vaguely high or higher-mid vowel in traditional terms. Even
abstracting away from the absurd F2 transition, there's indications that there's a less-front onset before the very front
position the F2 assumes around 450 msec. Which is about as much of a distinct [e] you're ever going to see. Acoustically,
these always struck me as more like mid-onglided [i]s than [e] with an offglide. Clearly if there's anything you want to
call 'steady', it's the 'glide' portion, and not the nucleus. But that's I fight I need to have with Hillenbrand inter alia once
I have my new data properly measured.
Fish-hook R, IPA 124,

[R], [ɾ]
Ah, good ol' flap. This is a very (very) short interruption to the resonance on either side, like a tiny plosive--the closest
thing you'll ever see to a concrete demonstration of one-mouth/two-mouth theory in spectrograms. This gap is even
short for a flap, looking like it's around 20 msec long or so. Apparently fully voiced throughout, there's nonetheless
some disturbance to the airflow or something following the 'release', such that it looks vaguely aspirated (though not
long enough to count as aspirated in the "chinchilla/Japanese quail" sense, even if it weren't clearly voiced). The F2 and
F3 transitions into and out of it are helpful, pointing fairly clearly to alveolar (no pinch, with an F2 locus around 1750
Hz or so), even though we don't usually see amazing alveolar transitions with flap (they just happen too quickly and
ballistically for everything to transition correctly). Maybe this is really a short /d/. But more on the notion of short [d]s
later.
Schwa, IPA 322,

[Ŧ], [ə]
Short(ish) vowel, nothing but transition, and probably r-colo(u)red to boot. Oops. F2 slightly front, so this is probably
better transcribed as an r-colo(u)red barred-i. But the point is that this vowel is all transition, with nothing to indicate a
robust 'target'. So call it reduced (or schwa) and move on.

[v], [v]
Well, for once in our lives (actually, I'm usually careful about these things in spectrograms), a truly fricative fricative.
This fricative is even voiced (judging by the striations in the low frequencies). In the upper frequencies (above 1000 Hz)
there's some very broad-band energy, and except for it apparently being strongest in the F2-F3 range (the F3 range is
low for, um, other reasons, so this is basically just the F2 range), there's not a lot of spectral 'shaping' to the noise (note
that F4 in the surrounding vowels is kind of dead in the fricatives). The energy in the noise is not what I'd call
outrageously strong, either. So this probably isn't a sibilant. (If it were a sibilant, it might well be Esh, or rather Yogh,
but the zero you see below the F2 range is a little low for those.) So if this isn't a sibilant, it must be a voiced [h], an Eth,
or a [v]. It's probably not an /h/, which would have more obvious-looking resonances. So we're left with one of the
others. Unfortunately, due to the F3 frequency, there's not a lot to tell us that it's [v] and not Eth. The transitions are
equivocal, and the F4 isn't telling us much.
Turned R, IPA 151,

[Ļ], [ɹ]
Well, there's a canonical /r/ for you. F3 well below 2000 Hz, well-defined F1 and F2 below it. There's even some energy
in F4, sort of. But the frequency of the F3 is the giveaway. I've placed the boundary between the /r/ and the following
vowel a little arbitrarily, but it seems like the F3 starts to move for real at about 725 msec. The F2 (which is also moving
throughout, but more obviously than the F3) I think changes slope about that same moment, so I figure that's where
the boundary might be, if we have to put it somewhere.

[i], [i]
Well, abstracting away from the necessary transition from the preceding /r/, we've got something that's heading to
higher than mid (I think the F1 goes from lower-than-where-it's-been to even lower than that--but not by much, I
admit), and extremely front (high F2). When the F2 gets above 2000 Hz, I always assume it's an /i/. Even very strong /j/
offglides don't often get up that high. On the other hand, it would be easy to be distracted by the onset F2 frequency,
which is a little depressed due to the preceding /r/, into thinking this was an /ei/ or something. But it's not.

[kH], [kʰ]
Well, in addition to the F2 screaming up to 22O0 Hz or wherever, the F3 doesn't seem to be racing out of its way, such
that the two look like they'd collide. Or 'pinch', depending on your point of view. Hence this is probably a (very front)
velar. It's apparently voiceless, and there's a fair amount of noise following the release. (I take the release to be that
transient at about 850 msec. If that's the case, then the actual VOT here is close to 100 msec, which is fairly long, even
for a velar. But nonetheless, falls well to the aspirated side of things.
Lower-case L + Mid Tilde + Under-Ring, IPA 155 + 428 + 402,

[lō8], [ɫ]̥
Well, this isn't the darkest of /l/s and I probably shouldn't have used the mid-tilde for velarization here. The
voicelessness is due to the prolonged aspiration of the preceding /k/, but it is typical for the second member of this
kind of cluster to be decidedly voiceless, extending the aspiration well through the 'duration' of the second segment. If
you believe in segments. There's a change of some kind at about 890 msec or somewhere, probably not a closure
moment, but I don't know what it is. But there's also a change in the trajectories of all the formants at or near the
moment when voicing finally kicks in, suggesting a target independent of the aspiration. The F2 is rising out of it, but
the F3 and F4 are both raised during that last bit of aspiration, so /l/ is probably the best guess.

[eI], [eɪ]
This one is a little more canonical. The F1 seems to be pretty mid throughout, the F2 rising from vaguely neutral to
quite front (before dropping sharply into the next segment. So mid and front-moving. Not a lot of choices.

[m], [m]
The sudden change in the bandwidth of F1, which corresponds, more or less, to the bottom of the F2 drop (which begins
at a moment of sudden loss of amplitude a few msecs earlier), tells us something is going on here. And the best guess is
some kind of nasal, given the zero-ey quality of the upper frequencies. The F2 transitions, and the 1000 Hz or so pole
suggest the bilabial nasal, although there's more energy at 1500 Hz (the /n/ range for me) than I'd normally like. But
there may be a reason for that....
Lower-case D + Subscript Arch, IPA 104 + 432,

[d9], [d̯]
Okay, controversy. It's typical for combinations of nasal and voiced plosive to look like this. A stretch of nice normal
nasal, followed (if your lucky) by a miniscule gap, and an obviously oral release. Which is what we have here. So I've
decided that since this isn't a flap in the usual sense, but a plosive whose closure is hidden by (or coarticulated with, or
something) the preceding nasal, it's just 'short'. So I used a breve. Okay, bad solution. I'll keep thinking. (I think the
UCLAbet recommendations distinguish these with a nasal closure and an oral release, but the IPA doesn't have different
symbols for closure and release phases.) Ennyhoo, since the preceding nasal is bilabial, this plosive is obviously bilabial,
right? Oh, I'm sorry. This is actually a coronal release. Its coronality may be indicated by the 1500 Hz pole that doesn't
seem to belong in the otherwise bilabial nasal, or the apparent F2 transition (has anyone done an acoustic study of
nasal-to-nasal coarticulation?). That transition (as we approach the gap at 1100 msec, from that strongish harmonic at
850 Hz or so up to the 1500 Hz mark) may be the tongue blade rising behind the bilabial closure. Or maybe not. I don't
know. The release noise is centered in the F2/F3 range, which makes it look velar rather than either bilabial or alveolar.
I hope that has something to do with tongue body movement anticipating the upcoming [w], but hope may not get me
too far. The release spectrum could conceivably be bilabial, and it definitely doesn't look classically alveolar. But the
transitions coming out of it are definitely not bilabial. They look velar, actually. But I think they *could* be alveolar,
shaped (that is, lowered) by the combination of the bilabial release beforehand and the rounding coming up. But I'm
really treading water here, since the lip rounding clearly gets rounder until you get to about 1200 msec. So those of you
who think this is a [g], go to the top of the class, but then try to makes sense of the whole spectrogram. Sometimes, life
just doesn't work out the way we want it to.
Lower-case W, IPA 170,

[w], [w]
Well, ignoring the release of the preceding plosive, and sort of ignoring the amazing transition going on, there's a
funny drop in amplitude about 25 msec after the release of the plosive, which continues to almost 1250 msec. Lower
amplitude like this probably means a very close articulation, and therefore this is probably an approximant. (Nasal is
another possibility, but the formants are too well defined, I think.) F1 is just sitting there about 400 Hz, but the F2 is as
low as my F2 ever gets, down there around 800 Hz. Then there's this huge expanse of nothing until you get up to the F3,
which is, well, slightly raised. What I can see of it, at least. So this is either another /l/, or I could take my cue and claim
that the F4 seems to be a little low, and wonder if this isn't a [w]. Which would be right, but don't ask me how you
should be able to tell for sure. I need to work on my approximants.
Upsilon, IPA 321,

[u], [ʊ]
Well, its a little higher than mid, judging from the F1, although it might be lowering ever so slightly between 1300 and
1400 msec. The F2 goes from incredibly low (back and round) to sort of nowhere. So we've got something that starts
vaguely in the higher-backer-rounder part of the vowel space, and moves, if anywhere, towards schwa. Sounds like an
Upsilon to me.

[d], [d]
Well, there's definitely about 100 msec of voicing starting just shy of 1400 Hz. So there's something there. It probably
isn't an approximant, due to the sudden loss of upper frequencies. It probably isn't a nasal, which probably wouldn't
kick on that abruptly. There'd be *some* anticipatory nasalization, wouldn't there? And there's absolutely no evidence
of any higher resonance going on. Looks like a pretty good candidate for a plosive, although the voicing goes on an
awfully long time. But if it's a plosive, it doesn't look velar (no pinch, to speak of) or bilabial (F2 is definitely heading
up), and, lo and behold, doesn't it seem just to reach 1700 Hz before the closure hits? How's that for an example of locus
equation theory? So this is probably alveolar. And voiced.

Winnipeg, Manitoba
CANADA R3T 5V5
your system.
"It could have been the ghost."

[?], [ʔ]
The real evidence for the glottal stop here is the irregularity of the glottal pulses (voicing) at the beginning of the
vowel. After tight closure, it takes a while for normal, regular vibration to get going, and the pulses start out like a
sputtering engine.
Small Capital I + Subscript Tilde, IPA 319 + 406,

[I0], [ɪ]̰
Vowel, nice clear formants, good amplitude. So with vowels, we look first at the formants. The F1 (first, lowest, formant)
is below the mid-range (500-600 Hz or so), so this vowel is moderately high. The second formant moves from the
beginning of this vowel, between 125 and 150 msec, to the end, about 100 msec later. It starts quite high (mid-range for
F2 is 1500-1600 Hz), up around 2000, and falls, slightly, to 1800 Hz or so by the end. So the high F2 tells us that this vowel
is relatively front, quite front actually. The movement might be either 'real' movement in the vowel (i.e. evidence of a
diphthong, or moving vowel of some kind), or it might be transitional, indicating a movement from (near) a vowel
target to (near) a consonant target. Since there's a gap following, we can take at least some of this movement as
transitional. So ignoring the transitional information (which will tell us about the place of the following consonant), we
have a definitely front, fairly high vowel. This is English, so haul out your vowel charts and find a vowel in the front,
mid-to-high range. I marked this vowel as creaky-voiced in part because of the creakiness from the preceding glottal
stop, but also because some creakiness creaps in in the last few pulses as well. Hmm.
[t], [t]
A plosive is formed by a complete obstruction of the airway. The result is a gap in the spectrogram corresponding to the
duration of the closure when there can be no resonance. So here we have one. Actually two, as there's a pretty abrupt
transient suggesting a release at about 300 msec, but let's just concentrate on the first one. Since the next thing isn't
resonant, but plosive, the transitions out of this first gap won't tell us anything about its place. Since its a gap, we don't
have a lot of information during it to tell us anything, so we have to look to the transitions into the closure. As we
mentioned before, it looks like the F2 is lowering, but not a whole lot. The lowest visible frequency that F2 ever gets to
in the preceding vowel is about 1800 Hz. The F3 isn't doing a whole lot (actually, it seems to be lowering just a bit, but
since that would lead us to the wrong answer I'm going to ignore it. X-) If the F2 is heading to a frequency just above
the mid-F2 mark (1500-1600 or so), this is pretty typical of alveolars. However, since the F2 is lowering we might wonder
if this isn't bilabial. It's probably not because even though the F3 might be lowering, the F1 and F4 are definitely not.
Bilabial stricture can only lower formants, and that movement in F3 just isn't convincing, and F4 isn't cooperating at
all. Velar is ruled out because, whatever the F3 is doing, it isn't "pinching" up with the F2, which would be more typical
of velar transitions. The transient at about 300 msec is consisent with the alveolar conclusion, in that it is not strongest
in the low frequencies or equally strong at all frequencies (more consistent with bilabial bursts), nor is it strongest in
the F2-F3 range (more consistent with velars). So even if all signs don't point to alveolar, at least they don't obviously
point anywhere else.

[kH], [kʰ]
The trick to spectrogram reading is both positive and negative reasoning. You hope that all the cues you can see will
point to something in particular, or *not* point anywhere else. Everything should be 'consistent with' whatever
hypothesis you're entertaining, whether it is 'evidence for', not evidence against, or ambivalent or equivocal. So the
evidence here is that from 300 to about 375 msec there's another gap, so we're dealing with a plosive. There's no
voicing bar, so it's voiceless. The transitions into the following vowel are slightly obscured due to the long period of
voicelessness (i.e. aspiration) from the release of the stop to somewhere between 425-450 msec. That's not outrageously
long for aspiration, but it's definitely well into the obviously aspirated category. So we know it's phonemically voiceless
and probably initial in its syllable (as opposed to suffixed to the end of the preceding syllable). The transitions aren't
amazingly helpful. The F1 doesn't seem to be doing much, The F2 seems to be dropping from just below 1500 Hz or so.
F3, what you can see of it, seems to be stuck at about 2400 Hz or so, but rises a little once voicing starts. So the falling F2
doesn't suggest bilabial, and its starting frequency doesn't suggest alveolar. So we might entertain velar, but then we'd
hope to see positive evidence, in the form of velar pinch. I'd be much happier of the F3 definitely started lower and rose
into the vowel. But it doesn't. So while the transitions don't point toward velar, the definitely point away from bilbial or
alveolar. So there's one more thing to consider, which is that burst. It's double There's a sharp transient at about 375
msec, and another 'twin' transient (actually a bit stronger, but the same 'shape' in terms fo frequency) about halfway
between the first one and 400 msec. (I'd say about 387 msec, but you aren't supposed to make those kinds of scale
judg(e)ments when you only have a scale in 100 msec intervals.) Double bursts are absolutely chararacteristic with
velars (you see them occasionally with bilabials and laminals/dentals but they just don't look like this). And the bursts
are definitely centered (strongest) in the F2 range, rather than lower (more consistent with bilabials) or higher
(alveolar). So we have a couple of positive cuse (double burst and burst frequency) and a couple of negative cues
(formants don't suggest anything clearly other than velar). So this is probably velar. And aspirated, as mentioned
earlier.
Upsilon, IPA 321,

[U], [ʊ]
This is very reminiscent of schwa, but then think about where upsilon is on the typical English vowel chart. Okay, this
F1 is about the same frequency as in the preceding vowel which we decided was just a little high. The F2 is below
neutral, but nowhere near as low as a seriously round vowel might be, considering where the F1 is. The F3 is pretty
neutral. So we've got a backish, highish vowel, that might be moving toward middish and centralish as it goes on. So
starting in the higher-backer space and moving towards schwa. The movement toward schwa is pretty clear evidence of
one of our English short/lax/non-peripheral vowels, and the only one that's high and back is transcribed with upsilon.
Voilá.
Fish-hook R, IPA 124,

[R], [ɾ]
So at about 525 or so, we've got a tiny short little moment of radically decreased amplitude. It's almost a gap, but
there's some resonance during, and it's just too short to be a decent plosive. So this is a brief 'interruption' to the
sonority or resonance going on around it. Sounds like a flap to me. So this is phonemically a /t/ or /d/, but since this is
phonetics is a flap/tap thing.
Lower-case H, IPA 146,

[h], [h]
So, this has formant structure, but no striations. It's all fuzzy and noisy. So this is a fricative. But with formants. What
does that tell you?
Okay, I'll tell you. This is /h/. Voiceless source, either glottal or epiglottal friction, exciting all the open cavities of the
vocal tract, just as voicing would. Review source-filter theory.
Schwa IPA 322,

[侷, [ə]
Okay, so this is mid-to-high again, but just slightly more mid than the previous vowels. The F2 is dead smack in the
neutral range. The F3 is a ittle low, but what do you want.

[v], [v]
Approximant V , ʋ IPA 150, is a possibility here, excpet that is usually thought of as sonorant and frictionless. This looks
more like a plosive, except that there's some evidence (depending on your screen resolution) of friction here, between
700 and 750 msec, at least, which is what I take to be the duraiton of the segment, more or less. Hence fricative. And it
looks weak, but striated at the bottom, so there you go. The transitions aren't particularly useful, but if anything the're
trending downward, all of them, so this looks vaguely labial.
Lower-case B, IPA 102,

[b], [b]
So there's evidence of a plosive, in particular the burst. Now, having clued you in about double bursts, you will no doubt
really, really want this to be a double burst. And perhaps it is, but it's not the same kind. The two bursty things don't
look the same, and neither looks velar. They first one (and sort of the second one, is strongest in the low frequencies,
rathe rthan the F2/F3 range, although that's arguable, I suppose. But they're both exceptionally weak.

[i], [i]
Well, we're back to a vowel. Mid-to-high again (someday I'll remember to have a variety of heights in a spectrogram),
and mostly front. You do the math.

[n], [n]
Well, the firs thitng to notice is that it's definitely voiced, and resonant, and then that it's of greatly reduced amplitude
from the surrounding vowel(s). So this is probably a sonorant consonant, probably a nasal, judging from the zeroes. If
you now my voice, you recognize the frequency (near 1500 Hz) of the pole, which is indicative of my [n]. If you dont you
have to find a way to convince yourself that the transitions are alveolar, but since there's another coronal coming up,
your only real clue is the F3, which isn't amazingly helpful.
Eth, IPA 131,

[D], [ð]
So your first clue that there's something else beside the nasal going on ehre is the change in amplitude, discontinuity,
whatever it is, at about 975 msec. Its not much, but it tells us that something changed there. What exactly that is is not
really readable, except that the F3 and F4 transtions in the following vowel look decidedly alveolar. The F2 doesn't help.
If you don't catch that, there's no real good clue that anything is going on (except the otherwise exceptional length of a
nasal stop in a relatively weak position), and you just have to pick this up from context.
Schwa IPA 322,

[侷, [ə]
Schwa, ah, schwa. Pretty classic, short vowel, quite low pitch of voice so probably stressless and reduced. Evenly spaced
formants.
[g], [g]
I transcribed this as voiced, although now I'm not sure. There's a nivce velar-looking double burst, followed by evidence
of velar transitions, so velar's the best guess here. Unaspirated, definitely, so phonemically /g/, at least.
Lower-case O + Upsilon, IPA 307 + 321,

[oU], [oʊ]
Okay, looking at the F1, it's higher than most of the previous vowels, which have mostly been mid to high. So this is
clearly mid, through most of its duration, although there's something going on about 1400 msec and after we may want
to pay attention to. The F2 starts around the neutral frequency, but drops rapidly, indicating increasing backness or
rounding through this vowel. Hence this is a mid, backish vowel, getting backer, hence diphthong-y /o/. I don't know
what's going on at 1400 msec. The pitch is rising, but I may just get a little creaky here. I can never tell when listening
to my own voice. I bet there's something interesting going on here with the creakiness and the rising pitch. But
whatever.

[s], [s]
Well, you can't really ask for a better /s/, unless it didn't suddenly cut off in the low frequencies, which makes it look a
little [S]/[s] (Esh)-like. So there you go.
Lower-case T + Right Superscript H, IPA 103 + 404,

[tH], [tʰ]
Well, except for the gap, this is more /s/. So the gap, given the position at the end of the syllable/word/utterance, is
most likely /t/, since there's no evidence of labial or velar shaping. The real question is whether it's /sts/ or just /st/
with heavy release. Well, it turns out to be heavy release, but I'm not sure how you'd know that with nothing to
compare it to.

Winnipeg, Manitoba
CANADA R3T 5V5
drop me a line.
"The bus leaves on the half-hour."
Segmental cues
Eth, IPA 131

SIL [D], Unicode [ð]
Okay, so whatever this is, it's voiced. There's not a lot in the way of resonance here, although that's not that
informative, given the initial position. The transitions in the following vowel suggests coronal (F2 falls from just above
1500, F3 falls from somewhere), and there's a bit of noise in the very high frequencies. The noise seems like it goes on
sort of long just to be 'release', so this is probably frication. Coronal frication. And voiced. But this isn't sibilant at all
(too much voicing and not enough noise) so that pretty much leaves Eth.
Schwa, IPA 322

Once again, the rules for teeny short vowels, and this is probably the shortest vowel in this utterance: 1) don't waste a
lot of time trying to work out the category--if it's that short, it's probably stressless, and therefore 'reduced'; 2)
therefore transcribe it as schwa, or barred i. Following Keating et al 1994, I use barred i if F2 is closer to F3 than F1, and
schaw otherwise. Done.

SIL [b], Unicode [b]
Okay, there's a nice little gap here, indicating some fairly serious closure. It's fully voiced, and except for some
nonsense above 1000 Hz, there's nothing going on higher up. The transitions, particularly F2 and F3, are falling as the
approach this closure, and rising out of it after release. This is typical of bilabials.
Turned V, IPA 314

SIL [鼕, Unicode [ʌ]
Well, this is a nice little vowel. Once the F1 gets to where it seems to be going about a third of the way through, it seems
to be hovering around 750 Hz or so, which is a little high, so this vowel is low of mid. The F2 seems to get where it's
going fairly quickly and starts to move somewhere else in the last third of the vowel. For the first two thirds, though,
it's hovering around 1200 Hz. So this is pretty far back. And lower than mid. So go look at the IPA chart and find a mid-
to-low, back vowel. The F3 is just a little high of neutral, sort of, which suggests nothing except this vowel cannot be at
all round. Which pretty much narrows it down, IPA-wise.

This is a fairly classic [s]. There's no trace of voicing here anywhere. There's very broad band frication energy, and it
gets stronger as we go up in frequency. This is classically [s]. So transcribe it in the classical manner, and get on with it.
Lower-case L + Mid Tilde, IPA 155 + 428 (composed, IPA 209)

SIL [l瀧, Unicode [ɫ]
Well, there's very little evidence that there's anything here at all, but the sentence doesn't make sense if there isn't
something here. So if there must be something here, what can it be. There's some friction at the upper frequencies, but
that's actually sort of a red herring. The F3 is obliterated by the noise or whatever that is between 2250 and 3250 Hz. So
there's nothing except the F2 to go on. The F2 is quite low (though not quite as low as I'd expect for [w]), which leaves
[l]. Which you'll all recall is always velarized in (North American) English, though not always to the same degree.

SIL [v], Unicode [v]
This is even fricative, so don't give me grief. I want to point out that there's frication more or less continuous starting
from about 825 msec and continuing straight through to about 925 msec. But if you look at the voicing bar (the low-
frequency striations), there's a change that happens just shy of 900 msec. The bandwidth of that energy reduces,
suggesting a change of stricture (toward greater stricture) at that point. So there's your evidence, if you need some,
that there's a segment here. Okay, frankly, the [v] might actually start earlier, i.e. around 800 msec, which is when the
transitions are clearly pointing down, as for (bi)labials. The higher-frequency frication doesn't begin until later, which
is probably coarticulation with the following [z]. But I marked the beginning of this at the moment where the F1 seems
to lose energy, i.e. go from fully resonant vowel to less resonant approximant/fricative thingy.

SIL [z], Unicode [z]
Okay, excluding any clues that might have led us to the preceding [v], this is voiced, it's a fricative, and it looks like it's
high-frequency. The transitions into the following vowel indicate a coronal, (notice the F2 and F3 fall slightly into the
following vowel).
Script A, IPA 305

SIL [A], Unicode [ɑ]
Well, if you look very closely (and wish very hard), the F1 here is just a little higher here than the second vowel back. So
it's lower. How many lower vowels than that can you think of. This one must be back, considering how low the F2 is. La.

Well, finally, this one is sonorant. You can tell because it has resonances. It's of lower amplitude than the vowel on
either side, so it must be a consonant. It has a pretty good zero between 1250 and 2100 Hz or so, which suggests nasal.
Once you know it's a nasal, you know (because it's my voice) that my coronal nasal has an F2 (or whatever) around 1100
Hz, where my bilabial one has one higher (about 1500 Hz or so). So this one must be the coronal.
Eth, IPA 131

Well, as little evidence as there was for the Eth at the beginning of this utterance, there's just none for this one. This is
probably because in general nasal-voiced plosive sequences wipe out all evidence of the oral closure (if there is one)
with the nasal, and this is similar. I could make up somethng about there being something in the high frequencies, but
there just isn't. Get this one by top-down processing.
Schwa, IPA 322

What I said before about little short vowels.

Well, I just realized that this is obviously fully voiced, and so I should have transcribed it as hooktop H, but I didn't. This
is broad band noise, but it's organized in formants (at least F2 and F3), which is what happens when you have breathy-
voiced source moving through an otherwise open vocal tract.
Ash, IPA 325

Well, frankly this is as low a vowel as I have every produced. I don' t quite know how I did it, but...there's the F1, way up
there. The F2, while it looks sort of low, if you notice is really right around 1500 Hz, i.e. neutral. So this isn't really back.
Low, not back, you have a couple of choices. I chose ash.

Okay, I would have called this one an [h] again, but I knew better. And looking at it again, although there is some
formant organization, the real noise in this thing, which is really just the middle of this segment, isn't connected to the
formants. It's broadish band, but unshaped by much in the way of filtering, which suggests an extremely forward
fricative, i.e. /f/ or theta. There's not much in the way of transition or other clues here, so either is a good guess. But
one of them makes a decent word and one doesn't.
Lower-case A + Upsilon, IPA 304 + 321

SIL [aU], Unicode [aʊ]
Okay, this starts, really, in exactly the same place as the previous one, so they may be coarticulating in some way. That
and I just used the conventional notation for the diphthong. Starts low and stays there, which is a little odd. Starts back
and gets backer, i.e. rounder, which as far as English diphthongs goes, only gives you one choice. I went back and forth
over whether I should put a [w] in, at about the point where I put the segment mark, i.e around 1650 msec or so, since
there's definitely a moment there where the F2 reaches its minimum and steady state. But since that moment does not
correspond with anything like a steady state or gap in the higher frequencies, I decided not to. Discuss.

SIL [灼], Unicode [ɹ]̩
F3, while noisy, is exceedingly low, around, I suppose 1750 Hz or so. Below 2000 Hz. Can only be /r/. For me, I feel this as
a syllable. Discuss.
I've tried to follow the current E_ToBI transcription conventions, with a few adjustments. Rather than a separate
orthographic tier, I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the
Break Index Tier as a single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-)
tones and boundary (%) tones to the left of their appropriate Break Index.
"the"
Break Index: 0
My range really seems to bottom out just above 100 Hz at the beginning of an utterance, and just below at the end. This
is nice and flat and low, hence L*.
"bus"
Break Index: 1
Lexical word, and it seems to get a high pitch (although not as high as most other HiF0s in my voice), hence H*.
"leaves"
Break Index: 3
I had to give this its own H*, first because it's a lexical word at the end of a phrase, so it needed something of its own,
but it's oddly displaced to the beginning of the vowel. I don't think this is just interpolation between the preceding H*
and the following L-. When I said it, and listened to it, it sure felt like an H+L, of whichever * variety, but those aren't
allowed any more, and we can get the L from the phrase accent (i.e. the - tone) associated with the 3BI. So that's what I
did.
"on"
Break Index: 0
'Nother one.
"the"
Break Index: 0
Ditto. Function words, short, stressless, I figure are what else is a 0BI for?
"half"
Break Index: 0
Okay, once again, I appeal to the ToBI people to rule on this one. I've marked this with a 0BI, because this is the first
'word' in a compound, i.e. there's no lexical word boundary here. On the other hand, I think this word gets its own H*,
perhaps because it's potentially contrastive or focussed, so there you go. So would that be a 1BI? or even a 2BI? Or
would I not bother, and just mark both parts of the compound as one word without a BI in between, in which case what
the heck would the H on this word?
"hour"
Break Index: 4
Similar to the preceding, a lexical word gets an H* (or some other *) of its own, followed by phrase accent (-) and
boundary tone (%), in this case both L, resulting in the quick but extended fall on this syllable.

Winnipeg, Manitoba
CANADA R3T 5V5
drop me a line.
"An example of a spectrogram"

The backstory: Towards the end of March, I got an e-mail from Dr. Stephen Linsday at UVic asking assistance with a
figure for an intro psych textbook. He wanted an example of a spectrogram to embed in the discussion of the ear having
to decode information encoded as frequency and amplitude. After tossing out a few subversive ideas ("Psychology
sucks. Study linguistics instead" would have been too long anyway), I offered a version of this phrase, which now seems
to have met the standards of the publisher. Woo hoo.
Ash, IPA 325,

[Q], [æ]
I guess I should have transcribed this as a schwa, but since I was sort of over-articulating, I didn't do nearly the amount
of reduction as I usually do in this spectrogram. So this has moderately high F1, so the vowel is lowish. And the F2 is just
a little high of neutral, so this is frontish. Lowish and frontish is probably /ae/.

[n], [n]
Looking back, I should have transcribed this as a nasal flap, but it didn't seem so short when I was doing the figure--it's
not that much shorter than the /z/ coming up, or the /kt/ sequence coming up later than that. But this definitely has
the quality of a flap--a short interruption in amplitude. The nasal quality is suggested by the presence of the resonances
up above (regular flaps are basically super-short plosives, so they'd have gaps above the voicing bar), separated by
zeroes. Must be alveolar if it's a flap. And nasal to boot.
Barred I, IPA 317,

[ö], [ɨ]
You'd think I'd hit the first syllable of a content word with a more distinct articulation, but I think I got trapped by my
own rhythm. Having not reduced the preceding syllable, and having to stress the following syllable, I must have
'deaccented' or something, this syllable. Cuz this don't look like no [E]. It's short and unstressed (compare the
amplitude of F2 in the preceding and following vowels--this vowel just ain't as loud), so I transcribed it accordingly.
[g], [g]
Looks voiced to me. Looks sort of like a gap, though it's mushy (and we know by now my velar closures tend to be
mushy--I think it's my outsized uvula). The transitions into it are pretty typically (front) velar, with F3 falling just a
touch and F2 rising (i.e. velar pinch). That F2 transition can't really be anything else. So it's velar and voiced (in case
you weren't sure, it looks like it is). Doesn't look nasal, so that pretty much limits the options.

[z], [z]
This is vaguely voiced, but it does get lost in the noise. We've got seriously broad-band (covering a lot of frequencies)
noise here, getting weaker in the lower frequencies. It doesn't look classically /s/ like, in that it isn't clear the noise gets
stronger in the very high frequencies (by which I mean frequencies off the top of the spectrogram). So beyond marking
this as a voiced fricative, I'm not sure there's a lot of other information. The F2 transition into the next vowel suggests
coronal (note the apparent rise from 1600 or 1700 Hz to a peak at just about 400 msec), but that's not much to go on.
Still, there's nothing obviously non-/s/-like about it either. So it's a [z].
Ash, IPA 325,

[Q], [æ]
Well, there's a nice long vowel for you. This vowel starts just high of mid and goes very, very low (F1 rises). The F2 starts
quite front--well, once it's done transitioning from the preceding--and then falls a bit, which is centralization. So
technically, this starts at about [e] and falls to [a]. Well, that's not really a likely combination in English. So if you think
this is two things, keep them. But if you don't, split the difference, and you get something around IPA [ae]. Lo and
behold.
Lower-case M, Lower-case P, IPA 114, 101

[mp], [mp]
I hope somebody is going to study these nasal-plosive sequences really soon, cuz I Just Don't Get It. I'm not 100% sure
about this segmentation, so I'm going to discuss these two at once. From about 500 msec to about, well 530 msec,
there's some evidence of voicing. Then somewhere around then thre's cahnge. This (sort of) corresponds to the
moment when there's some kind of transient aup above. This could well be the labial release, or it could be some kind of
wild clunk, i.e. a transient associated with the sudden closure or opening of the velopharyngeal port, or some wad of
something randomly flying around and hitting something. Euw, but work with me. Following that, the voicing loses
energy and dies rapidly, as it would during a regular closure. So that's our evidence that there's two things going on
here--a bit of 'open' voicing, followed by a bit of 'dying voicing', with a clunk of some kind in between. Unless the clunk
is the release, in which case I don't know what's going on. So anyway, the apparent transition into this bit suggest labial
(F2 and F3 falling), and notice they both rise slightly into the following vowel. So we have something vaguely voiced and
possibly sonorant followed by something mostly voiceless and pretty definitely obstruent, both labial. Moving on.
Lower-case L + Mid Tilde + Syllabicity Mark, IPA 209 (155+428) + 431,

[lò`], [ɫ]̩
The only real evidence that there's something gon gon here is the funny discontinuity in the F3 range here--that and
the apparent zero in the F4 (while the F4 is low) that suddenly kicks off after about the 630 msec mark. Note the little
hump in F3 at around then that indicates that this is an /l/. Not much to go on, but there you go.
Open O, IPA 306,

[], [ɔ]
I don't know where this vowel came from, but there it is. It's quite definitely mid (look at that F1). It's quite definitely
back (look at that F2) and probably round. Ick. But there it is.

[v], [v]
There's something here. It's definitely less sonorous than either vowel around it, and there's a bit of noise between 1500
and 2500 Hz or so. So it's a fricative and voiced. There's not a lot of frication, and the voicing bar suggests something
rather open (or approximant), so that lets out any of the 'strong' fricatives. Probably /v/ or Eth. Figure out which later.
Schwa, IPA 322,

[«], [ə]
Finally, a reduced vowel that looks like a reduced vowel.

[s], [s]
Finally, an [s] that looks like an [s]. Extremely broad-band, very high amplitude, and higest amplituded in the high (as in
above 6000 Hz, which you can't see but you can infer) frequencies.
Lower-case P, IPA 101,

[p], [p]
Well, there's gap from 875 msec or so to 950 msec or so. That's about all I can say. There's no real information into or
out of it, but there it is. I'm very concerned that the first little bit of noise, or whatever that is upon release looks like
it's even with the F1 in the following vowel, but just above the F2 and very possibly just below the F3. Looks a little like
velar pinch. But that would be wrong. So it's voiceless, it's plosive, it looks slightly velar, but it's not.
Epsilon IPA 303,

[E], [ɛ]
Well, this vowel is a little low of mid, i.e. its F1 is a little above 500 Hz. It's a little bit front, juging from the F2, but it
doesn't move at all the way both previous Ash-es move, so this is definitely a monophtnong. So lower-mid, or open-mid,
depending on how you were trained, and front. It's worht noticing the F3 is falling slightly through the duration of the
vowel.

[k], [k]
See, cuz there's *actually* velar pinch at the end of this vowel, which could conceivably be a clue that whatever is going
on at the beginning of this vowel, it's not as velar looking as it might have been. So anyway, the onset of this gap is
pretty definitely velar.

[t], [t]
On the other hand, there's no way that the offset of this gap looks velar. The F2 falls, the F3 falls, which looks coronal.
Also, the release noise is sibilant and /s/-like, suggesting alveolar as well. So it turns out this gap represents the closure
of two different plosives, and this one (the second one) is alveolar. And voiceless.
Turned R, IPA 151,

[¨], [ɹ]
The F3 is already quite low, and getting lower, which might lead you to just /r/. I hear something other than a syllabic
/r/ here tho, although I can't really explain why. But the low F3 is the giveaway here.
Barred O + Rhoticity Sign, IPA 323 + 419,

[PÕ], [ɵ˞]
Okay, I don't know what's going on here. Barred-O is sort of the round, higher-mid version of schwa, and this is
definitely r-colo(u)red (low F3). I marked it as round a) because it seems to be round and b) I wanted to capture the F2
and F3 *lowering* through here, although that may be coarticulatory with the /r/ in the next syllable.

[g], [g]
Okay, my velar plosives just aren't very stoppy most of the time. But there's some indication, at least in the preceding
falling F2/F3 is pinching, and the burst/noise in the middle of this gap (just ahead of 1300 msec) is centered in the F2-F3
range, which is typical of velars (for the same reason that velar pinch happens--discuss). So this is velar. It's probably
not as fully voiced as I thought it was when I prepared the figure, but there you go.
Turned R, IPA 151,

[¨], [ɹ]
The /r/ is definitely there in the onset, in the sense that the F3 is about as low as it gets toward the [g]. Low F3.
Ash, IPA 325,

[Q], [æ]
Well, the F1 is fuzzy, but there's no reason for the voicing bar to suddenly go from 0 up to 1000 Hz if there weren't a
high first resonance to support it, so this is a high F1, and therefore a low vowel. If you look at the F2 maximum, around
1750 Hz starting around 1400 msec, that's sort of the same range as the F2 in the two previous Ash-es, and just a hair
higher than the preceeding Epsilon. So this is *really* front. and very low.

[m], [m]
Well, there's something sonorant that starts abruptly at about 1550 msec. SO it's voiced and sonorant, and it's probably
nasal a) because the F1 of the preceding vowel is so fuzzy, b) because the nasal zeroes seem to kick in during the
preceding vowel, and c) because if it were an approximant it would probably be more 'continuous' with the preceding
vowel. So it's probably nasal. ANd judging from the sudden fall in F2 at the end of the preceding vowel (and sort of the
F3) suggest bilabial more than anything else. ALso, there's just a hint of something at the left edge of this segment at
around 1500 Hz, which is sort of where my bilabial nasal pole ends up.

Winnipeg, Manitoba
CANADA R3T 5V5
drop me a line.
"I hope it's in my desk."
Segmental cues

I've got to stop doing this. I transcribed this the way I was trained to, as a diphthong. But there's nothing central about
this vowel. From 125 to 175 msec or so, this is a classic [A] or [ɑ] depending on your font setup. F1 high, F2 low, close
together and straddling the 1000 Hz mark. But it moves, quite steadily from 175 to 275 msec or so, to something quite
front (F2 rises) and not nearly so low (the F1 makes it down as far as mid, but not much further). But this is traditionally
how we transcribe this diphthong. Probably shouldn't.

This is clearly a fricative. I'm not sure what happened to the F1, but the other formants, even F3 and F4 continue
through the fricative. That's classic /h/, where the noise occurs at the glottis and excites all the cavities of an open
vocal tract. Whenever you see more two formants runing through noise, think /h/.
Lower-case O + Upsilon, IPA 307 + 321

SIL [oU], Unicode [oʊ]
The F1 goes from middish to high (500 Hz or so to lower), F2 from backish and roundish to backer and rounder.
Lower-case P, IPA 101
SIL [p], Unicode [p]
Okay, we know there's some kind of gap here (450 to 525 msec or so). It can't be velar, because the transitions are wrong
(no pinch). It doesn't look alveolar, since the F2 seems to be dropping from below 1500 (alveolar transitions seem to
have loci around 1700 or 1800 Hz or so, so if the F2 is below that, it rises, and if it's above, it falls). Which leaves bilabial.
And voiceless--there's no evidence of voicing during the closure.
Barred I, IPA 317

This is one of those vowels you take note of, but you don't spend too much time worrying about--it's too short to have
much information in it, so it's probably unstressed or destressed, and therefore reduced. Following Keating et al 1994,
reduced vowels are transcribed as schwa if the F2 is closer to the F1 than the F3 and barred I if the F2 is closer to the F3.

Another gap. Notice here, the F2 transitions seem to center/re around 1700 Hz. Probably alveolar.

Theres a teeny bit of fricative here. It would be easy to miss it as aspiration, but it important not to have missed it
altogether. There's stuff going on in F2 and above, starting about 675 msec, quite a while before the vowel comes in
between 600 and 625 msec. So you have to account for that moment somehow, and I'm calling it an /s/. The only other
thing it could be would be is aspiration, but take my word for it. I presume if we could see some higher frequencies we'd
get more...
Barred I, IPA 317

Again.

This is a long stretch of voiced sonorant consonant. Probably nasal, given the apparent zero around 1000 Hz. Now, the
thing to notice is that the first 2/3 of this duration (from 775 msec to about 875 msec) is one thing and then the
resonances change, a little, almost transitionally--but they're not transitions. Okay so this is two things. The first one is
alveolar. The pole is about 1500 Hz, which may only be useful for my boice. The main thing is that neither is likely to be
velar (no pinch on either side) and ...l

... whatever else is going on, the resonance here (above F1) is definitely lower than in the preceding bit. So of the two,
this is more likely to be labial, because the pole is at a lower frequency. By process of elimination then, the previous bit
is likely to be alveolar.

This is similar enough to the first diphthong in this utterance that I figure we won't go too long about this one. It's
worth noting that as high as the F2 gets, it does *seem* to drop just a little towards the end, as if it were transitioning
toward that 1700-1800 Hz locus point....

Another gap, the transitions indicate that it's alveolar (well, at least they don't obviously indicate anything else),
There's a fair amount of perserverative voicing in the closure, and an extremely short VOT following release. Call it
voiced and move on.
Epsilon, IPA 303

This is weird. Looking just at the clearly voiced portion, and abstracting away from the transitions (i.e. starting at about
1200 msec), this is lowish, but not as low as it could be, and frontish. So Epsilon or Ash. It definitely looks Ash-y towards
the end. Now the weird part is the fact that the amplitude, and the voicing drop out rather abruptly. Why exactly I'll
talk about below, because I think it's an accident of the prosody and segmental stuff and just what was going on when I
recorded it. But if you thought the second half of this was a segment, let me just ask you this: what the h*ll could it be?
[h]? It doesn't look like a nasal. It's partially voiced, but whatever you say about this this is a coda, and [h] is presumably
disallowed. What? Sometimes, you really do have to go completely top down.

Weak, if otherwise admirable, [s]. Fricative, 'acute' in the Lieberman and Blumstein sense, and voiceless. Its spectrum, if
nothing else, is that of an /s/.

Well, there's a gap. So it must be a plosive. It's following an [s], so it's most likely voiced, given that this is English. It's
got a nice little release, and what there is in the way of aspiration or whatever following the relese, is centered in the
F2-F3 range, i.e. looks like velar pinch. Which is pretty much the only clue that this is velar, if you can't convince
yourself that there's anything going on in the /s/ noise, which I don't think there really is.
I've tried to follow the current E_ToBI transcription conventions, with a few adjustments. Rather than a separate
orthographic tier, I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the
Break Index Tier as a single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-)
tones and boundary (%) tones to the left of their appropriate Break Index.
"I"
Break Index: 1
My range really seems to bottom out just above 100 Hz at the beginning of an utterance, and just below at the end. This
is nice and flat and low, hence L*.
"hope"
Break Index: 1
H*, accounting for the peak.
"it's"
Break Index: 0
0BIs are essentially orthographic word boundaries apparently unmarked by any prosodic effects. There's nothing going
on here except interpolation of pitch between the previous H* and some following L.
"in"
Break Index: 0
'Nother one.
"my"
Break Index: 1
Well, there has to be a L somehwere, and I pick here. It's the bottom of the pitch range (except for a very weird
boundary tone to follow), it's cleary separated from the following stuff, and, well, there needed to be one.
"desk"
Break Index: 4
These utterance-final monosyllables get really complicated. This is a lexical content word, and in this utterance it's
fairly important. So it's gets a lexical prominence, H*. It's also the end of a phrase and the end of an utterance, so it gets
boundary tones (okay, technically a phrase accent and a boundary tone) associated with each. One or the other has to
be L, since the pitch obviously falls during this vowel. So (since H*+L lexical tones are disallowed), I've chosen L- and L%.
The weird part is that the pitch bottoms out so sharply during the first half of the vowel, basically I lose voicing
altogether. Probably the result of combining final low pitch ranges with the aerodynamic requirements of the following
voiceless fricative, my voicing just dies out in the middle of the vowel. Go fig.

Winnipeg, Manitoba
CANADA R3T 5V5
drop me a line.
"They need to make a new list."
Segmental cues
Eth, IPA 131

Okay, I cheated. I know what this is, so I transcribed it. Okay, this looks very sonorant. It's not, but if you thought it was I
don't blame you. This looks a little nasal, in that there's very little energy above, oh 1600 Hz or so. But what makes this
look not so nasal is the fact that the 'boundary' with the following vowel is kind of mushy, rather than sharp. Compare
it with the edges of the segment from about 375-425 msec. Now that's sharp. So maybe we're looking at a sonorant. But
it's not /w/ or /l/ (F2s all wrong), and it's definitely not /r/ (F3's all wrong) and oh my it's not /j/ either. Which doesn't
leave a lot of choices except maybe it's not sonorant. Maybe it's just a very approximant-y fricative. Could be labial,
could be coronal, couldn't be velar or glottal. Well, that narrows things down. But it's probably about as far as you can
go. Voiced, weak fricative. Very weak. Anterior. Moving on.
Lower-case E, Small Capital I, IPA 302 + 319

This is a nice long vowel, suggesting stress. Also tenseness, but as a phonetician I'm not sure I believe in tenseness. The
vowel height starts just high of mid (F1 just below 500 Hz) and doesn't do much from there, except raise (F1 lowers)
slightly and flatten out. The F2 starts front of neutral but moves sharply forward (starts above 1600 Hz and moves
whoosh way high). Good gravy, it ends up near 2400 Hz. I'm not sure I ever produce an F2 that high. About as front as it
gets. So middish vowel, moving from front to fronter.
Now *this* looks like a nasal. It's lower in amplitude than either flanking vowel. It's got sharp edges. It has resonances
which are discontinuous with the surrounding formant transitions. It's got zeroes, where the energy completely drops
out of the spectrum, but good resonances, suggesting something 'open'. Classic nasal. There's a pole around 1500 Hz,
which is about where my pole is in alveolars.

SIL [i], Unicode [i]
Well, the F1 here is lower than the F1 started in the previous vowel, although it's not the lowest I've ever seen. F2 is way
high, so this is a high exceedingly front vowel.
Lower-case D and Lower-case T + Right Superscript H, IPA 104 and IPA 103 + 404
SIL [dtH], Unicode [dtʰ]
So I hope we all recognize the 'gap' in the spectrogram from about 550 to about 625 msec. This stop is a little weird. It's
fairly long. It's clearly voiced for the first 75 msec or so and then something else happens. If you look at the right edge,
there's a very sharp burst, more characteristic of voiceless stops, followed by some moderate aspiration. So it's going to
turn out that this is two stops, the first voiced, the second voiceless and weakly aspirated. The place of the second stop
you can read off its aspiration and I see I've fallen into old bad habits by putting the aspiration in its own segment. But
notice the strongest noise in the aspiration is at the highest frequencies--it looks like an /s/, but it's too short to be an
/s/. So it must be aspiration and it must be alveolar. The first stop is more ambiguous. The F2 in the vowel is so high it
can't go anywhere but down. The F3 also looks like it's heading down a little, but it's not 'pinching' with the F2, so velar
can be ruled out. So here's where you use your top-down information. What's more likely: [nib] or [nid]?
Barred I, IPA 317

So there's a very short span of voicing, i.e. vowel, following the aspiration, and before the sonorant consonant to follow.
So there's avowel here. It's obviously too short to be too important to anyone, so call it a reduced vowel, and following
Keating, et al. (1994) I transcribe it as a barred i if the F2 is closer to the F3 than the F1.

Another nasal, for the same reasons--very strong voicing, but overall less amplitude than a vowel, sharp edges, zeroes.
Notice the pole here is about 1200 Hz, clearly lower than the pole in the previous nasal. There's also nothing in the
transitions that strongly suggest anything but bilabial, lucky for us.
By the way, I may have forgotten to mention that the sharpness of the left edges of these nasals probably indicates that
these are syllable-initial rather than syllable-final. Syllable-final (coda) nasals typically cause nasalization of a
preceding (tautosyllabic) vowel, and hence the left 'edge' of the nasal may be be less distinct.
Lowe-case E + Small Capital I, IPA 302 + 319

ALl things considered, this is remarkably similar to the first vowel. SImilar F2 path (though of course this doesn't
achieve quite the same heights toward the end, and it starts just a little lower, partly because of the bilabial transition.
As the voicing dies off, notice the F2 and F3 paths get fuzzy, but clearly start to come together in classic 'pinch'
configuration. Which leads us to...

Okay, it's noisy, but that's just how my velars are these days. There's pretty clear pinch going on as the voicing dies
around 850. The burst noise is a little ambiguous, but it's, um, duration (work with me) suggests velar--alveolars have
nice sharp bursts, as do bilabials, usually, but velars sometimes have double bursts. It's also strongest, sort of, in the F2-
F3 pinch range, also characteristic of velars. The aspiration is a little short, but that maybe due to its being at the end of
a word.
Barred I, IPA 317

Teeny short vowel, probably unstressed and therefore reducd. F2 closer to F3 than F1, therefore transcribed as barred-i.
Moving on.

Fully voiced, and it has resonances, so this is clearly a sonorant. IT has nice sharp edges, and zeroes (ranges of no
energy). This should sound familiar. The pole at 1500 Hz should look familiar.
Small Capital I + Lower-case U, IPA 319 + 308

SIL [Iu], Unicode [ɪu]
Okay, this one is going to be controversial. So here we go. This is a reflex of OE [y], I'm told, realized in a lot of dialects of
modern English as /ju/ as in 'few'. In my dialect, /ju/ and /u/ are neutralized to something like this after coronals. So
rather than saying 'T/ju/sday' and 'n/ju/s', I say 'toosday' and 'noos', only my 'oo', doesn't look like my usual /u/, but
like this, with a very obvious front on-glide and a very obvious back (and round, in this utterance) off-glide. Which I
have chosen to transcribe like this. Let the controversy ensue.
Lower-case L + Mid Tilde, IPA 209 (precomposed, but I'm not sure why)
All /l/s in English are dark. This one is. See the F2. Low. That indicates backness. That indicates velarization. Or
'darkness'. THis one is fully voiced, and resonant, it doesn't have a good zero or sharp edges of a nasal, and it's got the
raised F3 I associate with /l/s.

SIL [I], Unicode [ɪ]
This is the wimpiest [I] I've ever seen. It looks more like a schwa. If this were /l/-final, I'd say it was just backed to schwa
or barred-i. But this is just a backed /I/. I don't know why. The F1 is mid. The F2 is just barely higher than neutral. Okay,
call it a schwa if you want, but then you end up with a sentence that doesn't make any sense.

This is pretty wimpy for an /s/ too, but it definitely has the appropriate spectrum. Broad band noise, essentially at all
frequencies, strongest in the frequencies above the usual speech-range frequences (say 100-4500 Hz or so). Could be a
devoiced /z/, but then that would still be a phonetic [s] on a spectrogram. But that could explain the relative weakness
of this sibilant. On the other hand, devoiced /z/s tend to be really short.

The only real cue to the place of this, aside from the lack of shaping of the preceding [s] spectrum is the
noise/aspiration in the release, which is [s] shaped. If you think about the release of a [t], you've got the makings of a
fairly good [s]. My [s]s tend to be laminal while my [t]s are upper apical (to use Sarah Dart's terminology), so there is a
difference. Don't really know how that pans out acoustically though.
Once again, I've played with the current E_ToBI transcription conventions. Rather than a separate orthographic tier,
I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the Break Index Tier as a
single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-) tones and boundary
(%) tones to the left of their appropriate Break Index.
"they"
Break Index: 1
Pronoun, but under some kind of focus. This isn't traditional focus. This is sort of idiomatic. But it's vaguely contrastive.
Anyway, there's clearly a high associated with this syllable, and true to E-ToBI form, it's realized relatively late. It starts
very low, making me wonder if this is a scooped accent, L+H*, which is probably appropriate for this rhetorical position.
"need"
Break Index: 1
I assume since this is the main verb of the upstairs clause, it counts as a lexical word for purposes of break indexing. So
I've given it a 1. And I've given it an L* to capture the pitch which drops sharply after the preceding H*.
"to"
Break Index: 0
Cliticized, if that's still the word for it. This word is reduced to almost nothing, doesn't get any degree of stress, isn't
under any kind of focus, and certainly doesn't get its own pitch accent.
"make"
Break Index: 1
Again, this is a main verb, in the downstairs clause, and I think counts as a word. No obvious pitch *changes* go on
here, so I assume this is just a L*.
"a"
Break Index: 0
Another (pro?)clitic, undeserving and reduced.
"new"
Break Index: 1
Okay, I think this is word, in the sense of getting a 1BI. But it really doesn't look like it gets a pitch accent. From the L*
on the preceding word, to the H* following, this looks like simple interpolation. So I haven't given it a pitch accent. So
there.
"list"
Break Index: 1
Nice high pitch on this, so it gets an H*. It being at the end of an utterance, it gets a 4BI, and both a phrase accent (L-)
and a boundary tone (L%). I'm a little concerned about the placement of the L-, which is supposed to be attracted to the
edge of the phrase, and therefore (I think) is supposed to be indistinguishable from just the L%. But since the high pitch
on this word is actually toward the end of the first *half* of the vowel, I think (in this model) we need a L autosegment
of some kind to get this contour. And last time I checked H*+L isn't used any more. I think HLs are never supposed to
surface as falls, but just trigger downstep on a following H--thanks to the grad students in the Intonation and Prosody
seminar last term for helping me finally get this use of HL. I still don't really believe in downstep. But since what I'm
doing here is really 'transcribing' rather than doing a strong phonological analysis, I'll make use of the existing Ls in
the string rather than introducing a new one. Hence I slide the L- over into the vowel and away from the boundary, and
pretend like I know what I'm doing.

Winnipeg, Manitoba
CANADA R3T 5V5
This page must be viewed with an IPA-enabled Unicode compliant font listed in my style sheet installed on your
hardware. Click here for an up-to-date list of supported fonts.
"Marmalade doesn't go with cheese!"
Lower-case M
[m], IPA 131

[ɑ˞], IPA 305 + 419
Turned R
[ɹ], IPA 151
Lower-case M
[m], IPA 131
Schwa
[ə], IPA 322
Lower-case L + Mid Tilde

[ɫ], IPA 209 (155 + 428)
Schwa
[ə], IPA 322

[eɪ], IPA 302 + 319
Lower-case D + Length Mark

[dː], IPA 104 + 503
Schwa
[ə], IPA 322
Lower-case Z
[z], IPA 133
Schwa
[ə], IPA 322
Lower-case N
[n], IPA 116
Lower-case T
[t], IPA 103
Lower-case K
[k], IPA 109
[oʊ], IPA 307 + 321
Lower-case W
[w], IPA 170
S Small Capital I
[ɪ], IPA 319
Theta
[θ], IPA 130
Lower-case T
[t], IPA 103
Esh
[ʃ], IPA 134
Lower-case I
[i], IPA 301
Lower-case S
[s], IPA 132

Winnipeg, Manitoba
CANADA R3T 5V5
"The plane took off on time."
Eth + Under-Ring
[ð]̥ , IPA 131 + 402
I give up. When in doubt, if it doesn't start with a glottal stop, and this doesn't, guess Eth. It won't make any difference.
Just guess. But here, I worked for I don't know how many utterances trying to get a) noise and b) voicing. Well, at least
this time there is noise. Sort of. It's way too weak and short to be an initial sibilant, but from just it's frequency, that's
what it looks like.
Schwa
[ə], IPA 322
Vowel. When in doubt, guess schwa.

[pʰ], IPA 101 +404
Well,. gap. Nice and voiceless, with a long VOT, so clearly aspirated. The transitions in the schwa all point downward,
suggesting labial. The burst is a little misleading, in that there's a concentration of high amplitude energy in the very
high frequencies, which makes the release look a little sibilant, i.e. coronal. But the release burst, such as itis doesn't
have the straight up-and-down loud, sharp transient I'd expect of so plosive a [t] release.
lò
Tilde L (Dark L)
[ɫ], IPA 209
Under-Ring
[],̥ IPA 402A
eI
[eɪ], IPA 302 + 319
Lower-Case N
[n], IPA 116
tH
Lower-Case T
[tʰ], IPA 103 + 404
Upsilon
[ʊ], IPA 321
Lower-Case K
[k], IPA 109
Lowering Sign
[],̞ IPA 430

Turned Script A
[ɒ], IPA 313
Lower-Case F
[f], IPA 128

Turned Script A
[ɒ], IPA 313
Lower-Case N
[n], IPA 116
tH
Lower-Case T
[tʰ], IPA 103 + 404
Script A
[ɑ], IPA 305
I
Small Capital I
[ɪ], IPA 319
Lower-Case M
[m], IPA 114

Winnipeg, Manitoba
CANADA R3T 5V5
"You don't want a cheap substitute."
Technically, it's January. So even though this is the solution to December 2006, I'm embarking on my January 2007
policy of no longer marking up IPA characters as being in a special font. From here on out, you must view my page(s) in
a IPA-enabled Unicode-compliant font. I recommend Gentium from among the freeware available fonts. My style sheet
automagically prefers Gentium if you have it loaded on your system. See my list of supported fonts for more
information.
Lower-case J
[j], IPA 153
Starting at about 100 msec is a period of voicing. It's sort of weak, in that it gets a lot stronger in the following vowel,
and there's no evidence of energy between the the voicing bar/F1 or whatever it is and the F2, which is up above 2000
Hz. So while voiced and sonorant (i.e. with resonances, indicating an open vocal tract), it's not a vowel. So it has to be a
nasal or an approximant of some kind. Could be a nasal (sonorant, overall weak amplitude, and an apparent zero above
F1), that doesn't jive with the F2. More than anything else, this looks like a consonant version of an high front vowel [i].
Can anyone say palatal approximant?
Barred I
[ɨ], IPA 317
Well, the F1/voicing bar whatever thingy is stronger here (from about 150 msec to about 225 msec), and the striations
between F1 and F2 come in, but it's still weak, compared with vowels coming later on. Weakness, in vowel amplitude, is
a correlate of lack-of-stress, so this is probably a reduced vowel. F2 is closer to F3, so following Keating et al (1994), it's
transcribed as barred-i.
Lower-case D
[d], IPA 104
Well, there's some weak voicing down at the bottom, but absolutely nothing above that, and just before 300 msec
there's a nice sharp burst. The burst indicates that this has to be a plosive, since only an obstruent has pressure that
releases in a burst like that. The F3 is not telling us a great deal. The F2 transitions are headed toward that 'around 1700
Hz' area often associated with alveolar transitions. So on the balance, this is probably an alveolar, although I could
entertain an argument for a front velar. Although the burst is a little 'sharp' for that...
[oʊ], IPA 307 + 321
So, abstracting away from the transitions from the plosive, the F1 seems to hit a 'moment' at about 350 msec around
600 Hz, and then starts to head back down. Maxima/minima 'turning point' 'moments' like that usually indicate a
'target' of some kind has been reached (or undershot) between other targets, so this vowel starts either lower-mid or
low, and then moves toward someplace higher in the space. The F2 doesn't hit its 'moment' at the same time, so
chances are this is one coordinated movement rather than two distinct targets. The F2 'moment' is a low just around
900 Hz (very back and or round) at a 'moment' when the F2 seems to straighten out. So this moment is mid-to-higher-
mid and backish/roundish. So going from somewhere sort of back and lower-mid (at the first F1 'moment') to higher
and backer/round (at the F2 'moment') is something like a backish mid, tense vowel, diphthongized to something
higher and rounder (or backer). Or [nʊ].
Lower-case N
[n], IPA 116
Well, there's something going on between the time the resonances cut off (at about 450 msec) and the bursty thing
(maybe it's just a pulse) at 500 msec. There's a sharp reduction in voicing amplitude and resonance, but there's some
evidence of open-ness (in the vocal-tract resonating sense) in the F3(?) range above 2000 Hz. It's weak but it's there.
And not much else until you get well above 4000 Hz, and there's something noisy happening in the low frequencies. So
what is this? Well, it's open so we're talking vowel, approximant, or nasal. The fact that the F3 seems to drop into it
while the F2 seems to rise would suggest velar pinch. Which we'd be thrilled about, because the confirming evidence is
the apparent double-burst at 500 msec, right there in the F2/F3 pinch range, where we'd expect a velar release to be. So
we'd conclude that we're looking at a velar nasal followed by a velar plosive release. Homorganicity and all that. But
this would be a red herring, because there's no evidence of velar transition after the releasey thing. So we'll have to com
eup with another hypothesis. Which would either be [mb] or [nd]. And there's not a lot to tell us that's unequivocal.
Lower-case T
[t], IPA 103
So on the end of that nasal thingy is a burst, is a stop homorganic (same place of articulation) with (to?) the nasal.
Transcribed here as voiceless, because I convinced myself there was a short VOT, but now I'm not sure. It's ambiguous,
as I said before, because a) the burst is double, and in the right range, but the transitions don't match up with anything
velar. The F3 isn't doing much, and the F2 is so co-produced with the following, um, thing, that it's not telling us much.
So in the end, we'll rely on lexical access to fill this one in later.
Lower Case W
[w], IPA 170
So going from that moment of burst to about 550 msec, there's increasing amplitude and sonority. So I call that a thing.
Weaker than a vowel, probably an approximant, since in all other respects it's continuous with the vowel. The F1 is low.
The F2 is low. The F3 is just sitting there. So we've got something very back/round and close.
Script A
[ɑ], IPA 305
Well the F1 and F2 are pretty much all transition here, but there are some things to be gleaned. Note that the F1 rises
from its pushed-down-by-the-F2 position at the start to about 700 Hz around 675 msec. ANd then it levels off, or even
drops a little. So there's something lower-mid or low that it's heading toward. The F2 is still low there, so it's possibly
still being pushed down a little, but the point is that there's a moment in there we need to pay attention to. And at that
moment, the F1 indicates something a fairly low vowel, and about as back as it can get.
Lower-case N
[n], IPA 116
And now there's another of these things. This one is less ambiguous, though not by much. F2 is at least rising so it can't
be labial and the F3 is just sitting there, so this is unlikely to be velar. So this is probably alveolar. There's stuff going on
that looks a little like voicing at the bottom, but otherwise this looks like a plosive, down to the sharp, alveolar-looking
release. So probably a nasal, with a following plosive...
Lower-case T
[t], IPA 103
... homorganic of course. The release is nice and alveolar-looking, with it's sharp onset and high-frequency-tilted noise.
Finally.
Barred I
[ɨ], IPA 317
And here's another shortish, weakish vowel (hey, at least it looks like a vowel).
Lower-case T
[t], IPA 103
So around 825-850 msec or so, everything kind of stops. No voicing, no energy, no noise, anywhere. There's no closure
transient, but then how often are we lucky enough to get one of those. The release happens at about 900 msec, and
while weak is very sharp and in that high-frequency alveolar-looking range again.
Esh
[ʃ], IPA 134
This noise is interesting, since it's very [s]-shaped. It's a single, broad band, centered around 3000-3500 Hz, depending
on where exactly you look. That's a little low for an [s], but whatever. The real give-away is the fact that the noise stops
dead around 1800 Hz, that is just below F2. Which is almost always a classic indicator of a postalveolar fricative, i.e. [ʃ].
Lower-case I
[i], IPA 301
So for almost 100 msec, we've got a fairly flat, stable vowel. Yay. F1 is very low, F2 is very high (2100 Hz or over!), which
can really only be an [i]. As high (low F1) and as front (high F2) as can be.
Lower-case P
[p], IPA 101
Another gap, indicating another plosive. Transition wise, there's not a lot going on. F3 is coming down just a little, and I
can convince myself that F2 is as well, although that may just be me and my imagination running wild (armed with the
knowledge of what's really going on here as well...). The release burst is again sort of sharp and tilted to the high
frequencies, but that doesn't jive with the apparently labial looking transitions. Hmm. I'm trying hard to force this to
look bilabial, but except for making a big deal of the low-frequency components of the release transient (which aren't
really missing in the previous alveolar releases, so it would be a lot of handwaving) I'm not having much luck.
Lower-case S
[s], IPA 132
On the other hand, the [s]-shaped tilt to the burst noise might be influenced by the high-frequency tilt to thise noise.
Note that it isn't completely contiguous with the burst, which may suggest that this isn't just a [ts] kind of transition.
Anyway, note the off-the-top center of this noise. That's more typical of an [s] than the [ʃ] we saw earlier.
Turned A
[ɐ], IPA 324
I'm sick of using turned V [ʌ] (which the IPA defines as Cardinal 14, the unrounded version of open-o) for this vowel.
The vowels traditionally transcribed as turned-v in English are historically related to short o (and short u), but in my
dialect and in Canadian English there's nothing back about it. The turned (print) a symbol [ɐ] represents (in strict IPA
style) a central vowel of indeterminate height between lower-mid and low. So it is with this vowel. The F1 is a little
higher than 500 Hz, so vaguely lower-mid. The F2 is a little low of central, so vaguely back, but not at all round. So take
your pick. I think the F2 is being pulled down a little here by the following consonant, but that's just me. If you don't
like it, keep using turned-v, but you're unlikely to see it again here.
Lower-case B
[b], IPA 102
F2 and F4 are pointed down. F3 may be or may not be. But F2 and F4 both look labial. The gap is clearly voiced (look at
those nice clear striations), so we're talking [b].
Lower-case S
[s], IPA 132
Stronger than the last one, but clearly high-frequency and broad band, and even though the lower frequencies are
attenuated, there's not the abrupt cut-off at F2 we associated with postalveolars.
Lower-case T
[t], IPA 103
Shortish gap here, with a sharp release (gosh, it looks like the labial release from before, huh?) but the noise in the
short VOT is [s]-shaped, which we really only ever see with alveolar releases.
Barred I
[ɨ], IPA 317
Shortest vowel of the spectrogram. Don't sweat the small stuff.

[tʰ], IPA 103 + 404
Release less obviously bursty here, but the gap is unmistakeable. Transcribed as aspirated because of the longer release
noise/VOT.
Barred U
[ʉ], IPA 318
Again, getting strict about my IPA. This is not a back vowel. It is however quite round. So even though this has the same
formants as the barred-i we've seen (and similar to the small-cap i's we may be used ot seeing, the down-trend in the F2
indicates increasing rounding (or backness) during the articulation of this vowel. Which is typical of post-alveolar /u/
(reflecting its merger with /ju/ in my dialect). But as round as it gets, it don't get anywhere near 'back'. So transcribed
as round(ing) and central. And high more than anything else.
Lower-case T
[t], IPA 103
One final plosive, preceded with creakiness (utterance-finally, this could just be low pitch, but more likely it's the
combination of low pitch and glottalizing a coda plosive). The release is sharp, and tilted to the high(er) frequencies
(possibly brought down a little by lip rounding from the preceding vowel?). Noise like that is atypical of final velars or
labials.

CANADA R3T 5V5

Ilovepdf Merged

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ilovepdf Merged

Uploaded by

Copyright:

Available Formats

To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93

Solution for October 2002

"Virgin sharks lay eggs that hatch."

Lower-case V, IPA 129

Turned R + Syllabicity Mark, IPA 151 + 431

Lower-case D, IPA 104

Yogh, IPA 135

Barred I, IPA 317

Lower-case N, IPA 116

Esh, IPA 134

Script A + Rhoticity Sign, IPA 305 + 419

Turned R, IPA 151

Lower-case K, IPA 109

Lower-case L + Mid Tilde, IPA 155 + 428

Lower-case E + Raising Sign, IPA 302 + 429

Glottal Stop, IPA 113

Epsilon, IPA 303

Lower-case G, IPA 110

Lower-case Z, IPA 133

Eth, IPA 131

Schwa, IPA 332

Fish-hook R + Under-Ring, IPA 124 + 402

Lower-case H, IPA 146

Ash, IPA 325

Lower-case T, IPA 103

Esh, IPA 134

Solution for June 2003

"Prairie folk are hardy folk."

Lower-case P + Right Superscript H, IPA 101 + 404,

Turned R, IPA 151,

Epsilon + Rhoticity Sign, IPA 303 + 419,

Turned R, IPA 151,

Lower-case I, IPA 301,

Lower-case F, IPA 128,

Lower-case O, IPA 307,

Lower-case K, IPA 109,

Glottal Stop, IPA 113,

Script A + Rhoticity Sign, IPA 305 + 419,

Turned R, IPA 151,

Lower-case H, IPA 101,

Script A + Rhoticity Sign, IPA 305 + 419,

Turned R, IPA 151,

Fish-Hook R, IPA 124,

Lower-case I, IPA 301,

Lower-case F, IPA 128,

Lower-case O, IPA 307,

Lower-case K, IPA 109,

Solution for September 2003

"They like iced tea with lemon."

Eth, IPA 131,

Lower-case E + Small Capital I, IPA 302 + 319,

Lower-case L + Mid Tilde, IPA 155 + 428,

Lower-case A + Small Capital I IPA 304 + 319,

Lower-case K + Right Superscript H, IPA 109 + 404,

Glottal Stop, IPA 113,

Lower-case A + Small Capital I IPA 304 + 319,

Lower-case S, IPA 132,

Lower-case T + Right Superscript H, IPA 103,

Lower-case W, IPA 170,

Schwa, IPA 322,

Theta, IPA 130,

Lower-case L + Mid Tilde, IPA 155 + 428,

Epsilon, IPA 303,