Professional Documents
Culture Documents
or Lucida Sans
Unicode. If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode,
drop me a line.
In case you're wondering, yes, this happened. Parthenogenesis has been reported in a white spotted bamboo shark at
Belle Isle Aquarium in Detroit, MI. Check out this report from National Geographic. Belle Isle Aquarium appears to be
affiliated with the Detroit Zoological Institute but they don't seem to have their own website.
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
To properly view the phonetic symbols on this page, you must have one of the following fonts installed on your system.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. Depending on which font(s) you have installed, one or the other symbol in paragraph
headings may not display correctly.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. I'm committed to keeping my recommendations to a) freeware fonts with b) decent looking
IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts with good looking IPA
support, I'll test them out and add them to the style sheet.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. Depending on which font(s) you have installed, one or the other symbol in paragraph
headings may not display correctly. I'm committed to keeping my recommendations to a) freeware fonts with b) decent
looking IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts will good looking IPA
support, I'll test them out and add them to the style sheet.
Lower-case E, IPA 302, and Small Capital I + Tilde, IPA 319 + 424
[eI)], [eɪ]̃
I love final lengthening--you can see stuff that you don't ordinarily see. It helps that this is partially nasalized, but
that's getting ahead of things. Note the F1 here is just a hair higher than it was in the preceding two or three vowels,
but nowhere near as high as for the lowish front thing in the second syllable. So this is definitely our mid vowel. For me
/e/ is rarely mid, but whatever. The F2 is quite high, indicating frontness, so we're going for something that starts in the
classic American-style /e/ range. So if this is definitely mid, the others must be high. But then this vowel moves. At
about 1350, the F1 drops sharply--part of the sharpness is probably just the wide-band FFT interacting with the
changing harmonics (which you can kind of see between 1250 to 1350 msec just above F1), and probably there's a
change of state going on here. Actually two things are going on here--the change in tongue-body position, and the
introduction of nasality. You can see the nasality in the 'fuzzing' of the fomrant edges (i.e. the peaks in the filter are
flattening out a little), and the apparent zero creeping in about 1500 Hz. That might just be the distance between the F1
and F2 (I can convince myself this same zero is creeping into the previous two vowels, but this one seems like it's higher
in frequency, in spite of the slightly lower F2. So I think something else is going on here. So anyway, the F2 here is high,
but not as high as the things earlier we have now decided are /i/. This thing has a low F1 and a high F2, and is
connected to an /e/. So I transcribed it as a nasalized small-cap I.
Lower-case N
[n], IPA 116
Well, that sharpling falling F2 transition might suggest labial, but a) it starts out so high anyway, b) F3 doesn't seem to
be doing anything, and c) it doesn't really fall that far. Taken individually, a) where is it supposed to go from that high,
b) if it's labiality that is effecting the F2, it ought to be effecting F3 as well, and c) the transitions stay well above 1500 Hz
ro so, and the 'locus' of alveolar transitions, if you believe in such a thing, is about 1700-1800 Hz more or less depending
on the voice and who you read. So the transitions here are really suggesting an alveolar. The resonances (and zeroes)
suggest a nasal, and the alveolarness (for my voice) is further confirmed by that F2 thing just a hair below 1500 Hz.
Lower-case I
[i], IPA 301
Transitions aside (notice that the F2 looks alveolar on either side, although the F3 transition out of this vowel is a little
ambiguous, this has the same low F1 and extremely high F2 of (in fact higher than) the previous offglide. So there you
go. Must be [i].
Lower-case D
[d], IPA 104
Well, this gap, starting about 475 msec and probably continuing to about 575 msec is kind of long, so it might be two
things. So looking just at the left half (or so), it has alveolar transitions in (although the F3 loos like it's coming down,
which might lead you to suppose this is bilabial--but the lowering in F3 doesn't look 'transitional', so much as it looks
like it's 'modifying' from a higher general position). The first 50 msec or so of the gap is clearly voiced, so this is
probably [d].
Lower-case T
[t], IPA 103
On the other hand, the second half of this gap is probably voiceless. It's got a very strong, sibiliant ([s]-shaped) release,
which again suggests alveolar (which is consistent with the transitions out). In spite of the apparent quick onset of
voicing, is incredibly aspiratd. There seems to be some noise going on for almost 200 msec, which is just impossible. So I
think this is actually pretty heavily aspirated. But since we usually define aspiration as VOT, rather than noise, I didn't
transcribe it that way. I might change my mind about this next time.
Schwa
[ə], IPA 322
Well, there's almost 100 msec of voicing in here, although the F2 is still fairly front and moves to neutral (or just a little
lower), so following Keating et al (1994) as I usually do I probably should have transcribed this as a barred-i. But it
doesn't make a lot of difference. Note the undifferentiated noise above 2000 Hz that just goes on and on.
Lower-case F
[f], IPA 128
More noise, but this time it's voiceless. My first guess at this would be /h/, since there seems to be some resonance--F3
goes straight through, and you can see F2 moving from its low at about 700 msec up to where it is when the voicing
kicks on at about 775 msecs. But there's no F1. Which is possible, but atypical of /h/. Hmm. The noise is
undifferentiated, once you get past the frequencies below 1000 Hz, so there's very little in the way of high-frequency
filtering going on, in spite of the apparent resonance. Hmm. Then there's that F2. Why is the F2 falling to that point
around 700 msec, and then suddenly rising again after. There must be *something* there that is a target causing that. It
can't be alveolar, because an alveolar fricative of any kind would be more [s] shaped. I suppose it could be postalveolar,
given the absence of low-frequency energy, but it doesn't look [ʃ]-like, really. It just isn't loud enough, for one thing. So,
getting back to the F2, we're looking for something with a low F2 target. And lower than an alveolar target. Frankly,
that's as far as I can get. I'd ask you to *consider* [f], and move on.
Small Capital I
[ɪ], IPA 319
Well, at least a typical vowel. Okay. F1 is low, but not incredibly low, so we're looking at a higher than mid vowel, but
probably not just plain high. The F2 is moving, but it starts high and travels slightly higher. It never gets anywhere near
the range of the [i] vowel or the front offlglide we've already seen. But it's definitely front. So this oculd be [e] or [ɪ], and
it just ain't long enough to be [e]. Now look at those F2 and F3 transitions and move on to the next sound.
Lower-case K
[k], IPA 109
Ya gotta love velar pinch. And even though it doesn't really look like a stop (I think the reason my velars never look
stoppy may be my overlarge uvula--don't get me started), it's got to be velar, which doesn't leave many choices. Pretty
clearly voceless. Burst is a little low in frequency, if that's what that is just shy of 900 msec, but whatever.
Lower-case S
[s], IPA 132
Okay, what was that I was saying about 'not being loud enough'? Obviously I was mistaken. Here we've got broad band,
relatively (I guess) high amplitude noise, so this is probably a fricative. And voiceless, of course. I'd guess [ʃ] due to the
apparent low-frequency zero, but the energy doesn't seem concentrated in the visible/present low frequencies the way
I expect [ʃ] to look. So if it's sibilant, it must be [s]. Which is consistent with the distribution of energy, but I'd be
happier if the higher frequencies (above 4000 Hz) were clearly higher in amplitude than the lower frequencies. I mean, I
think they are, but it's arguable.
Eth
[ð], IPA 131
Well, there's something here. It looks slightly gappy, but there's some noise in the higher frequencies, and there's
some...something at 1500 Hz and something weirdly burst-transient like just after 1000 msec. I don't know what to
make of this, except if it's fricative, it ain't sibiliant, and it doesn't really look like anything else. I mean, a post-fricative
plosive ought to have more 'plosion' to it, just because of the airflow. I guess. So bear in mind this is English, and play
the odds.
Ash
[æ], IPA 325
Well, a nice high F1, indicating a rather low vowel (though I think the vowel coming up is lower), but with a mostly
neutral-looking F2. So it's not amazingly front, but then lowish front vowels aren't.
Script An
[ɑ], IPA 305
Ah, high F1, and an F2 as low as it can get. My kind of [ɑ].
Fish-hook R
[ɾ], IPA 124
Well, there's something there. It's very short, and kind of noisy, but there's something that has a vaguely [s]-shaped
release phase as it approaches 1500 msec. Anything that short can only be some kind of flap, so there you go.
Lower-case H
[h], IPA 144
Well, it's noisy, but it's organized into bands like a vowel. Like a voiceless vowel. Like an [h].
Lower-case O
[o], IPA 307
Well, the F1 is not high. It's a little high of neutral, but it's pretty mid-looking. The F2 is nice and low. So this is middish
and roundish, and I have a western US voice so I really only have vowel back there this could be. Not a heckuva lot of
diphthongization either, although that might be the backing environment of the following dark [l].
Lower-Case S
[s], IPA 132
Well, this is sort of weak, but you can see the fricative, starting about about 100 msec and going on to almost 250 msec.
It's broad band (rather than organized into narrower formant-like organization), and concentrated in the very high
frequencies. Toward the beginning, where the overall amplitude is much less, the frequencies we can see are very high.
So this is a pretty typical sibilant, almost definitely [s].
Epsilon
[ɛ], IPA 303
So the vowel starts just before 250 msec and goes on for about 100 msec. The F1 is a little low of mid, suggeting a
slightly higher mid vowel, which is weird for me if this is [ɛ]. But anyway, this vowel looks quite central, and so this
looks very schwa-like. But judging from the amplitude it must be stressed, and if it's stressed, this can't be my [ʌ],
which is typically low. So this is probably mid or high, and otherwise non-descript. Oh well. The falling formants are
clearly transitional, since they mostly all do it, so they don't help. Not long and not tense.
Lower-Case V
[&#x;], IPA 129
Well, it's very weak, but there's frication throughout this gap up to 400 msec. There's also voicing, so this is either a
very weak voiced stop or a weak voiced fricative. The transitions in and out all suggest bilabial (although the F3 doesn't
help--more later), so, since bilabial fricatives are not an option, as this is my English, labiodental is not a stretch.
Turned R
[ɹ], IPA 151
So there's that F3, clearly transitioning way down in the previous vowel, and there it is here, down at about 1600 Hz or
so. So this must be an /r/ of some kind. Nuff said, I guess.
Schwa
[ə], IPA 322
Transcribing this as a vowel is merely a convenience. There seems to be 'something' between the /r/ and the following
segment, but exactly what is open to interpretation.
Tilde L (Dark L)
[ɫ], IPA 209
So from about 475 to 550 msec or so, there's a dip in amplitude, accompanied by an apparent zero in F2, and a relatively
high F3. Very high considering the previous /r/. I'm not quite sure what's going on in F2, but the raised F3 is usually a
good indicator of the lateral. And the F2, if it's anywyere, is down there below 1000 Hz, so it must be dark.
Barred I
[ɨ], IPA 317
Again, this is a bit of vowel. I made a mistake transcribing it as barred-i--I think I must have misread the F3 as an F2, but
that's idiotic, since even the highest F2 can't get up that high. But it's a transitional vowel more than anything else. So
there.
Lower Case W
[w], IPA 170
So here's another attenuated, presumably consonantal articulation, but fully and clearly voiced. So this is almost
undoubtedly a sonorant, but very, very close. The low F2 is consistent only with something very round and very back,
and the following F2 transition is typical of [w], so there you go.
Turned R
[ɹ], IPA 151
Well, here's another /r/. Low F3, though not as low as previously. I've been noticing that the bandwidths of initial /r/s
being very narrow, but that may just be me doing stuff to do that. Anyway, thsi looks like a typical /r/ in coda position,
with the higher (closer to F3) F2 than in other positions.
Lower-Case D
[c], IPA 104
Gap. Probalby a plosive of some kind. Very voiced, which is interesting. There seems to be a folling F4 (or something),
But the F3, if anything, has a rising transition into this gap. But then it would be, since it' starts so low. The F2 is
ambiguous to say the least. SO on the balance, I think the alveolar guess is just a default thing. The release is even weak,
so it's not clear if that fricative coming up is just release (which would tell us a lot about the place of this plosive) or if
it's a fricative.
Lower-Case Z
[z], IPA 133
Well, if there's a fricative, it must be a sibilant. Look at that frequency. And it must be alveolar. Same reason. And
voiced.
Small Capital I
[ɪ], IPA 319
So if it ends up as afrotn velar, this vowel must be front vowel. ANd it is. Quite front, at least at the beginning. And quite
high, judging from the low F1. The F2 transitions down in a way that I'd expect an [i] to have more of a steady state or
trend upward, at least until it starts to transition into a following consonat. So this is probably lax/short/whatever you
want to call it.
Script V
[ʋ], IPA 150
Well, this looks short, and vaguely flap-like, being a short, fully voiced 'gap' looking thing. But while the amplitude
attenuation is appropriate for a flap, the sonorousness is not. The formants may dip away, but the don't 'stop' the way
the would/might ina proper flap. Then there are thte transitions. All falling. So this looks (bi)labial again. Not really a
good fricative like the previous one, but an approximant-y looking fricative. And again probably labiodental over
bilabial, just because this is English.
Schwa
[ə], IPA 322
Okay, now this looks like a schwa. Formants at 500, 1500, 2500 and--well, short of 3500, but what the heck.
Lower-Case N
[n], IPA 116
Another segment that has roughly the duration of a flap, although it might be just a tad long. And considering the
length, it's fully sonorant (with resonances), so probably not a tap. The attenuation therefore is probably close
articulation, and the discontinuity in the frequency/bandwidth of the formants, not to mention the apparent zero
below "F2" suggest a nasal. No evidence of velar pinch or a single F2/F3 range pole, and the pole is up around 1500 Hz,
too high to be my bilabial. One one choice left.
Lower-Case A + Upsilon
[aʊ], IPA 304 + 321
Well, abstracting away from the first 50 msec or so, the F1 here is fairly high, indicating a fairly low vowel. The F2 starts
in the central range, and goes down, indicating increasing rounding and/or backing. So this is probably a diphthong
[aʊ]. I wish I could see hte F1 dropping a little, to suggest going from low-to-high, vowel height-wise, but whatever. I
don't regard /aO/ a likely diphthong in this case.
On the other hand, these things look GORGEOUS on screen, and print fine for me off of the browser when coded in
HTML. So, since this is coded in HTML, I'm switching to them. This month I'm leaving up the regular SILDoulos IPA93 as
before, but have switched the defaults for the Unicode to Gentium (first) and SILDoulosUnicodeIPA. At some point I'll
probably stop using the regular Doulos altogether. So please please please download either Gentium or
SILDoulosUnicodeIPA from the above links, so you can see the symbols as they were meant to be seen.
If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode, drop me a
line.
Okay, so anyway, The F1 is pretty low (again, as low as it ever gets), the F2 clearly has a very high target, but no steady
state. So this is probably a glide of some kind, and it must be front.
This is the How To page of the mystery spectrogram webzone. Contents for this page:
May 2006: Commentary about stuff that I plan to change, or would like some in put on, is interspersed throughout this version of
the page, in this goofy text. Depending on your browser, I think this is rendered in colo(u)r. General stuff changing throughout:
First, read the chapter on acoustic analysis in Ladefoged's A Course in Phonetics, or better yet take a course based on
Ladefoged's Elements of Acoustic Phonetics or Johnson's Acoustic and Auditory Phonetics. Or you can just read this summary,
but bear in mind there's going to be a lot left out, especially in the 'why' realm. Then (as usual) learn by doing!
The goal of this page is to provide just enough basic information for the novice to begin, perhaps with some guidance, the
process of decoding the monthly mystery spectrogram. This page is not intended to be the last word in spectrographic analysis in
general, nor even the last word on spectrogram reading. However, reasoning your way through a mystery spectrogram is very
instructive, especially in relating acoustic events with (presumed) articulatory ones. That is, in relating physical sounds with
speech production.
If you're reading this, I assume you are familiar with basic articulatory phonetics, phonetic transcription, the International Phonetic
Alphabet, and the surface phonology of 'general' North American English (i.e. phonemes and basic contrasts, and major
allophonic variation such as vowel nasalization, nasal place assimilation, and so forth). I try to keep in mind that I have an
international audience, but there are some details I just take to be 'given' for English. Someday if we do spectrograms of other
languages, we'll have to adjust.
I really recommend that beginners find someone to discuss spectrographic issues with. If you're doing spectrograms as part of a
class, form a study group. If you're a 'civilian', form a club. Or something. I'm toying with the idea of starting a Yahoo group or
something for us to do some discussions as 'community'. Strong opinions anyone? Unfortunately, I don't have time to answer in
detail every e-mail I receive about specific spectrograms or sounds or features, but if you have a general question or suggestions,
please feel free to contact me.
Please note: My style sheet calls for this page to be rendered in either Victor Gaultney'sGentium font, or in SIL's
SILDoulosIPAUnicode. These fonts are (in my opinion) the best available freeware fonts for IPA-ing in Unicode for the web.
Please see my list of currently supported fonts for justification and links to download these fonts.
A sound spectrogram (or sonogram) is a visual representation of an acoustic signal. To oversimplify things a fair amount, a Fast
Fourier transform is applied to an electronically recorded sound. This analysis essentially separates the frequencies and
amplitudes of its component simplex waves. The result can then be displayed visually, with degrees of amplitude (represented
light-to-dark, as in white=no energy, black=lots of energy), at various frequencies (usually on the vertical axis) by time
(horizontal).
Depending on the size of the Fourier analysis window, different levels of frequency/time resolution are achieved. A long window
resolves frequency at the expense of time—the result is a narrow band spectrogram, which reveals individual harmonics
(component frequencies), but smears together adjacent 'moments'. If a short analysis window is used, adjacent harmonics are
smeared together, but with better time resolution. The result is a wide band spectrogram in which individual pitch periods appear
as vertical lines (or striations), with formant structure. Generally, wide band spectrograms are used in spectrogram reading
because they give us more information about what's going on in the vocal tract, for reasons which should become clear as we go.
The point is that vocal source isn't just one frequency, but many frequencies ranging from the fundamental all the way up to
infinity, in principle, in integral multiples. Just as white light is many frequencies of light all mixed up together, so is the vocal
source a spectrum of acoustic energy, going from low frequencies (the fundamental) to high frequencies. In principle, there's
some energy at all frequencies (although unless you're talking about an integral multiple of the fundamental, the amount will be
zero).
The energy provided by the source is then filtered or shaped by the body of the instrument. In essence, the filter sifts the energy
of some harmonics out (or at least down) while boosting others. The analogy to light again is apt. If you pass a white light through
a red filter, you end up removing (or lessening) the energy at the blue end of the spectrum, while leaving the red end of the
spectrum untouched. Depending on the filter, you might pass a band of energy in the red end and a band of energy in the green
band, and something else. The 'color' of light that results will be different depending on which frequencies exactly get passed, and
which ones get filtered.
In speech, these different tonal qualities change depending on vocal tract configuration. What makes an [i] sound like an [i] is not
something to do with the source, but the shape of the filter, boosting some frequencies and damping others, depending on the
shape of the vocal tract. So the 'quality' of the vowel depends on the frequencies being passed through the acoustic filter (the
vocal tract), just as the 'color' of light depends on the frequencies being passed through the light filter.
So, we can manipulate source characteristics (the relative frequency and amplitude of the fundamental—and some properties of
some of the harmonics) at the larynx independently of filter characteristics (vocal tract shape). < a href="#figfilter">Figure 1, is a
spectrogram of me saying [ i ɑ i ɑ ] (i.e. "ee ah ee ah") continuously on a steady pitch. On the left, a wide band spectrogram
shows the formants (darker bands running horizontally across the spectrogram) changing rapidly as my vocal tract moves
between vowel configurations. (Take a moment to notice that the wide band spectrogram is striated, and the horizontal formants
are 'overlaid' over the basic pattern of vertical striationsn.) On the right, a narrow band spectrogram reveals that the harmonics
—the complex frequencies provided by the source—are steady, i.e. the pitch throughout is flat. Because some harmonics are
stronger than others at any given moment, you can make out the formant structure even in the narrow band spectrogram. The
filter function (the formant structure) is superimposed over the source structure.
If you're still not sure what I mean by 'band' or 'formant', pass your mouse cursor over the figure. I've marked the center
frequency, more or less, of each visible formant in the figure. Look for the in captions of spectrograms for extra information like
this. Depending on your hardware/software configuration, you should also be able to play the audio clip, by pressing the 'play'
button in the figure caption.
Figure 1. wide band (left) and narrow band (right) spectrograms, illustrating changing vowel quality with level pitch.
The other side of the source-filter coin is that you can vary the pitch (source) while keeping the the same filter. Figure 2 shows
wide and narrow band spectrograms of me going [aː], but wildly moving my voice up and down. The formants stay steady in the
wide band spectrogram, but the spacing between the harmonics changes as the pitch does. (Harmonics are always evenly
spaced, so the higher the fundamental frequency —the pitch of my voice—the further apart the harmonics will be.)
Figure 2. wide band (left) and (narrow band) spectrgrams of me saying [aː], but with wild pitch changes.
A word on sources
I like to divide the kinds of sources in speech into three categories: periodic voicing (or vibration of the vocal folds), non-voicing
(which most people don't consider, but I like to distinguish it from my third category), and aperiodic noise (which results from
turbulent airflow).
Voicing is represented on a wide band spectrogram by vertical striations, especially in the lowest frequencies. Each vertical 'line'
represents a single pulse of the vocal folds, a single puff of air moving through the glottis. We sometimes refer to a 'voicing bar',
i.e. a row of striated energy in the very low frequencies, corresponding to the energy in the first and second harmonics (typically
the strongest harmonics in speech). For men, this is about 100-150 Hz, for women it can be anywhere between 150-250 Hz, and
of course there's lots of variation both within and between individuals. In a narrow band spectrogram, voicing results in harmonics,
with again the lowest one or two being the strongest.
Non-voicing is basically silence, and doesn't show up as anything in a spectrogram. So while there isn't a lot going on during
silence that we can see in a spectrogram, we can still tell the difference between voiced sounds (with a striated voicing bar) and
voiceless sounds (without). And usually there's still air moving through the vocal tract, which can provide an alternative source of
acoustic energy, via turbulence or 'noise'.
On the other hand, it's worth distinguishing several glottal states that lead to non-voicing. Typically, active devoicing, results from
vocal fold abduction. The vocal folds are held wide apart and thus movement of air through the glottis doesn't cause the folds to
vibrate. If the vocal folds are tightly adducted (brought together in the midline) and stiffened, the result is no air movement through
the glottis, due to glottal closure. Ideally, this is how a 'glottal stop' is produced. Finally, the vocal folds may be in 'voicing
position', loosely adducted and relatively slack. But if there is insufficient pressure below the glottis (or too much above the glottis)
the air movement through the glottis won't be enough to drive vibration, and passive devoicing occurs.
Noise is random (rather than striated or harmonically organized) energy, and usually results from friction. In speech this friction is
of two types. There's the turbulence generated by the air as it moves past the walls of the vocal tract, usually called 'channel
frication'. This is just 'drag', resistance to the free flow of air. If the air is blown against (instead of across) an object, you get even
more turbulence, which we sometimes call 'obstacle frication'. For instance, when we make an [s], a jet of air is blown against the
front teeth—the sudden displacement results in a lot of turbulence, and therefore noise. In spectrograms, noise is 'snowy'. The
energy is placed in frequency and amplitude more randomly rather than being organized neatly into striations or clear bands. (Not
to say they're aren't or can't be bands. They're just usually don't have 'edges' to the degree that formants do. Or may.)
We'll return to voicing and voicelessness below, after we deal with vowels.
Now look at the next formant, F2. Notice that the back, round vowels have a very low F2. Notice that the vowel with the highest
F2 is [i], which is the frontmost of the front vowels. F2 corresponds to backness and/or rounding, with fronter/unround vowels
having higher F2s than backer/rounder vowels. It's actually much more complicated than that, but that will do for the beginner. If
you're picky about facts or the math, take a class in acoustic phonetics.
Figure 3. Wide band spectrograms of the vowels of American English in a /b__d/ context.
Top row, left to right: [i, ɪ, eɪ, ɛ, æ]. Bottom row, left to right: [ɑ, ɔ, o, ʊ, u].
There are a variety of studies showing various acoustic correlates of vowel quality, among them formant frequency, formant
movement, and vowel duration. Formant frequency (and movement) are probably the most important. So we can plot vowels in
an F1xF2 vowel space, where F1 corresponds (inversely) to height, and F2 corresponds (inversely) to backness and we'll end up
with something like the standard 'articulatory' vowel space.
Note that some of the vowels in Figure 3 ([eɪ] and [ʊ] especially) show more movement during the vowel (beyond just the
transitions). Whether that makes them diphthongs (or should be represented like diphthongs) I'll leave for somebody else to
argue. But before we get too far, what would you imagine an [aɪ] or [aɪ] diphthong would look like?
It's worth pointing out now that all the formants show consonant transitions at the edges. Remember that the frequency of any
given formant has to do with the size and shape of the vocal tract—as the vocal tract changes shape, so do the formants change
frequency. So the way the formants move into and out of consonant closures and vowel 'targets', is an important source of
information about how the articulators are moving.
Generally we can think about the English plosives as occurring at three places of articulation—at the lips, behind the incisors, and
at the velum (with some room to play around each). The bilabial plosives, [p] and [b] are articulated with the lower lip pressed
against the upper lip. The coronal plosives [t,d] are made with the tongue blade pressing against the alveolar ridge (or
thereabouts). [k] and [g] are described as 'dorsal' (meaning 'articulated with the tongue body') and 'velar' (meaning 'articulated
against or toward the velum'), depending on your point of view. (I tend to use the 'dorsal' and 'velar' interchangeably, which is
very bad. I use 'coronal' because it's more accurate than 'alveolar', in the sense that everybody uses their tongue blade (if not the
apex) for [t,d], but not everybody uses only their alveolar ridge.)
That controversy aside, the thing to remember is that during a closure, there's no useful sound coming at you—there's basically
silence. So while the gap tells you it's a plosive, the transitions into and out of the closure (i.e. in the surrounding vowels) are
going to be the best source of information about place of articulation. Figure 4 contains spectrograms of me saying 'bab' 'dad' and
'gag'.
Take a look at those formant transitions out of and into each plosive. Notice how the transitions in the F2 of 'bab' point down (i.e.
the formant rises out of the plosive and falls into it again), where the F2 of 'gag' points up? Notice how in 'gag' the F2 and F3 start
out and end close together? Notice how the F3 of 'dad' points slightly up at the plosives? Notice how the F1 always starts low,
rises into the vowel, and then falls again.
Okay, these aren't necessarily the best examples, but basically, labials have downward pointing transitions (usually all visible
formants, but especially F2 and F3), dorsals tend to have F2 and F3 transitions that 'pinch' together (hence 'velar pinch'), and the
the F3 of coronals tends to point upward. The direction any transition points obviously is going to depend on the position of the
formant for the vowel, so F2 of [t,d] might go up or down. A lot of people say coronal transitions point to about 1700 or 1800 Hz,
but that's going to depend a lot on speaker-individual factors. Generally, I think of coronal F2 transitions as pointing upward
unless the F2 of the vowel is particularly high.
Another thing to notice is the burst energy. Notice that the bursts for "dad" are darker (stronger) than the others. Notice also that
they get darker in the higher frequencies than the lower. The energy of the bursts in "gag" are concentrated in the F2/F3 region,
and less in the higher frequencies. The burst of [b] is sort of broad—across all frequencies, but concentrated in the lower
frequencies, if anywhere. So bursts and transitions also give you information about place.
Figure 4 also illustrates that in initial position, phonemic /b, d, g/ tend to surface with no voicing during the closure, but a short
voice onset time, i.e. as unaspirated [p, t, k]. In final position, they tend to surface as voiced, although there's room for variation
here too.
Fricatives
Frankly, fricatives are not my favorite. They're acoustically and aerodynamically complex, not to mention phonologically and
phonetically volatile. There's not a lot you can say about them without getting way too complicated, but I'll try.
Fricatives, by definition, involve an occlusion or obstruction in the vocal tract great enough to produce noise (frication). Frication
noise is generated in two ways, either by blowing air against an object (obstacle frication) or moving air through a narrow channel
into a relatively more open space (channel frication). In both cases, turbulence is created, but in the second case, it's turbulence
caused by sudden 'freedom' to move sideways (Keith Johnson uses the terrific analogy of a road suddenly widening from two to
four lanes, with a lot of sideways movement into the extra space), as opposed to air crashing around itself having bounced off an
obstacle (Keith's freeway analogy of a road narrowing from four lanes to two works here, but I don't really want to think about
serious sibilance in this respect....)
Sibilant fricatives involve a jet of air directed against the teeth. While there is some (channel) turbulence, the greater proportion of
actual noise is created by bouncing the jet of air against the upper teeth. The result is very high amplitude noise. Non-sibilant
fricatives are more likely 'pure' channel fricatives, particularly bilabial and labiodental fricatives, where there's not a lot of stuff in
front to bounce the air off of.
In Figure 5, there are spectrograms of the fricatives, extracted from a nonce word ("uffah", "ussah", etc.).
Figure 5: Top row, left to right: f, theta, s, esh. Bottom row, left to right: v, eth, z, yogh.
Let's start with the sibilants "s" and "sh", in the upper right of Figure 5. They are by far the loudest fricatives. The darkest part of
[s] noise is off the top of the spectrograms, even though these spectrograms have a greater frequency range than the others on
this page. [s] is centered (darkest) above 8000 Hz. The postalveolar "sh", on the other hand, while almost as dark, has most of its
energy concentrated in the F3-F4 range. Often, [s]s will have noise at all frequencies, where, as here, the noise for [ʃ] seems to
drop off drastically below the peak (i.e. there's sometimes no noise below 1500 or 2000 Hz.) [z] and [ʒ] are distinguished from
their voiceless counterparts by a) lesser amplitude of frication, b) shorter duration of frication and c) a voicing bar across the
bottom. (Remember, however, that a lot of underlyingly voiced fricatives in English have voiceless allophones. What other cues
are there to underlying voicing? Discuss.) Take a good look at the voicing bar through the fricatives in the bottom row. You may
never see a fully voiced fricative from me again.
It's worth noting that F2 transitions are greater and higher with [ʃ] than with [s], and I seem to depress F4 slightly in [ʃ], but I don't
know how consistent these markers are.
Labiodental and (inter)dental (nonsibilant) fricatives are notoriously difficult to distinguish, since they're made at about the same
place in the vocal tract (i.e. the upper teeth), but with different active articulators. Having established (in a mystery spectrogram)
that a fricative isn't loud enough to be a sibilant, you can sometimes tell from transitions whether it is labiodental or interdental—
labiodental will have labial-looking transitions, interdentals might have slightly more coronal looking transitions. But that's poor
consolation—often underlying labiodental and interdental fricatives don't have a lot of noise in the spectrogram at all, looking
more like approximants. Sometimes, the lenite into approximants, or fortisize to stoppy-looking things. I hate fricatives.
Before moving on, we need to talk about [h]. [h] is always described as a glottal fricative, but since we know about channels and
such, it's not clear where the noise actually comes from. Aspiration noise, which is also [h]-like, is produced by moving a whole lot
of air through a very open glottis. I heard a paper once where they described the spectrum of [h]-noise as 'epiglottal', implying that
the air is being directed at the epiglottis as an obstacle. Generally speaking, we don't think of the vocal cords moving together to
form a 'channel' in [h], although breathy-voicing and voiced [h]s in English (as many intervocalic [h]s are produced) maybe be
produced this way. So I don't know. What I do know about [h]s is that the noise is produced far enough back in the vocal tract that
it excites all the forward cavities, so it's a lot like voicing in that respect. It's common to see 'formants' excited by noise rather than
harmonics in spectrograms of [h]. Certainly, the noise will be concentrated in the formant regions. Compare the spectrograms in
Figure 6.
Notice how different the frication looks in each spectrogram. In "hee", the noise is concentrated in F2, F3 and higher, with every
little in the 1000 Hz range. In "ha", in which F1 and F2 straddle 1000 Hz, the [h] noise is right down there. In "who", there is a lot
less amplitude to the noise between 2000 and 3000 Hz, but there around F2 (around 1000 Hz) and lower, there's a great deal.
You can even see F2 really clearly in the [h] of "who". So that's [h]. Don't ask me. It's not very common in my spectrograms....
Nasal stops
Nasals have some formant stucture, but are better identified by the relative 'zeroes' or areas of little or no spectral energy. In
Figure 7, the final nasals have identifiable formants that are lesser in amplitude than in the vowel, and the regions between them
are blank. Nasality on vowels can result in broadening of the formant bandwidths (fuzzying the edges), and the introduction of
zeroes in the vowel filter function. Nasals can be tough, and I hope to get someone who knows more about them than I do to say
something else useful about them. You can sometimes tell from the frequency of the nasal formant and zero what place of
articulation was, but it's usually easier to watch the formant transitions. (This is particularly true of initial nasals; final nasals I
usually don't worry about--if you can figure out the rest of the word, there's only three possible nasals it could end with.) (Actually,
being loose with the amount of information you actually have before you start trying to fit words to the spectrogram is one of the
tricks to the whole operation.)
Figure 7. Spectrograms of "dinner", "dimmer", "dinger".
The real trick to recognizing nasals stops is a) formant structure, but b) relatively lower-than-vowel amplitude. Place of articulation
can be determined by looking at the formant transitions (they are stops, after all), and sometimes, if you know the voice well, the
formant/zero structure itself. Comparing the spectrograms above, we can see that 'dinger' (far right) has an F2/F3 'pinch'—the
high F2 of [ɪ] moves up and seems to merge with the F3. In the nasal itself, the pole (nasal formant) is up in the neutral F3 region.
'Dinner' (middle) has a pole about 1500 Hz and a zero (a region of low amplitude) below it until you get down to about 500 Hz
again. The pole for [m] in 'dimmer' is lower, closer to 1000 Hz, but there's still a zero between it and what we might call F1. Note
also that the transitions moving into the [m] of dinner are all sharply down-pointing, even in the higher formants, a very strong clue
to labiality, if you're lucky enough to see it.
Approximants
In case you're not familiar with the term (generally attibuted to Ladefoged's Phonetic Study of West African Languages or as
modified in Catford's Fundamental Problems in Phonetics), the approximants are non-vowel oral sonorants. In English, this
amounts to /l, r, w, j/. They are characterized by formant structure (like vowels), but constrictions of about the degree of high
vowels or slightly closer. Generally there's no friction associated with them, but the underlying approximants can have fricative
allophones, just as fricative phonemes can occasionally have frictionless (i.e. approximant) allophones.
Canonically, the English approximants are those consonants which have obvious vowel allophones. The classic examples are the
[j-i] pair and the [w-u] pair. I have argued that [ɹ] is basically vowel-like in structure, i.e. that syllabic /r/ is the most basic
allophone, but there are those who disagree. Syllabic [l]s are all at least plausibly derived from underlying consonants, but I'm
guessing that'll change in the next hundred years.
In Figure 8, the approximants are presented in coda/final position, where the formant transitions are easiest to discern. Note that
in all four words, the F1 is mid-to-high, indicating a more open constriction than with a typical high vowel. For /l/, the F2 is quite
low, indicating a back tongue position—velarization of 'dark l' in English. The F3, on the other hand, is very high, higher than one
ever sees unless the F2 is pushing it up out of the way. In "bar", the F3 comes way down, which is characteristics of [ɹ] in English.
Compare the position of the F3 in "bar" with that in "bough" and "buy", where the F3 is relatively unaffected by the constriction.
In "bough", the F2 is very low, as the tongue position is relatively back and the lips are relatively rounded. Note that the this has
no effect on F3, so let it be known that lip rounding has minimum effect on F3. Really. The next reviewer who brings up lip
rounding without having some data to back it up is going to get it between the eyes. It's worth noting that the nuclear part of the
diphthong is relatively front (as indicated by the F2 frequency in the first half of the diphthong) with the [aʊ] than in [aɪ]. In 'buy',
the offglide has a clearly fronting (rising) F2.
Note that for both proper plosives, there's a longish period of relative silence (with a voicing bar in the case of /d/), on the order a
100 ms. The actual length varies a lot, but notice how short the 'closure' of the flapped case is in comparison. It's just a slight
'interruption' of the normal flow, a momentary thing, not something that looks very forceful or controlled. It doesn't even really
have any transitions of its own. The interruption is something on the order of three pulses long, between 10 and 30 ms. That's
basically the biggest thing. Sometimes they're longer, sometimes they're voiceless (occasionally even aspirated), but basically a
flap will always be significantly shorter than a corresponding plosive.
Okay, so let's turn back to the proper plosives. Notice the aspiration following the /t/, and the short VOT following the /d/. Note the
dying-off voicing during the /d/ closure, presumably due to a build up of supralaryngeal pressure. (Frankly, we're lucky to get any
real voicing during the closure at all.)
(Other big allophonic categories I want to cover are nasalized vowels and rhoticized vowels, but I'm wondering how important
they are at this level. Remember that this is a primer, not the be-all and end-all work on spectrogram reading. Also worth doing is
some prosodic stuff, pitch and duration, amplitude and that kind of thing, as it relates to finding word and phrase boundaries in
spectrogram reading. Comments?)
Is that it?
Well, obviously not. But it should be enough to get you started reading the monthly mystery spectrogram. We could go on and on
about various things, but that's not the point right now. Remember, identify the features you can, try to guess some words,
hypothesize, and then see if you can use your hypotheses to fill in some of the features you're unsure about. Do some lexical
access, try some phrases, and see how well you do. Reading spectrograms, like transcription, and so many other things can be
taught in a short time, but takes a long time and experience to learn. But then that's why we're here, right?
This month's high-pedagogy mode spectrogram is heavy with fricatives (hence the extended frequency scale) and nasals. So pay
attention. Things aren't going to stay this "easy" for long...
So remember how these fricatives really look, so that when you can only see them up to 4000 Hz you can hypothesize how they
are supposed to look.
Lower-Case F
[f], IPA 128
So here's a fricative. It has very little in the way of resonance-y-looking organization. It *may* be sibilant--it has vaguely the same
profile as the following fricatives (cent(e)red around 700 and 900 msec). But you may notice that it isn't quite as strong in the high
frequencies as the ones that follow. So if one of them isn't a sibilant, it would be this one. And if it isn't, this one looks more labial
than the others. After all, all three formants rise (to different degrees) starting with the voicing onset around 350 or 375 msecs. So
this seems to have labial transitions out. It's strong, I have to say, for a labiodental fricative. But there you go.
Lower-Case A + Upsilon
[aʊ], IPA 304 + 321
So, abstracing away from the transitions (so starting around 400 msec or so), we've got a quite high F1, a middling F2 and an F3
we don't expect to tell us much. There is a funny discontinuity in the amplitude of the striations above about 2000 Hz, which might
tell us something, but since the over all amplitude seems fairly consnat until 525 msec or so, it looks like this is a classic, vocalic-
throughout, diphthong. So after that amplitude change, the F1 seems to be falling. The F2 is decidedly low. So this vowel starts
rather low and vaguely central, and moves (sort of) up and definitely backer and/or rounder.
Lower-Case N
[n], IPA 116
Starting from 525 msec or so there's something voiced, but of reduced overall intensity compared to the preceding vowel. It's
clearly got periodic energy in the resonances, mostly seprated by zeroes. So this is almost definitely a nasal. Now here's where
things fall apart. The first pole above the voicing bar seems to be about 1100 Hz. This is not where I expect any of my nasal
resonances to be. If it were at 1000 Hz, it would look labial. If it were about 1500 Hz, it would look coronal. There's no hint of velar
pinch so don't even go there. So this looks closest to being labial. But as it turns out, it's not. On the other hand, the immediately
preceding vowel is round (or at least the offglide of the diphthong is probably round) which may be pulling down the locus of the
transition. WHy it sould have any effect on the nasal pole I have no idea, and I doubt that it does. So exactly why this looks like
this is beyond me. If you thing it's a [m], try to find a word that fits. See?
Lower-Case D
[c], IPA 104
I'd like to talk to someone about these nasal-plosive sequences. They always look like this, and it seems to me we have to start
transcribing such sequences as orally-released nasals or something. (I think Keating et al, 1994, used the distinction between
plosive closure-durations and plosive releases to do this sort of thing.) Anyway, there's this distinct oral release at just about 600
msec. It's quite sharp, with limited transitional information in the following vowel. So it's probably the same place as the preceding
nasal, just on general principle, and the absence of useful transitions is weakly suggestive of something coronal.
Schwa
[ə], IPA 322
Basically, we've got a mid-looking F1, possibly just low of mid, so this is a mid or higher-mid vowel. The F2 is pretty much central.
The F3 is as neutral as it gets. So this is pretty classically schwa-looking.
Lower-Case S
[s], IPA 132
Ignoring whatever we thought of the earlier fricative consonant (the one that turns out to be [f]), this one is fairly clearly an [s]. It's
very broad band, having energy at all visible frequencies. The greatest energy is in the highest frequencies, at least up around
4000 Hz if not higher. Classic [s]-shaped sibilant noise.
Schwa
[ə], IPA 322
This vowel looks just like the previous vowel, except it's a little shorter and possibly a little weaker. So this one is definitely a
schwa.
Lower-Case S
[s], IPA 132
And another [s]. Thsi one is longer though. So it's either the end of a word or phrase and undergoing final lengthening (but then
that schwa might be a little longer too--in English, final lengthening applies to the entire final syllable or at least rime) or this is the
beginning of a phrase and lengthened due to strengthening. Or it could be a geminate. Hmm.
Lower-Case D
[c], IPA 104
Longish gap. The F1 transitions down, but that's consistent both with the raising of the offlgide of the diphthong and the
approaching closure. The F2 however turns sharply in the last couple of glottal pulses, and starts to head downward. The F3
doesn't to much. Bilabial or velar transitions, in the best case, would pull F3 down a little, so the transitions are most consistent
with an coronal closure. The gap is at least 150 msec long, which is fairly long, and the voicing lasts close to 100 msec into the
closure. That's too long (and too strong) to just be perseverative voicing, so this stop, at least at the beginning, has to be
underlyingly voiced.
Lower-Case E
[e], IPA 302
So the vowel has a nice flat F1, in the mid range. The F2 is slightly rising, but very high throughout, so this is front and getting
fronter. The F3 is trending down throughought this vowel, but not sharply enough to really mean anything. So we're looking at a
mid-ish or slightly higher vowel, with a very front (and possibly getting fronter) tongue position. So again this look slike an [e].
Whether this one is 'as diphthongal' as the first vowel in this sentence is something I'm looking at finding an answer to. But it looks
less diphthongal than the other one to me, so I transcribed it as a monophthong.
Lower-Case B
[b], IPA 102
Gap. Voiced throughout. The F1 transitions don't tell usmuch. The F2 transition in the preceding vowel is definitely falling to below
1500 Hz, which is characteristic of labial transitions. The falling trend in F3 might be due to the labialization as well. Someone will
have to look into that.
Turned R + Under-Ring
[ɹ]̥ , IPA 151 + 402
Well, the F3 is sort of obscured, as are, frankly, the F1 and F2, by the aspiraiton n oise. But juding from the transitions,
the F2 starts high, the F2 starts low, and the F3 looks like it starts very low and rapidly transisiotns upward to almost
neutral. Which for me is about 2400-2500 Hz. I'm loking at that curved bit of energy just surrounding the first few
glottal pulses around 400 msec, which start about 1750 Hz and rise sharply from there. Such a low F3 can only be an /r/
in English. Marked as voiceless, due to the aspiration.
Lower-case D
[d], IPA 102
Well, here's another gap, this one mor obviously voiced. But it's very long to be a voiced stop, so, it may be two stops.
Considering the first (voiced) bit, the voicing clearly persists into the closure, further suggesting undelrying voicing (as
opposed to simply perseverative voicing). As for place, well, the transitions from the preceding vowel don't really look
velar, nor do they look bilabial. So coronal is probably a good bet. That and 'tried' is a better word after "I" than either
'trige' or 'tribe'. Sometimes your top-down look-ahead really is your best advantage.
Barred I
[ɨ], IPA 317
Somewhere around 650 msec there's some voicing starting that lasts, well, about 75 msec. WHich is quite short for a
vowel. And it doesn't really get very resonant. There's some F2 harmonic energy, but above that it' smostly noise. Noise
that's mostly constant from the release of the /t/. So this is a really short vowel, mostly being hidden by the
voicelessness and/or frication around it. So pick a reduced vowel symbol (as always I follow Keating et al (1994) in
choosing barred-i if the F2 is closer to the F3 than the F1) and get on with things
Lower-Case F
[f], IPA 128
Well, if I didnt' know better, I'd swear this is some kind of weak Esh. It has that obvious zero below 1500 Hz, and the
energy above that is contiguous with F2/F3 supported energy. With broad band. But if it were an Esh, and this long< I'd
*really* want to see more amplitude. I mean come on. So probably not Esh. Obviously not [s], so we're running out of
voicelss fricatives. /h/ doesn't usually go that voicelss or that long between vowels. WHich leaves [f] and Theta. And
there's not a lot to tell us which. (I think this was in response to someone's question about the 'dental' fricatives in
English, an dhow to tell them apart. The asnwer was, well, if there isn't transitional information, you pretty much can't.
At least not consistently. At least in a way generalizable within and across speakers. As far as I know. Hmm. If I had my
dissertation to do over again, maybe I'd concentrate on fricative noise. Hmm.)
Small Capital I
[ɪ], IPA 319
Well, this is another short vowle, but since it at least looks like a real vowel, I decided to treat is as such. Lowish F1 (the
harmonics around 500 Hz seem stronger than immediately below, but that little thing can't be aformant--too narrow
band, so I take it to be the top edge of the F1), very high (but not outlandishly high) F2. So high and front. And short.
Pretty much leaves [I].
Lower-Case K
[k], IPA 109
ANd the preceding vowel very nicely provides us with some very velar-pinch-y looking transitions. This coupled with
that nice low double-burst on the other side can only lead to 'velar' asa conclusion. And voicelss, of course.
Lower-Case S
[s], IPA 132
Well, if the other one couldn't be an Esh, I *raelly* don't see how this could be an [s], but there it is. The noise *does*
have that weird zero (although it's not really obvious that it has anything to do with the F2 being above it), and the
noise, such as it is is *really* broad band. The only giveaway that this *might* be an [s] is that the noise gets a *little*
stronger and a *little* more organized in the very high frequencies. So maybe it's adevoiced /z/, except that it's way
too long. So frankly I'd guess whatever I guessed for the other one. And I'd be wrong. Hey, I never said I knew any more
about this stuff than you do.
Eth + Raising Sign
[ð]̝ , IPA 131 + 429
Well, okay. Ome day I'll produce a real, honest-to-goodness *fricative* dental. This one clearly isn't. Even though it's
fully voiced, there's no way to get that, well, *release* thing at 1100 msec without building up some fairly serions
pressure behind the closure. Which fricatives do, but not like that. So the answer to how you tell the difference between
Eth and [v], which no one asked, but here you go, is that the Eth is more likely to present like a stop, an dth e[v] is more
likley to present as an approximant. So there. Anyway, the very strong and even voicing (and no VOT) tell us this is
voicd, and there's not really any transitional information to tell us anything, and the burst is sort of misleading but
looks more coronal than anything else. So I might guess /d/ again. ANd having made it that far, I could top-down to Eth
once I'd worked out what the rest of the sentence looked like.
Schwa
[ə], IPA 322
Well, here's ahortish vowel that's mostly transition. Judging from its starint frequencies, it looks highish and central-to-
ever-so-slightly-back (but that might just be the influence of the transition). F3 is slighlty raised, which is alarming. F4
is not helping, since it's very obviously transitioning into the follwoign deal. So I don' tknow. I called it a schwa, and
even if it isn't, it's not going to be overwhelmingly informative.
Lower Case W
[w], IPA 170
Well, look at that F1. Indicates a fairly high articulation. The dip in amplitude just after 1200 msec tells us that there's
very minute change toward closure (thoug of cousre it never reaches closure), so even without the transitions, there's a
consonant-like moment here between what would otherwise be just a sequence of vowels. The F2, a that moment, is low
low low. So, oddly enough is the F4. So at least for this utterance, the Maeda model that coupled the F4 to the F2 looks
like it was right after all. The F3, on the other hand, is damn flat throughout. So, this has got to be [w].
Small Capital I
[ɪ], IPA 319
AH, and here's naother shortish vowel that I'd be inclined to ignore, except that I know what the answer is. Abstracting
away from the [w] transition, it looks fronter than the previuus schwa thing, for what that's worth. But high. But short,
in spite of being high. So on the balance, I called it [I[, but it doesn't really ever get nearly front enough to warrant it.
Lower-Case N
[n], IPA 116
Voicing, longish, but not fully resonant like a vowel. But clearly sonorant. With a nize little zero around 1000 Hz. Must
be a nasal, and must be an [n], in that with those transtiions into and out of it it can't possilby be an Eng, and my [m]s
have a *resonance* aroudn 1000 Hz, with the zero lower (or higher, depending on how you think of what the zero is
doing).
Lower-Case D
[c], IPA 104
But then there's that momentary (like one glottal pulse) worth of dipped energy, and the followign thing that looks like
an (oral) release transient. Which is how voiced stops following homorganic nasals tend to look in my voice.
Lower-Case O + Upsilon
[oʊ], IPA 307 + 321
Well, this is a vowel. With transitions, so it's probably some kind of diphthong. In another life I'd have analyzed this as
two separate segments, but now I"m not so sure. Ennyhoo, starting that he beginning, we've got a mid-looking F1 which
is transitioning upward, meanign this vowel is going from mid to low. Or something like that. The F2 starts, well,
central, or front of central, and transitions rapidly downward, indicating increased backness or rounding. F3 is just
sittingthere, F4 is basically mirroring F2. Good strong voicingfor a looong time, when it sort of peters out of the higher
frequenices until , by the time you get about 300 msecs in, you've just got a voicing bar. But the amplitude changes is
very even. The clunk in the F3/F4 range at about 1575 msecs doesn't lik eup with anyting else. The moment the F1/F2
finally give up to the noise doesn't really line up well with anything else. So this is very smooth. No specific consonant
moment here.
So we've got somethig mid going to low, and something central going to back. WHich just can't happen in my dialect.
[əɑ] is just *not* a vowel where I come from. Especially not in an open syllable. But I don't know what else to say. Maybe
it's an artifact of the analsyis. Some weird interaction between the bandwidth windowing and the fundamental
frequency. I don't know. SO once again, go for the top down.
I tried tuhshuhksduhwunduhn
Shucks the window? Fix the window. I tried to f*x the window.
I was thinking about intonation when I did this one, and in my memory I kept hearing Mary Beckman's voice intoning ToBI
sentences. Since I can't use proper names in these things (by rule set down long ago by Peter Ladefoged), I couldn't use my
favo(u)rite "Marianna" sentences, and anyway, since I wasn't doing intonation with this one it wasn't that important. But there are
a lot of 'jam' sentences as I recall. But we will be doing some intonation stuff later on this year.
Lower Case W
[w], IPA 170
Well, starting at about 75 msec or so, there's some voicing going on. You'll notice that there's not much in the way of energy
above 1000 Hz. ANd the F1, such as it is, is weaker than the F1 of the following vowel (or whatever that is). So we're looking at
something sonorant (voiced and open enough have some serious periodicity to it), but not open enough to really be even a high
vowel. So this must be some kind of approximant, by traditional definition. The F1 is hard to see, but it's lowish, whatever it is,
which is consistent with a tighter-than-open constriction. The F2 is tough to make out, if in fact you can see it at all, but it's clear
the F2 transition into the following vowel starts around 900-1000 Hz or so. So it must be quite back and/or round. Probably and.
So how many back, round approximants can you think of?
Lower-Case I
[i], IPA 301
So abstracting away from the transition, this vowel thing has a low F1, lower than mid-range anyway. and an absurdly high F2 up
raround 2300-2400 Hz. That's just freaking high. So we're dealing with something high and exceptionally front. Again, how many
such vowels can you think of? Good.
So at this point, knowing we've got an english sentence, we could probably make a good guess at the first word. Or at least the
first syllable. And further, if we feel like we have a word/syllable that could plausibly be the subject of a sentence (someday I'm
going to put a weird adverbial or something at the front of a sentence and derail this whole line of reasoning), we might guess that
the next bit has to be some kind of verb. Or we could be wrong about [wi] being a subject, or even about it being [wi]. But it's a
working hypothesis.
Lower-Case E
[e], IPA 302
I'm trying to be consistent about marking movement in vowels, but I'm not sure what I was thinking here. But we have something
here that is separate from the preceding vowel--there's a sharp change in frequency, as well as a sudden change in F1 frequency.
The F1 frequency is higher than the previous vowel, approaching the mid-range, but not quite. So this vowel is still quite high, but
not as high as [i]. The F2 is similarly not-quite-as-high as the preceding vowel, so the this is not quite as front but still quite
radically front. So not as high or as front as [i], but still high or high-of-mid, and very front. Possibly moving back towards [i], at
least as we approach 400 msec or so. So possibly diphthongal. Sound familiar? If you're wondering about the height, check out
the height of /e/ in my 1997 JASA paper.
Glottal Stop
[ʔ], IPA 113
Well, as we approach 400 msec and a bit beyond, the periodicity, or the regularity of the voicing striations starts to fall apart. So
either there's a very abrupt and very extreme drop in F0, or there's some creak going on here. Creak in the sense of
glottalization. Glottalization as might result from a glottal stop. Hint hint.
Lower-Case O + Upsilon
[oʊ], IPA 307 + 321
Now this is a diphthong. The F1 starts a little high of the mid range and moves downward. So this starts a a little lower than mid
and moves toward a high vowel. The F2 starts well, the F2, when the voicing kicks in at about 500 msec, is low of the mid range,
so this is sort of back and/or round, but the F2 again drops in frequency reaching a min at about 700 msec. So it's getting backer
and/or rounder. So middish to highish, and backish/roundish to backer/rounder.
Lower-Case S
[s], IPA 132
So this next bit is definitely voiceless (no periodicity, no striations, no low-frequency "voicing bar" energy). It's not very little in the
way of formant structure (except possibly some in F2, almost definitely the front cavity. And fairly high amplitude noise in the very
high frequencies (very high at least in the sense of being at the top of the frequency range in this spectrogram, which goes up to
about 4400 Hz. So this is probably an [s].
Lower-Case T
[t], IPA 103
Well, here's another gap. This one is shorter than the previous one. It's voiceless, but it's hard to tell if it's aspirated. There's some
periodic looking things in the low frequencies that could be voicing. But during the closure it's voiceless. The release is sort of
sharp, but doesn't have a strong transient to it, suggesting that the closure was sort of weak without a lot of pressure building up
behind it. The release noise is concnetrated in the F3 range and higher. The F2 is a little lower. So there's no obvious velar pinch
in the release. The noise is consistent with an alveolar, but not great. BUt as it turns out there's a reason for the noise to be a little
lower than in the previous [s] or [t]...
Schwa
[ə], IPA 322
Well, there's teeny tiny bit of real voicing in here, with formants and everything Certainly a local sonority peak, worthy of being
called a vowel, but otherwise not worth worrying about. Schwa. Done.
Lower-Case D + Under-Ring
[d̥], IPA 104 + 402
Well, this is a voiceless gap, and probably coronal for the same reasons as the preceding. And I mean that literally. The release
transition isn't followed sharply by the high amplitude noise, like I'd expect with a simple /t/ release, but what do I know?
Yogh + Over-Ring
[ʒ̊ ], IPA 135 + 402
So if you notice the earlier [s] and [t] bursts, this doesn't really look quite the same. This is a period of high amplitude noise. It's
broad-band, but centered a little lower than the [s]s earlier, and it's pretty dead below 1500 Hz. Typical of [ʃ]. But if this were a
syllable-initial [tʃ]. I'd expect it to look, well, more aspirated. So I transcribed its a devoiced[dʒ], but whatever.
Script A
[ɑ], IPA 305
This vowel is short enough that it might be worth ignoring. On the other hand, it's initial. So we'd better learn from it
what we can. F1 is hard to see. It might be really really low, lost in the voicing bar, or it might be that little blip at about
750 Hz. F2, probably, is that thing that starts around 1000 Hz and rises slightly. T'm not sure about the F3 or F4, unless
it's that mess around 2700-3500 Hz. But we'll ignore that for the moment. So we've got something lowish (or extremely
high) and very back and/or round.
Lower-Case N
[n], IPA 116
So we're looking at this thing between 150 and 250 msec (or so). There's a nice little zero around 700-800 Hz, and
another above approaching 2000 Hz. So we've got something very clearly fully voiced and sonorant, with overall weaker
amplitude compared with the vowels around it, clear zeroes, and flat(tish) formant structure. The pole (think F2) is
ambiguous for my voice, being above 1000 Hz (clearly [m] territory) and 1400-1500 Hz (clearly [n] territory). So if you
had to choose, I'm not sure which you'd choose. Luckily there's another interpretation. Something in between labial
and alveolar. Hmm. Dental? And how might we get a dental nasal? Maybe by place assimilation with a dental
consonant? Hmm?
Okay, I don't expect you to have noticed that. Frankly I've only noticed it now that I'm trying to figure out why I
thought there was a fricative in here, besides knowing the text. But if you spotted it, you could have a future in this
business.
Eth
[ð], IPA 131
So there is this funny discontinuity in the upper formants, and this 'noisy' release, if that's what you want to call it. So
there's somethign at the end of that nasal we have to account for. And since the following vowel is mostly transition, I'd
say this is a function word. And whether we want to read this as a separate thing, it's reasonable to take an eth ([ð]) as a
source for the dentality of the nasal.
Schwa
[ə], IPA 322
Well, the F1 and F2 are bouth moving--the F1 from middish to highish and the F2 from backish/roundish to
backer/rounder. The F3 is really cruising, as is the F4. SO there's no 'moment' here where the F1 and F2 are just doing
what they want to do, suggesting 'targetless' vowel, i.e. one whose targets have been removed or expanded beyond
'targetness', i.e. one that is reduced, i.e. schwa.
Turned R
[ɹ], IPA 151
So while we're ignoring the preceding vowel, we should be noticing that diving F3. For about 50 msecs from just after
300 msec to not quite 375 msec, there's another dip in amplitude. And while there's attenuation in the higher
frequencies, there's no obvious zero in the lower frequencies. So we're looking at an approximant, with an absurdly low
F3.
Lower-Case T
[t], IPA 103
Well, there's a short gap approaching the 600 msec mark. The trannsitions in the preceding vowel look very front-velar,
but that would be wrong. There's an advantage to knowing the text before you start this sort of ting. Okay, there's some
falling apart of the periodicity into this gap, which might indicate glottalization. Unfortunately, this is probably a coda
plosive of some kind, and I can glottalize any final plosive. But most likely I'd glottalize a coronal. But this is sort of
ambiguous.
Lower-Case N
[n], IPA 116
Well, there's evidence of voicing, lots of it, but nothing much above it. So we're looking for something weak, but
sonorant. It scould just be a weak approximant, but the total absence of energy above the voicing bar suggests a zero, as
well as general attenuation of upper frequencies. So I'd say this was a nasal, on the balance. The pole sort of sneeaks in
at about 1300-1400 Hz as the upper frequency energy starts to creep in, which is the best guess at place information.
Lower-Case T
[t], IPA 103
Another one of these gaps with no information in it. Leave it and get on with your life.
Lower-Case Z
[z], IPA 133
Well, oaky, we've got some serous attenuation of amplitude, but full voicing from 975 msec to about 1050 msec. In the
higher frequencies, we've got a whole buncha noise. So we've got something voiced and noisy. The high-frequency of
the noise suggests sibilance more than anything else, so we've got some kind of voiced sibilant. There are only two in
English. 50-50.
Schwa
[ə], IPA 322
Here again a vowel that's mostly transition. If you must know, it's mid-to-high, and very back or round. But the
roudning is probably coarticulatory with....
Lower Case W
[w], IPA 170
Again, sharply attenuated in the upper frequencies (by which I mean above 1000 Hz in this case), but the F1, while not
'sharp', isn't particularly weak (compared for instance with with the voicing in the [z]). The F1 is apparently so low as
to disappear into the voicing bar, the F2 is as low as it ever gets. The F3 is a little low of neutral, what you can see of it,
so this is something quite high and close, quite back and round, and almost definitely not lateral.
Lower-Case T
[t], IPA 103
And here's another one of these. This one has more obvious glottalization to it, although I swear I still hear it. Oh well.
Even if you didn't know, by now it should be clear that all of these words are supposed to rhyme.
Tilde L (Dark L)
[ɫ], IPA 209
So this is fully voiced, and apparently sonorant, withno obvious low-frequency zero. So the attenuation of the higher
frequecies is probably just the tightness of the constriction. F1 is quite low. F2 is fairly low. F3 is downright high. So the
difference between [w] and dark l?
Lower-Case T
[t], IPA 103
Finally, a [t] worth talking about. Nice long gap (except for whatever that stuff is at 1750 msec). Nice sharp release with
s-shaped release noise. Classic stuff.
This page last modified: 11/08/2009 22:57:30 Support Free Speech
Minimally revised from the original. Wow, my style has changed. And not just in 'official' ways--like treating
aspiration (properly) as a diacritic and not a separate thing.
There are different styles of reading--this left-to-right business is just how I do it for convenience. As time goes on, I'll be
introducing other styles. One of the things that I always forget about, at least when sitting down to do these is the 'big' picture
stuff. For instance, how many syllables (or at least vowels) are there in this? What evidence of segmentation do you see? Where?
Can you see anything suggesting pitch peaks/lows, correlates of stress like amplitude, length, or pitch excursions? Once you've
done that sort of thing,usually you go through and mark all the things that are obvious--the sibilants, the nasals, if you can see
them, things that are obviously [i] or [a], that sort of thing. Then, once you've got the big picture, then you start in on specific cues.
Lower-Case M
[m], IPA 114
From the end of the previous diphthong to somehwere past 300 msec, there's another segment Fully voiced throughout,
suggesting [+voice]. Fully sonorant, i.e. with formants all the way up, suggesting something [+sonorant]. Sharp discontinuities on
both sides, which is classically nasal (as the side (oral) cavity closes, leaving only the nasal cavity as the open channel, you
suddenly get a sharp change in the acoustics. If you're lucky.) So this is probalby nasal. It's got what is probalby a zero around
1750 Hz or so (I'm not sure about the apparent narrow zero around 700 Hz, just because it continues for quite a ways--it looks
like an artifact. But it seems to separate a pole around 1000 Hz, which if you know my voice is about right for a bilabial pole.
There is an incling of something higher up, about 1300 Hz or something like that, which might be an indicator of a partial
occlusion at the alveolar ridge or so, but I'll defer that to someone who actually knows something about the acoustics of these
sorts of things.
Tilde L (Dark L)
[ɫ], IPA 209
I'll say it again, all North American /l/s are dark (velarized). True, some of them are darker than others, especially domain finally
(syllable, word, phrase, etc.), but there ain't nothing light about this lateral. WHich we know is a lateral because of the F3, but I'm
getting ahead of myself. 300-someting msec to about 400 msecs, where the F2 sharply changes, the amplitudes all weird out,
and the F3 really kicks in. That's the extent of this thing we're looking at. Okay, F1, if that's what you want to call it, fairly low,
indicating a fairly close constriction, so while not nasal, we're talking about something that is arguably a high vowel or closer. The
F2 is moderately low, lower than for the beginning of the diphthong, so this is very back. As iin velar. So we've got two
possibilities, something in the area of [w], and something in the area of [ɫ]. The difference, according to some sources, is the F3
or sometimes the F4. Typically, rounding will lower formants, so if we believe this is round, we'd want to see some lowering or at
least not raising, of the upper formants. But the F3 of this is just high. At the edge of the nasal it's quite high, just about 3800-3900
Hz, and although it falls, it's still well above 3600 Hz before the amplitude starts to kick in for the next vowel. So this ain't round.
Lower-Case I
[i], IPA 301
Vowel. Starting from the release of the lateral, if that's what you want to call that moment up to 500 msec. But the last 50 msec or
so are clearly transitional, so let's just worry about the apparent F2 extremum area. The F1 is pretty flat and fairly low throughout.
The F2, as I've said has a maximum around 425 msec, at around 2250 Hz. That's freaking high for an F2, so this is about as front
as this can get. So relatively high and outrageously front.
Lower-Case V
[v], IPA 129
Those diving transitions in F2, F3 and F4 (mirrored by rising trasnitions on the other side)! This is clearly labial. But not bilabial.
English only has three bilabials, one is voiceless, and one is a nasal, so this could only be [b]. First of all, even though the overall
amplitude is reduced here, the voicing is very strong and even throughout. And although a lot of my plosive closures are noisy,
there not fricated the way this thing is. This is a fricative. But labial. Which for my variety of English only really leaves [v].
Eng
[ŋ], IPA 119
[tʰ], IPA 103 + 404
Well, some zero-ness starts to creep in really early, and then at about 700 msec the upper formants, or at least F4, just sjut off
completely. And the other formants flatten out. So the zero is evidence enough of nasality, I suppose. The zero is right where
you'd hope to see a pole, so the pole is either that shadow thing at about 800 Hz, which I'm suspicious of, just because there
seems to be a lot of stuff just about that frequency, especially something that might be a harmonic in the [i]. The other candidate,
which is much stronger, is up arround 2400-2500 Hz. Which is pretty high. But notice that the F2 and F3 transitions in the
preceding vowel point right at it. So that's probably our puppy. The joint F2/F3 thing looks like velar pinch, albeit very front velar
pinch, so this is probably velar.
Lower-Case T
[tʰ], IPA 103 + 404
Well, there's 25-50 msec of gap, which is plenty to count, after a nasal. Now if you're expecting this to be velar based on a blind
faith in nasal place assimilation you're going to be disappointed. Because the release of this thing is not in the least velar-looking.
It's broad band, quite sharp, and strongest in the highest frequencies (which in this context means between 3000 and something
higher). The spectrum is strongest in the higher frequencies, well above the 'formant zone'. So this is basically an [s]-shaped
release. Which makes this an alveolar. And voiceless, and probably aspirated, due to a VOT of about 50 msec. At least. Which is
not outrageously long for a VOT, but long enough to count as aspirated. TMSAISTI.
Schwa
[ə], IPA 322
Then there's a pulse or two of vowel. <snooze>
Lower-Case M
[m], IPA 114
Hey, another one of these. This one starts at about 850 msec, and lasts to about 925 msec. But otherwise it's pretty much the
same as the previous [m], though a little weaker.
You'll notice there's full voicing and sonorance for 300 msecs starting around 925 msec. WHich is just too long to be one
segment. And the formants are too mobile to be reflecting a single target. So before going further, think about how many
segments there are here, and if you can't find the edges of each, can you find moments you want to call the 'center/re' of each?
Go ahead. I'll wait.
Okay, so there's the F1 peak near the beginning; the moment where the F3 is lowest and the F2 is highest, around 1050 msec;
and there's the funny dip in F2 at about 1125 msec. Those are the 'moments' I'm going to consider as evidence of at least three
things in this stretch. So, on to the first.
Turned Script A
[ɒ], IPA 313
I never get to use this vowel. But thre it is. It's just possible this is round, and not just transitioning from-and-to roundness. So I
transcribed it as round. Sue me. The F1 is high, the F2 is very, very low.
Turned R
[ɹ], IPA 151
Okay, so the F3 is dipping below 2000 Hz. Barely, but there you go, if that's good enough for you.
Lower-Case O
[n], IPA 307
Okay, so the F1 has been falling fairly steadily since that moment in the [ɒ]. So this ain't low. But it doesn't seem to be heading
down to well below 500 Hz as the hig vowels earlier on did. So this ain't particularly high. So probably mid. At least round here. F2
is low, as indicated at that dip. F3 doesn't tell us much. So mid and back. Only a couple of possibilities, and even on a good day
one is probably better.
Lower-Case N
[n], IPA 116
So we've got something fully voiced and sonorant, but with zeroes, as before. So this is probably another nasal. But notice the
polse. The bilabial poll was around 1000 Hz or just lower. This is definitely higher. And that harmonic/shadow thing just above it
maybe means it's even a little higher--maybe there's just a harominc space there that makes it look like the edge of the pole is
lower. But I don't know. Spectrally, this just ain't the same animal as any of the preceding nasals, which were bilabial, bilabial, and
velar, respectively. So this must be something else. Few options, at least for English.
Lower-Case T
[t], IPA 103
The cruddy voicing towards the end of the preceding vowel is probably a combination of low pitch and glottalization. The
glottalization probably tells us that this sound is probably a voiceless plosive underlyingly, and the release (and maybe the
transitions) suggest a nice alveolar again.
Lower-case H
[h], IPA 146
For about 100 msec, starting about 50 msec, there's some noise up in the region of F3 and F4 of the following vowel. No voicing
to accompany it, this looks like a voiceless fricative. It's not quite strong enough to be sibilant, and it doesn't have the "unfiltered"
quality of an [f] or [θ]. Also the transitions into the following vowel are all wrong for that. In fact, there don't seem to be any.
Coupled with the F3/F4 range the noise appears in, I'd say [h] is the most likely candidate.
Ash
[æ], IPA 325
This vowel has a very, very high F1, indicating a very low vowel. But the F2 is basically neutral, or a trifle higher than neutral. This
can't be a back vowel. Which leaves precious few options, at least in my dialect.
Lower-case P
[p], IPA 101
Beginning around 225 msec and going on until the release at about 300 msec, there's a nice little gap--no resonances, no noise
to speak of. Also no voicing (there's no striated organization to that noise, whatever it is, at the bottom). Check the transitions in
the surrounding vowels. All point down toward the closure, classically bilabial.
Lower-case I
[i], IPA 301
This vowel is a trifle short, but it has a very distinctive spectrum. F1 is quite low, F2 is about as high as one is ever likely to see,
and for those of you who believe in formant distnaces, F2 is 'tight' to F3. So we've got the F1 of a high vowel, and the F2 of a very
front vowel. The F2 being well above 2000 Hz, and the total trajectory being toward the front, this looks like an [i] more than
anything else.
Lower-case B
[b], IPA 102
There's another gap here but this one is clearly voiced through most of the closure duration. It's clearer on the left, transitioning
from the preceding vowel, that the F2 and F3 transitions point downward, i.e. in the bilabial direction again. It's harder to tell on
the right side, what with the noise and the short segment of voicelessness, but if you wish really, really hard, you might convince
yourself. There's not a lot indicating anything else, at least.
Theta
[θ], IPA 130
Well, this is a little confusing, but bear with me. The transitions into this fricative are consistent with a coronal. The F3 is returning
to its neutral value, the F2 seems to head towards 1750 Hz or thereabouts. And the noise, such as it is, is present mostly in the
high frequencies. So this is basically [s]-shaped. But there's no way there's enough energy for it to be an[s]. There's even less
amplitude to this noise than there is in the [h] at the beginning. It's even less than the release noise of the stop that follows. So if
this isn't sibilant, what is it? Frankly, this is the best-looking [θ] I've produced in a long time. Except that it's [s]-shaped rather than
more clearly unfiltered.
Lower-case T
[t], IPA 103
Okay, first things first. Ignore the 'clunk' (that's a technical term) at 750 msec. I have an explanation for it, but it's a long shot. So
ignoring the clunk, there's a gap here. There's no useful transitional information in the preceding fricative, so we're stuck with
those into the following vowel. These don't tell us much, except that they don't look obviously bilabial or velar, and they are
consistent with alveolar. Or coronal. So that's a working hypothesis, helped along by the clearly [s]-shaped (skewed to the high
frequencies) noise of the release burst. (That's what I meant about the preceding fricative just not being [s]-like). So this is a
voiceless plosive, almost definitely alveolar. Now that clunk. I think it has something to do with the transition between the theta
(which for me is truly interdental) and the release of the release of the plosive (which for me is apical). There's probably a point
where just a little bit of air escapes in between the laminal closure (since my tongue tip is busy between my teeth) and the apical
release. TMSAISTI.
Lower-case I
[i], IPA 301
The F1 has gone low again, and the F2 is way up above 2000 Hz (at its peak) again. This pattern should be familiar.
Lower-case T
[t], IPA 103
This probably isn't phonetically a plosive so much as a voiceless and ever so slightly aspirated flap. Or tap, or whatever it is in
English. The voiceless portion is close to 50 msec long, but the entire duration is a bit noisy in the higher frequencies. I'm
suspicious of the change in the noise about half-way through the voiceless duration. I think we've got a very short, incomplete
contact (hence the noise) followed by a portion of 'regular' voicelessness, i.e. VOT. But given that this is North American English,
a voiceless flappy thing can't really be anything except a phonemic /t/.
Tilde L (Dark L)
[ɫ], IPA 209
This is interesting. If I didn't know better, I'd have thought this was a nasal. It's fully voiced (compare the striated, low-frequency
energy here with the noisy low-frequency band in the flappy thing and the first [p]. So this is voiced, and while there isn't much
energy above, what there is is organized in formant-like bands. Overall amplitude is low, and there are largish bands with no
energy in them. ANd there don't seem to be any other nasals in this spectrogram to compare them with. But there are clues that
this isn't a nasal. First, there's no hint of nasality on the preceding vowel, which by itself is not strong evidence. But the decrease
in energy is so great, it's odd that the 'edge' of the supposed nasal isn't 'sharper'. Usually nasals either look very 'abrupt' or look
more coarticulated with at least the preceding vowel. The 'pole' is just above 1000 Hz, but the 'zero' below it isn't very well
defined. Which again is not strong evidence, but something to be explained. So entertaining a hypothesis that this is something
other than a nasal, what would it be. F1, if that's what you want to call it, is low. The F2 is that thing just above 1000 Hz (note the
continuity with the F2 in the surrounding vowels). And the F3. The F3, in spite of being radically low for the preceding [r], rise
sharply in the transition into this segment. On the other side, the F3 seems to fall from something higher than the 2500 Hz or so it
ends up at in the following vowel. And this raised-from-neutral F3 is consistent with the noisy energy we see in this segment. So
we have an oddly and otherwise inexplicably high F3 to contend with. And high F3s (and to a lesser extent F4s) are typically
associated with laterals in English. So this is a dark /l/. By 'dark', I mean velarized, as all my English laterals seem to be, without
intending anything about syllabic position.
Ash
[�E6], IPA 325
So here we have a relatively high F1, so we have a relatively low vowel. The F2 isn't doing much of anything. If it were lower, this
vowel would look back. But it's not, so the other choice is front. Or frontish. As front as a quite low vowel can ever really get?
Lower-case D
[c], IPA 104
Again, this probably is a more of a flap than a proper plosive [d], but I tried. At least this one is a more standard looking flap, and
actually voiced. It looks basically like a very, very short plosive, consistent with alveolar.
Barred I
[ɨ], IPA 317
I've transcribed this as a reduced vowel, given its relative duration, amplitude, and pitch to the previous vowel, which is definitely
longer (though not a lot, given how low it is), stronger (darker) and higher in pitch (closer striations). If I had to give it a standard
vowel symbol, I'd say [ɪ], that is something in between cardinal 1 and cardinal 2. The F1 idnicates something which is higher than
mid, but not exactly as high as it could get. The F2 something rather front, but nowhere near the frontness of [i]. But as I said, this
vowel is probably reduced, so spending too much energy trying to do something with it is probably unwarranted.
Lower-case F
[f], IPA 128
Well, starting around 1700 msec and going on until almost 1800 msec, we've got a fricative, surely, and mostly voiceless. It's got
some formant structure to it, suggesting that it's resonating through the vocal tract. It's definitely not sibilant. So the most likely
candidate is [h]. But I'd expect an [h] to ahve at least some energy in F1, and that's the one place where this fricative has no
energy. So while [h] is probably the most obvious guess, it does need some explaining. So entertaining the possibility that it is
something else, we're only really left with a couple of choices--labiodental and (inter)dental. And I can just convince muself that
the transitions are more labial looking (pointing down toward the closure) than coronal looking. Although interdentals don't look
always look particularly alveolar. Givne the more downward pointing transitions, I suppose [f] is the runner up. Keep this in mind
when you try to make some kind of word--or name--out of this part of the spectrogram.
Lower-case O + Upsilon
[oʊ], IPA 307 + 321
Well, the F1 is pretty close to neutral (around 500 Hz) or just low of that, which tells us that we've got a fairly mid or just high of
mid vowel. It's pretty flat, height-wise. over it's almost 150 msec duration. The F2 is less flat. It starts at about 1250-1300 Hz,
where it sits for about half the vowel or so, and then starts to lower just a little. Now that doesn't seem to coincide with the
transition in the F1, so I think that's a separate thing, and not just transition. So we've got something that is of constant height, mid
or just above, that starts vaguely back and moves slightly further back. Or round. There's only a couple of choices, and the other
one is more likely to move forward (i.e. towards central).
Lower-case G
[g], IPA 110
Another (weakly) voiced gap. At first glance it looks like it shas a fiarly clean burst, but if we look closer the noise seems to
precede the burst ever so slightly. This is different from the preceding bursts, which are pretty sharp on their left sides. Also notice
the low frequencies, which are weaker. In preceding bilabial bursts all had some energy down there. And the transitions into the
next vowel are all wrong for bilabial. In fact it looks like F2 and F3 are awfully close together (I hesitated to say 'pinched' together)
and move apart into the following vowel. So the absence of low-frequency release burst energy and the 'velar pinch' in the
transitions suggest a velar.
Barred I
[ɨ], IPA 317
Well, this is another weak vowel. Given that it's the last vowel, and presumably lengthened by final lengthening, it's not amazingly
long. So whatever you thought of the vowel before the last one, you might think here.
Lower-case D
[d], IPA 104
So we come to the end at last. A good gap as far as resonances go. Some decent voicing, considering the low frequency and
amplitude of the striations (considering it's utterance-final quite good amplitude and duration of voicing during the closure. And a
noisy release burst with energy biased to the very high frequencies. Definitely looks like an alveolar burst.
Hook-top H
[ɦ], IPA 147
Well, if I'm not trying to be literal about transcribing my diphthongs, I am trying to be literal about voicing, and the continuous
striations here indicate that this is voiced. On the other hand, it's very definitely fricative, and its spectrum is very cleary shaped by
the surrounding resonances. Which is pretty classic for [h], except this is voiced. In spite of the usual description, it's not unusually
to a voiced [h], especially between vowels. For trivia value, I seem to be one of those speakers who allows [h] word-medially only
if it is initial to the stressed-syllable (as in pro[h]ibit, but pro[*h]ibition). Mostly. So this either has to be word initial, pre-stress, or
both. As it turns out ...
Ash
[ɒ], IPA 325
Well, once the frication subsides, we've once again got something quite low (the F1, though weak, is just a little lower than it was
in the preceding vowel nucleus), but very definitely front. Frankly, fronter (higher F2) than I think I usually produce this vowel.
Maybe I'm reacting against the general trend centralizing this vowel, which I found (as others have reported) in California, and has
been demonstrated for some Canadian speakers.
Eng
[ɲ], IPA 119
So here we have a segment that's fully voiced, and apparently sonorant, but of greatly reduced amplitude compared to a real
vowel. It's got a nice little resonance just below 1500 Hz, which would make you think of [n], if you were use to my voice, except
that doesn't jive with the serous velar pinch going on in the preceding transition. So this most likely is velar.
Eth
[ð], IPA 131
There's a sudden change at about 625 msec, suggesting the end of the preceding eng and the onset of something else. It's still
fully voiced, but is again of reduced amplitude compared to the nasal. So this might be some kind of voiced obstruent. And given
the slushiness of my stop closures, you might be tempted to suggest this is a plosive. But there's that funny F2 thing going on,
and the noise at the top of the spectrogram. So maybe this isn't plosive, but fricative. So it's a voiced fricative. There's not a lot of
useful transition information, I don't think, to tell us which. Let's rule out the sibilants due to weakness. [h] is possible, since there
seems to be some formant-like organization to the noise, but this looks nothing like the preceding [h]. Which leaves us with [v]
and Eth. Probably that's the best we can do at this point.
Lower-case O + Upsilon
[oʊ], IPA 307 + 321
Well, it looks like there's a serious discontinuity around 775 msec, but since there's vowel on either side, I don't know what it could
possibly be, unless it has something to do with the sudden pitch change in this vowel. But the good thing is that the discontinuity,
whatever it is, clearly tells us that the vowel goes from basically mid, 500 Hz, to just a little higher (lower F1). The F2 starts sort of
neutral-to-lower than neutral, and gets lower. So this long vowel goes from mid and sort of back to higher than mid and definitely
back.
Lower-case S
[s], IPA 132
Now this is what a sibilant should look like, except it's a little short. Spectrally, this is a classic [s], with very broad band energy,
quite high in amplitude, and concentrated well above the 4000 Hz range, with very little formant-like organization.
Lower-case I
[i], IPA 302
Well, F1 as low as it ever seems to get for me, maybe 300 Hz or so, with an F2 well above 2000 Hz. That's about as high as my
F2 can go. So this is the highest, frontest vowel I can produce.
Lower-case Z
[z], IPA 133
Another sibilant, spectrally very similar to the previous [s], except if anything the noise is slightly broader band, extending all the
way down to the low frequencies. But this one is voiced. Barely. But there you go.
Script A
[ɑ], IPA 305
Well, look at that F1 and F2, as I said before, straddling about 1000 Hz.
Lower-case N
[n], IPA 116
This is the duration of a flap, and I'm not 100% sure why I decided it wasn't a nasal flap rather than the straight nasal I
transcribed. But it's definitely not a straight flap--it's got very clear resonant energy (formants) all the way up. But it's weak
compared to the flanking vowels, so it must be a nasal. Its shortness may tell us it's flappy, which makes it mostly likely
underlyingly alveolar.
Schwa
[ə], IPA 322
Short vowel. Actually, it looks just like the previous [ɑ] but it's short, and it's F1 is more ambiguous. TMSAISTI. I'm a little
concerned that the F3 seems to split into the F3 and F4 of the following vowel (the intervening noise notwithstanding), but, as I'm
so fond of saying, oh well.
Hook-top H
[ɦ], IPA 147
Well, here's naother one of these. Definitely noisy, though apparently voiced, and with energy cleary organized in the formant
pattern of the surrounding vowels.
Lower-case K
[k], IPA 109
Well, there's not a lot of velar pinch, although the F3 is definitely falling, and the F2 is definitely rising. But the rising F2 rules out
bilabial, and the falling F3 probably rules out alveolar. Which only leaves one option. And that would explain the apparenty double
burst at about 1825 msec as well.
I'm not sure how grammatical this is in my idio/dialect. I did check and there are plenty of examples of "envy [abstract nominal]"
but the more usual thing is to "envy [sentient]" where [sentient] can do or experience something enviable. Discuss. (I'd prefer
'covet' over 'envy', but I'm not sure I can "covet [abstract nominal]", although I can clearly "covet [concrete nominal]". Hmm. Or
more obviously "envy hibernating species/animals/individuals" in the sense of "envy [things (that hibernate)]".
Egad this is a long spectrogram. Pushing on: There's a funny discontinuity at about 1425 msec, which I take to be the 'moment' if
there is one, when the tongue approaches the minimum coming up. So ths sretch from 1375 to 1575 msec (or so) I'm dividing
into three bits again. The first bit, before the discontiniuty, the bit around the F3 minimum, and the bit after.
So let's talk prosody. This utterance didn't work out quite the way I expected.
First the break indices. Zeroes mark the ends of reduced words, think 'clitic' although that word is loaded. Ones mark the ends of
prosodic words. Anything 1 or above should have a lexical (*) accent. Twos are for something in between word and phrase, either
that has a boundary tone but doesn't otherwise exhibit the features of a boundary, or as in this case something that seems to
have the timing of a boundary but doesn't seem to have a tonal mark. Threes, if there were any, would mark phrase boundaries,
and the four the utterance boundary. Or at least that's what they're doing here.
Tonewise, there are two clear highs. Whether they're H*s or H*+L I'm not 100% sure, so I just marked them as H*. Probably the
first one should be an H*+L, since otherwise there's no real reason for the low pitch on the following syllable. Unless you think
that's really a 3], which is possible. The last H* is placed on the stressed syllable, which happens also to be final in the utterance,
so it's getting kind of squished in with the low boundary tone.
That 2] was a compromise in my head. It felt (and sounded) like there was some lengthening there, but there didn't seem to be a
phrasal tone. It looks like the H* there is deaccented, and I don't know how to deal with this in ToBI. Suggestions appreciated. But
following (my) ToBI rules, the lexical word there is supposed to get some kind of mark. Again, I could be wrong, as far as real
ToBI goes.
I obviously need to brush up on my ToBI. So this'll be it for the intonation for a while.
This month's high-pedagogy mode spectrogram is heavy with fricatives (hence the extended frequency scale) and nasals. So pay
attention. Things aren't going to stay this "easy" for long...
So remember how these fricatives really look, so that when you can only see them up to 4000 Hz you can hypothesize how they
are supposed to look.
Not
"Silk flowers have no smell."
Segmental cues
Okay, so if we say interdental, we might say the double burst is due to laminality. But I have mis-spoken. It's not laminality per se,
but the length of the closure along the hard palate. And the upper teeth don't provide that kind of length. So let's say labiodental.
Then we could have a velar stop which would explain the double burst, and we could have the low-frequency burst, since it has to
resonate through the approach to the labial 'closure'. Voilá, everybody's happy.
Okay, so if the vowel is /o/ and this is an off-glide (like a /w/), this sentence doesn't make any sense. If this is a dark /l/, then the
word could either be 'small' 'smole' or 'smell'. Which is the most likely to go with the 'silk flowers'?
So (working backwards from the hypothesis that this word is 'smell'), what this is is a very dark /l/, so dark it's velarizing/backing
(slightly) the (underlyingly not-particularly front, low-of-mid vowel /E/ which precedes it. (There are people who definitely have a
highish /e/ vowel in this word. I'm not one of those people.) Monosyllabic words with final /l/ are notorious for goofing around with
the vowel quality in North American English.
Once again, I've played with the current E_ToBI transcription conventions. Rather than a separate orthographic tier, I've aligned
the Break Indices to the segmental transcription. I've combined the Tonal Tier and the Break Index Tier as a single line. I align
word-level (*) tones with the middle of the marked vowel, but phrase accent (-) tones and boundary (%) tones to the left of their
appropriate Break Index.
"silk"
Break Index: 1
This word is receiving some kind of focus. silk flowers versus real flowers, I guess. As I understand the current conventions,
lexical words receive (at least) a 1BI and a *-tone, unless deaccented in some way. Receiving contrastive focus, this word
receives an H*. If this were a normal declarative. It might receive an L*.
"flowers"
Break Index: 2
Okay, the way I read the conventions, 2BIs are for a) something which 'feels' like a phrasal break, but doesn't receive any
particular tonal or prosodic mark (i.e. no phrase-final lengthening). This is different from a ?BI, which indicates an apparent mark,
but of uncertain or ambiguous BI level. So I gave this a 2, just because it seems a little bigger than a 1, but there doesn't seem to
be any mark. Lexical word, so I gave it an L* since it is near the apparent baseline. In a normal sentence, this would be H*, but
due to the focus of the preceding word, this is L*.
"have"
Break Index: 1
Another word, so it gets a tone. I've used H* because it feels like a high, although it is obviously lower than the first one. I don't
think there's a strong or obvious contrast between this intonational contour and one in which the HiF0 is here rather than on the
previous H*. Give it a try. If this were a normal rather than focus sentence, this would be a L*.
"no"
Break Index: 1
Since 'no' isn't lexical in the usual sense, I'd be tempted to give this a 0BI, but since it seems to have a distinct L*, I gave it a 1.
There's a certain amount of interpolation between the preceding H* and the pitch minimum on this vowel. On the other hand, I
think the pitch track indicates a drop to baseline here, rather than something neutral between the preceding H* and the Ls to
come. In a normal sentence, this could be either H* or L*, or unaccented.
"smell"
Break Index: 4
Low throughout, so I gave it a L*. The slight rise in pitch at the left edge of the [m] voicing I think is just a microperturbation due in
part to the voicelessness of the preceding fricative. (Voiceless sounds often are accompanied by a slight increase in pitch.
Supposedly this has to do with tightening of the folds that accompanies or anticipates abduction.) The utterance-level 4BI
requires (as I understand the current conventions) a phrase accent and a boundary tone. Since there's no evidence of anything
but L throughout, that's what I've selected. If there were a change from H to L here, or the reverse, I'd have a big trouble deciding
between the H* L-L% and H* H-L% or whatever. But no such conflict here.
This page last modified: 11/08/2009 22:57:12
So true, so true.
People are always trying to foist those break-apart orange-chocolate orange things at me. Blech. I can sort of deal with
lemon cream and chocolate, and if you must cover your candied orange peels in something other than sugar, I won't
argue with you. But really, one of these things doesn't belong: citrus, coffee, mint, raspberry. Some things go with
chocolate, and some things don't. Get over it. End of sermon.
Not Found
The requested URL
/~robh/wav/wav0801../wav/wav0801.wav
was not found on this server.
This is actually an observation I make every year after the first serious snowfall. It's just weird how different the world
sounds for those first hours after a snow. First of all, all the usual sounds--footfalls, roadsounds, etc.--just don't have the
same sound. Secondly, the usual reverby-echoey stuff that we take for granted disappear. The world gets correspondingly
small and close. It's just odd. Anyway, that was on my mind when I was putting this together.
Okay, at this point, some of you will be trying to fit in another vowel. Which is fair. But I swear I didn't produce one, and
I don't really hear one. I think that's just a transient closure approaching the fricative, overlapped with the nasal. So there's
a moment here where there's simultaneous nasal and fricative going on, and the point is it's all transitional so I'm going to
ignore it. TMSAISTI.
Such an interesting utterance. Inspired by a student's comment about the department. No, really. But we should be able to
deal with a little fronting. I did warn you that this sentence was syntactically marked.
-->"Caffeine is a necessity."
Gap. Not much in the way of voicing. Can't tell much from the transitions since the energy in the vowel starts to die
halfway through the vowel. So try any plosive you like until you find one that makes a word.
The Tones and Break Indices (ToBI) system is a set of notation conventions that are can be adapated for use in
describing and analyzing the intonational patterns in a language. A ToBI transcription contains a pitch track, an
'orthographic tier', with a transcription time-aligned with the pitch track, a 'tone tier' with tonal autosegments
indicated, and 'break index tier' indicating juncture. In my version, I time align everything to a spectrogram. I replace
the orthography with phonetic transcription, time aligned with spectrographic landmarks rather than just word edges.
I put tones and break indexes on the same tier, mostly to save space.
English conventions, broadly, recognize four levels of break index--roughly 0 for clitic boundaries, 1 for word
boundaries, 3 for phase boundaries and 4 four utterance boundaries (aligned with the right edge, or 'end' of the
constituent). 2s are used for 'anomalous' junctures--disfluencies, things that feel like phrase ends but don't get phrase-
appropriate tone marks, things that have phrase-appropriate tone marks but don't have phrase appropriate timing,
that sort of thing. The assumption is that these mark the right edges of strictly layered prosodic groupings (so a 4
corresponds to a 3 and a 1 simultaneously, since the end of an utterance must also be the end of a phrase and the end of
a prosodic word).
In English, there are a number of * tones, notably H*, H*-L and *L, where the *ed autosegment is aligned (usually) with
the stressed syllable of a prosodic word, so you'll usually get one for every non-0 BI (unless there's some deaccentuation
or something under focus, or something like that). % tones (boundary tones) usually align with the right edge of a 3 or 4
BI, i.e. marking the boundary of the phrasal constituent. In English, the assumption is that boundary tones align to the
BI, but 'spread' leftward to the end of the * tone.
The difference between H* followed by a L%, and a H*+L complex tone is subtle, and I chose to mark H*-L as sort of a
cheat. In some ToBI systems, - tones are associated with phrasal boundaries (i.e. 3s) instead of % tones (limited to
utterance or 4 boundaries). All three lexical words in this utterance seem to have the same tone pattern--the last one
has a low final tone (from the utterance-final 4), and the first one can have a low -L associated with the 3 (which also
accounts for the relative lenght of the last syllable), but the L second syllable of "common" is unaccountable. So I
declared that one an H*+L. But that felt arbitrary. So I sort of compromised.
If you feel strongly about this sort of thing, feel free to discuss this further. ;-)
Last modified: 11/08/2009 22:57:49 Support Free Speech
So, this month, or rather last month, we continue the 'return to basics' trend, with a bunch of voiced plosives. Check
out those transitions. The goal here was to get each voiced plosive with a preceding schwa, so you could see the
transitions. Well, I tried.
By the way, if you're wondering, a voiceless plosive can show some voicing energy during the closure, especially in
codas. But typically the voicing is weaker, dies off quickly (this and the previous [b] the voicing seems sort of 'steady'--
cf the coming plosive), and usually doesn't last into the second half of the closure. Just a rule of thumb.
So, it's January, and we're starting with basics. The main exercise here is to recognize my 'point' vowels, the vowels that
are the furthest apart in my vowel space. These are [i], [æ], [ɑ], [u]. And I worked pretty hard at producing an actual
back round [u], in spite of the consonant context. You'll see that I failed, but I blame that on context. More below.
Before diving in, take a moment and work out some basics. How many syllables are we dealing with? Are there any cues
that suggest where the lexical stresses and/or intonational accents and boundaries are? Any obvious segmental cues?
Inexplicably long vowels? Gaps? Apparent nasalization? Sibilance?
Eth
[ð], IPA 131
So there's this voicing that starts around 50 msec in from the left edge. The resonances of the vowel don't really kick in
until at least 125 msec, so that leaves us with almost 75 msec of voicing to do something with. Just 'cuz there's a vowel
following, I'm voting for consonant. And not a very open consonant, although since this is the beginning of an
utterance, I'd assume a fair amount of initial strengthening going on. If you look up to the 2500 range, and again at the
3500 range, what's that? That's noise. So this is probably a fricative. If it were sibilant, it would be louder. If it were /h/
it would have more F1/F2 involvement. Which leaves the labiodental and the interdental. And those transitions don't
look labial. The F2 starts too high, although frankly the F3 and F4 are not helping.
Lower-Case I
[i], IPA 301
Well, there's this big transitional thingie, but I'm looing specifically at that moment between 200 and 225 msec. The F1
is low, so it's high. F2 is way high, so very front. So we're dealing with a high front vowel. Owing to another 150 msec
more or less of vowel (even if the F2 is moving throughout), there's probably another vowel following, meaning we're at
a hiatus moment here. Meaning this syllable must be open, which means this vowel, if high, is tense. Think about your
phonotactics.
Schwa
[ə], IPA 322
Well, the F2 is moving throughout. So deciding on a backness value is perhaps a waste of time. It's interesting that the
vowel definitely is lower (though still not 'low') than the preceding vowel, so this is probably middish. But the F2 never
really comes to a stop, or even slows down, so there's no evidence of a true 'target' for the F2. Which makes me think of
vowel reduction. Which makes me happy because then I don't have to decide anything about this vowel.
Lower-Case V
[v], IPA 129
On the other hand, this F2 transition can't be anything but labial. It's way lower than the alveolar "locus" (we can argue
about the earlier transition into the [i], but it looks vaguely alveolar, more alveolar than labial, if it comes to that).
Perhaps most importantly, the F3 and F4 are clearly lowered--aside to Vineeta Chand: but not at all 'low' ;-)--suggesting
some labialization in here somewhere, but nothing like real rounding. So this thing from 350 msec to 400 msec or so is
vaguely labial, in the sense of not obviously being round, and being more labial than coronal or velar. So anyway, we've
got a consonant, voiced, and if we look really closely, fricative. That could be the noise that you get with my mushy
stops, but you don't get that kind of voicing even in my mushy stops. This being English, there aren't a lot of voiced
labial fricatives to wonder about.
Epsilon
[ɛ], IPA 303
So for almost 100 msec, there's a vowel. The F1 is, well, higher than 500, but not by much. So this is mid or vaguely
lower than mid. The F2 is, well, higher than 1500 but not my very much, so this is front, but not like front in high-and-
front. So middish to lower-middish, and vaguely front.
Lower-Case N
[n], IPA 116
So for almost another 100 msec, there's ful voicing, but the amplitude drops suddenly at 500 msec. So we've got full
vowel from 400 to about 500 msec, and then something less than fully open from 500 to almost 600 msec. Around 600
msec there's burst, so we'll have to jam in a plosive in here somewhere, but the voicing (and the upper frequency
resonance) aren't consistent with a voiced stop. So this is osmething else. The sudden change in amplitude suggests
nasal, although frankly I'm at a loss why it looks like it does. The transitions in the preceding vowel could go either
alveolar or labial, depending on your mood, although the F4 is just sitting there, which might tip the scales toward
coronal. But the pole, if that's what it is that stands in for F2 is moving (from about 1250 to around 1600 Hz), which is
just odd. And there's no obvious zero. So exactly what to do this one is a mystery to me.
Lower-Case T
[t], IPA 103
So this burst at 600 msec needs explaining. I explain it thus. This is how homorganic nasal-stop clusters look. Nasal
nasal nasal, with an oral burst. It's definitely alveolar, as the burst noise is 'acute' (higher in the high frequencies--i.e.
looks like about 5 msec of an [s]). There's definitely a disruption to the regular voicing pattern, although how exactly
we're supposed to realize voicing or voiclessness aligned with so 'instantaneous' a cue like this burst is again a mystery
to me.
Barred I
[ɨ], IPA 317
So after the bvurst, there's a short little vowel. Again, the F2 is just zooming, so I'd say there's no particular vowel
target, or at least the vowel target gets overwhelmed by the overlap with the flanking transitions. But that's just a
theory. It's reduced. Skip it.
Lower-Case Z
[z], IPA 133
Okay, so looking at the higher frequencies, there's some frication going on. It's not organized in bands; it's one big
band, centered really, really, high. Which makes it look sibilant, but it's not really that lound. But if you look down at
the bottom, there's some voicing. It's dying away really fast, but it's there. which explains the relative weakness of the
noise--hard to maintain sibilant airflow and voicing at the same time! So what we have is weak [s] noise, accompanied
by voicing. That is, [z].
Schwa
[ə], IPA 322
I'm not intending to make it a rule that if the F2 is just moving, you should ignore the vowel, but really? Is there *any*
evidence that the F2 is going somewhere other than between where the consonants are pulling it? Okay, well, if it is,
then this one has some kind of slightly low F2, so if you really want this to be back or round, fine with me.
Tilde L (Dark L)
[ɫ], IPA 209
But it's back because it's being coarticulatorily (?) velarized by this bit. From 825 msec or so for at least 100 msec,
there's something consonantal happening. Now, I'd say this was a nasal. Look at those sharp edges. Look at those flat
resonances. What's more, I'd say [m], because that nice little pole at 1000 Hz just screams [m]. But I'd be wrong, because
I didn't consider the upper formants. Look, there they are. In spite of the lower and higher apparent zeroes, these look
pretty strong. In a nasal, all the formants get relatively weaker. Than they would be in an oral vowel Compare the
overall amplitude of the resonances to the vowel in the preceding [n]. And then there's the F3 frequency to contend
with. It's raised. Compare the F3 here with that in any of the preceding vowels. That's raising. So that gives us a clue--
this could be a lateral, and that 'pole' is just the very low resonance of a very back (velarized) lateral. Ooh. I think we're
on to something.
Lower-Case O + Upsilon
[nʊ], IPA 307 + 321
Well, on the far side of the lateral is this thing. F3 starts around 500, maybe a little higher, and in the last 1/3 -rd or so it
drops, so this vowel goes form sort of mid to sort of high. And it's way back and possibly round, judging from the low
F2, and it gets backer/rounder. Easy.
Lower-Case K
[k], IPA 109
Well, this one is rough. The F2 and F3 are too far out of normal to provide much in the way of useful transitional
information, at least traditional transitional information. We've got a mushy gap, so we're probably talking about a
plosive. It's voiceless, so we're down to three possibilities. That's progress. The real giveaway is the burst. It's not [t]
looking. It's got bands, and if anything it's got a weak bit up around 4200 Hz. There's some energy in F2, but not below.
So this isn't really good for bilabial. So guess velar. Well, okay I can then convince myself that the F2 and F3 transitions
into the following vowel are vaguely pinchy (it's a stretch, but there's always some leaps of faith in this enterprise). And
then there's that clunk. See it? If the main burst is at 100 msec, then at about 1125 msec, int he F1/F2 region? See it?
That clunk? What if that's the second burst of a double burst? Ooh, double bursts are usually characteristic of velars.
Ooh, corroboration. Gotta love it, especially if it's all you've got.
Turned R + Under-ring
[ɹ]̥ , IPA 151 + 402
This explains the length I guess. Aspiration tends to go along with the duration of the following approximant, if there is
one. And there is one. There must be or there's no explanation for that F3. See how both F3 and F2, when the voicing
finally kicks on, are both below 2000 Hz. That low F3 is a dead giveaway.
Barred I
[ɨ], IPA 317
And here's another one of these stupid vowels.
Lower-Case D
[d], IPA 104
Well, here's another long gap. Quite a long gap, actually. And fully voiced. That's funny. The transitions are consistent
with alveolar, but it's so hard to rely on those. But the release looks like another alveolar release. Except this time it's
voiced, and the VOT is short, if it's greater than zero at all.
Small Capital I
[ɪ], IPA 319
Well, there's some high frequency noise, but given that we've got an alveolar plosive on one side and a clearly sibilant
fricative, I think we can just call it coarticulation, or reverb or something. SO what have we got. Well, it's got to be a
vowel. Probably high, or at least highish, judging from the F1. Quite front, judging from the F2, but not as front as the
first vowel in this spectrogram. So this is probably [ɪ].
Esh
[ʃ], IPA 134
Ah, sibilants. This is high amplitude, relatively broad band, high frequency energy. This one is more 'shaped', by
resonances/filters, than, say, the aforementioned [z] or the noise for the [t] releases. It's also centered a bit lower--most
of the others have their center, at least conceptually, above 4000 Hz. THe center here is definitely lower, just above 3000
Hz. ANd the energy shuts off abruptly (relatively speaking) below the F2 of the surrounding vowels. That's pretty
typical of [ʃ].
Schwa
[ə], IPA 322
Last vowel of the spectrogram. Kind of short for a final vowel. Must not be that important. Seriously. Final syllables
lengthen. If this is the lengthened version, how short would the unlengthened version be?
Lower-Case N
[n], IPA 116
And finally, something that has reduced overall amplitude, clear resonances, and definite zeroes. At least, something
that's definitely nasal. The pole, what you can see of it, is up around 1500 Hz, which if you're looking at my voice is
pretty clearly alveolar looking, especially without any hint of velar pinch in the incoming trnasitions.
This page last modified: 11/08/2009 22:57:34 Support Free Speech
So once again, before diving in, take a moment and work out some basics. How many syllables are we dealing with? Are
there any cues that suggest where the lexical stresses and/or intonational accents and boundaries are? Any obvious
segmental cues? Gaps? Sibilance?
Lower-Case S
[s], IPA 132
Speaking of sibilance, this is a typical sibilant. Quite high amplitude noise, here concentrated in the very high
frequencies. This really can't be anything except an [s]. The amplitude, the frequency, and the single, broad band. And
no voicing bar.
Lower-Case T
[t], IPA 103
Well, there's a gap here. It's short, but it's pretty distinct. It's voiceless, but then there's not much choice, considering
this is still the onset of an utterance-initial syllable and the preceding consonant is [s]. Okay, so the release isn't really
sharp, and tere's a hint of a double burst. But the double burst, if that's what it is, isn't in the F2/F3 range, so this
probably isn't a velar. It could be bilabial, but the noise is all wrong. There's not a nice sharp, across-all-frequencies
burst like I usually like to see with coronals, but that's what we're left with. So that noise up in the high frequency
range is pretty interesting. It's not going to be another fricative, [s-t-(fricative)] not being a conspicuous onset in
English, so that noise has to be release noise. And it's [s]-shaped, suggesting that the plosive is alveolar. If you're not
sure why, think about where the noise following the release of a [t] would be generated, and where the noise of [s] is
generated.
Barred I
[ɨ], IPA 317
I'm not 100% sure about this transcription. Let's go through the details. F1 is low, well clear of 500 Hz low, so this is a
high vowel. The F2 is falling slightly. It starts at about 1600 Hz and drops to about 1400 Hz. So the F2 is around 1500 Hz,
which is pretty much neutral territory. So this is vaguely central, moving very slightly back. Or round, but whatever. F3
is up where it's supposed to be, i.e. neutral as all get out. So high and central. But this is probably a stressed syllable.
Remember that this is my voice, general western US English with particular shades of southern California. My /u/ isn't
what you would call back. Or round, but whatever. Random fact about my dialect: /u/ and /ju/ neutralize after coronal
stops. To something quite flat. Like this.
Lower-Case D
[c], IPA 104
Another gap, this time with a nice clear voicing bar. So this one is voiced. There's that same funny maybe-double-burst-
thing-perhaps. There's not a lot else to the release, which is kind of troubling, but what there is looks like the other one,
which was coronal. In the absence of strong evidence to the contrary, I'd say [d].
Lower-Case S
[s], IPA 132
And here we have another sibilant. Voiceless. and higher in the high freuencies, so this is another [s]. Now, it's
voiceless. So eventually you'll have to decide whether it goes with the following syllable or the preceding. And there's a
trick here. So look at the next segment and then we'll talk.
Now here's the tricky part. This looks like aspiration. Depending on where you count from, the release noise is at least
50 msec long. This is an aspirated stop. So if the preceding [s] is part of this syllable, this should be an [s-(stop)] cluster,
and there shouldn't be much in the way of aspiration. But if the [s] is part of the preceding syllable, it follows a syllabic
[n] and is probably a plural marker or something (this being the second syllable of the utterance and therefore part of
the subjec NP). But we'd expect a plural marker following a syllabic sonorant (a voiced sound) to be voiced, i.e. [z].
Now, I'm famous for devoicing my [z]s, but the result doesn't look quite as [s] like as this. The noise of [z]s is typically
quite weak, compared to your average sibilant, even when devoiced. And devoiced [z]s are always (in my experience)
seriously shorter than analogous [s]. And this [s] isn't. So this is an [s].
Eventually, you'll notice there's another hypothesis, that there's an underlying voiceless consonant in between the [n]
and the [s]. Probably homorganic with the [n] (and coincidentally with the [s]). I'm not saying it deletes necessarily, but
if you've been following these things for a while, homorganic nasal-stop clusters tend to have reduced stop phases.
They look like [n]s (or whatever) with release bursts at the end. But instead of just releasing, there's an [s] to contend
with. It would be nice if there were a nice little burst, but life ain't perfect.
Moving on.
Schwa
[ə], IPA 322
Tiny short vowel, possibly a figment of my imagination. But I think there are a couple of nice clear periods of vowel just
before 700 msec.
Lower-Case M
[m], IPA 114
This one is another nasal, for the same reasons the last one was. This one is bilabial, however, because the pole is
around 1000 Hz, rather than 1500 Hz.
Lower-Case N
[n], IPA 116
Teeny short little gappy thing, but voiced. Fully voiced, with a nice strong voicing bar and some resonance. So this isn't
a standard flap, which is usually more plosive looking. If you abstract away from the length, or lack thereof, this coudl
be a nasal. You can even see a little big of pole around 1400 Hz. Okay, it's a little lower than the previous [n], but close
enough. Nasal flap. Remember it.
Schwa
[ə], IPA 322
Another vowel. I notice now that I've misplaced the segmentation mark. I think the vowel here starts at the end of the
flap, let's call that 1125 msec, but it's not quite, ad goes on to, well, let's call it 1175 msec. That's just short. And so
probably reduced.
Tilde L (Dark L)
[ɫ], IPA 209
The peak of the 'constriction' or whater you call it is where I marked it, but as I said above, I think the contact starts
much earlier than I've marked. But whatever. Somewhere in here, let's say between 1200 and 1300 msec, we've got
another dark [l]. Ignore the indiscernable F1. F2 is low. F3 is still way high.
Turned Script A
[ɒ], IPA 313
I love that symbol. I don't know if this was really round but the F2 was a little lower than I expected. The F1 is high, so
this is a very, very low vowel. And it's about as back and or round as it can get.
Lower-Case T
[t], IPA 103
I don't know if that last pulsey thing at 1600 msec is a glottal pulse or a closure transient, but it's the last evidence of
anything for almost 100 msec of serious gap. So we have another plosive (stop) here. F2 transitions in the vowel look
vaguely pinchy--but only because the F3 is returning to neutral from being raised, and the F2 is returning to neutral
from being lowered. The release at 1700 msec definitely looks like a nice alveolar burst.
This page last modified: 11/08/2009 22:57:33 Support Free Speech
Eth
[ð], IPA 131
I worked really hard at making an unstopped, fully voiced initial Eth, and I'm still not 100% successful. But look at that
voicing. More than 100 msec of it. But the visible frication is, well, as weak as it should be, but it really only creep sin in
the last 25 msec or so, and really only once the voicing in the upper formants starts to creep in which might as well be
vowel. So I don't know what to do. We've got voicing. We've got fricative, sort of. We've got transitions into the
following vowel consistent with alveolar (less so than velar or bilabial--the F3 is just sort of sitting there, where for
either velars or bilabials it should start lower), but there's no sense in which the fricative looks sibilant. So we're talking
some kind of coronal fricative, nonsibilant. And voiced. Narrows it down quite a bit, actually.
Lower-Case D
[c], IPA 104
Looking at this again, I'm wondering if I mis-transcribed it. We've got a shortish gap, with pretty full voicing
throughout. There's even some evidence of resonance through the gap, so maybe I should have transcribed as a flap.
But I didn't, so there you go. It's a gap, so if we take it as a plosive, then we have to decide which one. Well, voiced. No
velar pinch, which leaves bilabial or alveolar. ANd the transitions don't really look bilabial.
Lower-Case O + Upsilon
[nʊ], IPA 307 + 321
Ah, movement. This vowel, and the upcoming couple of consonants, were the whole point of this utterance. Okay, F1 is
a little higher in frequency than the previous vowel, but it's still in the mid range. The F2 starts up around 1750 Hz but
then falls sharply to a low just about 1000 Hz. So this is mid. It seems to start front, but falls outlandishly sharply
towards definitely back and round. I don't know how much of that is allophonic and how much is just me and my
screwed up back vowels. Something to think about as ew move on. But anyway, the only things that can end up that
back in my dialect of English are phonemically/historically back. And round. So this is an /o/, an id it looks like a
diphthong in my speech. Remember that. For the record, I think I transcribed it as nasalized (at least the offglide before
I decided that there was a nasal consonant in the following coda, distinct from the onset after that. More below.
Lower-Case N
[nʔ], IPA 116 + 113
Now this is a nasal. I'ts fully voiced, the zero(es) have expanded to kill even the lower pole. And the resonances are flat.
Frankly, it still looks like a bilabial, but I'm *really* hoping that that's just coarticulatory rounding. The point of this
part of the spectogram for me was to see what this transition looked like. When you've got one continuous closure (I
suppose) at the alveolar ridge, the velum down continuously. I'm sure there's tongue and jaw gestures happening to
distinguish the coda nasal from the onset one, but I'm not sure how they'll show up. I'd assumed that we'd see evidence
of glottality and a change in the amplitude of voicing. Well, neither of those is amazingly robust. What I definitely see is
the change in the amplitude of the zero, which I'm going to have to think about some more.
Lower-Case O + Upsilon
[nʊ], IPA 307 + 321
Well, we've got an F1 in the mid range (around 500 Hz) but this time, it's moving slightly downward. The F2 is starting
slightly back of centered (i.e. below 1500 Hz), but again is moving downward. So this is starting mid and vaguely back,
and moving backer (and/or rounder) and higher. Once again, this is /o/.
Lower Case W
[w], IPA 170
Well, that's the lowest F2 you're ever likely to see from me. The F1 isn't quite as low as I'd expect for an intervocalic
(and presumably onset) /w/, but oh well this isn't an amazingly strong boundary, in spite of its syntax. Which we don't
know about yet. Oops. Okay, so we've got an F1 in the mid to higher mid range, an F2 extremely low indicating extreme
backness and/or rounding. F3 not doing much. So this is something resonant, backish and roundish. The lowered
energy in the high frequencies suggests a consonant. Given that it's obviously surrounded by things which are 'more'
vocalic, this is something vowel-like acting like a consonant, i.e. an approximant. So if it's a choice between /u/ (or /o/
or something like that) and /w/, we know which to pick.
Epsilon
[ɛ], IPA 303
Well, this vowel is mostly transition, but from what we can see of this syllable, the F3 is still hovering in the mid-range.
Abstracting away from the preceding /w/, the F2 seems to be heading toward the front space. The F3 is dropping,
eventually coming to a minimum around 850 msec, so we'll save that moment for later. But the point is the F2 is
heading as high as it can until it stats to run into the lowering F3. So this is middish, and frontish. And given the
following segment (that F3 thing) it's probably not worth asking whether it's tense/long/diphthongized or
lax/short/centralizing.
Turned R
[ɹ], IPA 151
So the F3, which up intil about 850 msec is hovering between 2300 and 2700 Hz dives to around 2000. Think that's not
low enough? Get real. Especially for a coda /r/ with a front vowel. So the reason we're not worried about tense or lax in
the previous vowel is that the tense/lax distinction neutralizes before /r/. La.
Lower-Case T + Right Superscript H
[tʰ], IPA 103 + 404
So there's a couple of very widespread pulses in this gap (from about 900-950msec or so), but given the aspiration
following the release, I'd not worry about it. The release occurs at about 950msec, and there's not only at least 50 msec
of VOT, but there's higher frequency noise (look at that around 3500 Hz for 100 msec after that. And that noise is high
frequency, broad band and *very* strong. So that's sibilant release, which means this plosive has to have been a) behind
the teeth and b) close enough to produce noise upon release that resembles an [s]. Hence an aspirated /t/.
Barred I
[ɨ], IPA 317
So amid all that high frequency noise, there's a neeny little vowel. GIven how short it is, it's not worth thinking too long
about. But it's mid or higher-mid judging from the F1, and slightly front, judging from the F2. But it's reduced, so treat
it that way.
Lower-Case G
[g], IPA 110
The gap here starts at agbout 1050msec and goes on to, well, after 1100 msec. And the voicing is pretty strong and fairly
consistent, considering it looks like a serious plosive and not something slushymushy like my plosives so often are. The
burst is very broadband, and it's a little tilted t the higher frequencies. But the burst isn't 'sharp' the way the [t] release
was. It could almost be 'double'. Hmm. And there's a lot of energy just about 1700 Hz or so, rather than being clearly
concentrated higher up. So while I'll admit this is ambiguous, but the slightl low F3 might be evidence of velar pinch.
It's unlikely to be coarticulation with the previous [r], just because there's at least a whole syllable in between.
Lower-Case O + Upsilon
[nʊ], IPA 307 + 321
And finally here's another /o/. Again it seem sto be mostly mid throughout (the F1 is basically flat at 500 Hz for 400
msec or so. It starts out sort of central (in the mid F2 range) and moves decidedly backer/rounder (lower in frequency).
So I guess my /o/ is a diphthong. My /e/ is still open to interpretation....
This page last modified: 11/08/2009 22:57:32 Support Free Speech
To repeat: There are different styles of reading--this left-to-right business is just how I do it for convenience. As time
goes on, I'll be introducing other styles. One of the things that I always forget about, at least when sitting down to do
these is the 'big' picture stuff. For instance, how many syllables (or at least vowels) are there in this? What evidence of
segmentation do you see? Where? Can you see anything suggesting pitch peaks/lows, correlates of stress like
amplitude, length, or pitch excursions? Once you've done that sort of thing,usually you go through and mark all the
things that are obvious--the sibilants, the nasals, if you can see them, things that are obviously [i] or [a], that sort of
thing. Then, once you've got the big picture, then you start in on specific cues.
Schwa
[ə], IPA 322
My favo(u)rite. about 25 msec of vowel. Too short to do anything with, too long to ignore. Must be reduced due to
extreme stresslessness. Funny how something so short can still get a pitch accent on it, tho. Hmm.
Lower-Case N
[n], IPA 116
Well, what's really interesting is this almost 100 msec of consonant. Sonorant, with formants and everything, and
obviously very fully voiced, this clearly has less energy than your typical vowel. So it must have some kind of closure
somewhere. But it's *long*. Okay, so this is probably a good candidate for a nasal. I just wish it had an obvious zero.
Well, up near 3000, but that doesn't really count by itself. Oh well. The F1 is clearly being depleted by something. Let's
call it a weak zero... and then the pole is nice and high, as poles go, up around 1500 Hz. So that's a good indication of an
alveolar nasal.
Lower-Case D
[c], IPA 104
Well, really the only evidence of an obstruent moment here is the transient--be it release, burst, or clunk (that being
the technical term for a moment like this that we would otherwise choose to ignore). But there it is, and if it isn't a
clunk, we have to explain it. As I've observed before in these things, my (especially homorganic) nasal-plosive
sequences tend to look like this, using Steriade's Aperture Theory, a nasal closure with an oral-looking release. So this is
some kind of oral release. It looks slightly velar, concentrated in the middle frequencies rather than the upper
frequencies (as would be more typical of an alveolar). But then it would be tough to make word out of. The transitions
are not amazingly helpful, in that the preceding sound is a nasal, and the following is an /r/, which perturbs the
formants beyond all useful visual cues. So know there's a plosive here and move on.
Turned R
[ɹ], IPA 151
Well, this is perhaps the lowest F3 I've ever produced. So there you go. The F1 is very low (very close articulation), the
F2 is quite low. And F3 is the so low it's threatening to move into the low *F2* range. Can't get any lower than that.
Must be an /r/.
Lower-Case V
[v], IPA 129
So I've already given it away. THere's clearly labial transitions moving into this bit of noise. And there is a very short bit
of oral fricative here. So it must be labial. This being English, it's labiodental.
Lower-Case H
[h], IPA 146
On the other hand it opens immediately into something a bit louder, voiceless, but with evidence of formant structure.
Classic [h] stuff.
Lower-Case O
[n], IPA 307
So when the voicing finally cicks on, we've got sort of a problem. I'm not entirely sure where F1 is. The F2 is that bit just
under 1000 Hz. There's no zero-ish looking thing below it, so the F1 can't be jammed up too close to the F2, but it's not
so low as to disappear into the voicing bar. So this isn't low and isn't high. Well, we have some mid vowels to play with.
Note the lack of evidence of an offglide. For once. Don't let anyone tell you that 'tense' mid vowels in English are
*always* anything. Whether they are or not is an empirical question.
Lower-Case M
[m], IPA 114
So starting around 875msec and going on for about 50 msec or so, there's this thing where the amplitude falls off. So
there's something here. There's a pole more or less where the F2 in the preceding [o] was, but the F1 sort of disappears.
There's a move to transition up for both formants, if that's what they are once the amplitude starts to kick back on, so
all we're really looking at is this short lower amplitude bit. Which is lower amplitude because it's a nasal. And the pole
is that F2 thing, down just below 1000 Hz, which is pretty typical of my bilabial nasals.
Ash
[�E6], IPA 325
so again we have to abstract away from some odd transitions. The F1 doesn't reach its extremum until about 1000 msec,
but it's a high, so this must be a fairly low vowel. But the slope of the movement toward the extremum doesn't look
only transitional, so I wonder if there isn't another target floating around there. Then again, most English diphthongs
don't have *low* offglides, so I'm probably just dreaming. The F2 is fairly flat. It's maximum occurs sort of early, and
falls very slowly over the course of the vowel. There's evidence in the last 25 msec or so of downward pointing
transitions (labial again), so maybe that trend in F2 is just transitional. But maybe not. So we've got something mid-to-
low and centralish, and it's sort of long, so it might be moving to definitely low and very slightly back. Backer than
front, but not flat out back, especially compared with the nuclei of the earlier diphthongs. So this is probably a vaguely
front, but very low, vowel.
Lower-Case F
[f], IPA 128
So again we have something labial looking, this time with a burst. Which would make it a plosive of some kind. But the
voicing or noise or whatever it is at the bottom is a little loud and a little flat to go with a truly closed plosive. Then
again, you know my plosives are often sort of mushy. So as plosive as this looks, I'm going to suggest that it's worth
paying attention to the teeny weeny, almost imaginary, bit of noise up in the very high frequencies, and suggest this is
fricative. That and it makes a better word. This explains the noise, and the greater duration and frequency of the noise
at the bottom, compared to what I want to call noise or perseverative voicing or whatever it is in that thing earlier I
wanted to be a [k]. There's a lot of wishing going on in this spectrogram. But oh well. The bursty thing then isn't a
burst, is a noisy transition between the [f] and the [t] closure. Which if you do it a few times, can get really sharp (noisy
and short?). And the fact that it's sharpest in the mid frequencies I attribute to the high-attenuating properties of the
labiality with the short front cavity formed by the closure at the alveolar ridge. TMSAISTI.
Lower-Case T
[tʰ], IPA 103 + 404
Well, here's gap. Even ignoring the 'explanation' above, there's a gap, a release, and some fairly significant aspiration
noise. So there's voiceless aspirated plosive here. The noise is broad band, and very loud in the very high frequencies.
Pretty classically alveolar. There's another concentration of energy in the mid frequencies, but I attribute that to the
following context....
Lower-Case D
[c], IPA 104
At last, a gap that looks like a gap and is one. There's some serious voicing going on, but it's properly attenuated, as if it
were being produced in a closed space. Woo hoo. Okay, so about place. Nice sharp release. But that's about it. It looks a
little velar, depending on whether or not you think that's velar pinch in the offset. The onset transitions don't look
velar at all. The F4 is diving, who knows why. The F3 and F2 are basically just sitting there. So it's ambiguously velar.
Which turns out to mean ambiguously not velar. IF it isn't velar, it must be alveolar, since there's no way those F3
transitions on either side look bilabial. Watch me be wrong about the next set of bilabial transitions that come up...
Script A
[ɑ], IPA 305
Well, the F1 extremum occurs sort of late in the thing I've marked off as the vowel (that mark is based on the weird
amplitude/pitch thing that happens about where I've marked it, which is as arbitrary as anything else). But this is fairly
flat F1 anyway, so oh well. It's a quite low vowel. The F2 extremum (a minimum this time) occurs more or less at the
same moment, so it's a good bet that it means something. So this is very low and very back. And r-colo(u)red, judging
by the F3, if that sort of thing matters to you.
Turned R
[ɹ], IPA 151
Another low F3 deal. Not much else to say. Except look at those F2 and F3 transitions into the following stop.
Lower-Case K
[k], IPA 109
Now there's some velar pinch for you. The noise here has less amplitude than a for a typical alveolar closure, and it's
organized in bands as if it were far back and exciting forward cavities. And one of those resonances is in the region of
the F2/F3 pinch, which is a pretty good indication that this is a velar.
This page last modified: 11/08/2009 22:57:31 Support Free Speech
So here's the thing. As I start writing this, I'm at gate 211 at YYZ (Pearson, Toronto) waiting to board my flight home,
hopfeully in about 15 minutes. It's Sunday, 30 January, and I'm at the end of what I think was a successful conference.
But it's also late-ish on my fifth straight day of being 'on' most of the day on about 3-5 hours of sleep a night. So since I
have to put this up Tuesday morning, I'm grabbing my spare moments here to do this. I'll have to continue on the
plane, and who knows when else tomorrow. So if the text of this is more disjointed than usual, you know why.
Small Capital I
[ɪ], IPA 319
Well, picking up after the unusual 150 msec of silence at the left edge (just ask anyone, 150 msec of silence from me is
an unusual event), the first visible thing here is a vowel. Regular pulsing begins at about 150 msec and continues for
about 75 msec. The energy is quite strong and goes all the way up the (visible portion of the) spectrum. So now we
check the formant structure. The F1 is the lowest formant, and it seems to occupy the bottom half of the first 1000 Hz.
So bandwidth being bandwidth, the centre of this formant is probably well below 500 Hz. Let's say 300-400 Hz or so. The
F2 starts up not quite around 2000 Hz and falls to about 1750Hz. So this ain't an upgliding diphthong. F3 doesn't tell us
much, sitting up around 2500 Hz and F4, if you care, is right where it's supposed to be around 3500. I've never looked so
Average General American Male in my life. ANyway, we have a very low F1, so we have a quite high vowel. We have a
quite high F2, so we have a distinctly front vowel. And it's short, and glides, if anything, backward rather than forward
(I.e. the F2 indicates retreat from front to not-so-front), characteristics of front lax vowels. So this is a high, front,
lax/short vowel.
Lower-Case T
[t], IPA 103
So what we have here, starting about 225 msec and running to about 300 (and a bit further in the lower frequencies) is a
gap. An empty space in a spectrogram. A moment of relative silence. Which is usually associated with plosives. Now a
good, strong, domain-initial plosive would have a nice closure transient and a nice strong release burst. But this one has
neither, as far as I can tell. But that might just be because it's in a weak position, prosodically. So it's probably a plosive,
and probably ina coda. It's definitely voiceless, with no striations at the bottom in the 'voicing bar' we look for. So we
have a limited number of choices. (Quiz for beginners: What are the voiceless plosives in English?) So we need to look
for clues as to place. With plosives, those are usually in transitions, into or out of the closure, in the release
information, and in the top-down phonotactic knowledge that we all have such good command of. So if you look at the
transitions into the closure, F4 isnt' doing much. F3 isn't doing much. F2 seems to be approaching about 1750 Hz. F1
doesn't seem to be doing much.
Brief excursus 1: There's "positive" evidence, i.e. this observation points to this conclusion, and then there's "negative"
evidence, i.e. the absence of anything that points to something different. Positive evidence is better--there are some
areas where negative evidence isn't even permissible. But this is spectrogram reading and we take what we can get.
So I'm going to guess /t/ here. It's voiceless, it's plosive, and it's consistent with an F2 transition target of around 1750
Hz, especially if you are fond of locus theory. If you're not, that isn't much evidence to go on, but there's no evidence of
velar pinch and not real evidence of labiality (in the form of lowering of all formants, or at least one other than the F2
which is ambiguous, as far as 'lowering' goes). It's also a good guess statistically and phonotactically, just because
coronals are so much more common than other plosives. The release characteristics, such as they are, are consistent
with this guess--more consistent with a guess of coronal than of anything else.
Brief excursus 2: At this point, it's useful to start looking at top-down information. So how many words an you think of
that start with /It/? How many of those are likely to start an English sentence? Good.
Lower-Case S
[s], IPA 132
So there's a brief bit of friction here, about 50 msec long straddling the 300 msec mark. This could just be the release of
the preceding plosive, but the fact that it gets stronger and broader-band (involves more frequencies) at the right end
rather than at the release of the preceding plosive suggests that this is not 'just' the release of the preceding /t/. So if it
is something else, what is it? It has a broad band, i.e. the energy is distributed over a large and mostly continuous range
of frequencies, and it's strongest off the top of this spectrogram (so its peak must be above 4400 Hz, probably at least 6-
8 kHz. This is typical of siblant [s].
Schwa
[ə], IPA 322
Well, we've got here a vowel. If you notice, the F1 is just a little higher than the previous vowel, so this must be vaguely
mid. The F2 is sort of all transition, so it doesn't seem to show evidence of a 'target' of its own. That's a pretty good
indicator of a reduce vowel, i.e. something which in English you'd just transcribe with a schwa and then move on.
Which is what I'm going to do.
Lower Case W
[w], IPA 170
Well, the F2 is the real clue here. It's diving down to about 750 Hz or so, indicating something very round and/or very
back. The reduction in energy in the frequencies above 1000 indicate a degree of stricture greater than for a vowel, but
since we've got something apparently sonorant and fully voiced, it must be an approximant. A nasal would have the
reduction in energy, but it would effect the low frequencies as well, to a greater degree than here. So we've got a
backish roundish approximant. The F3 is being drawn down by the the coming transition, so it's not a good source of
information, but the F4 is also lowered, again suggesting rounding.
Top-down alert: At this point, we can start hypothesizing. "itswer" is an unlikely word, but 'it' is a very likely beginning
to a sentence. If 'it' is the subject of the sentence, we need to look for a third-person verb. If 's' is that, then we're
looking for some kind of locative, predicate nominal, predicate adjective, or something like that...
Lower-Case K
[k], IPA 109
Another gap, so probalby another plosive. And voiceless. It has a double burst approaching 700 msec, which is most
characteristic of velars. Velars also exhibit 'pinching' of the F2 and F3 frequencies, which we also see, although between
the lowered F3 of the preceding sound and the raised F2 of the following sound, the apparent approximation of the F2
and F3 frequencies may not be the most useful cue here. Now, note that I said voiceless, but not *aspirated*. (Quiz for
beginners: What is the significance of a voiceless plosive being unaspirated in intervocalic position?)
Barred I
[ɨ], IPA 317
ANother teeny short vowel that's mostly transition. This one is fronter (higher F2) than the previou sone, so following
Keating et al (1994), I transcribe it as barred-i.
Lower-Case N
[n], IPA 116
So this is what I meant above by the lowered amplitude applying to all the frequencies. Somewhere around 750 msec,
two things happen--the amplitude drops off at all frequencies (with the sudden appearance of zeroes in several places)
and the formants flatten out completely. So this is a nasal. For nasals, you want to look at the frequency of the first pole
above the F1. Which would seem to be about 1500 Hz, which for my voice is consistent with the alveolar nasal. Bilabials
have that pole closer to 1000 Hz, and velars usually evidence some degree of velar pinch, which if you look at the
barred-i is not at all in evidence.
Top-down alert: If the previous syllable is 'work', then this syllable could well be 'ing' or rather "in'". But what are the
odds? If this is so, then the phrase 'it's a working...' is plausible, but the next thing is likely to be a noun, modified by
'working'.
Turned R
[ɹ], IPA 151
There's another one of those low F3 things, this time on the periphery of a vowel rather than being one. Note the slight
attenuation of the higher frequencies characteristic of approximants as opposed to vowels.
Script A
[ɑ], IPA 305
So we've got a vowel. Abstracting away from the transitions, we want to look at the stretch between about 1050 and
1100 msec, where the F1 and F2 are 'steady', and the F3 is as stable as it gets. And during that stretch, the F1 is very
high, indicating a very low vowel. The F2 is very low, which as we said before indicates backness, rounding or both. So
we're looking for a vowel which is about as far back as we can go and as low as we can go. So we're looking in the
vicinity of the Cardinal 5.
Lower-Case K
[k], IPA 109
So the question is whether the falling F3 transition leading to this is another /r/, or if it's just pinch. I'd say it was just
transition, but I'm not sure--it seems to me that the F3 woudln't bother to rise and be steady at all if there were
flanking /r/. So since the gap here is followed by a nice double burst, and probably another velar, I'd say that this is all
consistent with velar pinch.
Turned R
[ɹ], IPA 151
I actually didn't intend this to be an r-ful spectrogram, but then I am the /r/ guy. Mostly I wanted to throw the
Canadians in the audience by saying pr[ɹ]gress instead of pr[o]gress.
Schwa
[ə], IPA 322
This looks to me like a schwa, although that's not what I *think* the vowel is phonemically. But whatever. The formants
seem to be "about" 500, 1500, 2500 Hz (this last at least at the end as the voicing dies out).
Lower-Case S
[s], IPA 132
So here's another fricative, quite weak, given how very long it is, but I take that to be a function of its utterance-final
position. It's broadband, and it's strongest in the highest frequencies (and best organized up there too--the low
frequencies kind of come and go, but the upper frequencies are always there. So this is another sibilant [s].
So it turns out that that middle word wasn't 'working', but two words 'work in'. Always go back and reconfirm and
retest previous hypotheses.
This page last modified: 11/08/2009 22:57:28 Support Free Speech
Okay, this is a slightly retooled version of something I actually said, the way I said it. When I realized it was a prosodic
nightmare, I decided to use it. Ha. In addition to the segmental transcription, I've included my best guess at a ToBI-style
transcription of the pitch track. I've taken some liberties with the ToBI conventions. I've sort of conflated the Break
Index Tier and the Tone Tier into a single line. Instead of an Orthographic Tier, I've just aligned the break indices to my
segmental transcription. I've skipped the additional 'miscellaneous' tier, but it might be worth having one, if only to
mention the absurd lengthening at the end of the first 'phrase'. If that's what's going on. If anyone out there is ToBI
savvy, I wouldn't mind discussing whether my interpretation is plausible. I've appended a discussion of the prosody at
the end of this page.
"could"
Break Index: 1
I've included a H* on the stressed 'could', since the pitch track indicates (relatively) high pitch here. This might be a
case of those rare %H right boundary tones, since this utterance could just as easily have started low. I didn't affricate
the /...d# #j.../ sequence here, so I decided it didn't merit a 0 BI.
"you"
Break Index: 3
The absurd lengthening of this word I attribute to it being, somewhat accidentally, at the end of a phrase. In the 'real'
utterance, this was me trying to decide whether not to suggest an option to something else that was going on. Even
though this is a separate word (i.e. the no affrication) it doesn't seem to exhibit any evidence of a separate pitch accent
of its own. I assume the slope is just interpolation between the H* on 'could' and the L- edge tone. I think these are still
officially called 'phrase accents'. I just call them the edge-tones (or occasionally 'minus-tones'), as opposed to the star
tones (word level pitch accents) and boundary tones (marking the ends of Intonational Phrases ('clause' or 'utterance'-
level constituents).
It's been pointed out to me (thanks, Kevin) that this isn't so much a real phrase ending as it is a filled pause. The whole
flat-but-declining contour here then could simply be just unmarked mid pitch (note the high isn't as high as the later
highs and the lows get lower). Since there's obviously no contrast with other kinds of relatively unmarked intonational
stuff (e.g. low throughout) this is probably not a bad interpretation. But less fun. ;-)
"take"
Break Index: 1
As the speaker, I feel like there's a high of some kind on 'take', but as you can see, the pitch peak is displaced onto the
following syllable. And if that weren't bad enough, there's a distinct low on the 'take' syllable (it's higher than the edge
L- preceding it, suggesting baseline reset and justifying my 3BI at 'you', if the lengthening weren't enough). Since the 'a'
definitely isn't tone-bearing in the usual sense, I've decided this is a 'scooped' L*+H.
"a"
Break Index: 0
The 'a' is marked with a 0 BI, since I regard it as proclitic on the next word. It doesn't get a pitch accent (star tone) of its
own, and I definitely don't think it deserves a H* of its own, which would be the most obvious choice.
"later"
Break Index: 1
I think the current ToBI conventions call for some kind of pitch accent on every stressed lexical word unless it's
deaccented in some way. So I've marked 'later' a L* on its stressed syllable. I almost didn't, except for my interpretation
of the aforesaid convention, but I think this is the right decision independently. If there weren't a separate pitch accent
on 'later', there'd be no reason for the tone to drop so abruptly here, instead of just interpolating in a more or less
straight line to the L* on 'flight', as in 'you'.
"flight"
Break Index: 4
This gets a 4BI due to being at the end of an utterance. 4s indicate the end of an Intonational Phrase, which as I said
above are (generally) clause- or utterance-level, or have the feeling thereof. Anyway, 4s are interesting, because a) being
both phrase and utterance final, the last syllable before a 4 lengthens a lot, b) being both the end of an Intermediate
Phrase and an Intonational Phrase, they get both edge-tones and boundary-tones. And this 4 marks the end of a one-
syllable lexical word, which gets a * tone of its own. So I chose a L*L- H% sequence. There's clearly a L* of some kind on
flight, which is only to be expected in a yes-no question. And it being a yes-no question, it gets some kind of H%
boundary tone. But what kind of edge tone should go here? I've marked a L-, since the L* seems a little long. But that
means displacing the L- to the left, sort of next to the L*, instead of next to the boundary tone where it usually goes.
The other choice, L* H-H%, is a possibility, but then we'd have to say the longish L* is due to final lengthening, and I'm
just not up enough on intonation to know of the targets get long under lengthening, just like segments, or not. If
anyone who knows the E_ToBI conventions better than me wants to comment, correct, or discuss, please e-mail me.
This page last modified: 11/08/2009 22:57:11 Support Free Speech
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93 or Lucida Sans
Unicode. If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode,
you may volunteer to teach me how to do 'family' font calls, in the (now deprecated) FONT tag. Or teach me how to do it
with a cascading style sheet.
August 2002
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
]
To properly view the phonetic symbols in the text below, you must have installed either SILDoulos IPA93 or Lucida Sans
Unicode. If you are desperate to see phonetic symbols in SIL Sophia or SIL Manuscript, or some other kind of Unicode,
you may volunteer to teach me how to do 'family' font calls, in the (now deprecated) FONT tag. Or teach me how to do it
with a cascading style sheet.
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
]
Solution for May 2002
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
Note: Okay, this month, we're kissing my GIFs goodbye. Nobody likes them, and as long as a) browsers standardly support client-side
fonts and Unicode, and b) the relevant client-side fonts are free, I'm not going to miss them. SIL Doulos also is having trouble in some
browsers (or vice versa) with character spacing and overstrikes, so Unicode seems to be pulling into the lead. So this month it's Pullum
& Ladusaw labels, IPA identifiers, SIL Doulos, Lucida Sans Unicode. You can download Lucida Sans Unicode from John Wells, and
you can download the SIL font(s) from SIL.
Linguistics Department
University of Manitoba
Winnipeg, Manitoba
CANADA R3T 5V5
Solution for March 2002
Nice high F1, meaning a low vowel. Not particularly Front, but not particular back either. Lowest not-obviously-back vowel available is
/ae/.
[ ], [¨], TURNED R
Here's the thing about approximants--they're articulated similarly to vowels, but they behave like consonants in a string.
So this is obviously sonorant and voiced, having formants (resonances) and striations. Note the transistions in the
following vowel, which show you that the F3 in the voew is continuous with the very very low F3 in the consonant. The
low F1 doesn't tell you much, the extremely low F2 might tel you back and round, but that F3 in the middle-F2 range is a
dead give-away for approximant /r/ in American English. (I respectfully remind you all that even though I'm in Canada
now, my dialect is definitely US, and western US at that.)
[ ], [Q], ASH
(I prefer 'AE LIGATURE' to 'ASH', but who am I to argue with Pullum and Ladusaw?) (Does anybody remember what a
runic AESC actually looks like?) Okay, The F1 in this latter part of this vowel (the steadier part, relatively unaffected by
the transition from the preceding sound) is highish, suggesting a lowish vowel. The F2 is sort of ambivalent, and the F3 is
a little low. So we're looking at a mid-to-low vowel, of a non-back variety. For me, that can only be [ ]- [Q]-ASH, or [
], [E], EPSILON. It'll turn out to be one or the other.
Woo, a rough one. Very. Things will get easier in January. Probably.
Lower-case M
[m], IPA 114
From about 50-150 msec there's strong voicing, a weak, flat formant at 1200 Hz, another around 2600 Hz and one higher
than that too, but nothing in between. So something with that kind of weak resonance/zero structure, and flat, has to
be some kind of a nasal. The pole at 1200 is lower than I'd usually get for an alveolar, though a big higher than I'd
normally get for a bilabial. But the transition in the following vowel is in no way alveolar-looking, so there you go.
Probably in initial position my tongue wasn't as low as it might have been, effectively shortening the side cavity. Think
about it.
Lower-case O
[n], IPA 307
F1 at about 500 Hz, F2 just above 1000 Hz. Don't ask me why the F3 is high. But something that's got a mid-vowel F1, a
back/round vowel F2. And basially flat, rather than obviously diphthongized.
Lower-case S
[s], IPA 132
Not the strongest I've seen, but whatever. Probably the effect of the syllabic psoition (coda, but not final). A single
broad band of noise, centered fairly high, and without a sharp drop off in the low frequencies. Don't be distracted by
the weak perseverative voicing.
Lower-case T
[t], IPA 103
Short gap, followed by [s]-shaped release noise (pulled down a bit in frequency by coarticulation with the following
sound). So alveolar [t] has [s]-shaped release noise. Discuss.
Turned R
[ɹ], IPA 151
Hard to tell, but that's F3 just above teh F2. That thing up around 3000 Hz is just too high to be F3. So the F2 starts about
1100 Hz and rises, and F3 starts around 1400 Hz and rises. F3's that low can only be rhotic.
Barred I
[ɨ], IPA 317
On the other hand, this short vowel is too short to worry about. To the degree that it's not just 'more' /r/, I had to
transcribe it as something, and following the F2-closer-to-F3 rule from Keating et al (1994), I chosed barred-i, although
with an F3 that low it can't help be close to the F2. I didn't use a rhoticity hook on it, since that mostly implies a
following /r/ (although there's no reason why it should). Moving on.
Lower Case W
[w], IPA 170
The thing about a [w] is that it's all transition. The noise that is visilbe in the low frequencies during the VOT is
supported by F1 and F2, which you can see rise sharply once the voicing kicks in. Note the 'straight' F2 transition,
typical of English (onset) [w]. Also note the F3 starts a little low, but by no means as low as an [r].
Epsilon
[ɛ], IPA 303
Well, the F3 is definitely low, but in my head there's a separate vowel here. Maybe these should be transcribed as
diphthongs. But anyway, the F1 is middish, the F2 is very slightly front. ANd there's an /r/ coming up, so there's a
neutralization here anyway.
Turned R
[ɹ], IPA 151
Well, there it is.
Lower-case F
[f], IPA 128
Very slight frication, fairly broad band, unshaped by any resonances, and strongest, at least at the beginning, in the
very low frequencies. No way this can be sibilant. The F2/F3 transitions are consistent with a labial, but that could just
be the [r]. So if this turned out to be an interdental, I wouldn't be particularly surprised, although phonotactically it
would be odd.
Lower-case M
[m], IPA 114
Nice voicing bar, but F1 is either 'gone' or so low it's in the voicign bar. Zero below 1000 Hz, nice little pole about 1000
Hz, more zero, then another pole in the neutral F3 range. Look familiar? It should. Flat, resonant, with zeroes, must be a
nasal. The 1000 Hz pole is pretty classically bilabial, and the F2 transition in the following vowel can't really be anything
but bilabial also. So this one is pretty clear.
Ash
[�E6], IPA 325
So here we have another mid-to-low vowel (highish F1), a frontish but not comepletely convincingly front F2 moving, if
anyything, toward neutral. Really can only be [E] or ash.
Lower-case N
[n], IPA 116
Short enough to be a nasal flap, I guess, the nasal here explaines the fuzziness of the F1 in the preceding vowel. Note th
zero, tand the pole. Note also that even though the pole 'looks' like it's in the bilabial ergion, there's just a trace of
something right at 1500 Hz! Woo hoo, because that's the only thing about this that makes it look alveolar. That and the
flappiness, but I've been known to produce very flappy bilabials (no comments from the peanut gallery, please).
Schwa
[ə], IPA 322
Short vowel. Don't want to belabor it. Note the offset frequency of F2, near the 'locus' for alveolar transitions.
Lower-case T
[t], IPA 103
Which suggests that this gap is alvoelar. Or close.
Yogh
[ʒ], IPA 135
Broadish band of voiceless noise, sharp energy drop off below F2 and concnetrated in the F3 region. Too low to be [s]
noise, so must be postalveolar.
Lower-case M
[m], IPA 114
Well, turns out there must be something here, or there's no real reason for the F2 to 'dip' into the silence the way it
does (pointing down in the fricative and up in the vowel). I mean, if there were just the fricative, or just aspiration,
there'd be no reason for F2 to do anything except transition. So there's something here, perhaps weakly voiced. Could
be an approximant, but then it would have to be /r/, since it looks like F3 is low. But knowing wha tI do, I'll choose to
ignore that... Hindsight. So the weakness might be nasality, in which case, the F2 transitions look decidedly labial. But
this is hindsight too. Sorry.
Schwa
[ə], IPA 322
Vowel. Lengthend in a final syllable, but weak, very low pitched, and not really 'long'. So final lengthening
notwithstanding, unstressed and reduced.
Lower-case N
[n], IPA 116
But this last syllable, if reduced, is just too long to be just a vowel and a stop. So there must be something here.
Something weak. And potentially devoiced. But I have no idea how I'd tell what it is since there's not a lot of
information available.
Lower-case T
[t], IPA 103
So there's something that looks like a double burst in the low frequencies (just at 2100 msec), but there's nothing else
until you get up to sibilant frequencies. So on the balance this is probably a weak alveolar burst.
So putting it all together, you get "most require careful man()ch()t, where the ()s indicate some kind of vowel/syllable
affair. Should be too rough to come up with something plausible, and fill in the features later.
Last modified: 11/08/2009 22:57:46 Support Free Speech
Or maybe that's "mushrooms are inedible fungus", now that I look at it again. Discuss.
Lower-case M
[m], IPA 114
Wow, more that 150msec of nice flat nasal. Okay, so rom about 50 msec to about 200 msec there's a nice sonorant (i.e.
nice, striated voicing bar, and resonances all the way up). Almost definitely a nasal, because of a) the zeroes at about
800, 1800, and 3000, b) the flat formant structure, and c) the sharp discontinuities with the following vowel (clearly a
vowel, since it's also obviously sonorant, and of higher overall amplitude, no zeroes, and transitiony-looking
transitions). So anyway, if it's a nasal, it must be [m], since the pole (formant) is at 1000-1100 Hz, which is typical of my
bilabila nasals. The transitions in the following vowel are also consistent with that, but are so short it's hard to tell.
Turned A
[ɐ], IPA 324
You maybe be wondering what happened to everybody's favo(u)rite vowel, [ʌ]. Well, I've been thinking about my
commitment to the IPA, and this has been bugging me. The IPA defines [ʌ] as a lower-mid, unround, back vowel,
Cardinal 14, the unround counterpart to [ɔ]. This vowel (as in 'hut', 'tuck', and especially STRUT, if you're into Well's
lexical sets. It and [ʊ] are reflexes are ME short /u/ (and short(ened long) /o/). But enough of the history lesson. It's not
round. In general North American English, it's not amazingly back, and in Western American and Canadian English, it's
downright frontish. Not as front as front [æ], so let's split the difference and call it central. Which is not controversial.
But the symbol for a lowish/lower-mid central vowel is turned-a, [ɐ], not turned-v, [ʌ]. So are we transcribing
phonetically, using the IPA, or aren't we. I've decided we are. So here it is. Now, that said, this vowel looks like an [ɑ] or
even an [ɒ], but it's outrageously short. Which I suppose makes it look back again. So maybe I'm still a hypocrite.
Esh
[ʃ], IPA 134
That falling peak between 300-400 msec (falling from 2500 to 1500 Hz) is a bit worrisome, but it'll turn out all right. So
ignore it's movement. We've got something that looks like a voiceless fricative, quite strong and sibilant, but with a
peak in the lower frequencies (in the F2-F3 range) rather than a lone peak way the heck off the top of the spectrogram.
So this is probably post-alveolar. Following that, there's a sharp drop-off in amplitude below 1000 Hz, which is more
typical of postalveolar than alveolar sibilants. The sloping peak is probably indicating some kind of transition....
Turned R + Under-ring
[ɹ]̥ , IPA 151 + 402
Well, the loss of high-frequency and high-amplitude energy (i.e. sibilance) suggests that this is osmethign else. The F1 is
invisible, since it's still coarticulating with the zero (or whatever it is) that zaps the low frequencies of the fricative. F2
and F3 are clearly visible in the noise (and contiguous with the F2/F3 of the vowel, an low and behold look at what that
peak in the fricative seems to be--something that follows the resonances from about 2500 Hz (what you might call the
neutral frequency of F3, or near enough), d own to about 1600 Hz, which looks like the frequency of the F3. Which is
plenty low enough for an [ɹ]. But voiceless.
Barred U
[ʉ], IPA 318
I know this is round because I remember it being round at the time I recorded it and I spent time workng out why--I
think the tendency of [ʃ] and [ɹ] to labialize, and the following bilabial (I don't want to get ahead of myself, but there I
go), I think this vowel just tends to get rounded. A little. But while This vowel is round, the F2 isn't really that low,
compared to the F1. So again, let's split the difference and call it roundish, but not back, or backish, but not round, or
just throw up our hands and say central. So that's what I did. F1 is a little high for something I think of as a high vowel,
making this look more mid, but hey, I pick my moments of IPA precision. I guess.
Lower-case M
[m], IPA 114
Shorter, and with considerably less energy than the earlier nasal, this still looks like a nasal. But, well, shorter, and with
considerably less energy. But the transitions are consistent with bilabial, and to the degree that we can see any energy
at all in the resonances, there might be a pole at about 1000-1100 Hz.
Lower-case Z
[z], IPA 133
Well, so this is a sibilant, with that high-frequency, high-amplitude peak. And it's the only peak, so we're looking at an
alveolar rather than a postalveolar. And I thought it was voiced when I did the figure but I'm not not sure that couple of
pulses at the beginning should count. But maybe it does. And it's shorter than a voiceless sibilant probably would be,
and weaker, sort of, both of which correlate with an 'underlying' voiced fricative. Okay, I'm just a complete hypocrite.
But I really think we should be using turned-a for the STRUT vowel.
Script A
[ɑ], IPA 305
The harmonics are getting in the way of this vowel, but I take the F1 to be the sort of peakish thing at about 900 Hz (as
opposed to the one at 500 Hz) to be the F1, and the F2 would be the one at about 1200 Hz. Ignore the diving F3 for the
moment. So we've got a very, very low vowel with a mostly back tongue position. Unless we have a mid vowel, but I
don't think we do.
Turned R
[ɹ], IPA 151
And here's what we do with the diving F3. Moving on.
Barred I
[ɨ], IPA 317
On the other hand, there's a transition after the F3 minimum that is a little long to be just a transition. So I've shoved in
an unstressed vowel. Moving on.
Lower-case N
[n], IPA 116
Well, there's something going on here. Fully voiced, but too weak to support any higher resonances. But too strong in
the voicing bar to be an obstruent. So some kind of unknown sonorant. Probably nasal, judging by the sudden loss of
energy from the preceding vowel. It just looks like an edge, of the kind nasals have but oral sonorants don't. The F3
transition (into it) is hard to read, since it starts so low it has no place to go but up. But assuming it's going up, the F2
isn't really pinching into it, nor is it obviously dropping bilabial-wise. So maybe this is an alveolar nasal. It would be
nice if we could se some resonance around 1500 (or anywhere 1300-1500) Hz, but you can't have everything....
Glottal Stop
[ʔ], IPA 113
Well, not so much a stop, as a creakiness at the end of the nasal and into the following vowel, but that's as close as we
usually see in my voice.
Epsilon
[ɛ], IPA 303
So the vowel looks like it's short and transitional, mostly in F2, but there's shorter coming, and it's unlikely they're
both completely stressless. So if we have to choose, let's look. THe F1 is basically mid, although it's moving from slightly
higher to slightly lower, so it's moving from lower to higher in the mid-range. The F2 is also in the central range, but
moving frontish (slightly) to backish (slightly). F3 is just neutral. So this is a middish, possibly lower-mid-ish vowel,
moving from frontish to centralish. Which is about all you can say.
Lower-case D
[d], IPA 102
Well, clearly voiced. Not really resonant, except for some mush in the upper formants. Could be a flappy type thing, but
is a little long, or a shortish stop. I went back and forth and decided on the stop. No pinch, no serious labial transitions,
so probably alveolar.
Schwa
[ə], IPA 322
Short little vowel, the F2 clearly all transition. Moving on.
Lower-case B
[b], IPA 102
Again, a voiced stop, this one even plosive-y-er than the othe rone, and sufficiently long to not really be a question. As
to place, thre's not a lot of information. The F2 transition could be labial (it couldn't be much else, but it's also
consistent with just a vowel-to-vowel transition (check the formants of the following vowel). So who knows. Not velar.
Probably not alveolar. But that's a guess.
Lower-case F
[f], IPA 128
A a non-sibilant fricative. Voiceless, and with no formant-like shaping. So this has to be labiodental or dental. Hard to
tell, but the trnasitions in the following vowel are more labial-looking than anythinge else. I guess.
Turned A
[ɐ], IPA 324
Ick. Okay, for the record, we're looking ath the sort of fuzzy-formanted thing that's mostly stransition, from about 1450
msec to where the F2 (or whatever it is) leaves off at about 1525 msec. F1 (not to be confused with the strong harmonic
over the voicing bar, is that t hing that starts at about 600 and rises, sort of to about 1000 Hz, maybe. The F2 starts really
low as well, let's say 900 Hz, and rises to about 1500 or so. F3 is lower than it was, and more or less flat, but it gets
fuzzier as it progresses. Okay, so the weakness in F1 and the increasing fuzziness in F1 (and the increasing weakness of
the inter-formant energy, and ultimately the formatns as well) suggests increasing nasality. Just something to file away
for another segment. F1 is mid-to-high, so we're dealing with a lower-mid-ish kind of vowel. This one sort of back as
well, but I'm sticking to my guns on thins one, at least or this spectrogram. Lowish central-to-backish vowel.
Eng
[ŋ], IPA 119
Backing perhaps helped along by coarticulation with a following velar, which is waht this is. It's a nasal, and the only
real reasonacnes is sort of in F3. But more important than that, there's a bit of a gap, with definitely velar transitions
following, and English nasal-place-assimilation being what it is, I'd say this was a velar nasal.
Lower-case G
[ɡ], IPA 110
That's assuming I can convince myself that there really is a gap here. Homorganic stops following nasals tend to be very
short, in terms of their apparenty oral-plosive component, so I'd take this little bit of low-energy voicing around 1600
msec to be sufficient evidence of a plosive. And as I said, the transitions in the vowel can only be velar.
Schwa
[ə], IPA 322
WHich leads us to the velar, which if it ain't a 'real' vowel, it hsould be transcribed as a barred-i, following Keating et al
(1994) as I do. But if it isn't a reduced vowel, what is it? Well, it's definitely mid. And definitely central or front of
central. And the F3 is a little low, but that again may just be coarticulation with the velar (transitional). So soemthing
schwa-like or epsilon-like, or somewhere in there....
Lower-case S
[s], IPA 132
Oooh, this is weak for a sibilant, but it definitely has that centered-off-the-top, broad band 'shape' of a sibilant
spectrum. Final weakening lives, I guess. Even though it seems to have that postalveolar low zero, it doesn't have the
lower (F2-F3) peak. So this has to be [s].
Last modified: 11/08/2009 22:57:45 Support Free Speech
Eth
[ð], IPA 131
Unfortunately, ther's a sharpish release looking thing in this, followed by some voicing and frication. There's a little bit
of noise, suggesting prevoicing, down at the bottom before the first 'pulse' thing, but not so much that it really tells us
there must be something going on before this. But what we can see of the voiced area is weaker than the following
vowel (as short as it is) and noisy, so it's a fricative. Voiced. And not sonorant, what with the formant structure showing
through. So that only leaves a couple of possibilities, and only one is likely to look like a stop in initial position.
Schwa
[ə], IPA 322
So the first full period of this vowel comes on at about 125 msec, and the last one is about three or four pulses later. So
this vowel is absurdly short. What do we always say about absurdly short vowels? They're reduced. Mark them as some
kind of reduced vowel, and move on.
Lower-case S
[s], IPA 132
Nice long fricative from at least 175 msec to 250 msec. Broad band (no formant-like banding, just one big band)
apparently centered off the top of the spectrogram. Typical for [s].
Lower-case T
[t], IPA 103
The gap and release burst here are nicely indicative of a plosive. The release noise is sibilant looking (high amplitude
and broadband) typical of an alveolar release. The formant transitions in the following vowel are consistent with that.
But the short VOT means this is unaspirated.
Lower-case O
[n], IPA 307
Ignoring the F2 transition, let's pick up this vowel around 400 msec. It seems to go on to about 550 msec, which is when
the F2 starts to change and the F3 hits its minimum. So that's where I marked the end of the segment. F1 is about 500,
so mid-ish, F2 is about 1000 Hz, so backish or roundish.
Turned R
[ɹ], IPA 151
The F3 here is at about 1700 Hz. Such a low F3 can only be an [ɹ].
Lower-case O + Upsilon
[nʊ], IPA 307 + 321
SO the F1 hasn't really moved from the preceding two segments, and the F2 is roughly back to where it was, but heading
down. Whether this is really diphthongization or just the transition into the following gap, I have no idea. But since
people seem to like their diphthongs....
Lower-case P
[p], IPA 101
Well, as I suggested before, the drop in the preceding F2 could be interpreted as a labial transition. The F3 transition
could also be interpreted that way, but that might just be wishful thinking on my part. Similarly, the vowel on the other
side is too short to provide much in the way of transitional information. So let's see, what else could we use. Well, the
release burst is sort of mushy, so it's probably not coronal. And the concentration of energy seems to be in F1 and F2,
rather than F2 and F3, so again that might tell us labial. I think that noise at the bottom is just noise, but if you
interpreted it as voicing I guess I couldn't fault you. But it would lead you down a garden path....
Schwa
[ə], IPA 322
Barely three pulses of vowel. Look at the F1 die out. Reduced. Moving on.
Lower-case N
[n], IPA 116
Notice how the apparent F1 (around 500 Hz is suppressed, but the voicing bar below it still looks nice and strong. That's
typical of nasals. Full voicing bar with supporessed upper frequencies. There's a resonance around 1300 Hz or so, and a
fairly strong one around 2500 Hz. No evidence of velar pinch on either side, and the pole around 1300 is far enough
away from the 1000 Hz I usually expect for bilabial nasals, so I'd say this was alveolar.
Lower-case Z
[z], IPA 133
Now around 850 msec or so, the voicing bar loses energy, but keeps its striated quality. So whatever this is, it's voiced.
And the loss of energy suggests an obstruent, i.e. something that doesn't resonate easily. The noise at the top of the
spectrogram looks sibilant, at least band-wise and frequency-wise, so this is probably a [z].
Lower-case T
[t], IPA 103
Short little gap, from about 900 msec to about 950 msec or so. The burst is a little mushy again, but it's obviously
centered up high (in the 3500-4000 Hz range) which is high enough to be an alveolar release. The F2 and F3 transitions
are consistent with that, no diving into the gap and no pinchiness.
Tilde L (Dark L)
[ɫ], IPA 209
So it's worthn oticing that around 1100 msec the F2 reaches a minimum and sort of loses cohesion. The F1/voicing bar
also sort of dies off, but slowly, but clearly clicks back on just before 1200 msec, which is the same moment that the F2
comes back. So taking F2 off/on as th edges of 'something', we can see that the energy above is also suppressed (and the
formants more diffuse) for that same duration. So this is a thing. The lesenened energy suggests some closure
somewhere, but the presence of low frequency energy (between F1 and F2) suggests an oral sonorant rather than a
nasal (which should have a zero somewhere in there). So the F1 appears to be in the middish-highish range (that is, is
500 Hz or below). The F2 is a little back (below 1500 Hz. The F3 is a little raised, which is usually indicative of a lateral. so
we've got a darkish /l/. Yay.
Lower-case I
[i], IPA 301
Remember what I said about an F2 around 2200 Hz? That would be useful to remember here. The apparent zero betwixt
F1 and F2 is probably just due to the very widely spaced formants, and the overall lower amplitude of this vowel
compared to others. The F2 transition which makes it look like an [eI] is just a transition from the low F2 of the dark /l/.
Exactly how you would tell the difference, I don't know. Maybe the F1. If we could be sure where it was...
Ash
[�E6], IPA 325
But then there's the transition after the F2 peak, and this is just too long to be just a transition. So it's gotta be another
vowel. But what kind of vowel? The F1 is still hovering around mid, and the F2 is still basically front, if not amazingly
front. So this could be another /e/, or even lower. I think the height is coarticulatory, i.e. with a high vowel in hiatus it
doesn't actually get low. Sure felt low when I did it. But definitely front. Hmm. Lucky for us this is a function word....
Glottal Stop
[ʔ], IPA 113
See how towards the end here the F1 sort of dies, the pulses in the upper frequencies seem to come every other
striation in the voicing bar. That's shimmer, folks, a reflex of glottalization, which suggests a) a syllable-final plosive, b)
probably [t]. But this is creak, so we'll call it a glottal stop.
Epsilon
[ɛ], IPA 303
Teeny short voiced vowel, and according to accepted rules we should just regard this as reduced. But if it weren't
reduced, what would it be? Well the F1 is sort of in the midrange, or maybe higher, so this is mid-to-lowish kind of
vowel, but the F2 is definitely higher than neutral, i.e. telling us this vowel is more front than anything else. So a mid-
to-low front vowel of some kind. Hmm.
Fish-Hook R + Tilde
[ɾ]̃ , IPA 124
Haven't had one of these for a while. This, folks, is nasalized flap, such as you almost only get in North American
English. The usual flap is a super-short plosive thing, so it should look like a short gap, or at best a little noise where
you're expecting a gap. This looks like a sonorant. It's fully voiced, if slightly reduced amplitude. See how it has 'edges'
like a classic nasal stop, but it's so short? See how it has a zero-ey thing around 1000 Hz and a pole-like thing at about
1400 Hz or so? See how the resonances are flat? See how the upper frequencies are vastly lowered amplitude? Looks like
a nasal. An [n] in fact. But it's so bleeping short! Flap, folks. Or tap. Whichever. But nasal.
Epsilon
[ɛ], IPA 303
Well once again, we're faced with something that's too long to 'just' be a transition. So wha t is it? WHo knows where F1
is? Could be around 700-750 Hz. Or I guess it coudl be somewhere else, but for lack of a better idea, let's suppose this is
the F1 of a mid-to-lowish vowel of some kind. The F2 is stronglest as it approaches the midrange (1500 Hz or so) from
above, and then it starts to lose some integrity. Also around 1850, the F1 does something odd. So those last 50-75 msec
or so (approaching 1900 msec) are probably more 'transtional' than the rest of it at least. So if we tkae everything
before that as non-transitional, we've got something that seems to be vaguely frontish. Not wildly frontish, but vaguely.
So frontish and not higher than mid. Hmm. And there's that fuzzy F1 again.
Lower-case M
[m], IPA 114
A-ha! I hear you cry! A final nasal! Flat resonances, full voicing, overall lessened amplitude, and a nice clear zero
between the voicing bar and the first pole. The first pole is just above 1000 Hz, which puts it closer to my [m] ranged
than any previous nasal in the spectrogram. And the final nasal explains the fuzziness of the F1. There's a zero creeping
in to the resonances, which is broadening the bandwidth of F1. Whence the fuzzies! Don't you love it when things come
together like that?
Last modified: 11/08/2009 22:57:44 Support Free Speech
I like this spectrogram because you have to separate your attention to the formant and the voicing bar. Just a hint.
Lower-Case S
[s], IPA 132
Well, this is a decent sibilant. It's got more resonance structure than I prefer, I don't know what was going on in my
mouth that day. But the strongest bit of energy is wa-a-ay up off the top, which is a good cue for [s]. The amplitude
(darkness) is consistent with sibilance, and the broad band (ignoring the resonant structure) is typiccal as well.
Small Capital I
[ɪ], IPA 319
This is so short, it probably should be treated as reduced, but you can sort of tell that it's the local pitch peak, which
suggests that it's stressed. So whatever. The F1 is low of 'mid', the F2 is, well, moving. Part of the problem is that the
following sound is throwing off the expected acoustics of this vowel. So whatever. Fill in the features later, I guess.
Tilde L (Dark L)
[ɫ], IPA 209
I've been doing a lot of these latetly. The overall amplitude here is slightly less than th e surrounding vowels, although
not by much. The F1 fuzzes out a little. The F2 hits a minimum of about 1100 Hz at about 325 msec. F3 is raised to about
2800 Hz. Oooh. Raised F3 almost always means lateral. The lowish F2 is consistent with a) the dark /l/, and b) the
surrounding front vowels.
Lower-Case I
[i], IPA 301
There are still transcription guides that insist that unstressed -y as in 'city' is always [ɪ]. It ain't. Transcribe what you
hear, not what you have been told to transcribe. Okay, so this again is a relatively high vowel (F1 is lower than 500 Hz.
Much lower, actually.), F2 is way high up above 2000 Hz. Can really only be [i].
Lower-Case I
[i], IPA 301
Well, we still have our low F1, in fact possibly the lowest of the entire spectrogram, F2 is close to 2400-2500 Hz, which is
about as high as I've ever seen it. F3 is sort of pushed out of the way. So this has to be [i].
Lower-Case P
[p], IPA 101
Another gap. With falling transitions into it and for the most part rising transitions out of it. At least F3 and F4. Also you
can sort of see a burst that is stronger in the low frequencies than the high frequencies. That's also a cue for bilabial,
sometimes. No aspiration this time, tho.
Lower Case W
[w], IPA 170
So from 900 msec for about 50 msec, we've got a seriously reduction in aperture, resulting in suppresion of acoustic
energy. Actually, it starts earlier than that, but you can see it really kills the first and second formant in thi ssection.
With a low F1/F2 like this, it really can only be either dark /l/ or /w/. Given that the 'peak' of the F3 movement looks
like it's here rather than before, you might have found this string to be [pol] and not [plw], but then you would have
been mistaken. One way or another. There's not a lot about this that looks particularly [w] like (relative to a dark /l/)
except for its extremity of F1/F2 lowering.
Script A
[ɑ], IPA 305
It's always hard to decide what to do with moving formants, but here goes. I usually ignore the first and last fifth or so,
so you're really only looking at the middle 2/3s or so of the vowel (do the math yourself, if you care). This allows you to
ignore the obvious effect of very local transitions. With something like this, that doesn't quite do it, so we'll have to
move on. F1 starts (absent the worst of the transition) in roughly id position, and rises to very high, up around 900 Hz.
So this vowel mostly occupies the lower part of the vowel space. The low starting frequency is attributable to
coarticulation with the preceding [w], so ignoring the last bit of the transition, the 'target' here seems to be around 800
Hz or so. The F2 again starts absurdly low due to coarticulation, but kind of levels out around 1000 Hz. So we've got
something with a lowish quality, and very back and/or round. This being my voice there's only one vowel back there,
really.
Lower-case K
[k], IPA 109
Well, we've got a gap, from just before 1100 to about 1150 msec, with some bursty releasey stuff following up to about
1200 msec. The transitions have a falling F3 but a a flattish F2 (and F4, if it comes to that). So the falling F3 might say
bilabial, but then we'd expect to see more falling formants, especially in F1 and F2. So this is probably velar. There's no
reason for a coronal plosive to have a falling F3 like that, and while not strictly 'pinch'y, it's as close as we're going to
get. The strong noise in the F2 range is also consistent with velar release, although the higher frequency noise (in F4) is
distracting, I admit.
Lower-case O
[n], IPA 307
So here againg we have an middish F1, a quite low F2 and a fairly high F3. The fundamentl is lower here, so the formants
are all a little broader, but this looks a lot like the previous dark /l/. But of course it isn't. Not sure why the F3 is so
consistenly high and flat here. Note how flat this is. Not really diphthongy at all, at least until you get to the last few
pulses. There's no reason for the F2 to drop like that unless something was going on, but for the most part this vowel is
pretty flat.
Lower-case S
[s], IPA 132
Now this is a decent looking [s]. It's length is attributable to final lengthening, and it's relative lack of amplitude is also
consistent with being at the end of utterance. It's still pretty strong as fricatives go, tho. This is what I mean by one
really wide band. THere's very little shaping to this at all, and the center frequency of this is somewhere off the top.
Last modified: 11/08/2009 22:57:42 Support Free Speech
Tilde L (Dark L)
[ɫ], IPA 209
So there's this sonorant consonant between 250 and 325 msec or so. It's fully voiced and resonant. The lowered energy
leads to what looks like a zero around 2000 Hz, but there's too much energy below to really be a good nasal. So I think
that apparent zero is just the low energy dropping off the end of the visible scale. So anyway, this is probably oral. With
consonants we don't worry about the F1, usually, because there's not much variation across types. Close-to-closure is
close-to-closure, after all. The F2 is swooping to a low of something like 1000 Hz at or near 325 msec, that is towards the
end. That probably means something in terms of prosody but I'm not sure what. The F3 is really interesting though. It
raises from the beginning to the end of this consonant. What do raised F3s mean? Right. Lateral. Probably. Consistent
with the low F2 (I only have fairly dark /l/s after all), and the overall intensity. Good spotting.
Lower-Case K
[k], IPA 109
... because combined with the rising F2 we've got something that looks like velar pinch. If this transition were bilabial,
then the F2 would have to come down, at some point. If it were alveolar, there's no reason for the F3 to come down. So
that transition must be velar. So from 500 msec to at least that release noise thing around 550 or 575 msec, this must be
a velar plosive. Looks pretty voiceless.
Lower-Case T
[t], IPA 103
On the other hand, from that release thing there's more gap up to about 625 msec. There's some clunks which might be
release noise, in the high frequencies, but they're not very loud. They are consistent with the real noise from 625-700
msec. This noise is [s]-shaped. It's very loud, and loudest at the highest frequencies. It's a little disturbing that the noise
dies off below 1500 Hz, which makes it look like a post-alveolar rather than an alveolar. But the concentration of energy,
such as it is, in F4 and above rather than below is probably the best cue for alveolar-ness. So if this is just the release of
the stop, then it must be alveolar.
I confess I chose this phrase because I was interested in this sequence of consonants. Now I wish I hadn't. But
challenges help us grow, right?
Schwa
[ə], IPA 322
The vowel, such as it is, is weak in amplitude, and the formants are all in transition. So this is a classic reduced vowel.
Call it schwa and move on.
Lower Case W
[w], IPA 170
Well, we have a problem. From about 900 msec to about 1000 msec (or 1050, depending on where you want to draw the
line) there's something that's clearly sonorant. The voicing is full and resonant. On the other hand, there's no energy
above 1000 Hz, and precious little between 600-1000. So we don't have a lot to go on. So then we need to look at the
transitions. The F1 starts to fade out, but it seems to be headed down from about 500 Hz and back up again on the other
side. So let's suppose it's heading to someplace like a close vowel. The F2 in the schwa is falling, and when the F2 finally
fades out, it's about 800 Hz or so. It seems to click on again even lower on the other side. F3, falls from high to neutral,
and is still headed down by the time it kicks on again. So we've got something very close (consistent with a high vowel
or an approximant), with a very low (back/round) F2. Lower even than it gets with the dark /l/. So backer/rounder
than that. And nothing really going on in F3 specific to anything else. So this is probably a [w].
Lower-Case O + Rhoticity Sign
[o˞], IPA 307 + 419
Well, let's suppose this starts around 1000 msec and goes on to about 1125. (I guess I forgot to stick in a segment mark
and recenter the vowel symbol. Oops.) So we've got mid or higher-mid F1 (i.e. near or low of 'neutral'), and a very low
F2 (indicated something round and back. That's easy. The F3 is way low for a normal vowel, hence the rhoticity sign.
Turned R
[ɹ], IPA 151
And my favo(u)rite approximant. F1 heaven only knows where, let's say around 500 Hz. F2 rising to about 1300-1400 Hz.
F3 falling to a low of 1600 Hz or so. You just don't see F3s that low with anything except North American-style
approximant [ɹ]s.
Lower-Case M
[m], IPA 114
So just shy of 1200 msec, the amplitude drops suddenly. Things stay sort of constant to about 1250 msec when
'something' happens. So let's talk about that stretch and ignore the rest for the moment. The sudden amplitude drop is
characteristic of nasals, so my guess is that's what we're dealing with. There's a zero around 750 Hz, although there's
not much to it. There's a pole around 1000 Hz. I'd be happier if there weren't apparently another pole around 1500 Hz,
which makes this more ambiguous. But the lower one is stronger, so I'll pretend that's the one we're supposed to pay
attention to (this is cheating. If I thought it was supposed to go the other way, then I'd ignore this one. As Peter
Ladefoged used to say, sometimes "you have to know what you're looking at before you can look at it," or something
like that. Anyway, in my voice, a pole around 1000 Hz is indicative of a bilabial nasal. (The higher pole around 1400 Hz
would be indicative of an alveolar.) The transitions in the previous segment are consistent with a bilabial (notice how
the F3 just keeps falling and the F2 seems to drop just a hair in the last few msec before the nasal kicks on). The
lowering effect of the /r/ confounds that, but if the following sound were really alveolar, I'd expect both those last
transitions to be just a little bit upward. Or at least to level out.
Lower-Case S
[s], IPA 132
This is a better [s], spectrally speaking, even if it is a little lacking in amplitude. compared with the noise in the previous
[t] release. But this is a pretty classic [s]. A single, very broad band of noise, extending from bottom to top, with very
little resonant-like shaping. The single, broad band is centered off the top of the spectrogram, so above 4500 Hz. If we
could image the higher frequencies, the center could be anywhere between 6-8 kHz, maybe up to 12 kHz. But whatever,
higher than we can see.
Turned V
[ʌ], IPA 314
Vowel. That's all. Vowel. Fully voiced, right amplitude, resonances all the way up. Formants? Well, I don't know. My best
guess is that F1 is around 750, or at least somewhere between 500 and 1000 Hz. F2 is around 1250 Hz, or at least between
1000 and 1500 Hz. F3 is raised a little, but since this is a vowel that doesn't tell us a lot. Okay so we've got something
mid-to-low and central-to-back. Or somewhere in that area. Turned V is the traditional symbol used in North America
for this vowel, but I'm not sure it's the right one.
Lower-Case N
[n], IPA 116
Another nasal. This one as a nice clear zero from 600-1300 Hz, and then there's a pole, very faint, but it's there. Around
1300 Hz. Close enough.
Esh
[ʃ], IPA 134
Now here's another sibilant. High energy, and mostly high frequency. This one being a fricative we'll pay attention to
the loss of energy below F2. And again this is ambiguous, but there's a little extra energy in the F3 area, and maybe
again in F4. So this fricative has lower-frequency center(s) than the previous [s], and has more resonance-y
organization. So the lower center, especially in F3, and the loss of energy below the F2, are classic [ʃ] markers.
Lower-Case M
[m], IPA 114
Starting at 75 msec and goign on until about 150 msec, we've got a nice little sonorant happening. It's got a nice, clear
voicing bar at the bottom, and resonances at the higher frequencies. The sharpness of the edge (of the following vowel),
the overall lowered energy (relative to the following vowel), the presence of a nice clear zero (around 750 Hz) and
mostly flat (unchanging) resonating structures, are all good pointers to a nasal stop. The pole around 1000 Hz is usually
a pretty good clue (in my voice) that it's bilabial. The F2 transition in the following vowel is consistent with that--that
F2 onset frequency is too low to be alveolar, and the distance between F2 and F3 is atypical of velars. But it's that F3
transition that bothers me. The F3 seems to fall into the following vowel, which is consistent really only with alveolars.
So we've got conflicting cues. Which are we going to believe? Well, we're going to wait for a deciding vote. Once we
have a clearer idea of what the first few syllables of this utterance are, knowing it's English and a declarative sentence,
we'll use lexical access to decide whether we're looking at an [m] or an [n]. Or something else...
Lower-Case S
[s], IPA 132
Well, this is interesting. From 300 msec (a little earlier in the higher frequencies) to almost 400 msec, there's a nice
voiceless fricative. There's no hint of voicing or anything at the low end. There's some noise into the very low
frequencies, and for some reason the amplitude hikes up a bit at about 1500 Hz. Then it stays pretty much flat (i.e. at
the same amplitued) all the way up. So this is fairly strong and broad band, typical of sibilants. And the sudden drop off
below 1500 Hz is usually a clue that it's post-alveolar. But I'm going to suggest it's not. Partly, it's because I know what
it's supposed to be, and I'm floundering for reasons to be right. Okay, usually a post-alveolar (rather than alveolar)
sibilant has that strongest energy in the F2-F4 range, and I think that low energy 'border' isn't quite continuous with
the F2 band in the following vowel, such as it is. So I don't know. This is supposed to be an [s]. And I think if we followed
it up to the 6-12 kHz range, we'd see it really gets really, really loud up there. So this is an alveolar. Accept it. Move on.
Barred I
[ɨ], IPA 317
Well, for a scant 25 msec or so, there's a vowel. There is. Look at it. But it's so short, it's hardly worth spending any time
worrying about. So I won't.
Lower-Case N
[n], IPA 116
Another nasal. Now look at this one carefully. There's a nice strong voicing bar, and there's a band of weaker energy just
above that. Now compared with the initial one, this one is a little higher in frequency or broader in band. So they're not
quite the same. There's a zero. It's narrow, but it's a little higher in frequency than the zero in the previous one. There's
a little energy at 1000 Hz, but it's weak w/r/t the previous one. And there's that blip, or whatever youw ant to call it,
several pulses of resonance, or something, up just below 1500 Hz. I point that out because it turns out that it's
important. I think that's the real pole. But I could be wrong. But what are the odds.
Lower-Case T + Superscript H
[tʰ], IPA 103 + 404
From 775 msec to about 850 msec, there's serious gap. The few periods of voicing leadin gup to 800 msec I'd say are just
perseverative. Since the release at 850 is followed by going on to 75 msec of aspiration (voicelessness, VOT), there's
little doubt that this plosive is aspirated. The transitions into it are decidedly alveolar looking, in the sense that both F2
and F3 are pointed up, but given their frequency in the preceding segment, they have precious little choice. The
aspiration noise is the big clue. It all respects (except for the formant shapping in F2 and F3, this looks like a sibilant,
particularly [s]. (I suppose you might say it looks like an [ʃ], but really it doesn't. There's not enough energy in the F2/F3
pole relative to the higher ones.) Ennyhoo, it's not an [s], it's just really heavy aspiration following an alveolar release.
So it's not 'grooved' like an [s], but the airflow is basically high pressure being directed at the incisors, just like [s]. SO
this has to be alveolar. The transitions out look vaguely velar-pinch-y, but since there's no way a velar would have
aspiration that looks like this, we can rule that out.
Turned M
[ɯ], IPA 316
Well, this is not good. The highest-pitched voice in the whole spectrogram. Which probably makes this syllable the
nuclear accent, or at least the focus accent of the utterance. But in practical terms it means a) the striations are so close
together you can't tell one pulse from the next, and b) the harmonics are widely separated (Quick--why?) and so
bandwidths just increase. sSo it's hard to tell exactly where F1 is. It could be that band around 500 Hz (or just below, but
above the very strong voicign bar), or it could be that band up around 800 Hz. Which makes this either a relative mid to
higher-mid kind of vowel or a very, very low one. The F2 is a little easier. Before it fuzzes out, you can see the F2
transition in the aspiration noise, so you know where it's headed at least. So the F2 has to be around 1200 Hz or so,
depending on exactly where you measure. So knowing the answer, I might suppose that the strength of the 'voicing bar'
was actually a very low first formant, and the two things I'd considered before are just strong harmonics. But I don't
know. It probably ain't the increibly low vowel that it would be. SO figure not high and realtively back, but not
outrageously round (or very round but not outrageously back). And we'll try to make a word out of it later.
For the record, this is a fairly typical /u/ for me. Not at all round, fairly high, and with front on-glide following the
coronal.
Lower-Case N
[n], IPA 116
So I think the oral closure happens on at about 1075 msec--when the zero kicks in. Which is another contributer to the
fuzziness of the preceding vowel--nasalized vowels tend to have broader bandwidth (and more centralized formant
frequencies) than their oral counterparts. So the zeroes are a good thing, really--they tell us this has to be a nasal.
Frankly, the pole looks like it's about 1000 Hz, and so I'd say this was bilabial. And I'd be wrong. Good guess, but if it's
not bilabial, then it has to be alveolar. No hint of velar pinch, and, well, there is that narrow thing at 1500, which is
where I'd expect the pole for an [n] to be, in my voice. There's no hint of that in the initial nasal of this utterance, so
there's some difference. But I wish I knew what was going on on at 100 Hz.
Lower-Case Z
[z], IPA 133
Well, there's a hint of voicing at the bottom, so this is probably voiced. The noise is [s]-shaped, if you follow, and weaker
(and shorter) than we'd expect for [s], which is consistent with the idea that it's voiced.
Lower-Case I
[i], IPA 301
Well, if the previous thing is an alveolar, then we can say that the onset frequency of F2 is in line with the alveolar
locus, which means all that movement is just transitional. Or we could suppose that it's meaningful. I n the first case,
coupled with the relatively low F1, I'd be looking at that spot, just after 1300 msec where the F2 levels off or just a bit,
and say that was our target F2 frequency, which would make this an [i], just because nothing else ever has an F2 above
2200 Hz. But in the other case, we'd say this was a relatively high, front vowel moving higher (I guess) and much much
fronter, something much more like classical [eɪ]. One or the other. One is right, the other's a good guess.
Lower-Case T
[t], IPA 103
So with the exception of that one pulsey thing before 14500, the gap here seems to start at about 1350 emx and go on
for almost 100 msec. The transitions into look sort of pinchy (but very front velar, if you follow) and the burst is slightly
doubled. All of which just screams [k]. But then we wouldn't get this spectrogram to say anything. So on the high-tilt to
the burst, and the phonotactics of the following thing, I'd say this was [t].
Esh
[ʃ], IPA 134
So here you see how much stronger the F2 pole is. And the energy below is weaker. So this looks like an [ʃ]. THis is also
more consistent with the F2/F3(/F4?) poles, which are more typical of postalvelaors than alveolars. There's just more
room to couple and a longer front cavity to play in. That is, for acoustic coupling to take place and to resonate in,
respectivecly. Shame on you for thinking what you were thinking!
Lower-Case I
[i], IPA 301
Well, there's a couple of odd amplitude discontinuities, but they're not really radical, considering the length and overall
energy in this vowel. So I'm thinking it all has to do with pitch change, and therefore striation spacing and harmonic
structure. So from 1575 to 1925 msec, I'm thinking this is really all one vowel. And since the F2 reaches 2200 Hz (i.e.
'absurdly high for anything except [i], and then still very, very high'), I'd say this was [i]. If you were determined to put
vowels on either side, what would you do with the middle?
Lower-Case Z + Under-Ring
[z]̥ , IPA 133 + 402
Well, this is a lesson, so here goes. This looks like an [s] again, but it's very weak. There's no hint of voicing, but it's
weak, and it's shorter than even the fricative in the affricate, even though it's final in utterance. So there's something
odd about it. It's not post-alveolar, because even though it looses energy below F2, you'd still expect the F2 pole in the
fricative to be a little stronger than above it, and this is flat. The noise gets a little better organized off the top of the
spectrogram. All this points to [s]. So how do we account for the weakness? Well, voiced fricatives are almost always
weaker and shorter than their voicless counterparts, just because the act of voicing impedes airflow and therefore
pressure build up. But this isn't voiced. So I'll suggest it's passively devoiced. That is, rather than devoicing by abducting
the vocal folds (as with underlyingly voiceless sounds), the vocal folds remain adducted here. But because we're at the
end of an utterance, we (I) don't have a lot of subglottal pressure to work with, and the result is the vocal folds don't
vibrate. And there you have it, devoiced [z]. As distinct from [s].
This page last modified: 11/08/2009 22:57:38 Support Free Speech
N"Harmony is achievable."
Lower-Case H
[h], IPA 146
So starting note quite 100 msec in, and going on until 225 msec or so, there's some voiceless (no striations in the very
low frequencies, in the range of the fundamental or first harmonic, which in my voice could be anywhere between 90
Hz to 130 or 140. So it's voiceless. There's lots of energy up above, but it's aperiodic, or noisy. If you notice the formants
of the following vowel, there's a little more noise in those same frequencies. Which is typical of [h]. The noise, being
produced in the laryngopharynx, bounces around the vocal tract the same way periodic energy does, and thus gains
energy in the frequencies of the vocal tract resonances and loses it in between. What's interesting is the high F3-there's
no hint of rhoticity in the noise, until about 200 msec, when it starts to come down in frequency. We can see that
transition continue once the voicing kicks in, but then we're well into the next segment.
Lower-Case M
[m], IPA 114
Then the amplitude falls off around 350 msec. The F2 transition in the /r/ is diving at that moment, which suggests
labial transitions. The overall energy from 350 to 425 msec (or so) is lower than either of the surrounding vowels, so this
is relatively consonantal. And its edges are sharp, if you see what I mean, suggesting some acoustic change that sucks
energy out of the source suddenly turns on, and then off. So this is a typical nasal-the aforesaid sucking occurring as
the nasal cavity is opened and the oral cavity is closed, and then stopping when the velopharyngeal port is closed and
the oral closure released. There's a nice pole around 400 Hz, which is just to be expected, but the first 'real'
pole/formant in the nasal is around 1000 Hz. You can see pole above that (continuous with the F3 of the /r/) is rising.
The frequency of that middle pole, the one around 1000 Hz is a good cue to this being bilabial-if the oral closure were
further back, this would be higher in frequency. (Go back to acoustic phonetics and read about 'side cavities' if you're
not sure why.) So that's two solid cues to this being [m], and none particularly pointing anywhere else.
Barred I
[ɨ], IPA 317
So from 425 to about 475 msec, there's a vowel. The F1 is sort of low, unless you believe it's still high, but it's not
particularly distinct either way. The F2 is in constant motion, almost as if it had nowhere in particular to go. The F3 is
still transitioning, so it' snot helping either. Also the F4 if it comes to that, but since we almost never look at F4, we
won't belabo(u)r the point. So we've got a short vowel of indistinct structure that never really develops a strong
identity of its own. So call it reduced, transcribe accordingly, and move on.
Lower-Case N
[n], IPA 116
So here we have another one of these. Note its similarity, in terms of its amplitude and edges, to the previous nasal.
There's a pole I don't think I've ever seen before at about 850 Hz, so I'm going to ignore it.... The main pole is up around
1400 or not quite 1500 Hz. Note how much higher it is than the 1000 Hz or so pole in the [m]. So there we go. This one
isn't bilabial, so we're stuck with alveolar or velar. There's no hint of velar pinch in the transitions into or out of this
nasal, and the transition-end frequencies (around 1700 Hz) is consistent with the locus of alveolar transitions.
Lower-Case I
[i], IPA 301
So the F1 is still rather low. Note the voicing bar in the first syllable. There's a strongish harmonic just below 500 Hz but
the main body of the resonance is clearly between the voicing bar and that harmonic. So this is an exceptionally low F1.
So this is an exceptionally high vowel. The F2, once it straightens out, is exceptionally high, up around 2100 or 2200 Hz.
So this vowel is exceptionally front. And the highest, frontest vowel you can think of? Right!
Barred I
[ɨ], IPA 317
Well, another section of vowel that's mostly F2 transition. If you missed it as just transition, you have to explain why
this vowel is so long when its pitch is clearly quite low (see how far apart the striations are compared to most of the
preceding vowels-each of those striations is a glottal pulse). So I think this is actually two different vowels/syllables. In
fact, two different words. I worked hard at not putting a glottal stop in this one, so I hope you appreciate the duplicity
involved.
Lower-Case Z
[z], IPA 133
So the striations continue, albeit in weaker form, all through the following amplitude dip (from about 700 msec to 750
msec or so?). So whatever it is, it's a consonant and it's voiced. But up above the voicing bar, there's no evidence of
periodicity, so no resonance to speak of. So there must be a very tight constriction somewhere. And it's noisy, so it's a
close constriction, but not a closure. So we're talking about a fricative. Voiced, but very noisy. The noise is not
particularly organized into bands. In fact, it's one broad band. It's a trifle weaker in the lower frequencies than the
higher frequencies (note the relative lightness of the noise just around and below 1000 Hz compared to anywhere
above), so this looks like it's tilted to the high frequencies. Very high frequencies, without any tilt toward the F2 or F3
region. So there you go. [s]-shaped noise, but voiced.
Barred I
[ɨ], IPA 317
And another short little vowel, overlapped in the high frequencies with a bit of the noise from the fricative. Or maybe
the noise is coming from the upcoming closure. Or both. Hmm. So this is amazingly reduced.
Lower-Case T
[t], IPA 103
Nice sharp gap so obviously we're dealing with some kind of plosive. There's not a lot going on in terms of transitions
suggesting anything in particular. On the other hand, if you look at the release noise burst, it's very sharp, broad band,
and evidently [s]-shaped. Although this may be in part a product of the following frication. But whatever. Believe it's
alveolar, or at least coronal, or remain agnostic. When it comes to parsing the upcoming fricative your choices will be
limited.
Esh
[ʃ], IPA 134
So here we go. We've got some very loud friction here. No voicing bar, but with that much noise, you wouldn't really
expect any voicing. The frication is very loud, but you'll notice it isn't one very broad band, but has some formant-like
shaping to it. It's loudest not off the top of the spectrogram (i.e. between 4-6-8-12 kHz), but seems loudest in the F2-F3-
F4 bands. And the F2 band is pretty noisy, while below it the energy drops off sharply. That's pretty typical of post-
alveolar [ʃ].
Lower-Case I
[i], IPA 301
So it's tough to tell where F2 is. You have to surmise from that falling transition afterwards that it's really, really high,
around 2200 Hz or so. It's almost merged with the F3, but that's not supposed to happen, so the combined band is still
wider than you'd expect a single band to be, but at this bandwidth there's no telling where the separation is. So the
edges of the filter overlap slightly. Get over it. So that's the F2, where's the F1? Low low low, I say. We could argue about
that, but TMSAISTI.
Lower-Case V
[v], IPA 129
Another voiced fricative here, from 1075 to 1125 msec or thereabout. Nice striations at the bottom, but no periodicity to
speak of above. This is a very loud fricative-it has about the same energy as the previous [z]. But spectrally, this looks
different. It doesn't have any tilt to it at all. It just looks white, in the sense of having equal energy at all frequencies.
Sort of unfiltered. Well, probably this is louder than it should be-I may have been spitting into the microphone or
something. The unfiltered-ness is a huge clue though. In order to be unfiltered, your source has to be uncoupled from
the resonators of the vocal tract. Which means it has to have a tight closure, and no vocal-tract-tubey-volumes in front
for the energy to bounce around. So this has to be at the teeth or lips. Given that this is English, the lips (bilabial) is
unlikely. It would be really helpful if the transitions on either side looked more labial, but they don't. Which might
make us think coronal, just by default. But then we'd be wrong. So let's just keep both [v] and [ð] in mind until we can
make a word out of it.
Schwa
[ə], IPA 322
Very short, indeterminate vowel. Moving on.
Lower-Case B
[b], IPA 102
Another gap, this one rather long, although since we're approaching the end of the utterance that might be
lengthening of the final syllable. There's a nice, clean gap in most frequencies, but if you look at the bottom, there's an
awful lot of perseverative voicing. More than you'd get if there were a nice abduction gesture associated with an
underlying voiceless stop. So this is probably voiced. It's a little annoying that the transitions are so ambiguous. The F2
in the preceding vowel seems to be coming down, well below the 1700-1800 Hz alveolar locus we usually look for with
alveolars. So that looks labial. F3? Seems to be high, if anything. Ya gotta love coproduction messing up all your cues. So
on the balance, I'm going to say bilabial. The F2 isn't even close to alveolar or velar looking. The F3 is ambiguous, but I'll
attribute it to coproduction with ...
1. HE/SHE
2. CHAINS/TRAINS
3. MEEK/WEAK
4. LEADERS/READERS
So the real trick here is to work out whether the first consonant is [h] or [ɫ][h] or [ʃ], whether the second onset is [tʃ] or
[tʰɹ]̥ , etc.
I'll only be discussing differential cues this time, again, just because of time constraints. (I will leave it as an exercise for
the reader to segment the 'known' segments and work out what cues there are as to their identities.)
Esh
[ʃ], IPA 134
From about 75 - 225 msec. This looks more like [ʃ] than [h] for a couple of reasons. The first is that it's too loud. This has
absolute amplitude like a vowel rather than a consonant. So this very loud frication is tilted to the higher frequencies,
typical of sibilants in general. This looks like [ʃ] rather than [s] since it has very little energy below F2, below which it
drops off fairly sharply ([s] has broad band noise that may diminish at the lower frequencies, but it'll do so more
gradually). The fact that it drops off right below F2 is suspicious, if you were wondering. Also an [s] would not have that
strength specifically in F2/F3/F4, but presumably would have a single broad band much centered much higher. An [h]
would have less energy over all, and wouldn't have any kind of discontinuity with the following vowel (except in terms
of voicing). If you notice, the "F2" in the fricative doesn't match that in the vowel.
Lower-Case T + Right Superscript H, Turned R + Under-Ring
[tʰɹ]̥ , IPA 103 + 404, 151 + 402
350 msec to 450 msec or thereabouts. The choice here is really between an [ʃ] release to the affricate or a voiceless [ɹ]̥ .
I'll duck the whole question of segments and affricates and so on. Okay, so the gap for the plosive goes from about 325
msec to the release somewhere between 375 and 400 msec. The release frication probably runs from about the release
for between 25 and 50 msec. The 'center' of the /r/ moment, if you follow me, is around 425 msec. Notice that by the
time voicing kicks on at about 450, the formants are already moving fast. So our choices for this bit, from 400 to 450
msec or so are the /r/ (devoiced due to the aspiration) or Esh. Notice the intensitive of the noise—on release, it's nice
and sibilant. It's centered pretty low, sibilant-wise, and looks a lot like the previous Esh. But you'll notice the intensity
drops off fairly quickly, instead of being nice and sustained through the voicelessness, and also that the noise is in the
shape of the following formants. The F2 starts up wherever it stars on release (around 1900 Hz or so), falling rapidly to
just below 1500 Hz. The F3 falls out, but notice in the release how the corresponding band is definitely falling.
Extrapolating or interpolating, or whatever, from the angles of the transitions on either side, it looks like the F3 drops
to just below 2000 Hz, but there's not a lot of evidence that it really gets there. But those transitions in F3 can only be
due to rhoticity. And the lowness of the F3 and the closeness of F2 and F3 together explain the esh-shaped-ness to the
release noise--the center of the energy is being pulled down by the low formants. But this explains why there seem to
be people who have a /tr/ goes to [tʃ] rule and/or /dr/ goes to "jr". For comparison, notice how, while diminishing, the
esh-noise in the comparison spectrogram is more or less stable right through until the voicing kick in.
Lower Case W
[w], IPA 170
Approxiamant or nasal? We're looking at the fully-voiced segment from about 725 msec to just past 800 msec. It's got
less energy in the voicing bar than in the following vowel, but that's typical of both nasals and close approximants. The
transitions are mostly bilabial, although F3 isn't helping much. So nasal or not? Well, not. Nasals don't have to have
'sharp' edges, but prevocalically the usually do. See that moment near 800 msec in the comparison spectrogrma. The
edge here is the velum closing--at that moment, the acoustic change suddenly. The energy that was being lost by the in
the nasalization is suddenly regained, the main resonances change--notice how the formants 'pop on' without
transitioning. Here, ther formants are all transition, suggesting something oral throughout, with continuously
changing articulators transitioning from the /w/ moment to the following vowel.
Tilde L (Dark L)
[ɫ], IPA 209
Finally another lesson in approxmants, this time /r/ and /l/. We're looking at the moment that begins when the
voicing kicks on around 1000 msec, and going utnil the upper frequency periodicity really becomes clear, at about 1075
msec. Again, this doesn't look nasal due to the continuity of the whole thing. That ahd the F1/voicing bar complex is
too continuous (in both amplitude and frequency) to indicate a sudden addition or loss of a cavity. So what's the
difference between a North American /r/ and an /l/? The F3. Lowered F3 for /r/, raised F3 (or sometimes F4--ideally
both) for /l/. So where's the F3? F1 is down just below 500 Hz. F2 is just above 1000 Hz, and F3 is way up there around
2750 Hz. It's falling a little, so by the time the upper frequendcy periodicity kicks on it's already almost back down to
2500, but you can still see how high it was in the noisy, semiperiodic energy during the approximant. So that's it. Raised
F3.
labeled spectrogram
This page last modified: 11/08/2009 22:57:37 N"He chains meek readers." Support Free Speech
Lower-case M
[m], IPA 114
We start with a clear sonorant consonant of some kind, fully voiced, nicely striated, and with nice clear resonances all
the way up. The formants are flat, and there's a nice clear zero about 750 Hz, both of which suggest a nasal. The pole
(formant) around 1100 Hz suggests a bilabial (at least for my voice--an alveolar nasal usually has a pole somewhat
higher, closer to 1400 or 1500 Hz, and a velar nasal a) wouldn't be initial in an English utterance and b) would have more
evidence of velar transitions in the following vowel.
Lower-case O + Upsilon
[nʊ], IPA 307 + 321
I love the overlap between this and the following consonant, but whatever. F1 is at about 500 Hz or just higher. F2 is up
around 1100 Hz or thereabouts. F3 is high for some reason. But since this is a vowel we're not going to worry about F3.
Just put it out of your mind. Don't let it consume you for twenty minutes like I just did. So we've got the F1 of a mid-ish
vowel, and the F2 of something fairly back and/or round. The F1 and F2 seem to move downward slightly (hence the
transcription as a diphthong) but you'll notice that the upper frequencies are taken over by the incipient sibilant noise
coming up. Gestural overlap? Spreading? Whatever. The illusion of segments. Moving on.
Lower-case S
[s], IPA 132
Well, since it's September, we'll review. From about 350 to almost 500 msec. This is a fricative (random, snowy 'noise').
It's voiceless (no striations or energy in the low-frequency 'voicing bar'). And the noise is in a single, very broad band
(unfiltered by a lot of vocal tract resonances) which suggests that it's relatively forward in the vocal tract. It's very loud
(and broad band) which suggests sibilance, and centered in the very high frequencies, which suggests alveolar (the
postalveolar sibilant is usually centered in the F2/F3 range rahter than above the F4 range). So this must be an [s]. [s] is
your friend, spectrographically speaking.
Lower-case T
[t], IPA 103
Our first real plosive. From about 475 msec to the release burst at about 525 msec there's a gap in the spectrogram,
indicating no airflow, no resonance, no voicing, squat. It's got a short VOT (not even 25 msec) so it's unaspirated.
Voiceless goes without saying, right? (Study question: Why?) The F2 transition starts at about 1600 Hz and falls, the F3
transition starts around 2400 Hz and again falls. So we have 'uppy' pointing transitions (pointing into the gap, that is)
and so this is probably an alveolar. The noise in the VOT is a little low (we'd like to see more [s]-looking release noise
following a [t], but there's a coarticulatory thing going on....
Schwa + Turned R
[əɹ], IPA 322 + 151
Okay, I've transcribed a diphthong here because there was just no place to segment. Sorry. I've also used a deceptive
sequence of symbols--for a lot of people, the sequence schwa-r is a shorthand for the symbol schwa-r (i.e. [ɚ] IPA 327,
for which I always use turned-r with the syllabicity diacritic, i.e. [ɹ]̩ .) But here we have something that looks and sounds
like a diphthong. So I've transcribed it as such. F1 is in the mid-region. F2 is neutral (and falling). F3 is sort of neutral
but also falling. The end of the F3 fall is way below 1800 Hz, which accounts for the F2 fall as well--that is, that's
rhoticity, i.e. approximant /r/ in North American English. But there's a non-rhotic vowel in front of it. So it's a
diphthong.
See, this is a syllabic /r/. It's not a vowel like schwa 'combined' with some diphthongy rhoticity. It's just a vowel. F1 is
mid-ish, F2 is as neutral as it can get, considering the F3 is around 1600 or 1700 Hz. An F3 that low can only be an /r/.
Lower-case F
[f], IPA 128
What we have here is another voiceless fricative. Now take a moment and compare it to the previous [s]. Broad band
noise but not of sibilant amplitude. So probably fairly far forward in the vocal tract. Given that this is English this
means labiodental or (inter)dental. Any other clues? The F2 in the preceding /r/ seems to transition downward, just a
tad, while the F3 is rising, slightly. The only reason for the F2 to not be transitioning in the same direction as the F3 is if
it's a labial transition. The transitions on the other side of the fricative all point down (that is, rise into the vowel, which
also makes this look bilabial. Now, we might discount the F3 transition as just 'rising' from the low position for the /r/.
But the F2 transition(s) still look(s) labial. Vaguely. So probably [f]. The double clunky thing at the onset of the vowel is
probably just a clunky thing.
Barred I
[ɨ], IPA 317
Absurdly short vowel. Reduced. Ignore it. Well, don't ignore it, but don't waste any time trying to identify it. Move on.
Lower-case K
[k], IPA 109
Another long gap, but now there's another double clunky thing at about 1100 msec, in the F3/F4 range. Not much of a
cue, but double burstiness is sometimes indicative of a velar release. Then again, there's another double-bursty-looking
thing at 1150 msec (or so) which I'm going to claim is a red herring (or rather, that the usual explanation for velic
double-bursting doesn't account for the other double clunk (either of them), but my explanation will. Anyway, that's
really the only clue that there's something else going on here, or that it's a velar release into another plosive. So if you
caught it great, if you didn't, you'll have to insert it in through lexical identification later.
Lower-case T
[t], IPA 103
The release here is good and sharp. It looks doubled, but I think most of the energy is in the F3, rather than in a pinched
F2/F3 combination. F2 seems to start at about 1500 Hz. So what's going on? I'm going to suggest the release at 1150
msec is coronal, and that the double clunkiness we see is the result of ...
Tilde L (Dark L)
[ɫ], IPA 209
... lateral release. I was being persnickety with the half-under-ring diacrhitic but the point is that there's dark /l/ here,
partially (or fully) devoiced by the aspiration following the /t/. The lowness of the F2 transition (relative to the
expected frequency of 1700-1800 Hz) is compatible with rounding, but I'm going to suggest that it's there result of
velarization of the lateral. The second clunk is the other side of the lateral releasing. So my story on double clunks is
that they involve two sides opening at different moments. So velars and laterally released [t] is most likely to have a
double burst. The standard story is that the long closure associated with velars (and dentals) causes a high-velocity
airflow on release, and a Bernoulli 'clunk' immediately after release. I think I'm right, but I'm apparently the only one.
And sometimes a clunk is just a clunk. The vocal tract is a juicy place, after all.
Lower-case I
[i], IPA 301
So after all that, we end up with an F2 up above 2000 Hz (way up above) and an F1 which is quite a bit lower than the
mid-ish F1s we've been seeing. So this is a highish vowel, amazingly front. /i/ or /e/. In this case [i]. Trust me.
Lower-case E
[e], IPA 302
Now this is a flat /e/. Not diphthongy. F1 is mid or just low, F2 is around 2000 Hz and relatively stable. Not obviously a
diphthong. So there.
Lower-case P
[p], IPA 101
A gap of about 100 msec. Withs ome perseverative voicing, but not enough to worry about. The burst is not amazingly
sharp, and it seems to be loudest in the low frequencies. If I work hard enough I can convince myself that the F2 and F3
transitions into this gap are bilabial, but they're not obviously sow on the other side. At least they're not obviously
anything else....
Schwa
[ə], IPA 322
Another absurdly short vowel, mostly transition, so all it reduced and move on.
Lower-case B
[b], IPA 102
Well here's gap. Notice that the perseverative voicing here is more 'voiced'. Probably meaningful, although there's no
guarantee. THe transitions into this look bilabial, at least the F2 does. The F3 and F4 transitions out of this gap and into
the following vowel are also suggesting bilabial more than anything else. So potentially voiced and bilabial.
Schwa
[ə], IPA 322
I hear a vowel here, so I guess it's a schwa. But it's really so totally coarticulated with all but the contact-part of the
following lateral, that I can't really blame you for wondering what the heck I'm talking about.
Tilde L (Dark L)
[ɫ], IPA 209
Okay, so here's the trick. THe formants, such as they are, indicate a mid-ish vowel (neutral F1), and a very back or round
tongue body F2 well below 1000 Hz. Ideally we'd like to see F3 (or at least F4) rise to above neutral for a lateral, but no
such luck. But it can't be an /r/ with an F3 like that, and it certainly can't be a /j/ with an F2 like that. So that leaves the
lateral and the labial-velar. So which is more likely to a) follow schwa and b) form a word with the preceding. But you
can see how dark /l/s and /w/ or /u/ shaped vowels can resemble one another....
Last modified: 11/08/2009 22:57:43 Support Free Speech
Epsilon
[ɛ], IPA 303
So the vowel starts just before 250 msec and goes on for about 100 msec. The F1 is a little low of mid, suggeting a
slightly higher mid vowel, which is weird for me if this is [ɛ]. But anyway, this vowel looks quite central, and so this
looks very schwa-like. But judging from the amplitude it must be stressed, and if it's stressed, this can't be my [ʌ],
which is typically low. So this is probably mid or high, and otherwise non-descript. Oh well. The falling formants are
clearly transitional, since they mostly all do it, so they don't help. Not long and not tense.
Lower-Case V
[&#x;], IPA 129
Well, it's very weak, but there's frication throughout this gap up to 400 msec. There's also voicing, so this is either a
very weak voiced stop or a weak voiced fricative. The transitions in and out all suggest bilabial (although the F3 doesn't
help--more later), so, since bilabial fricatives are not an option, as this is my English, labiodental is not a stretch.
Turned R
[ɹ], IPA 151
So there's that F3, clearly transitioning way down in the previous vowel, and there it is here, down at about 1600 Hz or
so. So this must be an /r/ of some kind. Nuff said, I guess.
Schwa
[ə], IPA 322
Transcribing this as a vowel is merely a convenience. There seems to be 'something' between the /r/ and the following
segment, but exactly what is open to interpretation.
Tilde L (Dark L)
[ɫ], IPA 209
So from about 475 to 550 msec or so, there's a dip in amplitude, accompanied by an apparent zero in F2, and a relatively
high F3. Very high considering the previous /r/. I'm not quite sure what's going on in F2, but the raised F3 is usually a
good indicator of the lateral. And the F2, if it's anywyere, is down there below 1000 Hz, so it must be dark.
Barred I
[ɨ], IPA 317
Again, this is a bit of vowel. I made a mistake transcribing it as barred-i--I think I must have misread the F3 as an F2, but
that's idiotic, since even the highest F2 can't get up that high. But it's a transitional vowel more than anything else. So
there.
Lower Case W
[w], IPA 170
So here's another attenuated, presumably consonantal articulation, but fully and clearly voiced. So this is almost
undoubtedly a sonorant, but very, very close. The low F2 is consistent only with something very round and very back,
and the following F2 transition is typical of [w], so there you go.
Turned R
[ɹ], IPA 151
Well, here's another /r/. Low F3, though not as low as previously. I've been noticing that the bandwidths of initial /r/s
being very narrow, but that may just be me doing stuff to do that. Anyway, thsi looks like a typical /r/ in coda position,
with the higher (closer to F3) F2 than in other positions.
Lower-Case D
[c], IPA 104
Gap. Probalby a plosive of some kind. Very voiced, which is interesting. There seems to be a folling F4 (or something),
But the F3, if anything, has a rising transition into this gap. But then it would be, since it' starts so low. The F2 is
ambiguous to say the least. SO on the balance, I think the alveolar guess is just a default thing. The release is even weak,
so it's not clear if that fricative coming up is just release (which would tell us a lot about the place of this plosive) or if
it's a fricative.
Lower-Case Z
[z], IPA 133
Well, if there's a fricative, it must be a sibilant. Look at that frequency. And it must be alveolar. Same reason. And
voiced.
Lower-Case G
[g], IPA 110
Well, there's another gap here, with voicing. Nice loud, but noisy release. Transitions in and out have F2 and F3 close
together, so velar is probably the best guess. Transitioning from back to front velar is apparent from the frequency of
the 'pinch' on either side.
Small Capital I
[ɪ], IPA 319
So if it ends up as afrotn velar, this vowel must be front vowel. ANd it is. Quite front, at least at the beginning. And quite
high, judging from the low F1. The F2 transitions down in a way that I'd expect an [i] to have more of a steady state or
trend upward, at least until it starts to transition into a following consonat. So this is probably lax/short/whatever you
want to call it.
Script V
[ʋ], IPA 150
Well, this looks short, and vaguely flap-like, being a short, fully voiced 'gap' looking thing. But while the amplitude
attenuation is appropriate for a flap, the sonorousness is not. The formants may dip away, but the don't 'stop' the way
the would/might ina proper flap. Then there are thte transitions. All falling. So this looks (bi)labial again. Not really a
good fricative like the previous one, but an approximant-y looking fricative. And again probably labiodental over
bilabial, just because this is English.
Schwa
[ə], IPA 322
Okay, now this looks like a schwa. Formants at 500, 1500, 2500 and--well, short of 3500, but what the heck.
Lower-Case N
[n], IPA 116
Another segment that has roughly the duration of a flap, although it might be just a tad long. And considering the
length, it's fully sonorant (with resonances), so probably not a tap. The attenuation therefore is probably close
articulation, and the discontinuity in the frequency/bandwidth of the formants, not to mention the apparent zero
below "F2" suggest a nasal. No evidence of velar pinch or a single F2/F3 range pole, and the pole is up around 1500 Hz,
too high to be my bilabial. One one choice left.
Lower-Case A + Upsilon
[aʊ], IPA 304 + 321
Well, abstracting away from the first 50 msec or so, the F1 here is fairly high, indicating a fairly low vowel. The F2 starts
in the central range, and goes down, indicating increasing rounding and/or backing. So this is probably a diphthong
[aʊ]. I wish I could see hte F1 dropping a little, to suggest going from low-to-high, vowel height-wise, but whatever. I
don't regard /aO/ a likely diphthong in this case.
Lower Case W
[w], IPA 170
Well, the voicing starts at about 75 msec, but the upper formants don't really kick on for another 50 msec or so. So
there's something here, something less 'open' than a vowel. But it doesn't look gappy or fricativey, so that leaves nasal
or approximant. The fact that the F1 is 'full' and not damped in any obvious way suggests approximant. The F2 starts
very (very, very) low, so this can't be a [j]. The F3 doesn't seem to be doing anything. Certainly not low enough to be an
[ɹ], it doesn't look raised either. So it looks like back and round is the only good choice.
Lower-case I
[i], IPA 301
So even though the F2 is apparently being pulled down on both sides, from the front by the preceding [w] and on the
right by whatever that transition is doing, it reaches an extremum (in this case a maximum) well above 2000 Hz, mayb
eeven 2200 Hz. Whenever you see a male voice with an F2 above 2200 Hz, it can only really be an [i].
So the astute spectrogram reader will be saying to itself, "[wi]. Hmm. And this is a declarative English sentence, so I'm
probably looking for some kind of NP at the beginning. Hmm."
Lower-case D
[d], IPA 104
Ah, another gap. But this one should strike you as very long. So maybe something is going on here. Look at that voicing.
It's strong, as if it was really voicing and not just perseveration of the vowel's voicing. So this might be a voiced stop. Or
part of this might be a voiced stop. I arbitrarily segmented the gap along with the voicing, just cuz, but that means we
have to look at only the left-side transitions for a cue to place of this stop (since any right-side transitions will be
covered up by the proposed following stop). So the F2 seems to be rising, but that last pulse looks like it's dropped al
ittle. F3 doesn't seem to be doing much. F4 seems to be rising, sort of, but I'm not sure what that means. Well, at least
we know it's voiced. Probably not velar. Not amazingly labial looking, and statistically [d] is more likely than [b] post-
vocalically anyway.
So now the astute spectrogram reader (hereafter to be known as the ASR) will be thinking, "[wi] might be a pronoun,
which might be a good subject, and now we have [pʰeɪd], which might just make a decent verb. Hmm."
So on to the voiceless side of this gap thing. Wow, talk about voiceless. Big huge release followed by very strong
aspiration. How much more voicless can you get? The noise is [s]-shaped, i.e. broad band, strongest in the very high
frequencies, so this is probably an aspirated [t].
We have something short of 100 msec of vowel where, starting from the onset of voicing and ending at that, well, let's
just call it a 'discontinuity' for the moment, just shy of 800 msec. At the begniing, the F1 is mid or low-of-mid,
suggesting something moderately high. The F2 is, well low-of-mid and dropping, although the drop may just be
transition. F3 is just hanging out, but F4 is definitely heading downward. So all in all this looks like rounding from
something vaguely high and not at all front to something rounder or backer. Or maybe it's just transition.
The ASR at this point will be recalling that in my west-coast USA voice, /u/ is not particularly round or back, and
following coronals I will have that merged /u-ju/ thing. And this is almost definitely post-coronal.
Lower-case M
[m], IPA 114
Well, there's an abrupt discontinuity as I mentioned before, one that involves reduced amplitude and steadying of
resonant frequencies for not quite 100 msec, when about 875 msec or so there's a 'symmetrical' moment, where the
amplitude and formant movement suddenly start up again. So we've got something resonant, but of reduced amplitude
(compared to the surrounding vowels). And unlike your average approximant, the edges are quite sharp, and there's no
movement or anything happening during the 'closure'. Which is pretty good indication of a nasal. The transitions all
suggest labial, as does the relatively low pole, or whatever that is, at about 800 or 900 Hz. (My coronal pole is closer to
1000 Hz.)
Turned V
[ʌ], IPA 314
Well, without being distracted by the voicing bar, the F1 here is moving up from a middish kind of vowel to something
that is pretty definitely low. THe movement may again just be transition from the preceding labial, but whatever. The
F2 is defintely low to start with and moves, well, to the mid-range. F3 and F4 just don't tell us much. So this is a mid-to-
lowish vowel of indeterminate back-to-centralness. How's that for a description?
Lower-case T
[t], IPA 103
Gap. Plosive. Rising F2, but not much indication of a lowering F3, I guess, so whatever this is it probably ain't labial and
it probably ain't velar. That and the release looks sibilant again.
Esh
[ʃ], IPA 134
Well, I was convinced earlier that this was definitely an esh, but now I just don't see it. I'm tempted to point at that F2
or F3 shaping of the noise, but there' some of that in the previousl aspiration that I just called apsiration. The noise is a
little clunky, suggesting spittle more than anything else. And I just don't see the usual sign of esh-ness, which is an
absence of noise in the lower frequencies. So I just don't know. If you think that's just more aspiration, make a word out
of it and get back to me.
Lower-case F
[f], IPA 128
But there's definitely something going on beyond just aspiration, cuz otherwise it (or the fricative release of the
affricate, or whatever) would go on for almost 200 msec, which is just too long. So I think there's qualityative change
here, in the form of the formants which sort of go away in favor of very diffuse, unshaped, unfiltered noise in this
segment. The formants that we can see all point down on the left, and on the right most of them are rising out of the
fricative, so this is all consistent with labial. Or, since this is an English labial fricative, labiodental. And voiceless, of
course. No one is thinking voiced, right?
The ASR will be going crazing trying to make a quantity (that can be 'paid') out of the sequence [t] vowel [m] vowel [t]
fricative [f], until it starts to sound it out.
Eth
[ð], IPA 131
So there's shortish vaguely gappy thing, but with full voicing. Could be a nasal flap, but the upper frequencies a) are
there and b) are noisy. Pretty good indicator of some kind of fricative. Or at least something oral. Transitions aren't
telling us anything. I mean nothing, in the sense of no information, and not just ambiguous. Which is often more
consistent with coronal (not to say alveolar) than anything else. So think of all the voiced, coronal, flappy fricatives you
can, and plug one in.
Turned V + Upsilon
[ʌʊ], IPA 314 + 321
Do not ask my why I have this vowel. It might be allophonic (following eth) or it might be isolated to this word, or it
might just be me watching too many Britcoms on TV. But ther it is. Mid throughout. Starting just back (or round) of
neutral and moving oh-so-slightly backer or rounder. The extreme length is presumably phrase-final lengthening
(combined with some phonological lengthening, but we'll come back to that) so that's not all that odd. The loss of
amplitude doesn't really look like nasality so much as just overall loss of amplitude, again consistent with just being
phrase-final. But get as far as mid and obviously not front, and you're doing pretty well.
Lower-case Z + Under-Ring
[z]̥ , IPA 133 + 402
Ah, an [s]. It's broad band, it's concentrated in the very high frequencies. But it's kinda short for something that's being
phrase finally lengthened. And it's kind of weak for a phrase-finally boost. So maybe this is a [z], but devoiced. Which is
what it is. This would explain a) the obvious lack of voicing, b) the weird length (short because its voiced and
lengthened from short because it's phrase final) and c) the incredible lengthening of the preceding vowel. Calling this a
devoiced (or voiceless) [z] is what we call an 'elegant solution'. Yeehaw.
Glottal Stop
[ʔ], IPA 113
Well, glottal stops are nonphonemic in English, but this is phonetics. There's some noise, or something, in formant-
looking frequencies, before 100 msec, something that looks like a glottal pulse or two (depending on where you look)
just after 100 msec, and then regular voicing kicks in. So unless you believe this is an [h], which I suppose it could be,
you have to account for this. It doesn't look like aspiration (unless you believe this is an [h], which it isn't), so this can't
be the release of a plosive. So if it's not an [h], and it better not be, this is just the glottal 'attack' of a vowel-initial
utterance.
Schwa
[ə], IPA 322
So the vowel here starts just after 100 msec, and goes on, sort of, until almost 200 msec. It's actually very low and back,
but I swear I hear it as a schwa and not at all like an [ɑ], which is really what this looks like to me. So if I were working
from just the spectrogram, this is an incredibly short [ɑ], and the only real reason for it to be so short is that it's
reduced. So I still might call it a schwa. But check out these formants, because they'll come back to haunt us in a
moment.
Hooktop H
[᧖], IPA 147
Well, there's a dive in amplitude here, and an increase in the noise above 500 Hz, but if you notice, this is pretty much
voiced throughout (though perhaps only passively). The noise is sort of formant-shaped, if you know what I mean,
which is pretty characteristic of [h]. But voiced.
Lower-case B
[b], IPA 102
Well, the transitions out of the preceding vowel are definitely falling into this. THe F2 transition is clearly falling below
1500 Hz, which is alreayd lower than you might expect for an alveolar. So this is pretty cledarly a [b]. It's even fully
voiced. Ignore the transitions out. They'll just confuse you.
Barred I
[ɨ], IPA 317
Well, there's something here beyond just transition between the offglide of the previous diphthong and the amplitude
dive between 750 and 775 msec. So there's a short little vowel there. Probably reduced from something.
Lower-case Z
[z], IPA 133
If you look at the very top of the visible part of the spectrogram, there's noise. It's strongest way there and trails off as
you go down in frequency. The noise doesn't seem to be well supported by the resonances, i.e. there's no formant-like
organization to the noise that is continuous with the vowel formants on both sides. So there's got to be a fairly close
articulation here, probably fricative or there wouldn't be that much noise, I suppose. And pretty much voiced
throughout. There's not a lot of really good transitional information in the surrounding vowels, but luckily the noise is
clearly [s]-shaped. But voiced.
Schwa
[ə], IPA 322
Another short, weak little vowel, this time from just before 800 msec to 850 or so. Probably reduced.
Script A
[ɑ], IPA 305
On the subject of stress, it's worth noting that the voicing striations in the following vowel are far apart, indicating
relatively low pitch. So disconnet 'high pitch' from your notion of stress, and replace it, if you must, with the notion of
'pitch accent' or 'pitch excursion'. Ennyhoo, we've got a great long vowel here. Qutie high F1, so very low vowel, Very
low F2, indicating backness and/or rounding. The backness is enhanced here for contextual reasons (the velarized [l] to
follow, but that's for later). I don't have a phonemeic [ɔ] or the Canadian [ɒ], so that limits the choices.
Lower-case T
[t], IPA 103
Well, it's mushy, but there you go. The main thing here is that there's a sudden cessation of voicing, and so approaching
the following sibilant you've got a sharp gap. Moving on.
Esh
[ʃ], IPA 134
Well, this is clearly a fricative, and probably sibiliant. It's concentrated in the higher frequencies, but not really in the
highest frequencies. I don' t know off hand if this is characteristics of /t/ shaping of esh noise in the affricate
(broadening the band of the noise, and pulling up the pole from the F2-F3-range cut-off) or if this is just how esh-noise
is really shaped. [s]-noise is usually concentrated well above 4000 Hz, and this noise is clearly centered between 3000-
4000 Hz. So it's and esh, and this is an affricate. You can tell because of the sharp onset. You get a sharp onset because of
the preceding 'gap'. TMSAISTI.
Epsilon
[ɛ], IPA 303
Lowish vowel, but not as low as it could be (as indicated by the high F1), basically neutral F2. Not much going on in F3.
So frankly ,this look slike an [a]. But it ain't. I've decided the attenuation that I always get in the middles of final
(stressed) vowels is just the low boundary tone. If this were further back (or rather 'earlier) in the utterance, I'd swear
there had to be a lateral or something here. But since we can attribute the amplitude change to the pitch excursion, we
should.
Lower-case S
[s], IPA 132
Nice little bit of noise. Centered way up above where it was in the esh. So there.
Lower-case T
[t], IPA 103
Well, there's are a real gap. What makes this look like a [t] is the release/aspiration phase. It's [s]-shaped, indicating
alveolar airflow.
This page last modified: 11/08/2009 22:57:23 Support Free Speech
Egad, this turns out to be a hard spectrogram, because there's a dearth of positive clues, and a lot of ambiguity.
Welcome to the real world, gang.
Ash
[æ], IPA 325
Lowish vowel (higher than 500 Hz F1), not outrageously back (or the F2 would be lower), so this could be something
lower than mid and fronter than, well, back. Good candidates are [æ] or [ɛ].
Lower-case T
[t], IPA 103
Nice little gap (okay mushy in the lower frequencies, but whatever), apparently voiceless with a nice sharp release. So
this is almost undoubtedly a voiceless plosive of some varaiety. The release is a little ambiguous, being strongest in the
F3 region. A little low for alveolar, a little high for bilabial or velar. The transitions aren't really helping. No obvious
downtrends like bilabials, no obvious pinch like velar, no obvioust lift to F3 like alveoalrs. But the F2 doesn't seem to
move at all, and it's overing sort of above 1500 Hz. This is near the locus for alveolar transitions, but it would be nice to
ahve some serious positive evidence. How about this. How many words like 'thep' or 'thek' can you think of?
Schwa
[ə], IPA 322
Short vowel, mostly transition in F2. Call it schwa and move it on.
Lower-case D
[d], IPA 104
Well, if I didn't know better, I'd suggest this was a nasal. But it's not. It's not really resonant enough, considering how
voiced it clearly is. But good guess. So if it isn't a nasal, it must be a plosive. I guess. Again, transitions aren't telling us a
great deal, except that they're 'consistent' with alveolar. At least it can only be voiced.
Lower-case B
[b], IPA 102
Ditto this, as voicing is concerned. This is a long, long stretch for a single voiced consonant. There also seems to be
some kind of amplitude discontinuity just before 700 msec, if that means anything. The release (at about 750 msec) is a
little clean to be alveoalar, even though it seems to be broad band and concentrated, if anything, in the high
frequencies. But the transitions into the following vowel are totally inconsistent with that. The F2 clearly starts quite
low, and the F3 and F4 all definitely point down (toward the plosive, i.e. they rise into the vowel), which is most
consistent with bilabials.
Lower-case I
[i], IPA 302
Well, the F1 doesn't move a lot, it seems to stay sort of low. So this is a fairly high vowel throughout. The F2 extremum
(800 to 825 msec) is way high, at least 2200 Hz. The only thing that ever gets that front is [i]. And I don't usually produce
offglides that front. So this is probably [i] and not [ei] or something like that, and the F2 movement is entirely
coarticulation. That's my story.
Lower-case W
[w], IPA 170
Well, the swooping F2 can only mean extreme backness and rounding. The loss of ampoitude in the higher frequencies
suggests initial /w/, although there's really nothing here to rule out a full-on [u], since a very tightly-rounded high [u]
could damp the higher frequencies like this. The fact that there's another vowel on the other side might sway the
decision.
Turned V
[ʌ], IPA 314
This is short, and could be another schwa, but the F1 definitely approaches the lower vowel range (compare it with the
first vowel in this spectrogram), and the F2 doesn't rise they way it might. So this is farily back, even at the end. Lower-
mid, and back.
Lower-case N
[n], IPA 116
This is a nice little nasal. Sharpish edges, but fully voiced and clearly resonating. Nice little zero at about 750 Hz, and a
few more as you go up). The pole is up around 1200 Hz or so. This is a little low for my [n], but is a little high for my [m].
No apparent velar pinch, nor bilabial transitions, so in the end this is consistent with [n]. If it were a little shorter, it
would look like a nasal flap, so it would definitely be alveolar.
Schwa
[ə], IPA 322
Short vowel. Moving on.
Lower-case O + Upsilon
[oʊ], IPA 307 + 321
So starting at the moment before 1400 Hz where everthing changes, F1 seems to be mid, and drifts downward a little. So
this vowel goes from mid to higher-mid. The F2 starts sort of low and gets lower. So theres movement from backish to
backer, and/or roundish to rounder. Maybe both. Ignore the F3 transition, which is just transition. Mid and backish to
higher and rounder. It's worth noting, I guess, that there really seem to be two targets here, and two different pitches of
voice too.
Lower-case T
[t], IPA 103
Ah, gaps. Well, this one isn't bad, from the point of view of transitional information. There's a nice little closure
transient, for once, between 1525 and 1550 msecs, and the good news is that the energy in F3 is definitely higher than it
was when the harmonics in F3 seekmed to turn off. So what we have here is a rising transition. F2 rises as well, and
that's pretty typical of alveolars. That and the following fricatve pretty much make this a dead cert.
Esh
[ʃ], IPA 134
Sibilant fricative, which by the way means it's high amplitude and high frequency, Actually, I think 'sibilant' means that
it's produced by directing a jet of air at the teeth, but this is acoustic phonetics, not articulatory. This fricative, unlike
the prototypical /s/ is not obviously highest in apmlitude in the very high freuencies, but definitely has lower
frequency centeres, in the F2 and F3 region. Also the energy drops off sharply below the F2 region, leaving basically no
energy below 1500 Hz or so. Typical of Esh. Note also the sharp onset upon release of the preceding plosive. So this and
the preceding plosive together are an affricate [tʃ].
This page last modified: 11/08/2009 22:57:21
Turned R
[ɹ], IPA 161
This is what I call a type DA /r/. Well, first things first. THe F1 is quite low, The F2 is pretty low, but that F3 is just
freakin' *low*. So it's an /r/. My type D is one that has no steady state, i.e. the F3 is absolutely always moving and seems
to start at its minimum (insteady of having an extremum in the middle of something). It's type A in the sense that it has
three serious formants, and a clear duration prior to the kicking in of a the upper formants (which I take to be the
'beginning' of the vowel, or the 'release' of some constriction or other associated with the /r/.
Lower-case O
[o], IPA 307
Okay, so once the F2 decides where it's supposed to be going we've got a moderately flat, overall. The F3 is just too busy
trying to get back up to where it thinks it was supposed to be all along to tell us anything useful. The F1 is sort of lower
than for a basic mid vowel, but it's definitely higher than it was where it started in the /r/. So mid-to-high vowel.
Mostly back and/or round.
Lower-case D
[d], IPA 104
Voiced plosive. Clear voicing bar lasting quite a while, and no resonance. Transitions (from the [o]) pointing up.
Probably coronal.
Lower-case S
[s], IPA 132
Well, this is a little short fricative. The very high frequency noise is suggestive of an /s/, even though its duration and
overall intensity aren't all that compelling. Still, there's a plosive following, so maybe that's hiding. Also, it's an
incredibly weak position. But whatever. The duration and weakness may be indications of the 'underlying' /z/ (by
which I mean the phonologically predicted [z] allophone of the plural marker), and this may be better transcribed as a
de-voiced [z]. But I didn't.
Schwa
[ə], IPA 322
Tiny short vowel, transcribed as schwa and otherwise not worried about. I'm glad I noticed, this. I probably should have
marked it as a barred-i, but it was all I could do to notice it was there.
Lower-case N
[n], IPA 116
From about 375 to about 550 msec or so, there's some serious voicing going on. Most of it has formants and is therefore
resonant. So this is some kind of sonorant. The formants actually look pretty good, but they are separated by areas of
no energy, indicating the presence of zeroes, as in a nasal. So probably that's what this is. The F2 is at about 1500 Hz,
which is about where my F2 usually is for [n].
Lower-case B
[b], IPA 102
This is a shortish gap, but enough to have a clear release to it. The nasal isn't doing much in terms of transitions, but
that's the nature of nasals. For place information we'll look at the burst and the following transitions. And it looks to
me like those transitions are pointing down (that is, rising, out of the gap), which indicates bilabial. Of course that's just
a guess, since if you look at where the F2 and F3 end up, it's not like they could be heading anywhere but up out of the
gap. But they look sort of smooth, so I'll say bilabial. If they were coronal, the F2 would start a trifle lower, and if they
were velar, I'd think the F3 might start a little lower.
Lower-case I
[i], IPA 301
Well, we've got a low, low F1, and a high, high F2. So this is [i].
Glottal Stop
[ʔ], IPA 113
Well, not so much a stop, in the sense of a gap, and certainly not plosive in the usual sense, but the absence of a voicing
bar (sort of) and the irregular pulse pattern in the upper frequencies is indicative of creak or glottality or whatever you
want to call it. So we've probably got a vowel-initial word coming up, probably phrase initial too.
Lower-case S
[s], IPA 132
Now that is a fricative. Look at that. Probably longer than 100 msec of voiceless fricative, with more noise on either side.
The noise is extremely broad-band, and centered in the higher frequencies. This suggests [s], which is what I'll say it is,
but frankly with something this long I'd expect the amplitude, especially in the higher frequencies, to be a lot stronger.
Then again, maybe it is, off the top of the spectrogram. Or would have been if we hadn't low-passed before sampling.
Nyquist, you know.
Lower-case I
[i], IPA 301
Another incredibly low F1 accompanied by an incredibly high F2. (By incredibly high, I mean well up above 2000 Hz, at
least in my voice). But then after reaching a max around 1150 msec, it starts to drop. F1 seems to moderate about that
moment as well, so I'm calling that a separate segment. The pattern of the movement is just not characteristic of a
transition, so this must be a separate thing. He says.
Barred I
[ɨ], IPA 317
So what is it? Heck if I known. The F1 is still sort of low, but the bandwidth has changed. The F2 is moving without any
indication of trying to get anywhere in particular either in terms of having an 'inflection point' or even a place where it
starts to slow down as it approaches its target. ALmost as if it doesn't have a target. Which is one of the descriptions of
vowel-reduction (or rather the acoustic manifestation of vowel reduction) in English. This one I have marked as a
barred-i, because it seems to me the F2 stays above 1500 Hz, and the F1 is always below 500, so the F2 is always closer to
F3 than F1, and following Keating et al. (1994), I transcribe it as barred-i.
Lower-case N
[n], IPA 116
This one is less obviously zero-ey than the preceding one, but you can still see the total reduction in amplitude
characteristics of nasals. I'd probably mistake this one for an [m], since what we can see of the F2 is low (just above 1000
Hz). But the F3 transition into the following vowle doesn't look bilabial at all. I think what this is actually a nasalized
flap-kind of thing, and the tail end (as the tongue is retreating) the oral resonance can be shaped by the lip rounding (in
preparation for the following sound). But I'm not sure what's actually going on here. Definitely a nasal, and, well, you'll
get further if you guess [n] than [m], if you are trying to make a sentence out of all this.
Lower-case W
[w], IPA 170
Well, let's start with the F3. Looks like it's rising. Don't ask me why. The following vowel seems to have a neutral F3 and
an F4, where the F4 is continuous with this F3. So I choose to ignore the evidence of the F3 and look at the rest of it. The
F2 is about as low as it can possibly get, well down below 1000 Hz. So this must be as back and as round as anything I can
produce. The transition out from this minimum is pretty straight, which is more characteristics of an /w/ retreat-from-
rounding transition than anything else, The raised F3 might indicate [l] (in this case, a dark [l], of course), but since I
can't tell if that's a raised F3 or a really low F4 (conceivably consistent with rounding) I'll again ignore the F3 and just
assume this is a [w].
Barred I
[ɨ], IPA 317
Short vowel. This one is even more schwa-like than the one I marked as a schwa. What was I thinking?
Lower-case N
[n], IPA 116
It might be easy to miss, but there's resonances at F2 and F3 above the voicing bar/F1 thing. If those were absent, I
might be include to regard the F1 thingy as perseverative voicing into an oral plosive. But the upper resonances are
there, so there must be a nasal in here. Again, the F2 is up around 1500 Hz, indicative of [n].
Small Capital I
[ɪ], IPA 319
So looking at the resonances, there's an F1 somewhere below 500 Hz. Not outrageously low, but definitely lower than
mid, which makes this a higher-than-mid vowel. But not outrageously high. F2 is nice and high, up around 1750 or 1800
Hz, indicating something quite front, but not outrageously front.
Lower-Case S
[s], IPA 132
Well, this at least we know is a fricative. It's fairly broad band and concentrated in the very high frequencies. It's a little
suspicious in that you can see the F2 travelling though it, and there isn't much in the way of energy below the F2
resonance. Usually that kind of drop off in the amplitude profile is characteristic of an Esh, but the center of the energy
here is just too high for that. So it must be an [s].
Esh
[ʃ], IPA 134
On the other hand, this is more plausibly an Esh. The center of the energy is still a little high, but at least it's in the
visible frequencies. (By the way, in case anyone is wondering I regard the 0-4500 Hz range the 'visible frequencies' just
because it's what I'm use to looking at. Most linguistic information is below 3000 Hz for men and about 4000 for women,
except for the odd high center and noise information like here.) I'd like somebody to check these "assimilations",
maybe in Lisa Zsiga's work, about whether this is how these sequences look--the alveolar loses its low frequencies and
the postalveolar's center is pulled up. But that's how it looks to me.
Lower-Case O
[nʊ], IPA 307 + 321
Well, there's a mid-looking F1 for you. And pretty freaking flat too. F2 starts, well, sort of neutral--I think this is the
frontness of the preceding fricative at war with the backness/rounding of the vowel, and gets slightly backer and
rounder. I've really only got one vowel that does this, and it's /o/. So there you go.
Lower-Case Z
[z]̥ , IPA 133 + 402
Now this is what an [s] looks like. Well, actually, [s]s are longer and louder. This is a devoiced [z]. Note the sibilant
pattern, but short.
Lower-Case P
[p], IPA 101
Well, there's a nice little gap. You can see a closing transient at abotu 725 msec, which is interesting. Note that it's only
really obvious as a blip at the bottom. That's going to be important. Just before 800 msec, there's a release--it's very
weak, and noiseless, so it's probably not coronal. It's way to clean to be a velar release. So that leaves (bi)labial, which
gets further support from the low frequency clunk at the closure. There's not a lot that clunks down there at the low
frequencies except labial stuff.
Barred I
[ɨ], IPA 317
Vowels that are this short and relatively low amplitude are almost always a) lax, b) reduced and/or c) just plain not
worth bothering with. Take it as some version of schwa and move on.
Lower-Case N
[n], IPA 116
Well, following that moment of obvious vowel, whatever it was, ther's some more voicing, but much weaker resonances,
and there's something 'discontinuous' about the resonances. This should be ringing some kind of bell. Sonorant,
weaker-than-vowel resonances, nice little zero down there around 800-900 Hz..... So thi smust be a nasal. If you know
my voice, this can't be [m], because what resonance there is is up about 1400-1500 Hz, and my [m] resonance is closer to
1000-1100 Hz. Eng is unlikely, given that F2 transition in the barred I or whatever it is. Not only is there no evidence of
velar pinch, there's just no way for it to be compatible with velar pinch.
Lower-Case S
[s], IPA 132
See how much longer this fricative is? I wish it were higher in amplitude, but oh well. But it has the typical profile of an
[s].
Epsilon
[ɛ], IPA 303
Well, it's tough to see the F1, but I'm thinking it's that thing that mvoes sort of upward from about 550 Hz at about 1100
msec to about, oh, 750 Hz about 50 msec later. Which makes this a lower-than-mid vowel, but in no way 'low'. F2 starts
sort of neutral but is being drawn down by the low target in the following segment. So wishing really, really hard, you
come to believe this is a lowish, but not amazingly backish vowel. So that would make it Ash or Epsilon. And this isn't
really long enough to be Ash.
Tilde L (Dark L)
[ɫ], IPA 209
So there's no zero, at least in the low frequencies, but a distinct loss of amplitude. So this isn't a nasal, but there's
something here that's relatively 'close' and damping the amplitude. Sonorant consonants spring to mind. That low F2
suggests something back, which suggests [w] or dark [l]. I'd hope the F2 of [w] would get lower than this, but you can't
always count on that. But [w] wouldn't have such a weird (and asymmetrical) effect on the onset transition (compared
to the offset transition). In a perfect world the F3 or F4 would be raised to tell you this really was dark [l], but in life
there is ambiguity.
Barred I
[ɨ], IPA 317
And here's another short vowel, and again it's mostly transition. Skip it. After you notice the velar pinch.
Eng
[ŋ], IPA 119
And front velar pinch at that!. With a nice little zero, fuzzy F1 especially on the left. So probably nasal, likely a coda
nasal, and almost definitely velar.
Lower-Case A
[aʊ], IPA 304 + 321
Well, the F1, what we can see of it, is pretty high, indicating a very low vowel. Starts frontish and ends quite back and
round. And this is actually quite long. I wish my voicing doesn't die out like it always does on these final falling
intonation thingies, or you'd see the F1 and F2 targets twoard the end more clearly. But of the three 'true' diphthongs,
only one goes from front to back.
Lower-Case T
[tʰ], IPA 103 + 404
Someone pointed out to me recently that I've been marking these as 'apsirated', when in fact they're just strongly
released--'aspiration' by definition is VOT, and you can't have VOT without some V coming on at some point. And since
this is utterance final, this is just a release. But the release is [s] shaped, which is typical of alveolar releases (for extra
credit, explain why). And voiceless, of course.
This page last modified: 11/08/2009 22:57:25 Support Free Speech
Lower-case J
[j], IPA 153
Well, I don't know if this is a separate moment or not, but the first few glottal pulses in this thing are greater in
amplitude (or the bandwidths of the formants are wider, or something) than in the transition/vowel thing. So
segmented it. The F1 is quite low, as F1s go, and the F2 is up above 2000 Hz, up in [i] territory. Coming before a vowel or
something, I follow IPA common practice and transcribe it as an approximant.
Barred U
[ʉ], IPA 318
I'm not sure if I've ever used this symbol before in my own voice. I'd ordinarily transcribe this vowel as a barred-i or
something, but I'm pretty sure it's rounded or rounding, partly in deference to the underlying rounding I usually lose
for /u/, but also in anticipation of the following bilabial. F1 is still low (high vowel), round(ing) but not at all back.
Belted L
[ɬ], IPA 148
Okay, I transcribed this in the spectrogram as a dark l, which it is, but it's also vastly voiceless (due mostly to the
aspiration of the preceding plosive), so as long as I was playing with my new Unicode markup database, I figured I'd go
for yet another symbol I'm not sure I've ever used before. Belted-l represents a voiceless lateral approximant and/or a
voiceless lateral fricative. More than one person has argued that there is not and cannot be a contrast between those
two things, so this symbol seems to have avoided the IPA's attempt to disambiguate its approximant and fricative
symbols. Now, how do I know there's anything here at all. First, there's the matter of the aspiration, which seems both
long and loud for just plain aspiration. Second, the F3 in the release is at about 2750 Hz, which is distinctly higher htan
the 2550 Hz it is in the vowel. Which for me is enough evidence of an [l] as I'm likely to get. Positing a dark /l/ (or a [w],
I guess) here will also allow me to explain the otherwise weird displacement of the transition into the following vowel
rather than having it happen all at once on the release of the plosive. TMSAISTI.
Lower-case D
[d], IPA 104
Well, another one of these mushy gappy things, this one looking more voiced than other one. The F3 transition is kind
of ambiguous, as is the F2, but the F2 transition clearly 'stops' before it gets too far below 1800 Hz, suggesting an
alveolar locus. But....
Ash
[æ], IPA 325
I've always been bothered by the spelling of 'ash', since it's the 'English' spelling/calque of aesc. Which brings up
another point. I use the symbol names from Pullum & Ladusaw (1996). The IPA does not have official names for most of
its symbols. Unicode very carefully names each of its symbols, with long descriptive names that are meant to be avoid
as much ambiguity as possible. I think they call this 'Latin small letter ae' or 'Latin small ligature ae'. I'll keep using
'ash', after P&L, much as I hate it. Anyway, we've got something that approaches very low for a vowel (high F1) round
about 750 msec or so, but the F2 indicates something not at all back. I'm wondering how often I see that falling F3 thing
during my /ae/s. I thing I see it a lot, but I'm not sure.
Lower-case T
[t], IPA 103
Ah, gaps. This one seems to be slightly pre-glottalized, or at least comes at the end of something with very low pitch.
Now that I look at them again, the transitions are a little ambiguous, which is a good indicator that 'alveolar' is as good
a guess as anything else. That and the glottalization, which is more prominent/common with alveolars than other
places. There's a hint of a release at about 900 mseec up at top of the spectrogram, which I took to be indicative of
plosion.
Lower-case S
[s], IPA 132
Meanwhile, after all that plosion, there's some herkin' fricative going on. Very long, quite high in amplitude, very broad
band and concentrated in the highest frequencies. Very typical of [s].
Script A + Tilde
[ɑ]̃ , IPA 305 + 424
Well, the nasalization is not represented by the usual zero, bur just by the general fuzziness of the formant structure.
Which is not helped by the high frequency F0, but there you go. The F1 you can see is about as high as it can get, and
the F2 is about as low as it can get and still be F2, if you follow me. So this is a very low, very back vowel. But, unlike my
Canadian colleagues, not round. How you can tell that I have no idea, since I don't have [ɒ] for you to compare it with.
(Hmm, on my browser, in my preferred font (Gentium) this symbol isn't popping up. It's turned-script-a, or supposed to
be.).
Eng
[ŋ], IPA 119
Well, there's definitley a change in amplitude, along with the wiping out of F1, both pretty typical of nasals. It doesn't
look at all velar (compare the velar transitions for the following consonant), but this is where top down knowledge of
ENglish will come in handy--bilabial [m] is unlikely here, and [n] would likely flap in this environment.
Barred I
[ɨ], IPA 317
Short vowel, F2 closer to F3 than F1. If I had to call it a real vowel, I'd have called it small-cap I, but given the H pitch
accent on the preceding vowel and then length of the following, I'm betting we should see this as stressless, if not
reduced, regardless of what we think it's supposed to be.
Lower-case K
[k], IPA 109
Well, you can't get any more velar, and front velar at that, than this. Mostly voiceless gap, with a short VOT.
Epsilon
[ɛ], IPA 303
Well, this looks like another [ae]. It's hard to tell exactly waht the F1 is doing since it looks like it's headed straight up in
the picture, but you can also see the bandwidth fuzzing out on you (an indicator of nasality which obviously I missed
the first time around when I did the transcription), so it could be doing just about anything. Well, anything mid-to-low
and not at all back.
Lower-case N
[n], IPA 116
Transitions aren't helpful again, and there isn't enough information in terms of poles to tell what's going on. There's
definitely something long and voiced here, probably sonorant judging by the regularity of the voicing, but beyond that
I have no idea. In the absence of a better guess, pick the alveolar.
This page last modified: 11/08/2009 22:57:23 Support Free Speech
Ash
[æ], IPA 325
Nice clear formants. F1 is very high, let's say 800 Hz or so. The F2 is a hair lower than neutral, let's say 1300 or so. Not
quite low enough to be really, really back, but not really what you'd call amazingly front. I'll have to listen to this vowel
again, but I'd say this was pretty central(ized), judging from the F2 frequency. Remember how centralized, relative to
other dialects, this vowel is in the western US.
Lower-Case F
[f], IPA 128
This is an interesting lesson in acoustics. The periodicity in F3 seems to leave off at about 225 msec. But the voicing
doesn't end for almost 50 more msecs. So what seems to be happening is that as the constriction increases, the upper-
frequency harmonics are getting suppressed. I'm not sure what lip-teeth compression does to radiation, but I'm
wondering if there's either an acoustic or an aerodynamic change in spectral slope here. Anyway, There's friction
starting about 225 msec or so, and clearly voiceless friction starting 275 msec or so. It's not particularly loud friction, so
this isn't sibilant. There's some organization in the resonant frequencies, but not the kind of support you'd get with [h].
So this is probably a voiceless labiodental or interdental. I'm not sure how to tell the difference. The formant
transitions aren't really giving us much information. Odds are against the interdental, just because it's a coda of a
stressed syllable. I think.
Lower-Case T
[t], IPA 103
Nice little voicelss gap from about 300 msec to the release just after 350 msec. Interestingly enough, there seems to be
an alveolar-shaped (that is, broad band, tilted to the very high frequencies) *closure* transient. There's a nice sharp
release at about 350-375 msec or so. The release is a little odd, centered in the F3 region, or possibly showing signs of
F3/F4 pinch. F3/F4 pinch is sometimes associated with dentality (velar pinch is F2/F3), but I haven't seen it enough to
be sure about its value as a cue. But the center frequency is a little low for an alveolar burst, and might be a into the
velar-burst range. But there's no involvment with F2, which you'd expect with a velar, there's no pinch, and the burst is
sharp and fairly clean--not at all mushy or doubley-looking. So this is an alveolar burst. The lowness of the center
might have to do with the upcoming low F3 (i.e. a long front cavity?) or liprounding (i.e. a long front cavity?). But I
don't know.
Barred I
[ɨ], IPA 317
Tiny short vowel, barely four or five pulses long. We don't waste a lot of time on these. Transcribed as a reduced vowel,
following Keating et al (1994), barred-i iff the F2 is closer to the F3 than the F1, schwa elsewise.
Lower-Case N
[n], IPA 116
On the other hand, the voicing continues even though at about 600 msec the amplitude takes a sharp dive. This is a nice
nasal-y looking thing. Reduced overall amplitude, reduced formant amplitudes, and a nice clean zero between the lower
resonances. The F1 is mostly neutral or low of neutral, typical of nasals, and the F2 is nice and high (relative to nasal
poles) at about 1500, which in my voice is a very nice, clean alveolar [n]. (Velars show more F2/F3 pinchiness than we
see here, and labials always have their pole much lower, around 1000 Hz or even just below.) That "clunk" at about 625
msec (in the F2, and from the F3 all the way up) is a phenomenon known technically in the biz as a "clunk". Clunks can
happen any time, but for some reason they often happen in nasals. They're due to something viscous (saliva or some
other fluid somwhere in the vocal tract) flying around somewhere at the wrong moment. Distracting in a spectrogram,
but so obviously an anomaly (unless it happens where you might be wondering if it's a release transient or something)
that they can safely be ignored.
Lower-Case S
[s], IPA 132
So, even if you're a beginniner, you should at this point be able to tell that there's something going on from about 750
msec all the way to 900 msec. It's voiceless (no striations at the bottom). It's noisy--the energy is snowy and random,
not organized in nice striations. It's very broad band--there's no formanty-organization. It's centered (darkest) in the
very high frequencies. So this is very loud (dark) this is very high pitched (as noise goes) and very long. Sounds like a
classic sibilant to me. In fact clearly an [s]. Even though the energy cuts off (sort of) below F2, if this were an esh the
noise would be centered lower down, in the F3 F2 range, down to the cut-off frequency (around F2).
Turned M
[ɯ], IPA 316
If you are a beginner, or if you're not familiar with the west coast US vowel system (or Japanese...) this vowel will
mystify you. BUt I'll try to explain. Starting about 900 mex all the way to about 1150 or 1175 msec, there's some very
high pitched voicing going on. The F1 (lowest formant) is sort of low, at least lower than neutral (around 500 Hz), so this
vowel is higher-than-mid. The F2 starts basically neutral (near 1500 Hz), maybe a little lower (backer) and moves a little
lower (backer). F3 is nice and flat in more or less its neutral range (about 2500 Hz) as is the F4. So we've got a highish,
central-to-back and moving backer-or-rounder vowel. So this my /u/. There's nothing particularly round about it, or
alternatively it might be round but then there's nothing particularly back about it. So take your pick. I've transcribed it
as back and unround, but that's my intution, not anything measurable. In southern California, the primary effect is
definitely unrounding, although the 'centralizing' of the F2 is achieved in other dialects of US English by centralizing
the tongue but maintaining rounding. Go fig.
Eth
[ð], IPA 131
Well, the energy in the very low frequencies is 'voicing bar', even though the frequency is such you can't really see the
individual striations. So it's voice, whatever it is. It could be a mushy stop, but the noise isn't really organized the way
I'd expect. So it's probably a fricative. Voiced. Definitely not sibilant. So again we've got something that is most likely
labiodental or interdental. Here, the transitions are being a little more helpful. There's definitely a 'lift' in F4, and no
evidence of anything remotely labial about any of the transitions. So on the balance, the (inter)dental is more likely
here based on the cues, although it's pretty unlikely statistically. That should make this word really easy to identify--no
near neighbors... ;-)
Schwa
[ə], IPA 322
I probably should have transcribed a glottal stop in here, as there's defintiely some creaky voice going on here. But oh
well. The vowel here is sort of short and the creakiness doesn't make it any easier. So in the end, given the great lenght
of the preceding and following vowels, I'd say this was reduced and move on.
Lower-Case N
[n], IPA 116
Well, it's not as long as the last one, but this is another nasal. From about 1350 to 1400 msec. Or thereabouts. Following
thoes three or four clear periods of voicing. The F2 again is up around 1500 Hz, at least if you can see it. So this really
can only be [n].
Lower-Case D
[c], IPA 104
Keating et al (1994) distinguished closures from releases (in similar fashion to Steriade's Aperture Theory model of
stops), which would be a handy thing to be able to do here. This is an oral release (see the nice sharp burst) of
something that doesn't seem to have much in the way of an oral stop component. Nasals stop with oral release. But
that's not an option the IPA givse us (I'm not suggesting it should, it just underscores the theoretical constraints
imposed by strictly segmental model like IPA transcription), so there you go. The release characteristics are consistent
with alveolar. In case homorganicity wasn't an option. Given the following segment, it probably was.
Lower-Case H
[h], IPA 146
This will be controversial. Because there's some very clear voicing starting from the release of the previous stop, at 1400
msec, that goes on for almost 100 msec. There's a dip in the voicing amplitude from 1475 to 1550 msec or so. ANd then
the voicing comes back up. But if you look at the upper frequencies, there's no periodicity to speak of in the formants.
So what we have here is a mostly voiced [h]. Which I should have transcribed as such, but I was paying more attention
to the noise than I was the voicing bar. It's not unusual for intervocalic /h/s to be fully voiced, but this is just bizarre.
But it's an [h], voiced or not. Note the formanty organization of F2 and F3.
Lower-Case I
[i], IPA 301
Well, this will be controversial too. I'm guessing the 'real' vowel is really just get beginning of this, i.e. when the voicing
kicks back on at 1550 or whenever, up to when the F2 starts to dive, around 1675 msec or so, and the rest of the vowel is
just transition. But whatever. The F1 is low, the F2 is unbelievably high (especially in the preceding /h/, which is typical
of [i]. The diving F2 is transition to the following consonant.
Tilde L (Dark L)
[ɫ], IPA 209
Speaking of which, this is weird again. There's a sharp discontinuity in the F3 and F4, which makes this look like a
sudden aperture change, but the F2 and F1 keep their energy and maintain it longer than that. SO I don't know where
the 'boundary' is. There probably isn't one (again one of the limitations of the segmental model). But by the time you
get to the end, the F1 has moderated to neutral, the F2 has lowered to about 1000 Hz which clearly indicates backing or
velarization, the F3 has risen again to well above the neutral freuency it has for most of the vowels. SO this is another
velarized /l/.
This page last modified: 11/08/2009 22:57:36 Support Free Speech
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled Unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. I'm committed to keeping my recommendations to a) freeware fonts with b) decent looking
IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts with good looking IPA
support, I'll test them out and add them to the style sheet.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled Unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. I'm committed to keeping my recommendations to a) freeware fonts with b) decent looking
IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts with good looking IPA
support, I'll test them out and add them to the style sheet.
Trivium: It was going to be 'they're made from reclaimed barnboard', but I couldn't get a "fr" I was happy with, and I
decided 'barn board" would just be too much rhoticity for anyone but me to be interested in.
One of SIL's IPA93 fonts (SILDoulos IPA93, SILSophia IPA93, or SILManuscript IPA93)
Gentium, an IPA-enabled unicode-compliant font by Victor Gaultney
or another IPA-enabled, Unicode-compliant font, such as SILDoulosUnicodeIPA
All are available freeware. I'm committed to keeping my recommendations to a) freeware fonts with b) decent looking
IPA compliant symbols. If anybody has any recommendations for freeware Unicode fonts with good looking IPA
support, I'll test them out and add them to the style sheet.
Okay, I'll tell you. This is /h/. Voiceless source, either glottal or epiglottal friction, exciting all the open cavities of the
vocal tract, just as voicing would. Review source-filter theory.
Segmental cues
I've tried to follow the current E_ToBI transcription conventions, with a few adjustments. Rather than a separate
orthographic tier, I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the
Break Index Tier as a single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-)
tones and boundary (%) tones to the left of their appropriate Break Index.
"the"
Break Index: 0
My range really seems to bottom out just above 100 Hz at the beginning of an utterance, and just below at the end. This
is nice and flat and low, hence L*.
"bus"
Break Index: 1
Lexical word, and it seems to get a high pitch (although not as high as most other HiF0s in my voice), hence H*.
"leaves"
Break Index: 3
I had to give this its own H*, first because it's a lexical word at the end of a phrase, so it needed something of its own,
but it's oddly displaced to the beginning of the vowel. I don't think this is just interpolation between the preceding H*
and the following L-. When I said it, and listened to it, it sure felt like an H+L, of whichever * variety, but those aren't
allowed any more, and we can get the L from the phrase accent (i.e. the - tone) associated with the 3BI. So that's what I
did.
"on"
Break Index: 0
'Nother one.
"the"
Break Index: 0
Ditto. Function words, short, stressless, I figure are what else is a 0BI for?
"half"
Break Index: 0
Okay, once again, I appeal to the ToBI people to rule on this one. I've marked this with a 0BI, because this is the first
'word' in a compound, i.e. there's no lexical word boundary here. On the other hand, I think this word gets its own H*,
perhaps because it's potentially contrastive or focussed, so there you go. So would that be a 1BI? or even a 2BI? Or
would I not bother, and just mark both parts of the compound as one word without a BI in between, in which case what
the heck would the H on this word?
"hour"
Break Index: 4
Similar to the preceding, a lexical word gets an H* (or some other *) of its own, followed by phrase accent (-) and
boundary tone (%), in this case both L, resulting in the quick but extended fall on this syllable.
This page last modified: 11/08/2009 22:57:13 Support Free Speech
Segmental cues
I've tried to follow the current E_ToBI transcription conventions, with a few adjustments. Rather than a separate
orthographic tier, I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the
Break Index Tier as a single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-)
tones and boundary (%) tones to the left of their appropriate Break Index.
"I"
Break Index: 1
My range really seems to bottom out just above 100 Hz at the beginning of an utterance, and just below at the end. This
is nice and flat and low, hence L*.
"hope"
Break Index: 1
H*, accounting for the peak.
"it's"
Break Index: 0
0BIs are essentially orthographic word boundaries apparently unmarked by any prosodic effects. There's nothing going
on here except interpolation of pitch between the previous H* and some following L.
"in"
Break Index: 0
'Nother one.
"my"
Break Index: 1
Well, there has to be a L somehwere, and I pick here. It's the bottom of the pitch range (except for a very weird
boundary tone to follow), it's cleary separated from the following stuff, and, well, there needed to be one.
"desk"
Break Index: 4
These utterance-final monosyllables get really complicated. This is a lexical content word, and in this utterance it's
fairly important. So it's gets a lexical prominence, H*. It's also the end of a phrase and the end of an utterance, so it gets
boundary tones (okay, technically a phrase accent and a boundary tone) associated with each. One or the other has to
be L, since the pitch obviously falls during this vowel. So (since H*+L lexical tones are disallowed), I've chosen L- and L%.
The weird part is that the pitch bottoms out so sharply during the first half of the vowel, basically I lose voicing
altogether. Probably the result of combining final low pitch ranges with the aerodynamic requirements of the following
voiceless fricative, my voicing just dies out in the middle of the vowel. Go fig.
This page last modified: 11/08/2009 22:57:13 Support Free Speech
Segmental cues
Lower-case D and Lower-case T + Right Superscript H, IPA 104 and IPA 103 + 404
SIL [dtH], Unicode [dtʰ]
So I hope we all recognize the 'gap' in the spectrogram from about 550 to about 625 msec. This stop is a little weird. It's
fairly long. It's clearly voiced for the first 75 msec or so and then something else happens. If you look at the right edge,
there's a very sharp burst, more characteristic of voiceless stops, followed by some moderate aspiration. So it's going to
turn out that this is two stops, the first voiced, the second voiceless and weakly aspirated. The place of the second stop
you can read off its aspiration and I see I've fallen into old bad habits by putting the aspiration in its own segment. But
notice the strongest noise in the aspiration is at the highest frequencies--it looks like an /s/, but it's too short to be an
/s/. So it must be aspiration and it must be alveolar. The first stop is more ambiguous. The F2 in the vowel is so high it
can't go anywhere but down. The F3 also looks like it's heading down a little, but it's not 'pinching' with the F2, so velar
can be ruled out. So here's where you use your top-down information. What's more likely: [nib] or [nid]?
By the way, I may have forgotten to mention that the sharpness of the left edges of these nasals probably indicates that
these are syllable-initial rather than syllable-final. Syllable-final (coda) nasals typically cause nasalization of a
preceding (tautosyllabic) vowel, and hence the left 'edge' of the nasal may be be less distinct.
Lower-case L + Mid Tilde, IPA 209 (precomposed, but I'm not sure why)
SIL [lò], Unicode [ɫ]
All /l/s in English are dark. This one is. See the F2. Low. That indicates backness. That indicates velarization. Or
'darkness'. THis one is fully voiced, and resonant, it doesn't have a good zero or sharp edges of a nasal, and it's got the
raised F3 I associate with /l/s.
Once again, I've played with the current E_ToBI transcription conventions. Rather than a separate orthographic tier,
I've aligned the Break Indices to the segmental transcription. I've combined the Tonal Tier and the Break Index Tier as a
single line. I align word-level (*) tones with the middle of the marked vowel, but phrase accent (-) tones and boundary
(%) tones to the left of their appropriate Break Index.
"they"
Break Index: 1
Pronoun, but under some kind of focus. This isn't traditional focus. This is sort of idiomatic. But it's vaguely contrastive.
Anyway, there's clearly a high associated with this syllable, and true to E-ToBI form, it's realized relatively late. It starts
very low, making me wonder if this is a scooped accent, L+H*, which is probably appropriate for this rhetorical position.
"need"
Break Index: 1
I assume since this is the main verb of the upstairs clause, it counts as a lexical word for purposes of break indexing. So
I've given it a 1. And I've given it an L* to capture the pitch which drops sharply after the preceding H*.
"to"
Break Index: 0
Cliticized, if that's still the word for it. This word is reduced to almost nothing, doesn't get any degree of stress, isn't
under any kind of focus, and certainly doesn't get its own pitch accent.
"make"
Break Index: 1
Again, this is a main verb, in the downstairs clause, and I think counts as a word. No obvious pitch *changes* go on
here, so I assume this is just a L*.
"a"
Break Index: 0
Another (pro?)clitic, undeserving and reduced.
"new"
Break Index: 1
Okay, I think this is word, in the sense of getting a 1BI. But it really doesn't look like it gets a pitch accent. From the L*
on the preceding word, to the H* following, this looks like simple interpolation. So I haven't given it a pitch accent. So
there.
"list"
Break Index: 1
Nice high pitch on this, so it gets an H*. It being at the end of an utterance, it gets a 4BI, and both a phrase accent (L-)
and a boundary tone (L%). I'm a little concerned about the placement of the L-, which is supposed to be attracted to the
edge of the phrase, and therefore (I think) is supposed to be indistinguishable from just the L%. But since the high pitch
on this word is actually toward the end of the first *half* of the vowel, I think (in this model) we need a L autosegment
of some kind to get this contour. And last time I checked H*+L isn't used any more. I think HLs are never supposed to
surface as falls, but just trigger downstep on a following H--thanks to the grad students in the Intonation and Prosody
seminar last term for helping me finally get this use of HL. I still don't really believe in downstep. But since what I'm
doing here is really 'transcribing' rather than doing a strong phonological analysis, I'll make use of the existing Ls in
the string rather than introducing a new one. Hence I slide the L- over into the vowel and away from the boundary, and
pretend like I know what I'm doing.
This page last modified: 11/08/2009 22:57:12
Lower-case M
[m], IPA 131
Turned R
[ɹ], IPA 151
Lower-case M
[m], IPA 131
Schwa
[ə], IPA 322
Schwa
[ə], IPA 322
Schwa
[ə], IPA 322
Lower-case Z
[z], IPA 133
Schwa
[ə], IPA 322
Lower-case N
[n], IPA 116
Lower-case T
[t], IPA 103
Lower-case K
[k], IPA 109
Lower-case O + Upsilon
[oʊ], IPA 307 + 321
Lower-case W
[w], IPA 170
S Small Capital I
[ɪ], IPA 319
Theta
[θ], IPA 130
Lower-case T
[t], IPA 103
Esh
[ʃ], IPA 134
Lower-case I
[i], IPA 301
Lower-case S
[s], IPA 132
This page last modified: 11/08/2009 22:57:19 Support Free Speech
Eth + Under-Ring
[ð]̥ , IPA 131 + 402
I give up. When in doubt, if it doesn't start with a glottal stop, and this doesn't, guess Eth. It won't make any difference.
Just guess. But here, I worked for I don't know how many utterances trying to get a) noise and b) voicing. Well, at least
this time there is noise. Sort of. It's way too weak and short to be an initial sibilant, but from just it's frequency, that's
what it looks like.
Schwa
[ə], IPA 322
Vowel. When in doubt, guess schwa.
lò
Tilde L (Dark L)
[ɫ], IPA 209
Under-Ring
[],̥ IPA 402A
eI
Lower-Case E + Small Capital I
[eɪ], IPA 302 + 319
Lower-Case N
[n], IPA 116
tH
Lower-Case T
[tʰ], IPA 103 + 404
Upsilon
[ʊ], IPA 321
Lower-Case K
[k], IPA 109
Lowering Sign
[],̞ IPA 430
Turned Script A
[ɒ], IPA 313
Lower-Case F
[f], IPA 128
Turned Script A
[ɒ], IPA 313
Lower-Case N
[n], IPA 116
tH
Lower-Case T
[tʰ], IPA 103 + 404
Script A
[ɑ], IPA 305
I
Small Capital I
[ɪ], IPA 319
Lower-Case M
[m], IPA 114
This page last modified: 11/08/2009 22:57:34 Support Free Speech
Technically, it's January. So even though this is the solution to December 2006, I'm embarking on my January 2007
policy of no longer marking up IPA characters as being in a special font. From here on out, you must view my page(s) in
a IPA-enabled Unicode-compliant font. I recommend Gentium from among the freeware available fonts. My style sheet
automagically prefers Gentium if you have it loaded on your system. See my list of supported fonts for more
information.
Lower-case J
[j], IPA 153
Starting at about 100 msec is a period of voicing. It's sort of weak, in that it gets a lot stronger in the following vowel,
and there's no evidence of energy between the the voicing bar/F1 or whatever it is and the F2, which is up above 2000
Hz. So while voiced and sonorant (i.e. with resonances, indicating an open vocal tract), it's not a vowel. So it has to be a
nasal or an approximant of some kind. Could be a nasal (sonorant, overall weak amplitude, and an apparent zero above
F1), that doesn't jive with the F2. More than anything else, this looks like a consonant version of an high front vowel [i].
Can anyone say palatal approximant?
Barred I
[ɨ], IPA 317
Well, the F1/voicing bar whatever thingy is stronger here (from about 150 msec to about 225 msec), and the striations
between F1 and F2 come in, but it's still weak, compared with vowels coming later on. Weakness, in vowel amplitude, is
a correlate of lack-of-stress, so this is probably a reduced vowel. F2 is closer to F3, so following Keating et al (1994), it's
transcribed as barred-i.
Lower-case D
[d], IPA 104
Well, there's some weak voicing down at the bottom, but absolutely nothing above that, and just before 300 msec
there's a nice sharp burst. The burst indicates that this has to be a plosive, since only an obstruent has pressure that
releases in a burst like that. The F3 is not telling us a great deal. The F2 transitions are headed toward that 'around 1700
Hz' area often associated with alveolar transitions. So on the balance, this is probably an alveolar, although I could
entertain an argument for a front velar. Although the burst is a little 'sharp' for that...
Lower-case O + Upsilon
[oʊ], IPA 307 + 321
So, abstracting away from the transitions from the plosive, the F1 seems to hit a 'moment' at about 350 msec around
600 Hz, and then starts to head back down. Maxima/minima 'turning point' 'moments' like that usually indicate a
'target' of some kind has been reached (or undershot) between other targets, so this vowel starts either lower-mid or
low, and then moves toward someplace higher in the space. The F2 doesn't hit its 'moment' at the same time, so
chances are this is one coordinated movement rather than two distinct targets. The F2 'moment' is a low just around
900 Hz (very back and or round) at a 'moment' when the F2 seems to straighten out. So this moment is mid-to-higher-
mid and backish/roundish. So going from somewhere sort of back and lower-mid (at the first F1 'moment') to higher
and backer/round (at the F2 'moment') is something like a backish mid, tense vowel, diphthongized to something
higher and rounder (or backer). Or [nʊ].
Lower-case N
[n], IPA 116
Well, there's something going on between the time the resonances cut off (at about 450 msec) and the bursty thing
(maybe it's just a pulse) at 500 msec. There's a sharp reduction in voicing amplitude and resonance, but there's some
evidence of open-ness (in the vocal-tract resonating sense) in the F3(?) range above 2000 Hz. It's weak but it's there.
And not much else until you get well above 4000 Hz, and there's something noisy happening in the low frequencies. So
what is this? Well, it's open so we're talking vowel, approximant, or nasal. The fact that the F3 seems to drop into it
while the F2 seems to rise would suggest velar pinch. Which we'd be thrilled about, because the confirming evidence is
the apparent double-burst at 500 msec, right there in the F2/F3 pinch range, where we'd expect a velar release to be. So
we'd conclude that we're looking at a velar nasal followed by a velar plosive release. Homorganicity and all that. But
this would be a red herring, because there's no evidence of velar transition after the releasey thing. So we'll have to com
eup with another hypothesis. Which would either be [mb] or [nd]. And there's not a lot to tell us that's unequivocal.
Lower-case T
[t], IPA 103
So on the end of that nasal thingy is a burst, is a stop homorganic (same place of articulation) with (to?) the nasal.
Transcribed here as voiceless, because I convinced myself there was a short VOT, but now I'm not sure. It's ambiguous,
as I said before, because a) the burst is double, and in the right range, but the transitions don't match up with anything
velar. The F3 isn't doing much, and the F2 is so co-produced with the following, um, thing, that it's not telling us much.
So in the end, we'll rely on lexical access to fill this one in later.
Lower Case W
[w], IPA 170
So going from that moment of burst to about 550 msec, there's increasing amplitude and sonority. So I call that a thing.
Weaker than a vowel, probably an approximant, since in all other respects it's continuous with the vowel. The F1 is low.
The F2 is low. The F3 is just sitting there. So we've got something very back/round and close.
Script A
[ɑ], IPA 305
Well the F1 and F2 are pretty much all transition here, but there are some things to be gleaned. Note that the F1 rises
from its pushed-down-by-the-F2 position at the start to about 700 Hz around 675 msec. ANd then it levels off, or even
drops a little. So there's something lower-mid or low that it's heading toward. The F2 is still low there, so it's possibly
still being pushed down a little, but the point is that there's a moment in there we need to pay attention to. And at that
moment, the F1 indicates something a fairly low vowel, and about as back as it can get.
Lower-case N
[n], IPA 116
And now there's another of these things. This one is less ambiguous, though not by much. F2 is at least rising so it can't
be labial and the F3 is just sitting there, so this is unlikely to be velar. So this is probably alveolar. There's stuff going on
that looks a little like voicing at the bottom, but otherwise this looks like a plosive, down to the sharp, alveolar-looking
release. So probably a nasal, with a following plosive...
Lower-case T
[t], IPA 103
... homorganic of course. The release is nice and alveolar-looking, with it's sharp onset and high-frequency-tilted noise.
Finally.
Barred I
[ɨ], IPA 317
And here's another shortish, weakish vowel (hey, at least it looks like a vowel).
Lower-case T
[t], IPA 103
So around 825-850 msec or so, everything kind of stops. No voicing, no energy, no noise, anywhere. There's no closure
transient, but then how often are we lucky enough to get one of those. The release happens at about 900 msec, and
while weak is very sharp and in that high-frequency alveolar-looking range again.
Esh
[ʃ], IPA 134
This noise is interesting, since it's very [s]-shaped. It's a single, broad band, centered around 3000-3500 Hz, depending
on where exactly you look. That's a little low for an [s], but whatever. The real give-away is the fact that the noise stops
dead around 1800 Hz, that is just below F2. Which is almost always a classic indicator of a postalveolar fricative, i.e. [ʃ].
Lower-case I
[i], IPA 301
So for almost 100 msec, we've got a fairly flat, stable vowel. Yay. F1 is very low, F2 is very high (2100 Hz or over!), which
can really only be an [i]. As high (low F1) and as front (high F2) as can be.
Lower-case P
[p], IPA 101
Another gap, indicating another plosive. Transition wise, there's not a lot going on. F3 is coming down just a little, and I
can convince myself that F2 is as well, although that may just be me and my imagination running wild (armed with the
knowledge of what's really going on here as well...). The release burst is again sort of sharp and tilted to the high
frequencies, but that doesn't jive with the apparently labial looking transitions. Hmm. I'm trying hard to force this to
look bilabial, but except for making a big deal of the low-frequency components of the release transient (which aren't
really missing in the previous alveolar releases, so it would be a lot of handwaving) I'm not having much luck.
Lower-case S
[s], IPA 132
On the other hand, the [s]-shaped tilt to the burst noise might be influenced by the high-frequency tilt to thise noise.
Note that it isn't completely contiguous with the burst, which may suggest that this isn't just a [ts] kind of transition.
Anyway, note the off-the-top center of this noise. That's more typical of an [s] than the [ʃ] we saw earlier.
Turned A
[ɐ], IPA 324
I'm sick of using turned V [ʌ] (which the IPA defines as Cardinal 14, the unrounded version of open-o) for this vowel.
The vowels traditionally transcribed as turned-v in English are historically related to short o (and short u), but in my
dialect and in Canadian English there's nothing back about it. The turned (print) a symbol [ɐ] represents (in strict IPA
style) a central vowel of indeterminate height between lower-mid and low. So it is with this vowel. The F1 is a little
higher than 500 Hz, so vaguely lower-mid. The F2 is a little low of central, so vaguely back, but not at all round. So take
your pick. I think the F2 is being pulled down a little here by the following consonant, but that's just me. If you don't
like it, keep using turned-v, but you're unlikely to see it again here.
Lower-case B
[b], IPA 102
F2 and F4 are pointed down. F3 may be or may not be. But F2 and F4 both look labial. The gap is clearly voiced (look at
those nice clear striations), so we're talking [b].
Lower-case S
[s], IPA 132
Stronger than the last one, but clearly high-frequency and broad band, and even though the lower frequencies are
attenuated, there's not the abrupt cut-off at F2 we associated with postalveolars.
Lower-case T
[t], IPA 103
Shortish gap here, with a sharp release (gosh, it looks like the labial release from before, huh?) but the noise in the
short VOT is [s]-shaped, which we really only ever see with alveolar releases.
Barred I
[ɨ], IPA 317
Shortest vowel of the spectrogram. Don't sweat the small stuff.
Barred U
[ʉ], IPA 318
Again, getting strict about my IPA. This is not a back vowel. It is however quite round. So even though this has the same
formants as the barred-i we've seen (and similar to the small-cap i's we may be used ot seeing, the down-trend in the F2
indicates increasing rounding (or backness) during the articulation of this vowel. Which is typical of post-alveolar /u/
(reflecting its merger with /ju/ in my dialect). But as round as it gets, it don't get anywhere near 'back'. So transcribed
as round(ing) and central. And high more than anything else.
Lower-case T
[t], IPA 103
One final plosive, preceded with creakiness (utterance-finally, this could just be low pitch, but more likely it's the
combination of low pitch and glottalizing a coda plosive). The release is sharp, and tilted to the high(er) frequencies
(possibly brought down a little by lip rounding from the preceding vowel?). Noise like that is atypical of final velars or
labials.
Last modified: 11/08/2009 22:57:46 Support Free Speech