You are on page 1of 3

As Easy As A-B-Schwa

What sounds and letters are most likely to trip up contestants at the National Spelling Bee?

By Ben Blatt

This week 281 students between the ages of 8 and 15 will assemble outside Washington, D.C., for
the Scripps National Spelling Bee. Over the last 10 years, its taken an average of 645 words and 5,680
letters to weed out the wannabes from the one who outspells them all. Looking at past trends, we can take a
shot at predicting which letters and sounds will cause contestants to go home D-E-F-E-A-T-E-D and br-

Thanks to the folks at the National Spelling Bee (who sent me complete records for the last decade) and
Merriam-Webster (which provided their pronunciations), Ive been able to compile statistics on all of the
words that have been spelled correctly (there are 5,042 of them) and incorrectly (1,409) during the
traditional oral rounds. (I didnt look at words that were part of the bees written test.) So, whats most likely
to throw a speller off?

(1) ________

Looking at length more systematically, the number of letters in a word seems to have little correlation with
spelling difficulty. Roughly half of the words in the bee have nine or more letters. These words were spelled
correctly 78 percent of the time. By comparison, those with eight or fewer letters were spelled correctly 79
percent of the time. In the first two oral rounds, which include a greater mix of weak and strong spellers, the
effect of word length is more pronounced.

(2) ___________

If not length, what causes the most spelling hiccups? To answer this, I grouped spelling mistakes into three
categories of my own design. The first is a substitution, such as spelling atrabilious as atribilious,
mistakenly subbing an I for the second A. The second is a deletion: spelling ecchymosis as echymosis,
erroneously removing a C. The third is an insertion: spelling vacillant as vascillant, adding an S that
shouldnt be there. Multiple mistakes were recorded if a speller, for example, had a substitution and an
insertion error in the same word.

It was not possible to categorize every single mistake or every single word. For example, in one case a
speller began to spell idiosyncratically as I-O and immediately realized his mistake, he finished by spelling
the word as I-O-Q-R-S-Z-3-cuatro-F-L-V-R-Q. This word was tossed from the analysis, but the vast
majority of the nearly 6,451 words from the last 10 years stayed in.

Most mistakes, by my categorization, were substitutions. Just short of 70 percent of spelling errors were
caused by subbing in one letter for another, while 19 percent were deletions and 11 percent insertions.

(3) _____________

While J is a tricky letter, is appears in only a small fraction of spelling bee wordsabout 2 percent. Last
years winner, Arvind Mahankali, spelled 14 words onstage and never received a word with a J, Q, or Y.

Most people are forced to give up their spelling dreams because of trouble with vowels. In the substitution
category, the five letters that were most likely to be missed are E, I, A, O, and Y. While these letters are
common to begin with, representing 39 percent of all letters in spelling bee words, they make up a
disproportionate 74 percent of all errors.
(4) _____________

Treating all deletions, insertions, and specific letter-for-letter substitutions as separate mistakes, I counted
140 unique error categories in the last 10 spelling bees. Below are the 30 most likely reasons a speller will
hear the elimination ding. (If multiple mistakes occurred in one word, all were counted.)

To make sense of this data, I talked to Arjun Modi, a two-time National Spelling Bee participant who placed
17th in 2005. When I showed him the letters in the chart above, he offered a simple explanation: .

(5) ______________

From the data provided, I matched 1,100-plus spelling mistakes in the last 10 years to the official Merriam-
Webster pronunciations. (If there were multiple pronunciations, only the first was used). All individual
characters were counted as unique sounds, with the exception of ch, sh, th, and zh, which were all treated as
unique sounds. Of the more than 1,100 mistakes in the data set, 35 percent occurred on the sound. The
next-biggest offender was s, at 8 percent, followed closely by (e.g., the long E in beep), k, and i.

(6) _________________

The schwa causes a mistake about 7.5 percent of the time it appears. A few other sounds come close (
triggers errors 6.7 percent of the time, while comes in at 6.2 percent), but most others dont compare. S, k,
and i all cause a speller to go home less than 3.5 percent of the time, and t and l (which are both in the top 10
for total errors caused) caused an incorrect spelling less than 2 percent of the time.

(7) ________________
A. Vowels also cause trouble in the deletion and insertion categories. The most common insertion was
adding an extra E. The most common deletion: giving no letter where there should have been an E.

B. You might suspect that longer words are more likely to trip up contestants. The two longest words in
the data set were 17 letters apiece: triboluminescence and idiosyncratically, both of which sent their
spellers home. But long words arent always so tricky. Five of the eight 16-letter words were spelled
correctly, Michelangelesqueand sphygmomanometer among them. And of the two shortest words to
appear in the spelling bee in the last 10 years, gbo and rya, only the former was spelled correctly.

C. The is an orthographic representation of the schwa, a ubiquitous and bland vowel soundits
the uh in dull. Modi describes it as the most difficult to get right, since its one sound and hard to
pick up since it is unstressed. What makes it particularly troubling, for spellers, is that it can take
the form of every vowel. In last years bee, the schwa threw off spellers when it shouldve been an A
(cyanophycean misspelled ascyanophycein), an E (zenaida misspelled as zaneida), an I
(cabotinage misspelled ascabotonnage), an O (melocoton misspelled as melecaton), a U
(kuruma misspelled askurama) and a Y (doryline misspelled as doraline). The top three runners-up
in last years bee were all eliminated when they used the wrong vowel to spell out the sound.

D. The schwa is the most error-causing sound in terms of total mistakes triggered, but its also a very
common sound. Does cause the most total mistakes because it is the most common sound or
because it is likely to cause the spellers to flub? It turns out the answer is both: The schwa both
causes the most total errors and causes errors at a higher rate than any other sound.

E. Its possible these statistics are the result of pure chance. Though more than 1,700 words have been
spelled correctly in the third oral round and beyond, the difference between above-average-length
words and below-average-length words barely misses out on statistical significance. Regardless, the
fact that long words and short words are spelled correctly at roughly the same rate shows that, in
general, the word pickers are doing a good job. Though word lengths can vary, ideally all words in a
given round should be of the same difficulty.

F. When Americas top spellers line up this week, it will only be a matter of time before the first one
succumbs to the . If theyre lucky, perhaps they wont get a and theyll be able to sneak on by.
Maybe theyll even get the word schwausing Merriam-Websters first pronunciation, its spoken
with an .

G. More telling than the type of error are the letters involved. Proportional to how often the letter
appears, J is the thorniest letter in the alphabet. Roughly 9 percent of the time, a J was incorrectly
swapped out for another letter, as when jardiniere was incorrectly spelled as gardiniere. On the other
end of the continuum, N was used more than 3,500 times in almost 2,900 different words and not
once substituted for the wrong letter. The letter B was used 1,005 times and was only switched with
the wrong letter once, when dysbarism was spelled dysporism.