OF LEXICOGRAPHY, COMPUTERS, AND NORMS how TO ITH REFERENCE JOHN WILLINSKY'S WAcommercial dictionary houses go aboutEXCELLENTpaper on("Cutcollecting citations ting English on the Bias: Five Lexicographers in Pursuit of the New," American Speech 63: 44-66), I would like to point out that electronic databases, no matter how many billions of words they contain, are also highly selective instruments for collecting data. Nexis has nothing before 1975 and is therefore seriously flawed by underrepresenting (or not representing at all) older usages. Databases are collected from tapes or other forms that are easily convertible to electronic storage; sources that are too local, too specialized, or produced in an incompatible form are ignored. The billions of words in Nexis and other databases represent a mere drop of water compared to the ocean of discourse that occurs daily, and statistical data of frequency, sense, or usage can only measure the items included within their survey, not the English language. These databases are obviously biased toward edited, mainstream copy, notably including The New YorkTimes,The WashingtonPost, the Los Angeles Times, and The ChristianScience Monitor-in short, the same kinds of sources Willinsky found in his study of the manual collection of citations. Although one can search within particular categories and therefore hope to get better frequency information in Nexis, Mead Data Central's biggest
customers are not lexicographers or linguists, and the categories are not

distinguished by any fine-tuned categorization such as that of the Brown Corpus. Fred Shapiro's paper to which Willinsky refers (AmericanSpeech 61: 139-46) also points up other limitations of computer searches, such as the failure to distinguish between homographs or senses or to recognize functional shift. No one can discount the importance of focused computerized searches in lexicography or the increasing role electronically stored linguistic corpora will have in the future, but they suffer from some of the same limitations Willinsky found in manual collection, and some others unique to their own form. Reliance on databases on the one hand, and software programs on the other, brings to mind a larger problem that goes well beyond lexicography. I am told by my professor friends how much student spelling has improved on term papers by the use of software spelling programs, but this improvement has not been matched by any noticeable improvement in student spelling on blue-book exams written in class. Evidently the spelling programs, like arithmetic calculators, simply relieve the student of the effort to learn, breeding reliance on the electronic tool. Some new software programs purport to help the student fashion grammati162



cally correct sentences and check diction. We can expect in the future, as CD-ROMs come on the market in greater numbers with their vast storage capacity, programs that will tie in to encyclopedias so that one can find, at the press of a key, articles dealing with Saussure or Mencken or Romanesque architecture or twentieth-century Russian literature. Now, my question is, who will make the selections of what to include and what to exclude? How accurate will this information be? How indisputable will it-can it-be? There may be little harm in Mead Data Central's choosing the sources that limit our search for lexicographic citations, but there is a great deal of harm in limiting the sources by which students acquire an education. The long-range effect of these increasingly aggressive, anonymously written programs will be to remove the student from the intimate environment traditionally associated with formal education. Education will become-perhaps has already become-remote and centralized, based on principles of selection, pedagogical policies, and scholarly approaches that are by no means universally held and about which educators and scholars may differ profoundly. I am such an extremist on this point that I detest even the idea of spelling-correction programs. If they do not serve any heuristic purpose, they are pernicious by artificially limiting the range of spelling choices. Teachers may be happier with better-spelt (a disapproved spelling, no doubt, in any American program) papers, but they are getting papers edited by someone other than the student, based on criteria of which neither the student (who couldn't care less anyway) nor the teacher has any knowledge. We thus artificially limit language change, suppress colorful neologisms because they are not in the computer's dictionary of approved forms, stop in their tracks unorthodox syntactic experiments and idiosyncratic diction, and push all our students toward a common center of officially endorsed usages. What centuries of misapplied effort toward the establishment of an official American orthodoxy of approved usages have failed to accomplish may have quietly arrived, without anyone's noticing, in the guise of the omnipresent computer monitor and keyboard, inoffensively putty in color, almost invisible.

