You are on page 1of 1

IEEE TR.hNSACTIONS ON INFORMATION THEORY VOL. IT-12, NO.

3 JULY 1966

Book Reviews

Learning Machines, Nils J. Nilsson. (McGraw-Hill Book Co., Inc., cover the subject adequately by virtue of its intentionally narrow
New York, N. Y., 1965. 137 pp., $10.00.) scope. As a technical treatise or monograph on the application of
(linear) discriminants to the problems of learning to recognize pat-
Under the all-embracing title of Learning Mac/&es, this book is terns from a finite number of samples, the book lacks depth and
an easy-to-read, well-presented survey of one approach to one aspect presents little that has not already been published. The problems
of a subclass of learning machines. The subclass of learning machines that are treated and the solutions and ideas that are presented,
treated in this book is that “which can be trained to recognize pat- however, are presented clearly, simply, and with an honest statement
terns.” This type of machine is fed sets of “patterns” from several of their limitations. This is a refreshing change from much of the
pattern classes and is trained to recognize the class membership of published literature of this still-obscure field.
new patterns by use of stored information extracted from the data
through training. GEORGE S.SEBESTYEN
The approach to pattern recognition used in the book involves the Litton Systems, Inc.
use of discriminant functions (primarily linear discriminants). These Waltham, Mass.
partition the space of measured pattern attributes into regions by
means of hyperplanes of adjustable locations and orientations.
The aspect of pattern recognition treated by the book is limited to
the processing of mearsurements for training and recognition. It is Topics in Communication Theory, David Middleton. (McGraw-Hill
not concerned with the measurement selection problem. Book Co., Inc., New York, N. Y., 1964. 107 text pp. + 8 index pp.
After an introduction to the subject, some important discriminant + IV pp. + 2 bibliography pp. + 5 appendix pp. + 3 pp. of Glossary
functions are discussed in Chapter 2. Linear discriminants, quadratic of Principal Symbols + references by chapter. Illustrations, 5% X 8,
discriminants (or minimum distance classifiers), and certain types of $5.95, hard cover, $2 95, paperback.)
polynomial discriminants are covered. The foundations of the
pertinent aspects of decision theory are outlined in the third chapter. The stated purpose of this monograph is to provide an introduc-
Since in decision theory the probability densities describing the tory treatment of several basic problems commonly occurring in com-
distributions of the sample patterns are assumed to be known, munication theory with emphasis on formulation and method
except for a few unknown parameters which must be estimated from rather than on detailed analysis and specialized results. Rather than
available data, the author discusses decision theoretic techniques attempting to cover a wide range of possible topics in communication
under the heading, “Parametric Training Methods.” The Gaussian theory (e.g., coding and information measures, noise theory, se-
assumption of the functional form of the prevailing probability quential methods, signal analysis, and random processes), the
densities and the estimation of the means of the Gaussian process author limits himself to problems in the areas of detection of signals
from samples plays a dominant role in this chapter. Chapters 4 and 5 in noise, extraction of signals from noise, and the interpretation of
are very important, for they give construction procedures and proofs optimum system structure for detection and extraction treated from
for partitioning the space with threshold logic units (hyperplanes). A a statistical decision theory point of view. In this sense, perhaps the
brief discussion of cascaded (or layered) machines is given in Chapter title of the monograph is misleading. The monograph was prepared
6, and methods for seeking out the modes of the probability distri- for the reader with an adequate mathematical background in the
butions are mentioned in the final chapter. elements of probability theory and statistics, matrixes, Fourier
While no examples are drawn from the real world, the frequent transforms, calculus of variations, and advanced calculus.
use of a graphic, geometrical portrayal of the ways in which the Rather than serving as an introduction (in the usual sense of an
vector space is partitioned by the different methods discussed, does introduction) to his earlier and more comprehensive treatment in
much to illustrate the properties (and the limitations) of the tech- Part IV of An Introduction to Statistical Communication Theory (New
niques presented. York: McGraw-Hill, 1960), as stated in the Preface, this monograph
While the above and other limitations of the scope of coverage and is closer to a guided tour of Middleton’s previous works in the area of
depth of treatment are mentioned by the author at appropriate statistical decision theory applied to detection and extraction of
places, a student of learning machines (or of pattern recognition) signals in noise. This monograph provides the author with a vehicle
who is not familiar with the literature and would attempt to aquaint for reformulating and presenting his earlier work in a package whose
himself with the field through this book would be left with an in-, size and scope make it appealing to the well-prepared nonspecialist
complete picture of the state of the art. in the area. If the reader is prepared to accept the material presented
Learning Machines is conspicuous by its omission of subject on its face value, Middleton has accomplished his goal admirably
matter for such an embracing title. Subjects not discussed include (with one exception noted below). The book is liberally endowed
learning on unlabeled inputs (without a teacher), learning to dis- with introductory, summarizing, and generalizing material; method-
criminate among a set of mutually not exclusive hypotheses, the ology; and the results of some of the more important problems in
dependence of adaptive techniques on the order in which inputs are signal detection and extraction with emphasis on canonical treat-
introduced, a discussion of the relationship between work in artificial ments. However, if the reader desires more detailed development,
intelligence and pattern recognition, etc. This reviewer sympathizes proof, and back-up source material, he is obliged to consult the
with Nilsson and admits that the present state of the art does not references, primarily, An Introduction to Statistical Communzcution
permit one to say a great deal about the above topics. Failure to Theory, and a Rand Memorandum, RM 4687-PR entitled,
mention them, even if only to point out their existence, however, CanonicaQ Optimum Threshold Detection (Rand Corporation,
tends to make the reader believe that the main concern of the field Santa Monica, Calif., November 1965). The author has made this
of “learning machines” is to determine how to place K - 1 hyper- task as easy as possible by referencing pertinent steps omitted and
planes in an N-dimensional vector space to separate K finite sets of results quoted (even to section and subsection in the case of re-
labeled vectors into their respective categories. For this reason, as a ferences to his earlier text).
textbook or reference book on learning machines, this book fails to Two possible hindrances to the uninitiated reader are notational
407

You might also like