You are on page 1of 2

Geot. Mag. 122 (2), 1985, pp. 203-204.

Printed in Great Britain 203

ESSAY REVIEW

Pattern recognition and pattern analysis


R. F. C H E E N E Y
Grant Institute of Geology, University of Edinburgh, West Mains Road, Edinburgh EH9 3JW, U.K.

WOLFF,D. D. & PARSONS, M. L. 1983. Pattern Recognition Approach to


Data Interpretation.
xiii + 223pp. New York, London: Plenum Press. Price U.S. $29.50. ISBN 0 306 41302 7.

To many, pattern recognition training must start with the measuring system. Composition comprises a triplet of
childhood study of bedroom wallpaper and the analogy is technical problems, placement, occlusion/superposition and
not lost as we witness increasingly abstract designs in later interaction. Placement is the searching for the object/event
years. Even if our knowledge of the underlying pattern- against a noisy background and is loosely allied to
generation process is limited or absent, the strong visual segmentation - the attempt to recognize homogeneous
impact and the innate need to find order within seeming regions within the field. Occlusion/superposition is the
disorder provides a powerful incentive towards analysing the covering of one object/event by another or by part of itself,
structure using the most powerful tools we have, the human e.g. mutual interference of seismic reflections. Interaction
eye-brain system. Having found regularity, we may be relates to the mutual modification of neighbouring
tempted to bestow values approaching philatelic proportions events/objects.
on any defects or irregularities. A successful application of pattern analysis methods may
Thus my interest was aroused on noting 'pattern lead to a classification of objects/events/processes but
recognition' in the book's title, although later to be certainly to improved representation for human consump-
somewhat dimmed as 1 argue below. But first, I think it tion, in that all redundancies are stripped away. Figure 1.1.
important to outline my interpretation of pattern analysis as may be taken as a projection onto two-dimensional paper
it may influence geological and related studies, and to assist of a four-dimensional pattern space in which the position of
I start with the' abstract design', Figure 1.1, which might be each point isfixedaccording to the four features that have
an x—y plot of anybody's data. By way of a guide, I might been measured on each specimen/individual at some
take an admirable introduction: A Profile of Pattern place/time. The projection is chosen so as to maximize the
Analysis, prepared by the British Pattern Recognition representation of the diversity of the sample, but despite this
Association for the Science and Engineering Research we are presented with the initial problem that in our haste
Council (Report number RL-83-086, September 1983). to make the data readily available to our eye-brain
Pattern analysis is defined as a field that includes pattern complexes, we have to restrict the dimensionality of
the analysis and thereby introduce artificial occlusions/
recognition, image processing, computer vision, waveform
superpositions.
analysis and speech understanding. Its unified study is
If we take the position of each point in four-dimensional
spurred by increasing automation in data-generating activity
pattern space as defining an individual pattern, then we may
which in our field might include X-ray fluorescence and
be justified in supposing that similar patterns will tend to
diffraction analyses, satellite and airborne remote sensing,
cluster together, yet retain some variability due to natural
geophysical exploration, natural seismicity, hydrogeological dispersion. Of course, this pre-supposes that we have given
and geothermal evaluations and so on. Pattern analysis careful consideration to the metric by which we locate points
methods aim to extract information about objects/ entities/ in pattern space in the first place, since we are not limited
processes/events from patterns arising from their images/ to conventional ruler-like scales of measurement. Clustering
imprints/histories. The information might include classifica- ilgorithms are numerous: one has been used in the
tion and enumeration of objects/entities, relative and Tour-dimensional space of the example to generate Figure 1.2
absolute location and orientation of objects (in whatever from the data represented in Figure 1.1. Depending now on
frame of reference: geographical, petrogenetic, etc.) and intuition, supported statistically, we may distinguish four or
measurement of objects. If time-dependent data are five groupings in pattern space - a tolerable segmentation
included, then pattern-analysis methods may yield descrip- has been achieved. In some applications the analysis might
tions of object spaces (e.g. sedimentation basins), identi- end here, wanting further information on the structure/
fication of processes, prediction of future paths and context/syntax of the sampled population. However, our
tendencies. The temporal/spatial/spectral dimensionality of sample is of reflectances in four spectral bands obtained by
problems should not be considered restricted in pattern satellite over the Solway Firth, southwest Scotland, so we
analysis studies, so problems should range from, for may add information from a geographical reference frame
example, a one-dimensional trace of a property atfixedtime, by presenting a map-like image, as in Figure 1.3. In this
through to a multicomponent chemical/colour study of simple example, the reference frame is known, but it could
water masses over extended time. be an objective of pattern analysis methods to learn
There exist, of course, technical problems. By analogy something of the structure of an unknown frame. Thus the
with attempted radio reception, random variation (noise) in sample data of Figures 1.1-1.3 were taken at regular (fixed)
the sampled measurement (signal) is ubiquitous. Likewise, intervals over the ground. Figure 1.4, on the other hand, is
deformations introduced by non-linearities in the recording/

http://journals.cambridge.org Downloaded: 24 Apr 2016 IP address: 130.132.123.28


204 ESSAY REVIEW

almost entirely to the practical methodology associated with


better-known computer software packages that enable
statistical studies and various unsupervised/supervised
(clustering) techniques. Consequently, the book is likely to
be of interest to relative newcomers to the field who seek a
comparative assessment of these widely-available packages,
rather than to experienced analysts who may be searching
for innovative techniques. However, I would not go so far
as the authors themselves in claiming the book out of date
before publication. I am convinced many problems in the
geological sciences have yet to yield to these methods and
they are ignored at peril.
Specifically, the book is divided into four unequal parts.
Part I (15 pages) broaches philosophical considerations and
introduces the five principal computer packages: BMDP
(Biomedical Computer Program), SPSS (Statistical Package
for the Social Sciences), ARTHUR, CLUSTAN and SAS.
Apart from the Scots-grown CLUSTAN, these are of North
American origin and ARTHUR and SAS may not be known
widely. Inevitably, these packages cover common ground
and in part II (91 pages) useful comparisons are drawn,
especially with respect to utility, and techniques are
described in non-mathematical, sometimes (unfortunately)
misleadingly simplistic terms. In this latter connection, the
reader is advised that a background in introductory statistics
might be an advantage in recognizing these shortcomings.
Part III (52 pages) is entitled 'implementation' and deals
with the nuts and bolts of data preparation, program
control, file management, etc. while part IV (12 pages)
reviews applications in the natural sciences, including 27
references in the field of geology and earth science. Nine
appendices give a glossary of pattern recognition definitions
and a list of 27 reference books together with brief notes on
a miscellany of topics such as package availability, missing
values, standardization, database description, non-para-
metric statistics, multivariate normal distributions and
summaries of program facilities. Methods are illustrated
throughout by reference to an example of an air pollution
study, the data for which are included in another appendix.
Illustrations are mostly in the form of tables, with rather
fewer graphs, mostly scatter diagrams, some histograms.
Throughout, the emphasis tends towards the statistical
approach to data analysis, not all of which might be thought
Figure 1. Stages in a pattern analysis study based on of as falling into the field of pattern recognition. However,
reflectances measured by satellite-borne sensors over the these are areas of rapid development, currently attracting
intertidal flats of the Solway Firth, southwest Scotland. (1.1) substantial resource involvement and it would be disingen-
A sub-sample of 90 reflectances from a sample of 1440 uous not to commend the authors' efforts in bringing
together these well-tried methods in a subtle new guise
measured in four spectral bands at points on a square grid
attractive to the interested newcomer.
and subjected to a principal component analysis. First two
components displayed. (1.2) Same data clustered by Ward's
method. The heights of the 'bridges' connecting clusters (or
groups of clusters) are a measure of their separability. (1.3)
Clusters of Figure 1.2 extrapolated back to the sample of
1440 and displayed in original 'geographical' coordinates.
Overlaps indicate 'fuzzy' edges to the clusters, blanks
indicate unclassified regions. Dashed ornament open water
and creeks; other ornaments, various intertidal zones. (1.4)
Clusters derived from a random sub-sample of 90 from the
original 1440 with no reference to geographical position. The
basic pattern of clusters is repeated.

a sample of similar data from the same area taken at random,


with no record of geographical position. The identity of
structure with Figure 1.2 is reassuring.
Wolff & Parsons have covered a part of the ground
required for such analyses, but have restricted themselves

http://journals.cambridge.org Downloaded: 24 Apr 2016 IP address: 130.132.123.28

You might also like