You are on page 1of 5

Concepts of Pattern Recognition

Basic Pattern Recognition


Concept
Xiaojun Qi

When a person perceives a pattern, he


makes an inductive inference and
associates this perception with some
general concepts or clues which he has
derived from his past experience.
Thus, the problem of pattern recognition
may be regarded as one of discriminating
of the input data, not between individual
patterns but between populations, via the
search for features or invariant attributes
among members of a population.

Pattern recognition can be defined as the


categorization of input data into identifiable
classes via the extraction of significant
features or attributes of the data from a
background of irrelevant detail.

Pattern: A pattern is the description of an


object.
According to the nature of the patterns to
be recognized, we may divide our acts of
recognition into two major types:
The recognition of concrete items
The recognition of abstract items

The study of pattern recognition problems


may be logically divided into two major
categories:
The study of the pattern recognition capability
of human beings and other living organisms.
(Psychology, Physiology, and Biology)
The development of theory and techniques for
the design of devices capable of performing a
given recognition task for a specific
application. (Engineering, Computer, and
Information Science)

Task of
Classification
Character
Recognition
Speech
Recognition
Speaker
Recognition
Weather
Prediction
Medical
Diagnosis
Stock Market
Prediction

Input Data

Output Response

Optical signals or
strokes
Acoustic
waveforms
Voice

Name of
character
Name of word
Name of speaker

Weather maps

Weather forecast

Symptoms

Disease

Financial news
and charts

Predicted market
ups and downs.

Pattern Class: It is a category determined by


some given common attributes.
Pattern: It is the description of any member of a
category representing a pattern class. When a
set of patterns falling into disjoint classes is
available, it is desired to categorize these
patterns into their respective classes through the
use of some automatic device.
The basic functions of a pattern recognition
system are to detect and extract common
features from the patterns describing the objects
that belong to the same pattern class, and to
recognize this pattern in any new environment
and classify it as a member of one of the pattern
classes under consideration.

The second problem concerns the extraction of


characteristic features or attributes from the
received input data and the reduction of the
dimensionality of pattern vectors. (This is often
referred to as the preprocessing and feature
extraction problem.)
The features of a pattern class are the characterizing
attributes common to all patterns belonging to that
class. Such features are often referred to as intraset
features.
The features which represent the differences between
pattern classes may be referred to as the interset
features. The elements of intraset features which are
common to all pattern classes under consideration
carry no discriminatory information and can be ignored.
The extraction of features has been recognized as an
important problem in the design of pattern recognition
systems.

The patterns to be recognized and


classified by an automatic pattern
recognition system must possess a set of
measurable characteristics.
Correct recognition will depend on
The amount of discriminating information
contained in the measurements;
The effective utilization of this information.

Fundamental Problems in Pattern


Recognition System Design
The first one is concerned with the representation of
input data which can be measured from the objects to
be recognized.
The pattern vectors contain all the measured
information available about the patterns. The
measurements performed on the objects of a pattern
class may be regarded as a coding process which
consists of assigning to each pattern characteristic a
symbol from the alphabet set.
When the measurements yield information in the
form of real numbers, it is often useful to think of a
pattern vector as a point in an n-dimensional
Euclidean space.
The set of patterns belonging to the same class
corresponds to an ensemble of points scattered
within some region of the measurement space.

The third problem involves the determination of


optimum decision procedures, which are needed
in the identification and classification process.
If completed a prior knowledge about the patterns to
be recognized is available, the decision functions may
be determined with precision on the basis of this
information.
If only qualitative knowledge about the patterns is
available, reasonable guesses of the forms of the
decision functions can be made. Need adjustment
as necessary.
If there exists little, if any, a priori knowledge about
the patterns to be recognized, a training or learning
procedure is needed.

Design Concepts and


Methodologies
Membership-roster Concept
Characterization of a pattern class by a roster
of its members suggests automatic pattern
recognition by template matching.
The membership-roster approach will work
satisfactorily under the condition of nearly
perfect pattern samples.

Common-property Concept
Characterization of a pattern class by
common properties shared by all of its
members suggests automatic pattern
recognition via the detection and processing
of similar features.
The basic assumption in this method is that
the patterns belonging to the same class
possess certain common properties or
attributes which reflect similarities among
these patterns.

Clustering Concept
When the patterns of a class are vectors
whose components are real numbers, a
pattern class can be characterized by its
clustering properties in the pattern space.
If the classes are characterized by clusters
which are far apart, simple recognition
schemes such as the minimum-distance
classifiers may be successfully employed.

Advantage: (Membership-roster Concept


vs. Common-property Concept)
The storage requirement for the features of a
pattern class is much less severe than that for
all the patterns in the class.
Significant pattern variations cannot be
tolerated in template matching. If all the
features of a class can be determined from
sample patterns, the recognition process
reduces simply to feature matching.

Overlapping clusters are the result of:


A deficiency in observed information;
The presence of measurement noise.

The degree of overlapping can often be


minimized by:
Increasing the number and the quality of
measurements performed on the patterns of a
class.

When the clusters overlap, it becomes


necessary to utilize more sophisticated
techniques for partitioning the pattern space.

The basic design concepts for automatic


pattern recognition described above may
be implemented by three principal
categories of methodology:
Heuristic;
Mathematical;
Linguistic or syntactic.

Heuristic Methods: The heuristic approach is


based on human intuition and experience,
making use of the membership-roster and
common-property concepts.
A system designed using this principle generally
consists of a set of ad hoc procedures developed
for specialized recognition tasks.
Decision is based on ad hoc rules.
Example: Character recognition (Detection of
features such as the number and sequence of
particular strokes)

Mathematical Methods: It is based on


classification rules which are formulated and
derived in a mathematical framework,
making use of the common-property and
clustering concepts.
Deterministic approach:
Does not employ explicitly the statistical properties of
the pattern classes.

Statistical approach:
It is formulated and derived in a statistical framework.
Example: Bayes classification rule and its variations.
This rule yields an optimum classifier when the
probability density function of each pattern population
and the probability of occurrence of each pattern class
are known.

Linguistic (Syntactic) Methods: Characterization


of patterns by primitive elements (subpatterns)
and their relationships suggests automatic
pattern recognition by the linguistic or syntactic
approach, making use of the common-property
concept.
A pattern can be described by a hierarchical structure
of subpatterns analogous to the syntactic structure of
languages. This permits application of formal
language theory to the pattern recognition problem.
This approach is particularly useful in dealing with
patterns which cannot be conveniently described by
numerical measurements or are so complex that local
features cannot be identified and global properties
must be used.

Examples of Automatic Pattern


Recognition Systems
In a supervised learning environment, the
system is taught to recognize patterns by means
of various adaptive schemes. The essentials of
this approach are a set of training patterns of
known classification and the implementation of
an appropriate learning procedure.
The unsupervised pattern recognition
techniques are applicable to the situations
where only a set of training patterns of unknown
classification may be available.

Character Recognition:

Automatic Classification of Remotely


Sensed Data:

Biomedical Applications:

Examples: Land use, crop inventory, cropdisease detection, forestry, monitoring of air
and water quality, geological and
geographical studies, and weather prediction,
plus a score of other applications of
environmental significance.
Technique Used: Bayes classifier

Technique Used: Rather than being


compared with pre-stored patterns, handprinted characters are analyzed as
combinations of common features, such as
curved lines, vertical and horizontal lines,
corners, and intersections.

Technique Used: Pattern primitives, such as


long arcs, short arcs, and semi-straight
segments, which characterize the
chromosome boundaries are defined. When
combined, these primitives form a string or
symbol sentence which can be associated
with a so-called pattern grammar. There is
one grammar for each type (class) of
chromosome.

Nuclear Reactor Component Surveillance:


Technique Used:

Fingerprint Recognition:
Technique Used: It detects tentative minutiae
and records their precise locations and
angles.

Detect the clusters of pattern vectors by iterative


applications of a cluster-seeking algorithm.
The data cluster centers and associated
descriptive parameters, such as cluster
variances, can then be used as templates against
which measurements are compared at any given
time in order to determine the status of the plants.
Significant deviations from the pre-established
characteristic normal behavior are flagged as
indications of an abnormal operating conditions.

A Simple Pattern Recognition


Model
A simple scheme for pattern recognition
consists of two basic components:
Sensor: It is a device which converts a
physical sample to be recognized into a set of
quantities which characterize the sample.
Categorizer: It is a device which assigns each
of its admissible inputs to one of a finite
number of classes or categories by computing
a set of decision functions.

We assume that the a priori probabilities


for the occurrence of each class are equal,
that is, it is just as likely that x comes from
one class as from another.

You might also like