You are on page 1of 39

INTRODUCTION

TO PATTERN
RECOGNITION
CSE555, Fall 2021
Chapter 1, DHS

1
MOST OF THE SLIDES IN THIS COURSE ARE FROM

PROF. DAS BHATTACHARJEE, SREYASEE

AND PROF. SIWEI LYU

syllabus

2
3

Course Information
• Register for the class on piazza --- our main resource for discussion and
communication
• Python tutorial will be out today

3
4

Key points to Remember


 Come t o Lectures and pay a t t e n t i o n . This w i l l d e f i n i t e l y make your subsequent tasks e a s i e r
 Study r e g u l a r l y, as I would love t o see each of mystudents f i n i s h i n g t h i s course with flying
colors….
 Revise via Textbook (immediately).
 Make a habit of writing i n pen and paper (typing does not help much) t o make sure you have
understood a topic well
 At any time you are s t u c k , your f i r s t job w i l l be t o “ d e f i n e your problem” f o r us
 Your f i r s t contact can be your TA, but please do not ever shy away t o j u s t drop by myo ff i c e hours ( or
though scheduling an appointment would help mes e t t i n g an exclusive s l o t f o r you). Just do not avoid
and leave things t i l l the exams.
 I amhere t o help you l e a r n . So, please help meto help you.

4
5

Relationships

https://levity.ai/blog/diff
https://www.analyticsvidhya.com/blog/2015/07/difference- erence-machine-
machine-learning-statistical-modeling/ learning-deep-learning
5
6

Can you tell me which one is fake and which one


is real?

6
7

Eye

7
8

Look their eyes

8
9

Results

9
Similar Different
Human Perception
Humans have developed highly sophisticated skills for sensing their environment and taking actions according to what
they observe, e.g.,
• Recognizing Faces
• Understanding Spoken Languages
• Reading and Recognizing Handwritten digits/characters
• speaker identification
• remote sensing,
• EEG/ECG analysis,
• DNA sequence identification,
• personal identification through iris scanning, finger-printing
• …..
We would like to give similar capabilities to machines, i.e., the requirement is for automated machine recognition of objects, signals,
images to support automated machine decision-making.

10
What is a Pattern?

A pattern is an entity, vaguely defined, that could be given a


• human face,
• speech signal,
• handwritten word,
• EEG/ECG signal
• DNA sequence
• fingerprint image, iris, ….
Pattern recognition is the study of how machines can
• observe the environment,
• learn to distinguish patterns of interest,
• make sound and reasonable decisions about the categories of the patterns.

11
Steps of Pattern Recognition

Step 0: Formulate Real-world classification/ decision-making problem


Step 1: Data Acquisition: How to acquire data, how much data should be acquired?
Step 2: Preprocessing: Noise removal, filtering, normalization, outlier removal
Step 3: Feature Extraction: Extraction of relevant features from data
Step 4: Feature Selection: Selection of minimum set of most relevant features
Step 5: Model Selection & Training: Choosing classification model and fitting it with training data
Step 6: Evaluation: Generalization performance? Confidence in the estimation of generalization performance?
Step 7: Solution.
Traditional pattern recognition concerns feature selection and model selection, whereas in model systems,
automatic feature extraction is possible with deep neural network.

12
Key Concepts of Pattern Recognition
feature: variable believed to carry discriminating and characterizing information about an object to be identified
feature vector
class/label: category to which a given object belongs
pattern: features + class
training data, testing data
cost function: quantitative measure representing the cost of making a classification error
true positive vs. false positive
training performance vs. generalization performance
supervised vs. unsupervised algorithms

13
Problems & Algorithms

Classification: supervised predicting categorical labels support vector classifier, linear discriminant
analysis, decision tree
Clustering: unsupervised predicting categorical labels k-means, kernel principle component analysis
Regression: predicting real-valued labels linear regression, generalized linear regression, principle
component analysis, neural networks/deep learning, support vector regression
Sequence labeling: predicting sequence of categorical or real-value labels. Kalman filter, particle filter,
hidden Markov model, conditional random fields
Ensemble learning: meta-algorihm combining multiple learning algorithms
Graphical model: structured label
neural network: unsupervised general-purpose learning

14
Machine Perception

• We are often influenced by the knowledge of how patterns are modeled and recognized in
nature, when we develop pattern recognition algorithms.
• Accurate pattern recognition by machine would be immensely useful.
• In fact, in solving such problems, we gain deeper understanding and appreciation for pattern
recognition systems in the natural world — most particularly in humans. For some applications,
such as speech and visual recognition, our design efforts may in fact be influenced by
knowledge of how these are solved in nature, both in the algorithms we employ and the design
of special purpose hardware.

15
Applications

16
Applications

17
Applications

18
Applications

19
An example System

Many visual inspection systems are like this: Circuit board, fruit, OCR, etc.

20
Feature Extraction

 The intrinsic characteristics of an object (or an object type) that sets it apart from the other
objects ( or object types)
 Features extraction and representation aims at identifying:
• Relevant and distinguishing characteristics of a pattern
• Obtain data reduction and abstraction

21
An Example Problem

A fish packing wants to avoid mislabeling “sea bass,” as “salmon.”


They want to sort incoming fish in the conveyor belt according to species.
Questions of Interest:
• What information to be preserved to help this distinguishing process?
• length, width, weight, number and shape of fins, tail shape,
etc.
• What can pose potential challenges?
• Images captured by camera may appear different due to lighting
conditions, position of fish on the conveyor belt, camera noise, etc.
The decision process is performed as: image capture ->segment image ->take all
the measurements -> make decision

22
An Example Problem: Feature Selection

 Lightness may be the one of the best visual features in isolation.


 In order to improve the accuracy, we still want to add more features
( information) that may help discriminating ‘Salmon’ from ‘Seabass’
• Observe that Seabass are typically wider than Salmon.
• So, length can be another feature to discriminate between the two types of
fishes.
Question still remains, what is the best Threshold?

23
Feature Selection

24
Feature Selection

Is lightness a better feature??

25
Feature Selection

26
Estimating the Cost of Error

Consequences of our actions may not be equally costly. This means that deciding the fish is a
sea bass when in fact it is a salmon, is just as cost undesirable as the converse. Such a
symmetry in the cost is often, but not invariably the case.
• mistaking Salmon as Seabass may not be that expensive a mistake as it is the
other way.
Does this information help our decision?
• There should an overall single cost associated with our decision, and our
task is to make a decision rule, so as to minimize the cost.
• But relying on a single feature, may have other challenges. For example, in order to
play safe, we may allow more Salmon be mis-identified as Seabass and end up
making losses

27
Add more Features for Representation

 We can use two features for making our decision


• Lightness (x 1 )
• Length (x 2 )

 Each fish image is then represented in terms of a single point vector

 This is Two dimensional Feature Space

28
Example: Two Feature Representation

29
6

Can you tell me which one is fake and which one


is real?

30
Pupil

Note that the pupils for the real eyes have a strong circular or
elliptical shapes (yellow) while those for the fake eyes are with
irregular shapes (red).
31
8

Look their eyes

32
9

Results https://mobile.twitter.co
m/ak92501/status/1433
Real Fake 250799749054465

https://arxiv.org/abs/210
9.00162

Regular Shape Irregular Shape 33


Is Adding more Feature a Solution always?

Does adding more features always improve the results?


• Avoid unreliable features.
• Be careful about correlations with existing features.
• Be careful about measurement costs.
• Be careful about noise in the measurements.
• Is there some curse for working in very high dimensions?
Is there some challenge (curse) for working in very high dimensions?

34
Trying a Different Decision Rule?

Should we choose decision boundary more complex than the simple straight line?

35
Generalization

The central aim of designing a classifier is to suggest actions when presented with
novel patterns, i.e., fish not yet general- seen. This is the issue of generalization.

It is unlikely that the complex decision boundary would provide good generalization,
since it seems to be “tuned” to the particular training samples, rather than some
underlying characteristics or true model of all the sea bass and salmon that will
have to be separated.

36
How to obtain THE Best Decision Boundary?

A tradeoff between complexity of decision boundary and the performance of the learned model in test
environment?

37
Subtasks of Pattern Classification

• Feature Extraction
• Feature Processing in presence of Noise
• Feature Selection with the use of Prior Knowledge in
• Handling Missing Feature
• Avoiding Overfitting
• Model Selection
• Classification
• Computational Complexity
• Context Invariance

38
Learning & Adaptation

• Supervised Learning
• Unsupervised Learning
• Reinforcement Learning

39

You might also like