You are on page 1of 11

Lecture 8

Discrimination for Two Multivariate


Normal Populations
Suppose there are two multivariate normal populations, say,

1 that is N p (u1 , 1 ) and 2 that is N p (u2 , 2 ). Suppose


%
%
a new observation vector x is known to come from either 1
%
or 2 .
is needed that can be used to predict from
which of the two populations x is most likely to come from.
%

A rule

Mahalanobis Distance Rule


1
( x k )T k 1 ( x k )
1
%
2 %
Q f k f ( x | uk , k )
e
%%
(2 ) p / 2 | k |1/ 2

d k =( x K )T k 1 ( x K )
%
%

Posterior Probability Rule

Choose 1 if

f k pk
P( 1 | x) P ( 2 | x), whereP ( k | x)
.
%
%
% f1 p1 f 2 p2
1
exp( d k ) pk
2

Specially, when 1 2 , P( k | x)
.
1
1
% exp( d ) p exp( d ) p
1
1
2
2
2
2

A General Discriminant Rule


(Two Populations)

Expected(Average) Cost of Misclassification:


ECM=C (2 |1) P(2 |1) p1 C (1| 2) P(1| 2) p2

Goal: Have ECM as small, or nealy as small, as possible.


Result: The regions R1 and R 2 that make ECM small are:
f1 (x) C (1| 2) p2
R1 :

%
f 2 (x) C (2 |1) p1
%
R 2 : else.

Classification with Normal Populations When 1 2 .

fi ( x)
e
p/2
1/ 2
% (2 ) | |
R1 :

f1 (x) C (1| 2) p2

%
f 2 (x) C (2 |1) p1
%
1
1

( x 1 )T 1 ( x 1 ) ( x 2 )T 1 ( x 2 )
%
%
2 %
2 %

R1 : e

R1 : d1* d 2* , where di*

1
( x i )T 1 ( x i )
%
2 %

C (1| 2) p2

C (2 |1) p1

1
( x i )T 1 ( x i ) ln pi C ( j | i)
%
2 %

or

1
1
R1 : xT 1 1 1T 1 1 ln p1C (2 |1) xT 1 2 2T 1 2 ln p2C (1| 2)
%
%
2
2
LDA

Classification with Normal Populations When 1 2.

1
fi ( x)
e
p/2
1/ 2
% (2 ) | i |

1
( x i )T i 1 ( x i )
%
2 %

f1 (x) C (1| 2) p2

%
f 2 (x) C (2 |1) p1
% 1
1
1/ 2
( x 1 )T 11 ( x 1 ) ( x 2 )T 2 1 ( x 2 )
| 2 |
C (1| 2) p2
%
%
2 %
2 %
R1 :
e

1/ 2
| 1 |
C (2 |1) p1

R1 :

R 1 : d1** d 2** , where di**

QDA

1
1
( x i )T i 1 ( x i ) ln | i | ln pi C ( j | i )
%
2 %
2

Fisher (Canonical) Linear


Discriminant Analysis ( )

Fishers LD and LDA


They become same when
(1) Prior probabilities are equal
(2) Costs are same
(3) Common covariance matrix for the class conditional densities
(4) Both class conditional densities are multivariate Normal

Example 1-page 221


Dr.Michael Finnegan, professor of anthropology at
Kansas State University, was interested in determining
whether wild turkeys could be distinguished from domestic
turkeys by considering measurements of certain bones.
A man had been arrested for stealing many of his
neighbors turkeys. The man claimed that the turkey meat
in his freezer actually came from wild turkeys.
Bone measurements were made on turkeys
confiscated from the suspect. And professor Finnegan was
asked to help decide whether these turkeys are domestic
or wild.

Example 2
We have four groups of wheat kernels. Groups 1 and 2 were
grown in one location and groups 3 and 4 were grown in another
location. In addition, groups 1 and 3 were of a variety called
Arkan while groups 2 and 3 were of a variety called Arthur.
Arkan has a higher level of protein and thus has a higher
market value.
There is some concern about whether an unscrupulous person
might try mixing Arthur with Arkan, and then try to sell the
combination of the two varieties as being all Arkan to get higher
price.
We want to use physical measurements of wheat kernels to
try to discriminate among groups of kernels. We have taken
measurements of area, perimeter, length, and breadth of each
kernel.

You might also like