Professional Documents
Culture Documents
Lecture 8
Today
to each class
Example in 2D
similarly , µ~ 2 = v t µ 2
Fisher Linear Discriminant
How good is µ~1 − µ~2 as a measure of separation?
The larger µ~1 − µ~ 2 , the better is the expected separation
µ~1 µ1
µ~2 µ2
µ1 µ2
µ~1 µ1
small variance
µ~2 µ2
µ1 µ2
large variance
Fisher Linear Discriminant
We need to normalize µ~1 − µ~2 by a factor which is
proportional to variance
1 n
Have samples z1,…,zn . Sample mean is µ z = n z i
i =1
i =1
( µ~1 − µ~2 )2
J (v ) = ~ 2 ~ 2
s1 + s2
µ~1 µ~2
x i ∈Class 1
S2 = (x i − µ 2 )( x i − µ 2 )
t
x i ∈Class 2
Fisher Linear Discriminant Derivation
Now define the within the class scatter matrix
SW = S 1 + S 2
~2 =
s 1 (v x − v µ )
t
i
t
1
2
y i ∈Class 1
= (v (x − µ )) (v (x
t
i 1
t t
i − µ 1 ))
y i ∈Class 1
= ((x i − µ1 ) v
t
) ((x
t
i − µ1 ) v
t
)
y i ∈Class 1
= v t
(x i − µ 1 )( x i − µ 1 ) v = v t S 1v
t
y i ∈Class 1
Fisher Linear Discriminant Derivation
Similarly s~22 = v t S 2v
Therefore s~12 + s~22 = v t S 1v + v t S 2v = v t S W v
Define between the class scatter matrix
S B = (µ 1 − µ 2 )(µ 1 − µ 2 )
t
= v (µ 1 − µ 2 )(µ 1 − µ 2 ) v
t t
= v t SBv
Fisher Linear Discriminant Derivation
Thus our objective function can be written:
( µ1 − µ 2 )
~ ~
J (v ) = ~ 2 ~ 2 = t
2
v t S Bv
s1 + s 2 v SW v
Minimize J(v) by taking the derivative w.r.t. v and
setting it to 0
d t t d t
v S B v v SW v − v SW v v t S B v
d dv dv
J (v ) =
dv (v t
SW v )
2
=
(2 S B v )v t SW v − (2 SW v )v t S B v
=0
(v t
SW v )
2
Fisher Linear Discriminant Derivation
Need to solve v t S W v (S B v ) − v t S B v (S W v ) = 0
v t S W v (S B v ) v t S B v (S W v )
t
− t
=0
v SW v v SW v
v t S B v (S W v )
SBv − t
=0
v SW v = λ
S B v = λ SW v
v λ v
Fisher Linear Discriminant Example
Data
Class 1 has 5 samples c1=[(1,2),(2,3),(3,3),(4,5),(5,5)]
Class 2 has 6 samples c2=[(1,0),(2,1),(3,1),(3,2),(5,3),(6,5)]
Arrange data in 2 separate matrices
1 2 1 0
c1 = c2 =
5 5 6 5
det (V t S BV )
Objective function: J (V ) =
det (V t S W V )
within the class scatter matrix SW is
c c
SW = Si = (x k − µ i )( x k − µ i )
t
i =1 i = 1 x k ∈class i