Professional Documents
Culture Documents
Minimum-Error-Rate Classification
= ∑ P( ω j | x ) = 1 − P( ω i | x )
j≠1
The risk corresponding to this loss function is the average probability error”
• Minimize the risk requires maximize P(ωi | x)
(since R(αi | x) = 1 – P(ωi | x))
For Minimum error rate
Decide ωi if P (ωi | x) > P(ωj | x) ∀j ≠ i
Classifier is a machine
that computes c
discriminant functions
g( x ) = P ( ω 1 | x ) − P ( ω 2 | x )
P( x | ω1 ) P( ω1 )
= ln + ln
P( x | ω 2 ) P( ω 2 )
CSE 555: Srihari 7
The Normal Distribution
A bell-shaped distribution defined by the probability density function
1 x−µ 2
1 − ( )
p( x) = e 2 σ
2πσ 2
If the random variable X follows a normal distribution, then
• The probability that X will fall into the interval (a,b) is
given by b
∫a
p ( x)dx
∞
• Expected, or mean, value of X is E[ X ] = ∫ xp( x)dx =µ
−∞
• Variance of X is ∞
Var ( x) = E[( x − µ ) 2 ] = ∫ ( x − µ ) 2 p( x)dx = σ 2
−∞
• Standard deviation of X,σ 2, is
σx =σ 8
CSE 555: Srihari
Relationship between Entropy and Normal Density
Entropy of a distribution
∞
H ( p ( x)) = ∫ p( x) ln p( x)dx
−∞
With 80% confidence the r.v. will lie in the two-sided interval[-1.28,1.28]
1 ⎡ 1 ⎛ x − µ ⎞2 ⎤
P( x ) = exp ⎢ − ⎜ ⎟ ⎥,
2π σ ⎢⎣ 2 ⎝ σ ⎠ ⎥⎦
Where:
µ = mean (or expected value) of x
σ2 = expected squared deviation
or variance
Univariate normal distribution has roughly 95% of its area in the range
CSE 555: Srihari |x-µ|<2σ. 11
The peak of the distribution has value p(µ)=1/sqrt(2πσ)
Multivariate density
abbreviated as
p ( x) ~ N ( µ , Σ)
where:
x = (x1, x2, …, xd)t (t stands for the transpose vector form)
µ = (µ1, µ2, …, µd)t mean vector
Σ = d*d covariance matrix
|Σ| and Σ-1 are determinant and inverse respectively
• Formal Definitions
µ = Ε[ x] = ∫ xp( x)dx
∑ = Ε[( x − µ )( x − µ ) t
= ∫ ( x − µ )( x − µ ) t
p ( x)dx
Covariance :
Diagonal elements are variances of variables
Cross-diagonal elements are covariances of pairs of variables
Statistical independence means off-diagonal elements are zero
CSE 555: Srihari 13
Multivariate Normal Density
Locii of points
of constant
density are
hyperellipsoids
Samples drawn from a 2-D Gaussian lie in a cloud centered at the mean µ.
Ellipses show lines
CSE 555: of equal probability density of the Gaussian
Srihari 14
Linear Combinations of Normally distributed
variables are normally distributed
Whitening Transform
Φ is matrix whose columns are the orthonormal Eigen vectors
of ∑ and A is diagonal matrix of corresponding Eigen values
then the transformation A w = ΦA−1/ 2 applied to the coordinates
ensures that transformed distribution has covariance matrix equal
to the identity matrix
Action of a linear transformation on the feature space will convert an arbitrary normal
distribution into another normal distribution. One transformation A, takes the source
distribution into distribution N(Atµ,AtΣA). Another linear transformation- a projection P onto
a Line defined by vector a– leads to N(µ, σ2) measured along that line. While the
transforms yield distributions in a different space they are shown superimposed on the
original x1-x2 space. A whitening transform Aw leads to a circularly symmetric Gaussian,
here shown displaced.
CSE 555: Srihari 15
Mahanalobis Distance
r 2 = ( x − µ ) t Σ −1 ( x − µ ) Contours of constant
is the Mahanalobis distance from x toµ Density are hyperellipsoids
of constant Mahanalobis
Distance
Samples drawn from a 2-D Gaussian lie in a cloud centered at the mean µ.
Ellipses show lines of equal probability density of the Gaussian