You are on page 1of 26

AIPC404 Fundamentals of

Machine Learning
UNIT 1
Bayesian Decision Theory and Normal
Distribution

Staff Incharge
Dr. M. Kalaiselvi Geetha
Professor
Department of CSE
Univariate and Multivariate
Normal Densities

Dr. M. Kalaiselvi Geetha, Professor, Dept.


of CSE, AU 2
Normal Distribution
Normal distribution or Gaussian distribution is a continuous
probability distribution that describes
The data that is distributed normally with
mean and variance.

The variable x is distributed normally with


mean and variance.
It can be categorized into:
 Univariate Density :
It involves single variable (one dimension). Ie., distribution of one
single variable. Example: Height of person
 Multivariate Density :
It involves more than one variable (two or more dimensions).
Example: Height and weight of person

Dr. M. Kalaiselvi Geetha, Professor, Dept.


of CSE, AU 3
Mean of The Dataset
• Consider the example data set
X = 1, 4, 2, 12, 15, 25, 67, 65, 6, 98
• The centroid of the points is defined by the
mean of each variable
• Mean is measured as

Dr. M. Kalaiselvi Geetha, Professor, Dept. 4


of CSE, AU 4
Drawbacks of Mean
• Mean does not give information about the
spread of the data
• Two data sets having different spread may
have same mean
• Example:
a = [3, 1, 24, 12] and b = [11, 9, 7, 13]
Mean = 10

Dr. M. Kalaiselvi Geetha, Professor, Dept. 5


of CSE, AU 5
Standard Deviation
• The Standard Deviation (SD) of a data set is a
measure of spread of this data
• SD can be defined as the average distance from
the mean of the data set to a point

Dr. M. Kalaiselvi Geetha, Professor, Dept. 6


of CSE, AU 6
Standard Deviation
• High standard deviation
– data are spread over a large
range of values

• Low standard deviation


– data points are very close
to the mean

Dr. M. Kalaiselvi Geetha, Professor, Dept. 7


of CSE, AU 7
Standard Deviation – Example 1
a = [3, 1, 24, 12] SD = 9.08
b = [11, 9, 7, 13] SD = 2.236
Mean = 10
a has more SD, when compared to b
Mean

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

Reason:
Spread of data in a from the mean is more
Dr. M. Kalaiselvi Geetha, Professor, Dept. 8
of CSE, AU 8
Standard Deviation – Example 2
c = [15, 15, 15, 15, 15] SD=0

Reason:

• Spread is alike

• None of the data points deviate from the mean

Dr. M. Kalaiselvi Geetha, Professor, Dept. 9


of CSE, AU 9
Variance
• The spread of data can be measured using
another measure called variance.
• The variance of each data is the average squared
deviation of its n values around the mean of that
variable

• SD is square root of variance


Dr. M. Kalaiselvi Geetha, Professor, Dept. 10
of CSE, AU 10
Drawbacks
• Standard deviation and variance can be applied on
one dimensional data sets
• Goal of statistical analysis is to find the relationship
between the data and their dimensions.
Example:
student who has attended more classes –
attendance percentage is high or not.

Dr. M. Kalaiselvi Geetha, Professor, Dept. 11


of CSE, AU 11
Covariance
• It is needed to have a measure to see how two
different dimensions are related to each other
• Degree to which the data points are linearly
correlated is represented by their covariance
• Covariance (σ) for two dimensional data

Dr. M. Kalaiselvi Geetha, Professor, Dept. 12


of CSE, AU 12

Mean Vector and Covariance

 11
Mean 11


vector
12
12
and.covariance
. . .  for
1 n 
1n 
more
than onedimension 


  21
21  22
22 . .  . . 
1 2 n2 n 
  . .
31 

  
1
  32 

  
 
33
11

 . .
12
2
. 3.n 

1n

   31 32 33 3 n


 .   

2 21 22 2n
  .
  .
 ..  .. ... .. . . .. .  
31 32 33 3n
 .

 . 
n1  
.
. 
  
 n1 nn22  
 

. .
n
 . n1
.
n2
n
nn
 nn 
nn

Dr. M. Kalaiselvi Geetha, Professor, Dept.


of CSE, AU 13
Covariance Matrix
• For a n dimensional data, the covariance matrix
has (n × n) elements
• If the data is three dimensional, the covariance
matrix has (3×3) elements
• For 2 dimensional data,

Dr. M. Kalaiselvi Geetha, Professor, Dept. 14


of CSE, AU 14
Significance of Covariance
• If the value of covariance matrix is
 positive - both the dimensions increase together
 negative - as one dimension increases other
decreases
 zero - two dimensions are independent of each
other

Dr. M. Kalaiselvi Geetha, Professor, Dept. of CSE, AU 15


15
Univariate Density
Probability density function for univariate density is written as
 
1 x μ
 2
1    
p x   exp    
2π σ 

2 σ
 


 

Where μ is mean, σ 2 is variance.

 Mean ( μ) :
It is the average of the given feature x and it is
μ 
given by x
n
 Variance ( σ)2

The spread of the data can be measured using


variance.
2
σ  
x 
μ 2
n
Univariate Density (contd….)
The Normal distribution is symmetrical about mean and it
is a “Bell-Shaped Curve”.

Peak of univariate normal distribution occurs at x = µ and


1
its value is .
2 

Width of the univariate normal distribution proportional to


standard deviation (σ). 1
2πσ
0.607 0.607
2πσ 2πσ
P(x)

x
µ-2σ µ-σ µ µ+σ µ+2σ
Fig. 1 : Univariate normal distribution
Example for univariate density :
Height (h) of 165 170 160 154 175 155 167 177 158 178
males (adult)

Fit an univariate normal distribution (Gaussian) for h.


Mean  μ    
n
h 1
10
165 170 178  
1659
10
 165 9

  h  μ 2 1
Variance σ 2     165  165 9 2    178  165 9 2   72 89
n 10

Fig. 2 : Univariate normal distribution


for height of males (adult)
Univariate Density (contd….)
Test data :
i) Height (h) = 100

Find probability density function P(100) being an adult and

threshold (T ) = 0.00005
  
2
1 1

100
165
9

P
100 
exp  
5.485
-
015
0
2

3
14
8
54
2
8
54 
 

Result: The height 100 is not in the normal distribution,


so the person is not an adult (P(100) <T).
ii) Height (h) = 160

Find probability density function P(160) being an adult and


threshold (T ) = 0.00005
  
2
1 1

160 
165
9


P
160 
exp  0.03
2

3
14
8
54
2
854
 

Result: The height 160 is in the normal distribution,


so the person is an adult ( P(160)>T ).
Multivariate Density
The general multivariate normal density in d-dimensions is written as
1  1 
exp    x  μ    1  x  μ  
t
P(x) 
 2π  d   2 

Where x is a d - dimensiona l column vector


μ is the d - dimensiona l mean vector
 is the d - by - d covariance matrix
 is determinant of covariance matrix
 1 is inverse of covariance matrix
 x  μ  t is transpose of  x  μ 
Mean Vector (µ) and Covariance Matrix (∑)
1 n
μ 
 1
σ11 σ12  σ
1d  σ ij  σ ji   (x i -μ i )(x j -μ j )

μ 2
μ 
  σ
21
σ
22
 σ 
2d
n 1
 
       σij is variance between xi and xj ,
 μd  σd1 σd2  σ 
dd  i, j=1…d
Multivariate Density (contd…)
Covariance matrix ( ) is symmetric matrix and its diagonal elements are
variances within x which can only be positive.
Off-diagonal elements are the covariances which can be +ve or –ve.
Statistically Dependent Variables:
The variables which are causally related are called statistically dependent
variables.
Example: engine temperature and oil temperature

Statistically Independent Variables:


The variables which are not causally related are called statistically
independent variables.
Example: oil pressure in engine and air pressure in tire
Multivariate Density (contd…)
If the variables are statistically independent, the covariances are zero and
covariance matrix is a diagonal matrix.
σ 2  1 
0  0   σ2
0  0

 1   1 
 0
 σ2
2
 0   1
 
 0

1
σ2
2
 0 
 
  σ 12  σ 22     σ d2 
      
       
 0 0  σ d2   0 0  1 

 σ d2 

Multivariate Density ( Bivariate density ):

P(x) for a two dimensional (bivariate) data is a bell/ hill shaped


surface over the two dimensional plane (x1, x2).

Peak of the bivariate normal distribution occurs at  x1, x2    μ1, μ2 


1
and its value is .1
2π  2

The shape of the hump depends on the two variances


σ
correlation coefficient(ρ) by ρ  12
σ σ
1 2
Example for Multivariate Density ( Bivariate density ):

Height (h) of 165 170 160 154 175 155 167 177 158 178
males
Weight (w) of 78 71 60 53 72 51 64 65 55 69
males

Fit a bivariate normal distribution (Gaussian) for h and w.

Mean Vector (µ)


h 1
 μ1  μ1    165  170    178   165  9
μ    n 10
 μ2  w 1
μ2    78  71    69   63  8
n 10
165  9 
 
 63  8 
Bivariate Density (contd…)
Covariance Matrix (∑) 1 n
σ 
σ ij  σ ji   (x i - μ i )(x j - μ j )
 11
σ12  n 1
 
σ
 21 σ 22 
i, j  1 to 2.
 
1
σ 11   165  165 9  2  170  165 9  2    178  165 9  2 
10
 7289

1
σ 12   165  165 9  78  638   170  165 9  71  638     178  165 9  69  638
10
 5278

1
σ 21   78  638  165  165 9    71  638  170  165 9      69  638  178  165 9
10
 5278

1
σ 22   78  638  2   71  638  2     69  638  2 
10
 7216

72 89 52 78 


 
52  78 72  16 
Bivariate Density (contd…)

Fig. 3 : Bivariate normal distribution


for height and weight of males
Test data: Bivariate Density (contd…)
i) Height, weight = 75, 25

Find probability density function P(75, 25) being an adult and


threshold (T) = 0.000005.

P(75,25) = 1.26e-29 = 0
Result: Height and weight are not in the bivariate normal
distribution, so the person is not an adult. ( P(75,25) < T )

ii) Height, weight = 160, 60

Find probability density function P(160, 60) being an adult and


threshold (T) = 0.000005.

P(160, 60) = 0.0023


Result: Height and weight are in the bivariate normal distribution,
so the person is an adult. ( P(160,60) > T )

You might also like