You are on page 1of 20

Principal Component

Analysis

1
Contents
ØImportance of PCA
ØDimensionality Reduction
ØLinear Transformation
ØPCA Steps
Dr.- Ing. Maggie Mashaly
Importance of PCA

ML 3
Correlation

𝑥" ++ +++
𝑥" + +++ ++ + +
+ + + ++++
++++ +++ +
+ ++++ + + +++ +++ +
+ + + +++
+ + ++ + +++++++
+ + ++ + + + + +
++ +
+ + + + ++++++
++ + + + +++ + + +
+ + ++ ++ + +
+++
+ +
+ ++ +
+ + +++ +
++ + +++++ +
Dr.- Ing. Maggie Mashaly

+ + + ++ + +
++++ ++
+

𝑥! 𝑥!
Uncorrelated Correlated
Main Questions
› How to remove correlation?

› What are the dimensions with the most


information?
Dr.- Ing. Maggie Mashaly

› How can we provide a better scaling or


normalization?
Principal Component Analysis

To identify the principal directions in


which the data varies
qBetter data representation
ØReduce the number of features
Dr.- Ing. Maggie Mashaly

ØReduce the relation between features


qEliminate non principal data
qCan be used for compression applications
Principal Component Analysis
2D Example
Projection of the
data along the Variation in the data along
Principal direction direction is low
𝑥" +
+
+
+ +
+ +
+
Dr.- Ing. Maggie Mashaly

Principal direction of
++ + data variation
Variation in the data along
+ direction is very high
+
+ +
+ 𝑤!
𝑥!
Dimensionality Reduction

ML 8
Dimensionality Reduction
We are interested to find a linear
transformation from an m
dimension to n dimension (reduce
the number of features from m to n
𝑥" +
+ + +
Dr.- Ing. Maggie Mashaly

+ +
+ +
+ + + + + + + + 𝑤!
𝑥!
2D 1D
Linear Transformation
! ! ! !
𝑥! 𝑥" 𝑥# … 𝑥$ 𝑤!,! 𝑤!," 𝑤!,# … 𝑤!,'
𝑥!
"
𝑥"
" "
𝑥# … 𝑥$
" 𝑤",! 𝑤"," 𝑤",# … 𝑤",'
𝑥= 𝑊= 𝑤#,! 𝑤#," 𝑤#,# … 𝑤#,'
𝑥!# 𝑥"# 𝑥## … 𝑥$# …………….
……………. 𝑤$,! 𝑤$," 𝑤$," … 𝑤$,'
% % % %
𝑥! 𝑥" 𝑥# … 𝑥$
Transformation matrix
Data Matrix
Dr.- Ing. Maggie Mashaly

nxk nxm mxk


Transform
into nxk
𝑍 = 𝑥𝑊
K features M features
N data points N data points
Example
0 1 1
𝑤= 1
𝑥= 3 2 𝑍= 5
1
−1 −2 −3
3 1

2
0.5
1
Dr.- Ing. Maggie Mashaly

0 0

-1
-0.5
-2

-3 -1
-4 -2 0 2 4 -6 -4 -2 0 2 4 6
Finding the right Transformation
› Let us first forget about reducing the dimensions and
focus on reducing the relation between features
› Relation between features is described by the covariance
matrix
› 𝐶𝑜𝑣 𝑥, 𝑥 = 𝑥 # 𝑥
Dr.- Ing. Maggie Mashaly

› The best transformation would give no relation between


any two features
– (this means that the covariance matrix an identity matrix )
– 𝐶𝑜𝑣 𝑧, 𝑧 = 𝑧 ( 𝑧 = 𝐼
Finding the right Transformation

nxm nxk
Correlated 𝑥 𝑧 Uncorrelated
M features 𝑧(𝑧 = 𝐼 K features
𝑍 = 𝑥𝑊
Dr.- Ing. Maggie Mashaly

[kxm] [mxn] [nxm] [mxk]


# # #
𝑧 𝑧 = 𝑤 𝑥 𝑥𝑤 = 𝐼
Finding the right Transformation

› 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝑈𝜆
› Where 𝑈 is the eigen vector and 𝜆 is the eigen value
› 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝜆
! ! #
› 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝜆 𝐼 𝐼 𝜆
" "
Dr.- Ing. Maggie Mashaly

!# !
› 𝜆 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈𝜆 = 𝐼
" "

! # !
$ $
› 𝜆 𝑈 " 𝑐𝑜𝑣 𝑥, 𝑥 𝑈𝜆 " =𝐼
Singular Value Decomposition
› 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝑈𝜆
› 𝑐𝑜𝑣 𝑥, 𝑥 = 𝑈𝜆𝑈 #
› SVD(𝑐𝑜𝑣 𝑥, 𝑥 ) = 𝑈𝑆𝑉 #
› Because of the characteristics of covariance matrix
› 𝑈 = 𝑉#
Dr.- Ing. Maggie Mashaly
PCA Steps
– Centering each feature around the feature average
𝑥! = 𝑥! − 𝐸(𝑥! )
Normalize the data (optional)
𝑥! − 𝐸(𝑥! )
𝑥! =
𝜎!
Calculate the covariance matrix of x
1 "
𝑐𝑜𝑣 𝑥, 𝑥 = 𝑥𝑥
Dr.- Ing. Maggie Mashaly

𝑛
Use singular value decomposition to calculate eigen vectors matrix U
If reduction is to be applied pick the first k columns from the u matrix to
obtain 𝑈#$%&'$%
Obtain the new transformed data by
"
𝑧 = 𝑈#$%&'$% 𝑥
Note in doing so we ignored the eigen values which represent the
significance of each dimension
Obtaining the Original Dimension
Approximation
› 𝑧 = 𝑈%&'()&'
#
𝑥
› 𝑥*++%,- = 𝑈%&'()&' 𝑧
› In this case the approximation error can be calculated as
! "
/
› ∑1/0! 𝑥 /
− 𝑥*++%,-
Dr.- Ing. Maggie Mashaly

.
› Percentage error
"
1 + )
∑)*! 𝑥 ) − 𝑥,--./0
𝑚
+ ) "
∑)*! 𝑥
We are interested in a very small percentage of error
Obtaining the Original Dimension
Approximation

& 𝑥'((!)* = 𝑈!"#$%"# 𝑧


𝑧= 𝑈!"#$%"# 𝑥
Dr.- Ing. Maggie Mashaly
How to Choose K?
› Iteratively increase k from 1 to n and calculate the error
› Pick the k that ensures an error that is smaller that
predefined value
› Luckily the eigen values returned for the case of the
covariance matrix provides the contribution of each
feature and hence it provides the same value as
Dr.- Ing. Maggie Mashaly

calculating the error


Thank You

ML 20

You might also like