Professional Documents
Culture Documents
Analysis
1
Contents
ØImportance of PCA
ØDimensionality Reduction
ØLinear Transformation
ØPCA Steps
Dr.- Ing. Maggie Mashaly
Importance of PCA
ML 3
Correlation
𝑥" ++ +++
𝑥" + +++ ++ + +
+ + + ++++
++++ +++ +
+ ++++ + + +++ +++ +
+ + + +++
+ + ++ + +++++++
+ + ++ + + + + +
++ +
+ + + + ++++++
++ + + + +++ + + +
+ + ++ ++ + +
+++
+ +
+ ++ +
+ + +++ +
++ + +++++ +
Dr.- Ing. Maggie Mashaly
+ + + ++ + +
++++ ++
+
𝑥! 𝑥!
Uncorrelated Correlated
Main Questions
› How to remove correlation?
Principal direction of
++ + data variation
Variation in the data along
+ direction is very high
+
+ +
+ 𝑤!
𝑥!
Dimensionality Reduction
ML 8
Dimensionality Reduction
We are interested to find a linear
transformation from an m
dimension to n dimension (reduce
the number of features from m to n
𝑥" +
+ + +
Dr.- Ing. Maggie Mashaly
+ +
+ +
+ + + + + + + + 𝑤!
𝑥!
2D 1D
Linear Transformation
! ! ! !
𝑥! 𝑥" 𝑥# … 𝑥$ 𝑤!,! 𝑤!," 𝑤!,# … 𝑤!,'
𝑥!
"
𝑥"
" "
𝑥# … 𝑥$
" 𝑤",! 𝑤"," 𝑤",# … 𝑤",'
𝑥= 𝑊= 𝑤#,! 𝑤#," 𝑤#,# … 𝑤#,'
𝑥!# 𝑥"# 𝑥## … 𝑥$# …………….
……………. 𝑤$,! 𝑤$," 𝑤$," … 𝑤$,'
% % % %
𝑥! 𝑥" 𝑥# … 𝑥$
Transformation matrix
Data Matrix
Dr.- Ing. Maggie Mashaly
2
0.5
1
Dr.- Ing. Maggie Mashaly
0 0
-1
-0.5
-2
-3 -1
-4 -2 0 2 4 -6 -4 -2 0 2 4 6
Finding the right Transformation
› Let us first forget about reducing the dimensions and
focus on reducing the relation between features
› Relation between features is described by the covariance
matrix
› 𝐶𝑜𝑣 𝑥, 𝑥 = 𝑥 # 𝑥
Dr.- Ing. Maggie Mashaly
nxm nxk
Correlated 𝑥 𝑧 Uncorrelated
M features 𝑧(𝑧 = 𝐼 K features
𝑍 = 𝑥𝑊
Dr.- Ing. Maggie Mashaly
› 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝑈𝜆
› Where 𝑈 is the eigen vector and 𝜆 is the eigen value
› 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝜆
! ! #
› 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝜆 𝐼 𝐼 𝜆
" "
Dr.- Ing. Maggie Mashaly
!# !
› 𝜆 𝑈 # 𝑐𝑜𝑣 𝑥, 𝑥 𝑈𝜆 = 𝐼
" "
! # !
$ $
› 𝜆 𝑈 " 𝑐𝑜𝑣 𝑥, 𝑥 𝑈𝜆 " =𝐼
Singular Value Decomposition
› 𝑐𝑜𝑣 𝑥, 𝑥 𝑈 = 𝑈𝜆
› 𝑐𝑜𝑣 𝑥, 𝑥 = 𝑈𝜆𝑈 #
› SVD(𝑐𝑜𝑣 𝑥, 𝑥 ) = 𝑈𝑆𝑉 #
› Because of the characteristics of covariance matrix
› 𝑈 = 𝑉#
Dr.- Ing. Maggie Mashaly
PCA Steps
– Centering each feature around the feature average
𝑥! = 𝑥! − 𝐸(𝑥! )
Normalize the data (optional)
𝑥! − 𝐸(𝑥! )
𝑥! =
𝜎!
Calculate the covariance matrix of x
1 "
𝑐𝑜𝑣 𝑥, 𝑥 = 𝑥𝑥
Dr.- Ing. Maggie Mashaly
𝑛
Use singular value decomposition to calculate eigen vectors matrix U
If reduction is to be applied pick the first k columns from the u matrix to
obtain 𝑈#$%&'$%
Obtain the new transformed data by
"
𝑧 = 𝑈#$%&'$% 𝑥
Note in doing so we ignored the eigen values which represent the
significance of each dimension
Obtaining the Original Dimension
Approximation
› 𝑧 = 𝑈%&'()&'
#
𝑥
› 𝑥*++%,- = 𝑈%&'()&' 𝑧
› In this case the approximation error can be calculated as
! "
/
› ∑1/0! 𝑥 /
− 𝑥*++%,-
Dr.- Ing. Maggie Mashaly
.
› Percentage error
"
1 + )
∑)*! 𝑥 ) − 𝑥,--./0
𝑚
+ ) "
∑)*! 𝑥
We are interested in a very small percentage of error
Obtaining the Original Dimension
Approximation
ML 20