Professional Documents
Culture Documents
Dimensionality Reduction
Machine Learning
Overview
Motivation
Principle Component Analysis (PCA) Problem
Formulation
The Algorithm
Choosing k
Applying PCA
Motivation
Motivation :: Data Compression (1)
X
X
X
X
X
X
Pilot Skill
Motivation :: Data Compression (2)
Mean
GDP Per Capita Life Poverty
Country HDI Household ….
(Trillion $) GDP (K $) Expectancy Index
income (K $)
Country 𝟏 𝟐
USA 2 1.5
… …. ….
Data Visualization
Quiz
Principal Component Analysis (PCA)
PCA Problem Formulation (1)
Perform Mean Normalization /
X Feature scaling before PCA
X
X
X
X
PCA Problem Formulation (2)
X X
X X
PCA Problem Formulation (3)
X X
X X
PCA Problem Formulation (4)
X X
X X
PCA Problem Formulation (5)
X Principal Component
X X
X X
PCA Problem Formulation (6)
PCA Problem Formulation (7)
X
X X
X X
• Reduce from 2-dim to 1-dim : Find a direction (vector 𝑢( ) ∈ ℝ ) onto which to project the data so as to
minimize the projection error.
• Reduce from n-dim to k-dim : Find k vectors 𝑢( ) , 𝑢( ) , … . . , 𝑢( ) onto which to project the data so as to
minimize the projection error.
PCA is not Linear Regression
X X
X X X X
X X X
X
X X
X X
Training Set:
3 5
9 7
6 5
11 8
8 6
Numerical Example – II
𝑿 𝒀 1. Compute Covariance Matrix
11 8
8 6
∑ 𝑥 − 𝜇 𝑦 − 𝜇
𝐶𝑜𝑣 𝑿, 𝒀 =
𝑚−1
Numerical Example – III
𝑿 𝒀
1. Compute Covariance Matrix
3 5
2. Perform Eigen Value Decomposition
9 7
3. Select the No. of Principal Components
6 5 ∑ 𝑥 − 𝜇 𝑦 − 𝜇
4. Reduce the Dimension 𝐶𝑜𝑣 𝑿, 𝒀 =
𝑚−1
11 8
8 6
Numerical Example – IV
𝑿 𝒀
1. Compute Covariance Matrix
3 5
2. Perform Eigen Value Decomposition
9 7
3. Select the No. of Principal Components
6 5 ∑ 𝑥 − 𝜇 𝑦 − 𝜇
4. Reduce the Dimension 𝐶𝑜𝑣 𝑿, 𝒀 =
𝑚−1
11 8
8 6
Numerical Example – V
𝑿 𝒀
1. Compute Covariance Matrix
3 5
2. Perform Eigen Value Decomposition
9 7
3. Select the No. of Principal Components
6 5
4. Reduce the Dimension
11 8
8 6
Numerical Example – XI
𝑿 𝒀
1. Compute Covariance Matrix
3 5
2. Perform Eigen Value Decomposition
9 7
3. Select the No. of Principal Components
6 5
4. Reduce the Dimension
11 8
8 6
Numerical Example – XII
1. Compute Covariance Matrix
2. Perform Eigen Value Decomposition
3. Select the No. of Principal Components
4. Reduce the Dimension
𝑿 𝒀 𝒁
3 5 -4.53
9 7 1.783
6 5 -1.7468
11 8 4.012
8 6 0.4822
Applying PCA
Applying PCA
Supervised Learning Speedup
,
(say, computer vision, where input is 100 x 100 image)
Extract inputs
○ Unlabelled data set : ( ) ( ) ,
Apply Hypothesis
Applications of PCA
Compression
○ Reduce memory/disk needed to store data
○ Speed up learning algorithm
Visualization
○ , or
Bad Use of PCA (1)