20090504 StudyGroup : PCA

- Machine Learning Algorithms - Summary + R Code
- PCA ppt
- PCA
- Image Processing Based Feature Extraction-Indian BNs
- Fingerprint Based Gender Classification Using 2D Discrete Wavelet Transforms and Principal Component Analysis
- Lecture 3
- Algorthmic Trading
- Psychometric evaluation of the Financial Threat Scale (FTS) in the context of the great recession
- Exploration of Normalized Cross Correlation to Track the Object through Various Template Updating Techniques
- Enhanced Face Detection and Tracking In Video Sequence Using Fuzzy Face Model and Sparse Representation Technique
- Tunisian Peppers Lahbib Et Al
- An Improved ELM Algorithm for the Measurement of Hot Metal Temperature in Blast Furnace
- 10120130406028-2
- Mathematical Tools Problem (17)
- Binary Principal Component Analysis
- FDA Manual
- 00908974(3)
- Subspace Methods
- ,DanaInfo=Ieeexplore.ieee.Org+05393861
- 6_Ahp

Toolkits of PCA

Study Group

Presenter : Chin-Hui Chen

Theory :

◦ 1. Scenario

◦ 2. What is PCA?

◦ 3. How to minimize Squared-Error ?

◦ 4. Dimensionality Reduction

Toolkit :

◦ A list of PCA toolkits

◦ Demo

Consider a 2-dimension space

Principal component analysis (PCA)

involves a mathematical procedure that

transforms a number of possibly correlated

variables into a smaller number of

uncorrelated variables called “principal

components”.

What can PCA do ?

◦ Dimensionality Reduction

For example :

◦ e.g. {x1, x2, x3, x4} ; xi = (v1, v2)

◦ A set (M) of basis for projection

◦ e.g. {u1}

They are orthonormal bases ( 長度 1, 兩兩內積 0)

M << D (represent the feature in M dimensions)

◦ e.g. xi = (p1)

What is PCA ? (2)

Consider a D-dimension space

◦ Given N point : {x1, x2, …, xn}

◦ xi is a D-dim vector

How to

◦ 1. 找一個點使得 squared-error 最小

◦ 2. 找一條線使得 squared-error 最小

◦ Goal : Find x0 s.t. min.

◦

◦ Let .

How to ? - Point

∴ x0 =

◦ 1. 找一個點使得 squared-error 最小

◦ 2. 找一條線使得 squared-error 最小

L : xk’- x0 = ake

xk’= x0 + ake

= m + ake

L : xk’ = m + ake

Goal :

Find a1…an

How to ? – Line

每個部份微分後 [2ak – 2aket(xk-m)]

How to ? – Line

Then, how about e ?

How to ? – Line

Independent of e

Let

How to ? – Line

J’1(e)= -etSe

Use lagrange multiplier :

f(x,y) ->

How to ? – Line

◦ What is S ?

Covariance Matrix ( 共變異數矩陣 )

◦ Assume D-dim

How to ? – Line

, we know S.

Then, what is e ? Eigenvectors of S.

How to ? – Line

Summary :

◦ Find a line : xk’= m + ake

ak = et(xk-m)

Se = λe ; e = eigenvectors of covariance matrix.

◦ D-dim space can find D eigenvectors.

How to ? – conclusion

Dimensionality

Reduction

Consider a 2-dim space …

X1 = (a,b)

X2 = (c,d)

X1 = (a’,b’)

X2 = (c’,d’)

We are going to do …

X1 = (a’)

X2 = (c’)

Dimensionality Reduction

We want to proof :

◦ Axes of the data are independent.

◦ {x1, x2, … ,xn}

◦ Let X=[x1-m x2-m … xn-m]T m = mean

Se = λe

eigen decomposition Eigen vector {e1,…,

em}

Dimensionality Reduction

E = [e1 e2 … em]

= [λe1 λe2 … λem]

=

= ED

S = EDE-1

Dimensionality Reduction

We want to know new Covariance Matrix of

projected vectors.

E = [e1 e2 … em]

Y = ETX

SY

Dimensionality Reduction

SY =D

2. represent data↑->covariance of axes↑

-> λ ↑

Dimensionality Reduction

Conclusion :

If we want to reduce

dimension D to M

(M<<D)

1. Find S

2. ->eigenvalues

3. Select Top M

4. Project data

Dimensionality Reduction

Toolkits

C & Java

◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources

◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/

Perl

◦ PDL::PCA

Matlab

◦ Statistics Toolbox™ : princomp

Weka

◦ weka.attributeSelection.PrincipalComponents

(http://www.laps.ufpa.br/aldebaro/weka/feature_selection.html

)

C & Java

◦ Fionn Murtagh's Multivariate Data Analysis Software and Resources

◦ http://astro.u-strasbg.fr/~fmurtagh/mda-sw/

C:

Download: pca.c

Compile: cc pca.c -lm -o pcac

Run: ./pcac spectr.dat 36 8 R > pcaout.c.txt

Java :

Download: JAMA, PCAcorr.java

Compile: javac –classpath Jama-1.0.2.jar PCAcorr.java

Run: java PCAcorr iris.dat > pcaout.java.txt

