This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

Based on 1. “Statistics and Data Analysis in Geology, J.C. Davis, New York, John Wiley & sons, 2nd ed.,1996 2. “Chemometrics: A Textbook, Amsterdam”, D.L. Massart, B.G.M. Vandeginste, S.N. Deming, Y. Michotte, and L. Kaufman, Elsevier,1988 3. “Course note: Multivariate Data Analysis and Chemometrics”, B. Jørgensen, Department of Statistics, University of Southern Denmark, 2003 4. “Multi- and Megavariate Data Analysis- Principles and Applications”, L. Eriksson, E. Johansson, N. Kettaneh-Wold, and S. Wold, UMETRICS, 2001 5. “Matlab” online Manual, The MathWorks, Inc.

**PART II: Procedures of PCA
**

1. 2. 3. 4. 1st Step: Pre-treatment of Data Matrix Scaling 2nd Step: Calculation of covariance matrix 3rd Step: Calculation of eigenvalues and eigenvectors of covariance matrix 4th Step: Calculation of scores

Johansson.1st Step: Pre-treatment of Data Matrix 1) Pre-treatment of Data Matrix-Scaling Scaling (Adapted from ”Multi. Eriksson.Principles and Applications.and Megavariate Data Analysis. E. and S. UMETRICS. 2001) 2) Example of Data Matrix: X . L. N. Kettaneh-Wold. Wold.

.. Sk2 Skn Row: Sample Standard deviation of each variable ⎡ (a11 / sk 1 ) ⎢ ⎢ ⎢(am1 / sk 1) ⎣ (a1n / skn ) ⎤ ⎥ ⎥ . amn ⎥ ⎦ .. a variable with a large variance will dominate Most common scaling technique – Unit variance (UV) scaling Column: Variable UV Scaling ⎡ a11 ⎢ ⎢ ⎢ am1 ⎣ Sk1 a1n ⎤ ⎥ ⎥ .1) Pre-treatment of Data Matrix-Scaling • • Unless the data are normalized.. (amn / skn ) ⎥ ⎦ ......

the mean values still remain different Therefore mean-centering as a second part of pre-data processing Step1) Average value of each variable is calculated Step2) Subtracted from the data .1) Pre-treatment of Data Matrix-Scaling (Continued) Measured Values & Length Unit Variance Scaling Length of each variable: identical Note: However.

Scaling (Continued) Measured Values & Length Unit Variance Scaling Length of each variable: identical Mean Centering Note: Unit Variance Scaling + Mean Centering= Auto-Scaling Sometimes we don’t need UV scaling ex. same unit like spectroscopic data .1) Pre-treatment of Data Matrix.

1 1.78 0.9 2.98 2.85 3.43 We have 20 samples with 2 variables.01 3.64 1.69 3 2.08 0.31 Zunger’s pseudopotenial core radii sum (X2) 1.8 1. 1 2 3 4 5 6 7 8 9 10 Element H Li Be B C N O F Na Mg Martynov-Batsanov’s Electronegativity (X1) 2.65 2.86 2.42 1.54 0.65 2.98 0.58 2.795 0. when we have bivariate observations as follows.32 3.2) Example of Data Matrix: X As an example.9 1.17 1. The size of data matrix will be 20×2.465 0. 11 12 13 14 15 16 17 18 19 20 Element Al Si P S Cl K Ca Sc Ti V Martynov-Batsanov’s Electronegativity (X1) 1.5 1.64 0.45 1.25 1.22 Zunger’s pseudopotential core radii sum (X2) 1.1 0.89 1.61 1.03 Sample NO.75 2.32 2.675 1.37 2.24 1.405 2. Sample NO. .

5 1.5 X2 2.0 1.0 0.0 2.0 3.5 1.0 2.0 1.5 3.0 0.5 0.0 X1 .0 0.5 3.2) Example of Data Matrix: X (continued) The scatter plot of X 4.5 2.0 3.5 4.

5 2.0 -2.0 2.5 1. of X2 (1.2) Example of Data Matrix: X (continued) If we choose Mean Centering as a scaling method.5 -2.5 2.0 1.5 X2 2.0 3.0 1.5 1.0 1.0 2. of X1 Avg.5 -1.0 0.0 2.0 -0.0 2.5 3.5 1.0 0. New frame after the mean centering 4.618) X2 0.0 -1.5 4.0 -0.5 0.0 0.5 New frame after the mean centering X1 Data after the scaling Avg.5 3.0 0.0 1.9995.0 0.5 2.5 -2.5 -1.0 3.5 0.5 X1 .5 -2.5 1.0 -1. 1.

2nd Step: Calculation of covariance matrix 1) Calculation of Covariance matrix(S) of Data Matrix(X) 2) Property of Covariance matrix(S) of Data Matrix(X) .

covariance matrix S is 1 )X T X S = cov( X ) = ( m −1 From example. via computer? Use the command. ⎡ 0. S. in Matlab where a is a data matrix = .5929 ⎤ S=⎢ −0. we have 20 rows and 2 columns.5929 0. the size of covariance matrix.6881 −0. By definition of covariance matrix. Therefore.6881 −0.1) Calculation of Covariance matrix(S) of Data Matrix(X): If the data matrix X has m-rows and n-columns. is 2 by 2. S.9026 ⎥ ⎣ ⎦ ⎡ 0.5929 0.9026 ⎥ ⎣ ⎦ Covariance matrix S from original data matrix Covariance matrix S from scaled data matrix How do we get a covariance matrix.5929 ⎤ S=⎢ −0. cov(a).

Y ) = ∑(X i =1 n i − X )(Yi − Y ) (n − 1) Where n: sample number X and Y : mean of the set X and Y.9026 ⎦ covariance between the X1 and X2: Inverse correlation from the negative sign the variance of X2 • Variance (1 dimensional concept): Measure of the spread of data in a given data set var( X ) = ∑(X i =1 n i − X )( X i − X ) (n − 1) • Covariance (Multi-dimensional concept): Measure of the spread of data between dimensions (variables) cov( X . respectively .5929 0.2) Property of Covariance matrix(S) of Data Matrix(X): the variance of X1 covariance between the X1 and X2: Inverse correlation from the negative sign ⎡ 0.5929 ⎤ S=⎢ ⎥ ⎣ −0.6881 −0.

Then we can extract m eigenvalues and m eigenvectors. z ) ⎥ ⎣ ⎦ Since cov(a. covariance matrix S is symmetrical ! 1. the covariance matrix S will be ⎡ cov( x. we can compute an m×m covariance matrix.2) Property of Covariance matrix(S) of Data Matrix (X) (continued) n! covariance values. for symmetric matrices. we will have For 3 dimensional data set (x. x) cov( x. Therefore eigenvectors of covariance matrix are orthogonal! : VERY IMPORTANT concept in PCA 3. x) cov( z . y ) cov( z .z). y ) cov( x. If we measure m variables. x) cov( y.y. z ) ⎤ S = ⎢cov( y. z ) ⎥ ⎢ ⎥ ⎢ cov( z .b)=cov(b. 2(n − 2)! If we have n-dimensional data set.a). y ) cov( y. we already know…. From the eigenvalue properties. their eigenvectors always are at right angles to each other: ORTHOGONAL!!! 2. .

3rd Step of PCA: Calculation of eigenvalues and eigenvectors of covariance matrix 1) Calculation of Eigenvalues of Covariance matrix (S) 2) Calculation of the corresponding Eigenvectors of Covariance matrix (S) 3) Graphical representations of eigenvalues and eigenvectors 4) Summary of eigenvalues and eigenvectors 5) Advantages of PCA .

5907λ + 0.9026 − λ =0 (0.3978.5907) 2 − 4 × 0.5929 ⎤ S=⎢ −0.6881 −0.5929) 2 = 0 λ 2 − 1.1928 Eigenvalues of covariance matrix S .6881 − λ −0.5929 0.5929 0.27 λ= 2 See next slide… λ1 = 1.1) Calculation of Eigenvalues of Covariance matrix (S) ⎡ 0.6881 − λ )(0.5907 ± (1.9026 − λ ) − (0. λ2 = 0.9026 ⎥ ⎣ ⎦ Symmetrical 0.27 = 0 its eigenvectors are orthogonal (90o) 1.5929 −0.

7097 −0.5929 0.4952 ⎥ ⎢ x2 ⎥ ⎣ ⎦⎣ ⎦ ⎣ ⎦⎣ ⎦ Is it possible to solve this problem? (NOT for the case of No.1928 −0.1928 ⎤ ⎡ x1 ⎤ ⎡ 0.9026 − 1. because of two same values (-0.5929) x1 = x2 = 0 ) So computerized solutions are indispensable!! See Next Slide! .6881 − 1.5929 ⎣ ⎦⎣ 2⎦ Is it possible to solve this problem? (NOT for the case of No.7098 ⎥ ⎢ x ⎥ = 0 0.3978 −0.1928⎦ ⎣ 2 ⎦ ⎣ −0.2) Calculation of the corresponding Eigenvectors of Covariance matrix (S) For eigenvalue λ1 = 1.3978 = =0 ⎢ −0.9026 − 0.4953 −0.6881 − 0.5929 0.5929 −0.5929 ⎤ ⎡ x1 ⎤ ⎢ ⎥ ⎢ x ⎥ = ⎢ −0.3978⎥ ⎢ x2 ⎥ ⎢ −0. because of two same values (-0.5929 ⎤ ⎡ x1 ⎤ ⎡ −0.5929 ⎡0.5929) x1 = x2 = 0 ) For eigenvalue λ2 = 0.5929 ⎤ ⎡ x1 ⎤ ⎡0.

3978⎥ ⎣ ⎦ Eigenvalues that we already calculated… For λ1 = 1.1928 . use the function of eigenvector decomposition.6411 0. ∧]=eig(S).3978 Eigenvectors ⎡ −0.2) Calculation of the corresponding Eigenvectors of Covariance matrix (S) In Matlab. Then you will get followings in Matlab. 0 ⎤ ⎡0.7675 ⎥ ⎣ ⎦ For λ2 = 0.7675 −0. The command is [P.1928 Λ=⎢ 0 1.6411⎤ P=⎢ −0.

we sometimes get mirror images in PCA! .5929 ⎤ ⎡ x1 ⎤ ⎢ −0.4952 ⎥ ⎢ x ⎥ = 0 ⎣ ⎦⎣ 2⎦ Because of this effect.7097 −0.6411 ⎡ 0.5929 ⎤ ⎡ x1 ⎤ ⎢ −0.7675 ⎡ −0.7098 ⎥ ⎢ x ⎥ = 0 ⎣ ⎦⎣ 2⎦ for λ1 = 1.5929 0.7675 −0.6411 0.2) Calculation of the corresponding Eigenvectors of Covariance matrix (S) For λ1 = 1.1928 Eigenvectors can also be x1=0.7675 and x2=0.3978 Eigenvectors ⎡ −0.1928 Note: for λ2 = 0.6411⎤ P=⎢ −0.3978 Eigenvectors can also be x1=0.4953 −0.7675 ⎥ ⎣ ⎦ For λ2 = 0.5929 −0.6411 and x2=-0.

1928 λ1 =1.6411 & -0.3978 c.S • Eigenvalues of S • Eigenvectors of S ⎡ 0.7675:(-0.7675 & -0.3978 and λ2 = 0.3) Graphical representations of eigenvalues and eigenvectors • Covariance matrix of scaled data matrix.7675) For λ2 = 0.1928 λ1 = 1.6411 or (0.Slope + Slope • Eigenvalues: the lengths of each of the principal axes of ellipse λ2 =0.5929 ⎤ S=⎢ −0.6411) For We already know • Eigenvectors: the orientations of the principal axes of the ellipse Slope of major axis =ratio of eigenvector=0.f.7675) .3978 .9026 ⎥ ⎣ ⎦ λ1 = 1.1928.6881 −0.7675 or (0.5929 0.6411 & 0.7675 & 0.6411) Slope of minor axis=ratio of eigenvector=-0. Eigenvectors: -0. in 3D Ellipsoid .6411:(-0. Eigenvectors: -0.

5 0.0 -0.3) Graphical representations of eigenvalues and eigenvectors (continued) Covariance matrix of scaled data matrix.0 0.5 1.6881 −0.5 2.0 0.5929 ⎤ S=⎢ ⎥ ⎣ −0.0 1.5 0.5 X1 Slope of minor axis = ratio of eigenvector of the second largest eigenvalue(λ2) PC1(Principal Component 1) Slope of major axis = ratio of eigenvector of the largest eigenvalue(λ1) .0 -1.5 -2.9026 ⎦ PC2(Principal Component 2) 2.S ⎡ 0.5929 0.0 2.5 -2.0 -1.5 1.5 -1.5 the lengths of each of the principal axes of ellipse X2 Eigenvalues: Orthogonal (90o) -2.0 1.0 -2.0 -0.5 2.5 -1.

Eigenvalues represent the lengths of the two principal axes of ellipse.3978/1.5907=43. Total eigenvalues will be 0.74% 3.1928+1.5906=87. Variable X2 contributes 0. Therefore the axes represent the total variance of the data set. 1.6881 −0.1928/1.88% of the total variance.9026=1.1928 Λ=⎢ 0 1.5907 2.5907=56.5929 0.9026 ⎥ ⎣ ⎦ From covariance matrix S.9026/1. . we know followings. The first principal axis contains 1.26% 0 ⎤ ⎡0.6881+0.12% of the total variance.6881/1.5906 and this is the same as total variance 2.5929 ⎤ S=⎢ −0.3978=1.3978⎥ ⎣ ⎦ How about eigenvalues? 1. The total variance (trace of matrix) is 0. 3.5906=12.4) Summary of eigenvalues and eigenvectors ⎡ 0. Second principal axis represents 0. Variable X1 contributes 0.

we lose only 12. • Let’s suppose.. It means we will lose 56.we need to reduce our system to only one variable. This is a big advantage of PCA!!! . Then we need to discard either variable X2 or X1.5) Advantages of PCA From this example. however.12% of the variation in our data set. we convert our data set to scores on the first principal axis(PC 1).74% or 43.26% of the total variance. • If.

4th Step of PCA: Calculation of scores 1) Mathematical representations of transformation of axes 2) Calculations of scores 3) Proof of scores by overlapping onto the scaled data set .

5 -2.7675 X 1i + 0.5 X1 Slope of minor axis = ratio of eigenvector of the second largest eigenvalue(λ2) PC 2 i = 0.7675 X 2i For 2nd principal axis.0 2.0 1.5 1.5 -1.5 1.5 0.1) Mathematical representations of transformation of axes from X1-X2 to PC1-PC2 (PC: Principal Component) Slope of major axis = ratio of eigenvector of the largest eigenvalue(λ1) PC1 2.5 -2. PC1i = 0.5 -1.0 1.0 -1.5 2.5 0.0 -1. -2.0 -2.5 2.6411X 1i − 0.5 PC1 = α1 X 1 + α 2 X 2 α ’s are the elements of the first eigenvector PC2 PC 2 = β1 X 1 + β 2 X 2 β ’s are the elements of the second eigenvector X2 For 1st principal axis.0 0.0 -0.6411 X 2 i Scores Loadings .0 -0.0 0.

368⎤ ⎡ −0. X ] ([ P ] ) −1 = [T ] [ T PC1i = 0.7675 0.4818 ⎥ ⎣ ⎦ ⎣ ⎦ Scores . PC2 PC1 ⎡ 0.7675 X 1i + 0.0606 ⎥ 0.1588 0. [T ] n × m matrix of principal components scores [ X ]n × m matrix of observations (scaled original data matrix) [ P] m × m Square matrix of eigenvectors or loading matrix.1005 −0.8490 −0.0995 −0.5495 −0.6987 ⎥ ⎢ ⎥ ⎡0.7666 0.6411 −0.812 ⎥ ⎢ 0.7675 X 2i For 2nd principal axis.538⎥ ⋅ ⎢ ⎥ = ⎢ −0.7675⎦ ⎢ ⎢ ⎥ ⎣ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0.6898 −0.6411X 1i − 0. PC 2 i = 0.6411 X 2 i Scores Loadings Therefore.2205 0.2) Calculations of scores For 1st principal axis.008⎥ ⎢ −0.3469 ⎤ ⎢ −1.6411 ⎤ ⎢ ⎥ ⎢ −0.

5 1.0 -2.0 2.5 -2.0 -1.5 -1.5 0.0 0.0 -2.5 rotation -2.0 -0.5 X2 PC2 0.3) Proof of scores by overlapping onto the scaled data set Original data set (scaled) 2.0 0.5 -2.0 -0.0 -0.0 -1.0 -1.5 -2.5 -2.5 1.5 2.5 1.0 -1.5 -1.0 0.5 1.5 0.0 1.0 1.0 -0.0 2.5 -1.0 0.5 -1.5 2.5 X1 PC1 .5 0.5 -2.5 2.5 2.5 Score plot of scaled data set rotation 2.0 1.0 1.

- Distance Correlation
- buntine - ecml02
- A stochastic Paris Erdogan model for fatigue crack growth using two state model.pdf
- A Gentle Tutorial on the EM Algorithm
- Ung Dung Cramer
- MLRM.docx
- Assignment#2_due by 20th Jan 2014
- gauss seidal example
- GaussBondQuadrature
- Chapter 3
- Metoda e eulerit
- Oldfinal Sols
- A Just Ee Lipses
- 1
- Generalized PCA
- Statistical methods in geodesy Martin Vermeer.pdf
- 0
- Kalman Filter 2
- lecture-1_2
- Functions & Formulas in WEBI
- Graphical Models_parameter Learning
- Zellner Bayesian Analysis
- Chapter 5
- new09
- Total Least Squares
- diet of random variables
- 1204.5242v2
- Spatial Statistics
- Dummy Endogenous 12

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd