NEED FOR PCA ➢ High Dimension Data is extremely complex to process due to inconsistency in the features. ➢ Increase the computation time and make data processing hard. BENEFITS OF PCA ➢ Better perspective and less complexity. ➢ Better visualization. ➢ Size reduction. WHAT IS PCA???? The main idea of principal component analysis (PCA) is to reduce the dimensionality of a data set consisting of many variables correlated with each other, either heavily or lightly, while retaining the variation present in the dataset, up to the maximum extent. STEPS TO PERFORM PCA STANDARDIZATION OF CALCULATE THE REDUCING THE DATA EIGENVECTORS AND DIMENSIONS OF THE EIGENVALUES DATA
STEP 1 STEP 3 STEP 5
STEP 2 STEP 4
COMPUTE THE COMPUTING THE
COVARIANCE MATRIX PRINCIPAL COMPONENTS STANDARDIZATION OF DATA COVARIANCE MATRIX EIGENVECTORS AND EIGENVALUES
➢ Principal Components are the new set of variables that are
obtained from the initial set of variables. They compress and possess most of the useful information that was scattered among the initial variables. ➢ Eigenvectors are those vectors when a linear transformation is performed on them, then their direction does not change. ➢ Eigenvalues simple denote the scalars of the respective eigenvectors COMPUTING THE PRINCIPAL COMPONENTS
➢ PC1 is the most significant
and stores the maximum possible information. ➢ PC2 is the second most significant PC and stores remaining maximum information and so on. THANK YOU