Professional Documents
Culture Documents
ExNo 5
ExNo 5
Aim
Procedure
Step 1: Standardize the dataset.
Step 2: Calculate the covariance matrix for the features in the dataset.
Step 3: Calculate the eigenvalues and eigenvectors for the covariance matrix.
Step 4: Sort eigenvalues and their corresponding eigenvectors.
Step 5: Pick k eigenvalues and form a matrix of eigenvectors.
Step 6: Transform the original matrix.
Program
Output Interpretation
>1 0.542 (0.048)
>2 0.713 (0.048)
>3 0.720 (0.053)
>4 0.723 (0.051)
>5 0.725 (0.052)
>6 0.730 (0.046)
>7 0.805 (0.036)
>8 0.800 (0.037)
>9 0.814 (0.036)
>10 0.816 (0.034)
>11 0.819 (0.035)
>12 0.819 (0.038)
>13 0.819 (0.035)
>14 0.853 (0.029)
>15 0.865 (0.027)
>16 0.865 (0.027)
>17 0.865 (0.027)
>18 0.865 (0.027)
>19 0.865 (0.027)
>20 0.865 (0.027)
We see a general trend of increased performance as the number of dimensions is increased. On this
dataset, the results suggest a trade-off in the number of dimensions vs. the classification accuracy of
the model.Interestingly, we don’t see any improvement beyond 15 components. This matches our
definition of the problem where only the first 15 components contain information about the class and
the remaining five are redundant.A new row of data with 20 columns is provided and is automatically
transformed to 15 components and fed to the logistic regression model in order to predict the class
label.
Result
Thus the program to illustrate Principal Component Analysis was executed successfully.