0% found this document useful (0 votes)
19 views7 pages

ML Exp6

Uploaded by

Nishad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views7 pages

ML Exp6

Uploaded by

Nishad
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

PART A

(PART A: TO BE REFFERED BY STUDENTS)

Experiment No. 6
A.1 Aim:
To implement Principal component Analysis.

A.2 Prerequisite:
Python Basic Concepts

A.3 Outcome:
Students will be able to implement Principal component Analysis.

A.4 Theory:

The Principal Component Analysis is a popular unsupervised learning technique for reducing the
dimensionality of data. It increases interpretability yet, at the same time, it minimizes
information loss. It helps to find the most significant features in a dataset and makes the data
easy for plotting in 2D and 3D. PCA helps in finding a sequence of linear combinations of
variables.

The Principal Components are a straight line that captures most of the variance of the data. They
have a direction and magnitude. Principal components are orthogonal projections (perpendicular)
of data onto lower-dimensional space.

● PCA is used to visualize multidimensional data.

● It is used to reduce the number of dimensions in healthcare data.

● PCA can help resize an image.

● It can be used in finance to analyze stock data and forecast returns.

● PCA helps to find patterns in the high-dimensional datasets.


1. Normalize the data

Standardize the data before performing PCA. This will ensure that each feature has a mean = 0
and variance = 1.

2. Build the covariance matrix

Construct a square matrix to express the correlation between two or more features in a
multidimensional dataset.

3. Find the Eigenvectors and Eigenvalues

Calculate the eigenvectors/unit vectors and eigenvalues. Eigenvalues are scalars by which we
multiply the eigenvector of the covariance matrix.
4. Sort the eigenvectors in highest to lowest order and select the number of principal
components.

The principal component analysis is a widely used unsupervised learning method to perform
dimensionality reduction.
PART B
(PART B : TO BE COMPLETED BY STUDENTS)

(Students must submit the soft copy as per following segments within two hours of the practical. The
soft copy must be uploaded on the Blackboard or emailed to the concerned lab in charge faculties at
the end of the practical in case the there is no Black board access available)

Roll. No. BE-A Name: Nishad Sutar


Class: BE-Comps A Batch: A1
Date of Experiment: 11/08/2025 Date of Submission: 18/08/2025
Grade:

B.1 Software Code written by student:


import numpy as np
import pandas as pd
import [Link] as plt
from [Link] import StandardScaler
from [Link] import PCA

# Step 1: Load dataset (Iris dataset as example)


from [Link] import load_iris
iris = load_iris()
X = [Link]
y = [Link]

print(X[:5])
print(y)

# Step 2: Standardize the data


scaler = StandardScaler()
X_std = scaler.fit_transform(X)
# Step 3: Apply PCA
pca = PCA(n_components=2) # reduce to 2D
X_pca = pca.fit_transform(X_std)

# Step 4: Plot PCA result


[Link](figsize=(8,6))
[Link](X_pca[:,0], X_pca[:,1], c=y, cmap='viridis', edgecolor='k', s=80)
[Link]("Principal Component 1")
[Link]("Principal Component 2")
[Link]("PCA on Iris Dataset (2D)")
[Link](label='Target Class')
[Link]()

# Explained variance
print("Explained variance ratio:", pca.explained_variance_ratio_)
B.2 Input and Output:
B.3 Observations and learning:
In this experiment, I implemented Principal Component Analysis (PCA), a widely-used unsupervised
learning technique for dimensionality reduction. I observed that the primary goal of PCA is to simplify
high-dimensional data by transforming it into a lower-dimensional space while retaining as much of the
original variance as possible. This is achieved by identifying principal components, which are new,
uncorrelated variables that capture the most significant patterns in the data. The process involved several
key steps: standardizing the data, constructing a covariance matrix to understand inter-feature
relationships, and then calculating the eigenvectors and eigenvalues to determine the direction and
magnitude of the new components. I also noted its diverse applications, from visualizing complex
datasets to analyzing financial data and resizing images.
B.4 Conclusion:
In conclusion, the aim of implementing Principal Component Analysis was successfully achieved. This
experiment provided a practical understanding of how to reduce the complexity of a dataset, making it
more interpretable and easier to analyze without significant information loss. The process reinforces the
value of PCA as an essential tool in data preprocessing, enabling more efficient handling and
visualization of high-dimensional data. Its ability to find the most important features makes it a
fundamental technique for pattern recognition and data compression in various fields.

You might also like