You are on page 1of 3

Report on PCA Face detection assignment

Name: Anirban Bhattacharjee


Roll: 211000010
Subject: Machine Learning Algorithms
Instructor: Dr. Abhishek Sharma

i) Abstract:
Face detection is a widely researched area in computer vision. Principal Component
Analysis (PCA) is a popular technique for dimensionality reduction in face detection.
In this report, we used PCA to reduce the number of features from 4096 to 256 in a
dataset containing 40 classes of human faces. We trained an SVM classifier on the
reduced feature set and evaluated its performance using precision, recall, and F1 score
metrics. Our results show that the SVM classifier achieved high precision and recall,
with an F1 score of 0.87.

ii) Theory:
Principal Component Analysis is a widely used technique for dimensionality reduction
in machine learning. It is used to extract the most important features from a dataset
and to reduce the complexity of the dataset. In face detection, PCA is used to reduce
the number of features in an image to extract the most important information about the
face.

In our study, we used PCA to reduce the number of features in a dataset containing 40
classes of human faces. The dataset contained 4096 features, and we extracted 256
features using PCA. We then trained an SVM classifier on the reduced feature set. The
SVM classifier is a popular supervised machine learning algorithm used for
classification problems. It works by finding the best hyperplane that separates the
different classes of data points.

We evaluated the performance of the SVM classifier using precision, recall, and F1
score metrics. Precision is the fraction of true positives among the total predicted
positive samples, recall is the fraction of true positives among the total actual positive
samples, and F1 score is the harmonic mean of precision and recall.

iii) Results:
Our SVM classifier achieved high precision and recall, with an F1 score of 0.87. The
macro values of precision and recall were 0.92 and 0.87, respectively. A classification
report was generated, showing the precision, recall, F1 score, and support for each of
the 40 classes of human faces. The classification report indicated that the SVM
classifier performed well for most of the classes, with some variation in the precision
and recall values.

iv) Additional Results:


In addition to the evaluation metrics and classification report, we also plotted two
graphs to visualize the dataset and the results of PCA.

The first plot is a countplot of the target variable classes, which shows the distribution
of the 40 classes of human faces in the dataset. This plot helps to identify any class
imbalance and provides an overview of the dataset.

The second plot is a cumulative plot of PCA components vs variance, which shows
how much of the variance in the dataset is explained by the extracted PCA
components. This plot helps to determine the optimal number of components to use
for dimensionality reduction.

The countplot showed that the dataset had a balanced distribution of the 40 classes of
human faces, with each class having an equal number of samples. The cumulative plot
showed that the first 100 PCA components explained approximately 80% of the
variance in the dataset, while the first 200 components explained approximately 95%
of the variance. Based on this plot, we selected 256 components for dimensionality
reduction, which explained approximately 98% of the variance in the dataset.
v) Conclusion:
In conclusion, we used PCA to reduce the number of features in a dataset containing
40 classes of human faces. We extracted 256 features using PCA and trained an SVM
classifier on the reduced feature set. Our results showed that the SVM classifier
achieved high precision and recall, with an F1 score of 0.87. The classification report
indicated that the SVM classifier performed well for most of the classes, with some
variation in the precision and recall values. This study demonstrates the effectiveness
of PCA and SVM for face detection tasks and shows the potential for further
improvements in performance with additional optimization and refinement.

You might also like