PCA Pres

PRINCIPAL
COMPONENT
ANALYSIS
Presented by:
Zoha Ahmed (F20604017)
Faria Shoaib (F20604022)
Qundeel Saleem(F20604028)
Eesha Noor(F20604030)
Aliza Mushtaq (F20604039)
INTRODUCTION
 Principal component analysis, or PCA, is a dimensionality reduction

method.
 It is often used to reduce the dimensionality of large data sets, by
transforming a large set of variables into a smaller one that still
contains most of the information in the large set.
 An orthogonal transformation is used to transform some correlated
variables into linearly uncorrelated variables. Those are called principal
components.
 Principal components show the direction of the most variation in data.
 The first, and the most important principal component, demonstrates
the maximum variance in the data.
CONTD.
INRODUCTION
 The second principal component describes the remaining variance in the data and is uncorrelated to the
first principal component.
 Be aware that the PCA transformation depends on how the original variables were scaled relative to one another.
 Before using PCA, data column ranges must be normalized. it can be note that the new coordinates no longer
correspond to system-produced variables.
 Your data set becomes less interpretable after applying PCA.
 PCA is not the transformation for your application if the results' interpretability is crucial to your analysis.
WHY PCA?
Curse of Dimensionality:
The issues that arise when dealing with high dimensional data. Some problem sets may have:
 Large number of feature set
 Making the model extremely slow
 Even making it difficult to find a solution
Hence, More data is good but more detailed data might not be.
CONTD….
SOLUTION: DIMENSIONALITY
REDUCTION
 Data can be represented by fewer dimensions.

 Reduce dimensionality by feature elimination
Example
DIMENSIONAL REDUCTION TECHNIQUES
Principal Component Analysis (PCA):

 transforms the original variables of a data set into a new set of variables called principal
components.
 the first principal component (PC1) has the largest possible variance.
 each succeeding component has the highest possible variance under the constraint that it is
orthogonal to the preceding components.
PRINCLIPLE COMPONENTS
Example
WORKING
 Plot some data on a graph and calculate the average measurement for the sample.
 With these average values, we can calculate the center of our data and shift the data in a way, so the center of the
data is placed on the origin of the graph (0,0).
 Now we will try to fit a line on this data: we start by drawing a random line that intersects the graph right in its
origin.
 Then we rotate the line until the line fits the data as good as it can with the condition that the line still must go
through the origin.
 In the end, some line will fit best.
 PCA can find the line that maximizes the distances from the projected points to the origin.
CONTD.
WORKING
 To find that line fits the data the best way possible measure the distance
from each data point to the separating line.
 Then find the line that minimizes these distances.
 We normalize our existing vectors in a way, so their length is one unit
long.
 Such vectors are called singular vectors or eigenvectors.
 Eigenvectors have a characteristic that no matter what, they always
stay at the origin. SSD is a scaling value, or, in other words, it is a
representation of eigenvalues.
 The root of SSD is a singular value.
ALGORITHM
Data Collection of Calculation of

Standardization of
Acquisition covariance eigen value and
data set
matrix eigen vector
Choosing
Deriving a new components and
data set forming a feature
vector
IMPLEMENTATION
CONTD.
CONCLUSION
 Dimensionality reduction is simply, the process of reducing the dimension of your feature set.
 Your feature set could be a dataset with a hundred columns (i.e features) or it could be an array of points that make
up a large sphere in the three-dimensional space .
 The principal components of a collection of points in a real p-space are a sequence of direction vectors, where the
vector is the direction of a line that best fits the data while being orthogonal to the first vectors.

PCA Pres

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PCA Pres

Uploaded by

Copyright:

Available Formats

PRINCIPAL

 Principal component analysis, or PCA, is a dimensionality reduction

 Data can be represented by fewer dimensions.

Principal Component Analysis (PCA):

Data Collection of Calculation of

You might also like