point,
Consider the following 20 dataset
Which ofthe following figures correspond to possible values that PCA may return for w!? the
first eigenvector / fist principal component)? Check al that apply (you may have to check
‘more than one figure@poet
2. Which of the following is a reasonable way to select the number of
cipal components?
(Recall that n is the dimensionality ofthe input data and m isthe number of input examples)
¢
Choose k tobe the largest value so that at least 99% ofthe varlancels retained
Use the elbow method,
Choose k tobe the smallest value so that at least 99% of the variance's retained.
‘choose k to be 99% of m (ic. k= 0.90 «m, rounded to the nearest integer).
3. Suppose someone tells you that they ran PCA in such a way that "95% of the variance was
retained.” What is an equivalent statement to this?
) AEE a o5
EEO eel?
Salle?
0.05,
EP l20 aad?
teO Sl” < 0.05
Say
¢ ETI 95
FE 2° 2B+ 4. whieh of the following statements are true? Check all that apply.
point
estar scatngis not ust or PCA, since the eigenvector calcultion ich 35 sng
Octaw's avd (Signa) routine) thes care oft automaticaly,
Given an input € R™, PCA compresses it to a lower-dimensional vector z € R*.
PCA can be used only to reduce the dimensionality of data by 1 (such as 30 t0 20,
or 200 1D).
a
tthe input features are on very ifferent scales, tis a good idea to perform feature
scaling before applying PCA.
7 5. Which of the following are recommended applications of PCA? Select all that apply.
] To get more Features to feed into a earning algorithm.
Preventing overfiting: Reduce the numberof features (in @ supervised learning
problem) so that there ae fewer parameters tolearn.
‘Data visualization: Reduce data to 2 (or 30) 0 that it canbe plotted,
ee oO
Data compression: Reduce the dimension of your data so that it takes up less
‘memory isk space.