MLESA v2024 Week10 Assignment Solution

MLESA Week-10 Assignment
1. Suppose we are using dimensionality reduction as pre-processing technique, i.e, instead of using all the
features, we reduce the data to ‘k’ dimensions with PCA. And then use these PCA projections as our
features. Which of the following statements is correct?
Choose which of the options is correct?
a: Higher value of ‘k’ means more regularization
b: Higher value of ‘k’ means less regularization.
(A) Only a
(B) Only b
(C) a and b both
(D) None of these
Solution
(B) only b
In PCA, reducing the number of dimensions (a lower value of k) introduces more regularization by dis-
carding features that contribute less to the variance in the data. This can help to improve the performance
of machine learning models by reducing overfitting.
2. Which of the following is correct mathematical notation for the Gaussian distribution (also known as the
Normal distribution)?
√ (x − µ)2

(A) µ(x) = 2πσ 2 exp −
2σ 2
(x − µ)2

2 1
(B) N (µ, σ ) = √ exp −
2πσ 2 2σ 2
(x − µ)2

1
(C) σ(x) = p exp −
2πµ2 2
√ (x − µ)2

2
(D) N (x, µ) = 2πσ exp −
2
Solution
(B)
The Gaussian distribution (Normal distribution) probability density function is given by:
(x − µ)2

1
f (x|µ, σ 2 ) = √ exp −
2πσ 2 2σ 2
3. Principal Component Analysis (PCA) is a dimensionality reduction algorithm. Which of the following
statement(s) is/are TRUE about PCA?
1
(A) PCA is an unsupervised method.
(B) It searches for the directions that data have the largest variance.
(C) Maximum number of principal components ≤ number of features.
(D) All principal components are orthogonal to each other.
Solution
(A),(B),(C),(D)
Principal Component Analysis (PCA) is an unsupervised dimensionality reduction technique widely
used in data analysis and machine learning. It aims to transform high-dimensional data into a lower-
dimensional space while preserving as much of the original variance as possible. PCA achieves this by
identifying the directions in which the data exhibit the largest variance, known as principal components.
These principal components form a new orthogonal basis for the data, allowing for a more compact rep-
resentation. Importantly, PCA requires no knowledge of class labels or outcome variables, making it
applicable to a wide range of datasets. Additionally, the number of principal components is constrained
by the number of original features, ensuring that the transformed data remains interpretable. Moreover,
PCA ensures that the resulting principal components are orthogonal to each other, meaning they cap-
ture distinct and uncorrelated aspects of the data’s variation. Overall, PCA provides a powerful tool
for reducing the dimensionality of data while retaining essential information, facilitating visualization,
exploration, and analysis tasks.
4. Given that
P (X|Y ) × P (Y )
P (Y |X) =
P (X)
Match the following:
P (Y |X) i. Evidence
P (X|Y ) ii. Prior
P (X) iii. Posterior
P (Y ) iv. Likelihood
(A) (a → iii, b → iv, c → i, d → ii)

(B) (a → i, b → iv, c → iii, d → ii)
(C) (a → iii, b → ii, c → i, d → iv)
(D) (a → i, b → iv, c → ii, d → iii)
Solution
(A) (a → iii, b → iv, c → i, d → ii)
P(Y—X) corresponds to iii, which is the posterior.
P(X—Y) corresponds to iv, which is the likelihood.
P(X) corresponds to iii, which is the evidence.
P(Y) corresponds to iii, which is the prior.
2
5. You draw n i.i.d. random variables X1 , X2 , . . . , Xn from the distribution F , yielding the following sample:
[0, 0, 1, 1, 1, 1, 1, 1, 1, 1]. (where n = 10). Suppose distribution F = Ber(p) with unknown parameter p.
What is pMLE the Maximum Likelihood Estimator (MLE) of the parameter p?
(A) 1.0
(B) 0.5
(C) 0.8
(D) 0.2
Solution
(C) 0.8
1
Pn
pMLE = X̄ = n · i=1 Xi Given the sample:
X = [0, 0, 1, 1, 1, 1, 1, 1, 1, 1]
The number of successes (1s) in the sample is 8, and the total number of observations is 10.
Therefore, the MLE of p is the proportion of successes:
Number of successes 8
p̂MLE = = = 0.8
Total number of observations 10
6. There are two boxes. The first box contains 3 white and 2 red balls whereas the second contains 5 white
and 4 red balls. A ball is drawn at random from one of the two boxes and is found to be white. Find the
probability that the ball was drawn from the second box?
53
(a) 50
50
(b) 104
54
(c) 104
54
(d) 44
Solution
Let the first box be A and the second box be B. Then the probability of choosing one box from the two
1
is P (A) = 2 and P (B) = 12 .
As given in the question, we have to find the probability that the white ball was drawn from the second
box, denoted as P (B|W ).
3
Now, P (W |A) = 5 and P (W |B) = 59 .
According to Bayes’ Theorem, we know that:
P (W |B) × P (B)
P (B|W ) =
P (W |B) × P (B) + P (W |A) × P (A)
Substituting the given probabilities:
5 1
9 × 2
P (B|W ) = 5 1 3 1
9 × 2 + 5 × 2
3
5 1
9 × 2
P (B|W ) = 5 1 3 1
9 × 2 + 5 × 2
5
18
P (B|W ) = 5 3
18 + 10
5
18
P (B|W ) = 50 54
180 + 180
5
18
P (B|W ) = 104
180
50
P (B|W ) =
104
50
(B)
104
7. In order to find the maximum likelihood estimator (MLE) of a parameter, we find the likelihood function,
maximize it with respect to θ, and solve for θ. What steps do we do to maximize the likelihood function,
and solve for θ?
(a) We take the integral of the likelihood function with respect to θ, set it equal to zero, and solve for θ.
(b) We take the derivative of the likelihood function with respect to θ, set it equal to zero, and solve for
θ.
(c) We set the likelihood function equal to zero, and solve for θ.
(d) Take the natural log of the likelihood function, set it equal to zero, and solve for θ.
Solution
(B)
We take the derivative of the likelihood function with respect to θ, set it equal to zero, and solve for θ.
8. Given a full deck of cards you attempt to draw from it with the goal of getting a queen. You replace the
card and reshuffle the deck at every instance. Which of the following statements is true?
(A) The experiments are independent and identically distributed.

(B) This is an example of a Bernoulli trials.
(C) These are not Bernoulli trials due to the presence of multiple classes of cards in the outcome.
(D) The probability of getting a queen is different from the probability of getting a jack.
Solution
(A),(B)
A. The above experiments fits the definition of i.i.d.
B. There are 2 outcomes: draw a queen or not draw a queen. So the experiment is Bernoulli
C. As clarified in Option B
4 1
D. Both have the same probability P (Q) = P (J) = 52 = 13 .
4
[Common Data for next two questions: Q9-Q10.]
You are given a set of data points by the array:
X = [5.970, 4.700, 6.307, 7.950, 4.445, 4.504, 8.108, 6.551, 4.016, 6.015]
You decide to model it with a Gaussian distribution and perform maximum likelihood estimation.
9. What is the mean?
(A) 6.493
(B) 5.778
(C) 5.857
(D) 4.835
Solution
1
PN
µM L = N n=1 xn
Given the data points:
X = [5.970, 4.700, 6.307, 7.950, 4.445, 4.504, 8.108, 6.551, 4.016, 6.015]
To find the mean:

n
1X
Mean = xi
n i=1
Substituting the values:

5.970 + 4.700 + 6.307 + 7.950 + 4.445 + 4.504 + 8.108 + 6.551 + 4.016 + 6.015
Mean =
10
58.6
Mean = = 5.86
10
(C) 5.857
10. What is the standard deviation?
(A) 1.368
(B) 2.124
(C) 0.988
(D) 1.562
Solution
Given the data points:
X = [5.970, 4.700, 6.307, 7.950, 4.445, 4.504, 8.108, 6.551, 4.016, 6.015]
1. Calculate the mean (x̄):

10
1 X
Mean = xi
10 i=1
5
5.970 + 4.700 + 6.307 + 7.950 + 4.445 + 4.504 + 8.108 + 6.551 + 4.016 + 6.015
=
10
58.6
= = 5.86
10
2. Calculate the sum of the squares of the differences between each data point and the mean:
10
X
(xi − x̄)2 = (0.1034)2 + (−1.1666)2 + (0.4404)2 + (2.0834)2
i=1
+ (−1.4216)2 + (−1.3626)2 + (2.2414)2 + (0.6844)2

+ (−1.8506)2 + (0.1484)2
= 0.01069 + 1.35967 + 0.19378 + 4.33957 + 2.02176
+ 1.85779 + 5.02183 + 0.46832 + 3.42686 + 0.02202
= 18.72159
3. Now, divide the sum obtained by N , where N = 10 (number of data points), to calculate the variance:
18.72159
Variance = = 1.872159
10
4. Finally, take the square root of the variance to find the standard deviation:
√
Standard Deviation = 1.872159 ≈ 1.368
(A) 1.368
6
Question Number Key/Answer
1 B
2 B
3 A,B,C,D
4 A
5 C
6 B
7 B
8 A,B
9 C
10 A

MLESA v2024 Week10 Assignment Solution

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MLESA v2024 Week10 Assignment Solution

Uploaded by

Copyright:

Available Formats

MLESA Week-10 Assignment

Match the following:

(A) (a → iii, b → iv, c → i, d → ii)

(A) The experiments are independent and identically distributed.

9. What is the mean?

To find the mean:

Substituting the values:

10. What is the standard deviation?

1. Calculate the mean (x̄):

+ (−1.4216)2 + (−1.3626)2 + (2.2414)2 + (0.6844)2

You might also like