You are on page 1of 16

​ 1 Give Algorithm for EM technique? Specify the formulas used clearly.


​ 2 What is curse of dimensionality?







​ 3 Why is Dimensionality Reduction important? Comment on your
answer with an example.




​ 4 Cluster the given data using E-M algorithm

○ 12,14,9,6,4,15
Assume learning constant = 1. Initial hypothesis as 12, 14

​ 5 Specify in PCA algorithm with explanation of each step.

​ (PCA notes page 11)
○ Preprocessing
○ Calculate sigma (covariance matrix)
○ Calculate eigenvectors with svd
○ Take k vectors from U (Ureduce= U(:,1:k);) o Calculate z (z
=Ureduce' * x;)










​ 6 What is SVD? Compare SVD with PCA.




7.Following is the output of E-step in the EM algorithm (2 clusters)

A]. Identify the values for E[z12], E[z22], E[z32], E[z42], E[z52]
​ B]. Find new hypothesis





8.What is idea behind Classification and regression trees(cart)? What is
the advantage of using CART?

➢ Idea:

• Partition the input space into Rectangles.

• Partition into rectangles by axis parallel lines

➢ Advantages

• Can handle both numerical and categorical data. Can also


handle multi-output problems.

• Nonlinear relationships between parameters do not affect tree


performance.
​ 9 Explain how regions are formed in CART with example


➢If it is regression problem:

○ Single real valued o/p for each region.


○ No matter were the data point falls in the region.

➢If it is classification problem:

○ Same class label


○ No matter were the data point falls in the region.


​ 10 Having the regions how to decide which value to output for a given
region.



​ 11 Explain pruning of trees with example.




​ 12 What are ensembles in ML? What is the reason of using
ensembles?
​ • Diverse set of models gives better decisions in comparison to single
models.
​ • This diversification in Machine Learning is achieved by a technique
called Ensemble Learning.
​ • Ensemble: is a machine learning technique that combines several
base models in order to produce one optimal predictive model.
​ • Reason of using ensemble:
○ ➢ Weak learners which fail to converge.
○ ➢Model might perform well on some data and less accurate on
other.

​ 13 What is bagging? Explain the working of bagging technique.
​ • Bagging is also called bootstrap aggregation.
​ • Bootstrapping: create subsets of observations from the original
dataset, with replacement.

○ • Implementation Steps of Bagging
■ • Step 1: Multiple subsets are created from the original data
set with equal tuples, selecting observations with
replacement.
■ • Step 2: A base model is created on each of these
subsets.
■ • Step 3: Each model is learned in parallel with each
training set and independent of each other.
■ • Step 4: The final predictions are determined by combining
the predictions from all the models.

​ 14 Explain the working of random forest algorithm.
​ • Random Forest Classifier or regressor is a bagging technique.
​ • A forest is comprised of trees, similarly random forest classification is
​ a collection of different decision trees.

○ Implementation of Random Forest
■ • Step 1. Select random samples from a given dataset.
■ • Step 2. Construct a decision tree for each sample and get
a prediction result from each decision tree.
■ • Step 3. Select the prediction result with the most votes as
the final prediction

​ 15 List various algorithms used in bagging and boosting.


​ Algorithms:
​ • Bagging algorithms:
○ ➢Bagging meta-estimator
○ ➢Random forest
​ • Boosting algorithms:
○ ➢ AdaBoost
○ ➢Gradient Boosting (GBM)
○ ➢XGBoost
○ ➢Light XGBoost ➢CatBoost

​ 16 What is boosting? Explain the working of boosting technique.


​ • Boosting is an ensemble modeling technique that attempts to build a
strong classifier from the number of weak classifiers.
​ • It is done by building a model by using weak models in series.
○ Boosting
■ 1.A subset is created from the original dataset.
■ 2.Initially, all data points are given equal weights.
■ 3.A base model is created on this subset.
■ 4.This model is used to make predictions on the whole
dataset.
■ 5.Errors are calculated using the actual values and
predicted values.
■ 6.The observations which are incorrectly predicted, are
given higher weights.
■ 7.Another model is created and predictions are made on
the dataset.
■ 8.Similarly, multiple models are created, each correcting
the errors of the previous model.
■ 9.The final model (strong learner) is the weighted mean of
all the models (weak learners).

​ 17 Compare bagging and boosting techniques.



​ 18 Explain K-fold cross-validation with example.
​ K-Fold Cross validation:
​ • Step 1. Randomly split the entire dataset into k folds/subsets.
​ • Step 2. In each iteration(or kth round), train the model using (k – 1)
folds of the dataset and validate/test the model using the kth fold.
​ • Step 3. Calculate the accuracy for this iteration.
​ • Step 4. Repeat this process until each of the k-
​ folds has served as the validation/test set.
​ • Step 5. Take the average of all k such accuracies to get the final
validation accuracy.

​ 19 Explain Stratified K-fold cross-validation with example.




​ 20 Define support vectors. What is the importance of support vectors?

​ 21 Why SVM’s are called Maximum margin separators?
​ • To separate the two classes of data points, there are many possible
hyperplanes that could be chosen.
​ • Our objective is to find a plane that has the maximum margin, i.e the
maximum distance between data points of both classes.
​ • Maximizing the margin distance provides some reinforcement so that
future data points can be classified with more confidence.
​ • So, SVM’s also called as Maximum Margin Seperators.

​ 22 What are hyperplanes?
​ - Hyperplanes are decision boundaries that help classify the data points.
​ - The dimension of the hyperplane depends upon the number of
features.



​ 22 Describe quadratic programming problem in SVM.

​ 23 What is kernel trick? List various kernel function used in SVM.


​ 24 Numerical on SVM.






25 What is linearly separable and non-linearly separable data .Support
​ answer with diagram.



​ 26 Write a short note on XGBoost.
​ (Not in the ppt…..geeks for geeks)
​ XGBoost is an implementation of Gradient Boosted decision trees.
XGBoost models majorly dominate in many Kaggle Competitions.

​ In this algorithm, decision trees are created in sequential form. Weights
play an important role in XGBoost. Weights are assigned to all the
independent variables which are then fed into the decision tree which
predicts results.
​ The weight of variables predicted wrong by the tree is increased and
these variables are then fed to the second decision tree.
​ These individual classifiers/predictors then ensemble to give a strong
and more precise model. It can work on regression, classification,
ranking, and user-defined prediction problems.


​ 27 Short note on support vector regressors.

You might also like