You are on page 1of 17

💜

MACHINE LEARNING
1. Define machine learning. Which are different applications of ML? what
is difference between traditional programming and ML?

💡 Machine learning is an application of AI that enables systems to learn


and improve from experience without being explicitly programmed.
Machine learning focuses on developing computer programs that can
access data and use it to learn for themselves.

applications of ML

https://www.javatpoint.com/applications-of-machine-learning

difference between traditional programming and ML

https://www.enjoyalgorithms.com/blog/introduction-to-machine-learning

2. Which are different methods of learning? Give one example of each


method.
https://www.javatpoint.com/machine-learning-techniques

3. Compare classification and regression.

https://www.javatpoint.com/regression-vs-classification-in-machine-learning

4. Define the terms variance and bias. Explain trade-off between variance
and bias?

bias

difference between the average prediction of our model and the correct
value which we are trying to predict.

high bias ⇒ over simplified data


Variance

MACHINE LEARNING 1
variability of model prediction for a given data point or a value which tells
us spread of our data

high variance ⇒ no generalization of data


mathematically

💡 formula

Bias and variance using bulls-eye diagram

Bias Variance Tradeoff

MACHINE LEARNING 2
high bias and low variance ⇒ simple model with few parameters
high variance and low bias ⇒ overfitting model with large number of
parameters

good balance between the two cases ⇒ tradeoff


5. Describe linear regression and non-linear regression.

linear regression

https://www.javatpoint.com/linear-regression-in-machine-learning

non-linear regression

What is Nonlinear Regression

💡 a regression analysis where the regression model portrays a


nonlinear relationship between a dependent variable and
independent variables.

experimental data are mapped to a model

mathematical function representing variables in a nonlinear relationship


is formed and optimized

flexible

no assumption of data linearity

accommodates diverse types of curves

parametric or non-parametric

accommodate multiple response variables

Example
non linear relationship between gold and US CPI inflation and
currency depreciation in many countries

gold price ⇒ dependent variable


inflation ⇒ independent variable

result ⇒ inflation impacts the gold price

MACHINE LEARNING 3
gold prices are affected the most by inflation

gold prices can control inflation instability too

Application
forestry research ⇒
power function to relate tree volume or weight in
relation to its diameter or height

chemistry ⇒ wide-range colorless gas


research & development ⇒ formulation of the problem and deriving
statistical solutions

insurance ⇒ computation of IBNR reserves


agriculture ⇒ crops and soil processes

6. Explain multivariate regression. Write down steps of multivariate


regression. What are advantages and disadvantages of multivariate
regression?

multivariate regression

💡 a technique that estimates a single regression model with more than


one outcome variable. When there is more than one predictor variable
in a multivariate regression model, the model is a multivariate multiple
regression.

steps of multivariate regression


Step 1: Select the features

select that one feature

responsible for the change in your dependent variable

Step 2: Normalize the feature

scale them in a certain range

Step 3: Select loss function and formulate a hypothesis

means a predicted value of the response variable

loss function ⇒ calculated loss when wrong value is predicted

MACHINE LEARNING 4
cost function ⇒ cost for these wrong predictions
Step 4: Minimize the cost and loss function

dependent on each other

use minimization algorithms

Step 5: Test the hypothesis

test set is used to check the accuracy and correctness

Advantages
helps you find a relationship between multiple variables

defines the correlation between variables.

Disadvantages
requires high-level mathematical calculations

complex

output is difficult to analyze

loss uses errors in output

good for large datasets

7. What are MSE and RMSE?


https://www.i2tutorials.com/differences-between-mse-and-rmse/

8. Compare linear regression and logistic regression.

https://www.geeksforgeeks.org/ml-linear-regression-vs-logistic-regression/

9. What is VIF? How do you calculate it?

https://www.investopedia.com/terms/v/variance-inflation-factor.asp

10. What is Gradient descent?

What is gradient descent?

MACHINE LEARNING 5
💡 optimization algorithm which is commonly-used to train machine
learning models and neural networks

How does gradient descent work?

based on convex function

starting point = arbitrary point

find slope from starting point

steepness measure ⇒ tangent line


slope⇒ inform updates to parameters
goal ⇒ minimize the cost function

Learning rate

size of the steps that are taken to reach the minimum

high learning rate⇒ larger steps and overshooting minimum


low learning rate ⇒ small step sizes

cost function

measures the error

improves ML model’s efficacy to adjust the error

iterations till cost function is close to 0

MACHINE LEARNING 6
calculates avg error for the entire training set

Types of Gradient Descent


Batch gradient descent

sums the error for each point in a training set, updating the model only
after all training examples have been evaluated

long processing time

a stable error gradient and convergence

Stochastic gradient descent

runs a training epoch for each example within the dataset and it updates
each training example's parameters one at a time

more detailed and speed

loss in computational efficiency

may result in noisy gradients

Mini-batch gradient descent

combination of both

splits the training dataset into small batch sizes and performs updates
on each of those batches.

balance between efficiency and speed

Challenges with gradient descent


1. Local minima and saddle points

a. nonconvex problems ⇒ struggle to find the global minimum


b. Local minima mimic the shape of a global minimum where the slope
of the cost function increases on either side of the current point.

c. with saddle points, the negative gradient only exists on one side of
the point, reaching a local maximum on one side and a local
minimum on the other.

2. Vanishing and Exploding Gradients

a. Vanishing gradients:

i. occurs when the gradient is too small

MACHINE LEARNING 7
ii. gradient continues to become smaller

iii. results in slow learning

b. Exploding gradients:

i. gradient is too large

ii. model weights will grow too large

iii. unstable model

iv. solution ⇒ leverage a dimensionality reduction


11. What are the disadvantages of linear regression?
https://www.geeksforgeeks.org/ml-advantages-and-disadvantages-of-linear-
regression/

12. What is overfitting? What is the use of regularization?

💡 common link for both - https://www.geeksforgeeks.org/underfitting-


and-overfitting-in-machine-learning/

💡 Overfitting
is a modeling error that occurs when a function or model is too closely
fit the training set and getting a drastic difference of fitting in test set.

If our model does much better on the training set than on the test set,
then we’re likely overfitting.

How to prevent Overfitting?

1. Training with more data

2. Data Augmentation

3. Cross-Validation

4. Feature Selection

5. Regularization

regularization

MACHINE LEARNING 8
https://www.javatpoint.com/regularization-in-machine-learning

13. Explain SVM algorithm for classification.

https://www.javatpoint.com/machine-learning-support-vector-machine-algorithm

14. What is Linear discriminant analysis and PCA?

💡 youtube link https://youtu.be/azXCzI57Yfc

LDA

focuses on maximizing separability among known categories

e.g., gene analysis for cancer drug.

creation of new axis ⇒ maximize distance between two means for 2


categories

minimize the variations

more than 2 dimensions ⇒ same procedure for creating graph

💡 https://www.geeksforgeeks.org/ml-linear-discriminant-analysis/

PCA [principal component analysis]

💡 youtube - https://youtu.be/83x5X66uWK0

overfitting problem resolution

reducing dimensionality ⇒ purpose


find principal components ⇒ find views

no. of principal components <= no. of attributes

high priority ⇒ PC1


orthogonal property ⇒ PCs must be independent of each other

MACHINE LEARNING 9
https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-
component-analysis#:~:text=The Principal Component Analysis
is,plotting in 2D and 3D.

15. Why is LDA important?

💡 find the answer on youtube

16. Write names of different dimensionality reduction methods? Explain


any one method.

https://www.upgrad.com/blog/top-dimensionality-reduction-techniques-for-
machine-learning/

17.Compare between single layer perceptron and multi layer perceptron.

https://www.i2tutorials.com/what-is-single-layer-perceptron-and-difference-
between-single-layer-vs-multilayer-perceptron/

18. How does Gradient descent help in minimizing the cost function?

https://towardsdatascience.com/minimizing-the-cost-function-gradient-descent-
a5dd6b5350e1

19. Write Back propagation algorithm.

https://towardsdatascience.com/understanding-backpropagation-algorithm-
7bb3aa2f95fd

20. Describe MLE and MAP.

MLE

https://analyticsindiamag.com/how-is-maximum-likelihood-estimation-
used-in-machine-learning/#:~:text=By Sourabh Mehta-,Maximum
Likelihood Estimation (MLE) is a probabilistic based approach to,panel
data and discrete data.

MAP

💡 https://youtu.be/TSMJ-QRnk54

MACHINE LEARNING 10
https://towardsdatascience.com/what-is-map-understanding-the-statistic-
of-choice-for-comparing-object-detection-models-1ea4f67a9dbd

21. Write down the applications of ANN.

https://www.geeksforgeeks.org/artificial-neural-networks-and-its-applications/

22. Define learning rate in neural network. How to choose learning rate for
optimization problem?

💡 The learning rate, denoted by the symbol α, is a hyper-parameter


used to govern the pace at which an algorithm updates or learns the
values of a parameter estimate.

https://towardsdatascience.com/learning-rate-a6e7b84f1658

23. Define the terms Training, Activation function, Weights and loss
function in ANN.

Training

💡 A machine learning training model is a process in which a machine


learning (ML) algorithm is fed with sufficient training data to learn from.

Activation function

💡 https://www.geeksforgeeks.org/activation-functions-neural-networks/

Weights

MACHINE LEARNING 11
💡 Weight is the parameter within a neural network that transforms input
data within the network's hidden layers. A neural network is a series of
nodes, or neurons. Within each node is a set of inputs, weight, and a
bias value

Loss function

💡 https://www.geeksforgeeks.org/ml-common-loss-functions/

24. Explain feed forward neural network.


https://www.turing.com/kb/mathematical-formulation-of-feed-forward-neural-
network

25. What is activation function in ANN? Describe the sigmoid activation


function and Tanh activation function used in ANN.

💡 https://www.geeksforgeeks.org/activation-functions-neural-networks/

26. How does gradient descent help in minimizing the cost function?
https://towardsdatascience.com/machine-leaning-cost-function-and-gradient-
descend-75821535b2ef

27. How does the decision tree algorithm works? Give one example.
https://www.geeksforgeeks.org/decision-tree-introduction-example/

28. Which are the attribute selection measures in decision tree? Explain.

MACHINE LEARNING 12
https://www.kdnuggets.com/2020/01/decision-tree-algorithm-explained.html

29. What is mean by pruning? Which are different techniques used for
pruning?

https://www.kdnuggets.com/2022/09/decision-tree-pruning-hows-whys.html

https://analyticsindiamag.com/what-is-pruning-in-tree-based-ml-models-and-
why-is-it-done/

30. Which are the advantages and disadvantages of decision tree?

https://www.jigsawacademy.com/blogs/data-science/decision-tree-in-
machine-learning/

31. Define the terms overfitting, underfitting, regularization.

overfitting & underfitting:

https://www.javatpoint.com/overfitting-and-underfitting-in-machine-
learning#:~:text=Overfitting occurs when our machine,and accuracy of
the model.

regularization

https://www.javatpoint.com/regularization-in-machine-learning

32. Which are different cross validation methods? Explain two cross
validation methods.

https://www.geeksforgeeks.org/cross-validation-machine-learning/

33. What is Bootstrapping? Which steps are used in bootstrapping?


Explain parametric and non parametric bootstrapping with example.

https://analyticssteps.com/blogs/bootstrapping-method-types-working-and-
applications

parametric bootstrap example -

model the uncertainty about the population mean using parametric


bootstrapping.

https://www.vosesoftware.com/riskwiki/TheparametricBootstrap.php

non parametric bootstrap example -

MACHINE LEARNING 13
To estimate the uncertainty about the population standard deviation
using non-parametric bootstrap,

https://www.vosesoftware.com/riskwiki/ThenonparametricBootstrap.php

34. Explain different ensemble learning techniques.

https://www.analyticsvidhya.com/blog/2018/06/comprehensive-guide-for-
ensemble-models/

35. What are the advantages and disadvantages of random forest learning
algorithm?

💡 read advantages and disadvantages only

https://www.mygreatlearning.com/blog/random-forest-algorithm/

36. Write an algorithm for partition clustering and hierarchical clustering.


Mention example of each method.

k means clustering

💡 K-means clustering can be used in almost every domain, ranging


from banking to recommendation engines, cyber security,
document clustering to image segmentation. It is typically
applied to data that has a smaller number of dimensions, is
numeric, and is continuous.

MACHINE LEARNING 14
💡 YouTube - https://youtu.be/CLKW6uWJtTc

💡 for algorithm - https://www.tutorialspoint.com/what-are-


the-types-of-the-partitional-algorithm

💡 for flowchart and info -


https://www.geeksforgeeks.org/partitioning-method-k-
mean-in-data-mining/

hierarchical clustering

https://www.geeksforgeeks.org/hierarchical-clustering-in-data-mining/

💡 YouTube - https://youtu.be/7enWesSofhg

37. Write following algorithms

Birch algorithm

https://www.javatpoint.com/birch-in-data-mining

HMM algorithm

https://www.jigsawacademy.com/blogs/data-science/hidden-markov-
model

CURE algorithm

https://www.geeksforgeeks.org/basic-understanding-of-cure-algorithm/

38. Let’s say you are building a model that detects whether a person
has diabetes or not. After the train-test split, you got a test set of length 100,
out of which 70 data points are labelled positive (1), and 30 data points are
labelled negative (0). Draw confusion matrix based on the given data.
Calculate True positive rate, True negative rate, False positive rate and False
negative rate.

MACHINE LEARNING 15
https://www.kdnuggets.com/2020/09/performance-machine-learning-
model.html

39. Design a system for human activity recognition.

https://www.geeksforgeeks.org/human-activity-recognition-using-deep-
learning-model/

40. What is reinforcement learning? Explain working of reinforcement


learning. Write an algorithm for reinforcement learning.

what is reinforcement learning

💡 Reinforcement learning is an area of Machine Learning. It is about


taking suitable action to maximize reward in a particular situation. It is
employed by various software and machines to find the best possible
behavior or path it should take in a specific situation.

working of reinforcement learning

https://www.synopsys.com/ai/what-is-reinforcement-
learning.html#:~:text=How Does Reinforcement Learning
Work,maximization of expected cumulative reward.

algorithm for reinforcement learning

https://www.guru99.com/reinforcement-learning-
tutorial.html#reinforcement-learning-algorithms

💡 check this out on youtube

41. Write working of expectation maximization (EM) algorithm. What is


convergence in the EM algorithm? What are advantages and disadvantages
of EM?

algorithm - https://www.geeksforgeeks.org/ml-expectation-maximization-
algorithm

convergence - https://arxiv.org/pdf/1611.00519.pdf

MACHINE LEARNING 16
42. Write an algorithm for GMM.

https://towardsdatascience.com/gaussian-mixture-modelling-gmm-
833c88587c7f

43. What are ensemble methods? Which are different types of ensemble
methods?

https://towardsdatascience.com/ensemble-methods-in-machine-learning-
what-are-they-and-why-use-them-68ec3f9fef5f

44. Design a neural network to solve XOR problem.

https://towardsdatascience.com/how-neural-networks-solve-the-xor-problem-
59763136bdd7

MACHINE LEARNING 17

You might also like