Professional Documents
Culture Documents
ML 1 Lecture 1
ML 1 Lecture 1
Predict a number
• One of the columns is the label (also called output or target)– either
numerical (regression) or categorical (classification) -
Classification is typically binary
• Definition: The algorithm is trained on a labeled dataset - The goal
is to learn the mapping between inputs and outputs
• Some Examples: Linear regression, decision trees, support vector
machines, and neural networks
Cluster of Pepper
Cluster of Eggplants
Cluster of
Onions
• Objective:
• Learn a mapping from the features to the label (in case of SL)
• Learn a mapping from the features to the clusters (in case of USL)
• Learn a mapping from states to actions (in case of RL)
• There are many possible such mappings
• Specifically, each ML algorithm has its own mapping
• A mapping can be also called a hypothesis – because the data scientist
hypothesizes about a possible mapping for his/her data
• No one knows the best hypothesis
May 15, 2024 CSE602 - Machine Learning I - Dr. Tariq Mahmood 17
Here is the ML scene
• The trained algorithm is first tested on the data acquired (in house testing)
• Then it is deployed in real-time for testing (ML Ops process)
• Performance typically drops in this case – because the data patterns on
which the ML algorithm has been trained are not exactly the same as
those present in the training data
• This is a natural and a regular thing - so live performance deteriorates–
but the ML engineer should adapt by updating the model – needs to find
an excellent middle ground
May 15, 2024 CSE602 - Machine Learning I - Dr. Tariq Mahmood 31
Live Testing
• Middle ground – keep on updating the model with new data (which always
keeps on gathering) but with some policy (not randomly)
• When to update? How to update? How much time to take for update?
• When a reasonable performance starts to be obtained in live testing data, then
we say that the model has “generalized”
• That is, the model will be now able to give a reasonable predictive
performance in real-time for a long time
• It will fight out the small variations in patterns which occur in live testing
from time to time – and continue to give a reasonable predictive performance
May 15, 2024 CSE602 - Machine Learning I - Dr. Tariq Mahmood 32
What are Hyperparameters of ML Algos?
• Remember that 20% of the data maybe controlling the other 80%
(pareto rule)
• If you can get hold of this 20% in training phase, maybe the model
can generalize well in real-time (live testing)
• But maybe a very simple model can get hold of this 20% or more
data – not necessary to use the state of the art algorithm
• Cross-Validation
• Regularization
• Feature-Engineering
• Occams Razor