Professional Documents
Culture Documents
Swarna Chaudhary
BITS Assistant Professor, CSIS
swarna.chaudhary@pilani.bits-pilani.ac.in
Pilani
Pilani Campus
• The slides presented here are obtained from the authors of the books and from
various other contributors. I hereby acknowledge all the contributors for their
material and inputs.
• I have added and modified a few slides to suit the requirements of the course.
2
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Session Content
• Objective of course
• Evaluation Plan
• What is Machine Learning?
• Why use Machine Learning?
• Application areas of Machine Learning
• Types of Learning
• Challenges of Machine Learning
4
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Course
Details
LO1 High-level conceptual understanding of machine learning field and its
applicability and limitations
LO2 Ability to frame a machine learning problem and select candidate machine
learning models
LO3 Ability to design and implement an end-to-end machine learning based
solution using Python and appropriate libraries and do trouble shooting
LO4 Ability to evaluate and interpret the results from a machine learning
system
Books & References
Dataset Sources
https://archive.ics.uci.edu/ml/index.php
https://www.kaggle.com/datasets
https://data.gov.in/catalogsv2
Machine Learning
Saturday, September 3, 2
Learning Problems
Example
T: Recognizing hand-written words
P: Percentage of words correctly classified
E: Database of human-labeled images of
handwritten words
12
Where does ML fit in?
Solution
Write a detection algorithm for frequently appearing patterns in
spams Test and update the detection rules until it is good enough.
Challenge
Detection algorithm likely
to be a long list of complex
rules hard to maintain.
17
BITS Pilani, Pilani Campus
Why Machine Learning?
Machine Learning Approach to Spam Filtering
Automatically learns phrases that are good predictors of spam by detecting
unusually
frequent patterns of words in spams compared to “ham”s
The program is much shorter, easier to maintain, and most likely more
accurate.
18
BITS Pilani, Pilani Campus
Why Machine Learning?
Machine Learning Approach - Spam Filtering
Automatically learns phrases that are good predictors of spam by detecting
unusually
frequent patterns of words in spams compared to “ham”s
The program is much shorter, easier to maintain, and most likely more
accurate.
19
BITS Pilani, Pilani Campus
A Classic example of ML Task
20
Types of Learning
• Supervised (inductive) learning
– Given: training data, desired outputs (labels)
• Unsupervised learning
– Given: training data (without desired outputs)
• Semi-supervised learning
– Given: training data + a few desired outputs
• Reinforcement learning
– Given: rewards from sequence of actions
9
September Arctic Sea Ice Extent
8
7
(1,000,000 sq km)
6
5
4
3
2
1
0
1970 1980 1990 2000 2010 2020
Year
Data from G. Witt. Journal of Statistics Education, Volume 21, Number 1 (2013) Slide Credit: Eric Eaton
Regression
y = f(x)
output prediction features
function
1(Malignant)
0(Benign)
Tumor Size
Predict Benign
Predict Malignant
Tumor Size
Slide Credit: Eric Eaton
Supervised Learning Applications
• Face Detection.
• Spam detection.
• Weather forecasting.
• Predicting housing prices based on the prevailing market price.
• Stock price predictions,
25
Supervised Learning Techniques
• Linear Regression
• Logistic Regression
• Naïve Bayes Classifiers
• Support Vector Machines (SVMs)
• Decision Trees and Random Forests
• Neural networks
26
BITS Pilani, Pilani Campus
Unsupervised Learning
• Given x1, x2, ..., xn (without labels)
• Output hidden structure behind the x’s
– E.g., clustering
• Kernel PCA
28
BITS Pilani, Pilani Campus
Reinforcement Learning
• No predefined data
• Semi-supervised learning model in machine learning,
• Allow an agent to take actions and interact with an environment so as to
maximize the total rewards.
• Examples:
– Autonomous cars
– Game playing
– Robot in a maze
31
Types of Learning
32
Challenges of Machine Learning
• Training Data
– Insufficient
– Non representative
– Poor Quality
– Irrelevant attributes
• Model Selection
– Overfitting
– Underfitting
• Testing and Validation
– Hyperparameters
missing data
available data
Smaller slope
High-order
polynomial
PyTorch Linux, Mac OS, Python, C++ Autograd Module, Optim Module
Windows NN Module
TensorFlow Linux, Mac OS, Python, C++ Provides a library for dataflow
Windows programming.