Introduction to Machine
Learning
UNIT - I
AIL302: Machine Learning,
T.Y.B.Tech CSE( AIML )
DKTE’s TEI
How do we learn…?
Visual : Learn by observation
Auditory: Learn by Listening
Kinesthetic: Learn by doing
2
How you learn to -
1. Buy good quality mangos…?
1. Choose your best friend…?
1. To make a yummy cake…?
3
4
How to
pass
experience
to
machines…?
5
Machine Learning: Definition
“Machine Learning (ML) is a subset of artificial
intelligence (AI) that focuses on the development
of algorithms and statistical models that enable
computers to learn from and make predictions or
AI Machine decisions based on data without explicit
Learning programming. In other words, it is a method of
teaching computers to learn and improve from
experience automatically”
6
Types of ML
Machine Learning
Supervised Unsupervised Reinforcement
Regression Classification Clustering Association
7
Supervised Learning
● Machines are trained using well "labelled" training data
● Trained data work as a “supervisor”
● The aim of a supervised learning algorithm is to find a mapping function to map the input
variable(x) with the output variable(y)
● We can have an exact idea about the classes of objects
● Risk Assessment, Image classification, Fraud Detection, spam filtering, etc.
● But NOT suitable for complex tasks and where training input is unlabeled data
8
How Supervised Learning Works ?
Training Prediction
Output
WhatsApp Model “Facebook”
Facebook
Instagram
Labeled Data Test Data
9
Regression
10
Classification
11
Unsupervised Learning
● Models are not supervised using training dataset
● Models itself find the hidden patterns and insights from the given data
● The input data has no corresponding output data (Unlabeled Data)
● Goal is to find the underlying structure of dataset, group that data according to similarities, and
represent that dataset in a compressed format
● Helpful for finding useful insights from the data
● Used for more complex tasks
● Closer to AI
12
How Unsupervised Learning Works ?
Process
A
Interprete
Model
B
13
Clustering
14
15
Association: e.g Shopping basket analysis
+ +?
16
17
ML System Architecture
Refers to the high-level design of the entire machine learning system, including data pipelines, data
storage, model training, model deployment, and inference
Encompasses the end-to-end flow of data and how machine learning components interact with other
parts of the system
18
ML Model Architecture
Refers to the design of the machine learning model itself
Decision about type of Model :
- NN (ANN, CNN, RNN)
No. of Layers Activation Function, Connections
- SVM
- Decision Tree
The model architecture is crucial as it determines the model's capacity to learn and its ability to
generalize to new data
19
ML Model Architectures : Neural Networks
20
X1
X2
y Output
X3
X4
21
Input Layer Hidden Layer Output Layer
ML Model Architectures : Support Vector Machines
22
ML Model Architectures : Decision Tree
23
Architectural Patterns in Machine Learning
Refer to the common design principles used to structure and organize the components of the system.
Some common architectural patterns in machine learning systems include:
● Pipeline
● Microservices
● Model-View-Controller (MVC)
● Event-Driven
● Serving
24
Pipeline Architecture Pattern
Collect
Store
Enrich
Train
Visualize
25
Microservices Architecture Pattern
26
MVC Architecture Pattern DB
Model
Controller
View
Presentation
27
Event Driven Architecture Pattern
28
Serving Architecture
29
Machine Learning Process
1. Define the Problem
2. Collect the Data
3. Prepare the Data
4. Split Data - Training and Testing
5. Select Algorithm
6. Train the Algorithm
7. Evaluate the Test Data
8. Parameter Tuning
9. Start using your Model
30
1. Define the Problem
a. What is the problem ? (Business Problem)
Does this problem really need ML approach?
Task(T): —---------
Experience(E): —----------
Performance(P): —------(error rate)
a. Why does the problem need to be solved ?
motivation & benefits, (more focused on business side)
31
2. Collect the Data
● Different ways to collect the data - (Review, Ratings, Surveys, Scraping the website, use of
provided APIs)
● Sometime labels are also collected with data
● BIG data
● Right data is key to solve any ML problem
32
3. Prepare the Data (Feature Processing)
I. Selection / Filtering
II. Cleaning
III. Formatting
IV. Sampling
V. Decomposition
VI. Scaling
33
4. Split Data: Training and Testing
OR
34
5. Algorithm Selection based on Categories
1. Regression
2. Classification
3. Clustering
4. Dimensionality Reduction
5. Anomaly Detection
6. Recommendation System
7. NLP
8. Transfer Learning
35
6. Train and Tune the Model
Compare
Model Predicted Actual
Input
Output Output
Adjust Cost
Weight
Function
36
7. Use Trained Model
Performance…?
Trained
Input Model Output
37
Performance Measures
1. Accuracy = (Number of Correct Predictions) / (Total Number of Predictions)
2. Precision = (True Positives) / (True Positives + False Positives)
3. Recall = (True Positives) / (True Positives + False Negatives)
4. F1 Score = 2 * (Precision * Recall) / (Precision + Recall)
5. Specificity = (True Negatives) / (True Negatives + False Positives)
6. MAE = (1/n) * Σ|predicted - actual|
7. MSE = (1/n) * Σ(predicted - actual)^2
8. RMSE = √MSE 38
Tools & Frameworks
● Scikit-learn
● TensorFlow
● Keras
● PyTorch
● Caffe
● MXNet
● Microsoft Cognitive Toolkit (CNTK)
● Microsoft Azure ML
● H2O.ai
● RapidMiner
39
40
41
42
43
The Kernel Trick
LD
Kernel
HD
44
Data Visualization
Data visualization is a graphical representation of data
Commonly used Techniques Useful Python Libraries -
● Matplotlib
● Scatter Plots
● Seaborn
● Line Plots
● Bar Charts ● Plotly
● Histograms ● Pandas
● Box Plots
● Heatmaps
● Pair Plots
● Violin Plots
● Area Charts
● 3D Plots
45