Machine Learning

Machine Learning Series
Machine Learning with

Python 3
By Dr. Nickholas Anting
Learning Outcome
At the end of this section, the attendees will be able to:
• Understand the fundamental concept of Machine Learning Algorithm in

data science application.
• Train ML algorithm, including regression, classification & clustering

algorithm to build the predictive model.
• Evaluate the performance of each ML model.

Machine Learning Series
Part 1
Introduction to Machine Learning

Concepts of Machine Learning
Training the algorithm using historical data to generate machine learning model
used to predict future outcome.
ML Produce
Algorithm
Historical Data Training Model
Add a footer 4
Conventional Processing
• Programmer specify set of rule that is pre-defined by existing condition.
Copyright by Dr. Nickholas 5

Processing with Machine Learning
• The algorithm learn from data to define the rule of the generated model
Train ML
Algorithm to
• Algorithm learn from purchase history data to
GENERATE
Historical understand the user purchasing behavior.
ML Model
Transactional Data
NEW_INPUT Apply
fraud
NO Complete payment
ML Model transaction?
transaction process
YES
Cancel payment

What Machine Learning can do ... ?
b
Who will buy more?
Forecasting
Price
a
Size

Fraud Detection

Process Workflow Engineering Design
Optimization
Add a footer 9
Recommendation
System
Machine Learning
Watched Movies Recommended to watch
Add a footer 10
Types of Machine Learning
SUPERVISED LEARNING
• Samples in dataset are labelled

Label
HP MetColor FuelType Price
90 1 Diesel 13500
90 1 Diesel 13750
New Inputs
90 1 Diesel 13950
Support Vector
Car Price Dataset Linear Regression
Machine

UNSUPERVISED LEARNING
• Dataset comprises of non-labelled

samples.
K-Means Apriori
Customer Behaviour Clustering Algorithm

REINFORCEMENT LEARNING
• A rewards system – instead of minimizing the error, the model is maximizing the
objective function.
• Model become smarter by rewards.
Deep Neural
Network
Neural Network

Machine Learning Algorithm
Regression Model
• Predicting numerical output/response (Car's Price)
• Have response
• Linear regression using OLS algorithm
Supervised Learning
Classification Model
• Predicting output in the form of class (Buy or Not Buy)
• Logistic regression/Naive Bayes...
Clustering Model
• Predict pattern.
• No exact response Unsupervised Learning
• K-Means, DBSCAN...
Add a footer 14
How do I start with Machine Learning ... ?
Understanding OK
Prepare Algorithm Train the Performance Saved
the
Data Selection Algorithm Evaluation Model
requirements
NOT OK
• New parameters
• New algorithm

Building the Machine Learning Model – The
Workflow
Exploration & Data Data Pre-Processing Train & Apply Model Evaluation
Preparation
Import dataset into Feature Selection

repository
Training
Handling Categorical
Algorithm
Variables Train – 80%
Data Exploration &
Observation Assigned Input &
Output Attributes Partitioning ML Model Performance
Data Cleaning Features/Inputs Test – 20%

Scaling Apply model
ML Workflow

Things Needed for Machine Learning
Data
Training algorithm
Programming Tools

APPLICATION OF MACHINE LEARNING
Machine learning model applied for predictive analytics, which will be utilized
in prescriptive modelling.
COMPUTATIONAL COMPUTATIONAL
IMAGE PROCESSING
FINANCE BIOLOGY
Credit scoring and Face recognition, Drug discovery,
algorithmic trading. motion & object DNA sequencing,
detection. etc.
ENERGY MANUFACTURING & NATURAL LANGUAGE

PRODUCTION PRODUCTION PROCESSING
Price & load Predictive Voice recognition,
forecasting. maintenance text processing

Python Packages for Machine Learning

Data Science Professional Certification
Part 4
Linear Regression Model
LINEAR REGRESSION MODEL
• Linear approximation of a relationship between two or more variables.
• Model mathematically the relationship between two or more variables (dependent & independent
variables).
• The output responses are NUMERIC values.
Price
Size a
Add a footer 21
Simple Linear Regression
• Simple relationship between one input and one output variable
• Only one input to predict output
• y-axis representing dependent value while x-axis

y
is for independent value.
• Data point is the observed value plotted in the
graph y against x.
• The line is draw based on regression equation.
Add a footer 22
Ordinary Least Square (OLS)
• Most common method to estimate the
linear regression equation. y
• Least Square stands for the minimum
squares error.
• Lower error results in a better explanatory
power of the regression model.
• The method aims to find the line which
minimizes the sum of the square error.
• There are many lines can be draw to fit the
data. By the way, the OLS determines the The best fitted line with
one with the smallest error. the smallest error.
• This bet fitted line is the one that closes to x
the all data points.
Add a footer 23
Performance Metrics: R-Squared (R2)
• Measure that widely used to describe how R2 =0.70

powerful/good the regression model. 0 1
• R-Squared measure the Goodness of Fit of
the regression model. R2 = 1 or 100% means that the model explains
the entire variability of the data.
R2 = 0 or 0% means that none of the variability

of the data.
Usually observed value of R2 is ranging

between 0.2 (20%) to 0.9 (90%).
• The values of R-Squared are ranging from 0

(0%) to 1 (100%).
Add a footer 24
0 1
y y y
x x x
Hands – On 2
Create a machine learning model to predict the Price based on
Size.
y
Input/Feature/Predictor – Size
b
Output/Response – Price
Price
dataset
real_estate.csv
x
Size a
Add a footer 26
Understanding the requirement & dataset
• Build a ML model to predict house price, Price based on Size

• The response or output variable is price
• price is a numerical output.

Machine Learning Workflow
NO
DATA TRAIN
PREPARATION ALGORITHM
Train Set
Generate
Performance
YES SAVE
OK? MODEL
ML Model
DATA Test Set PERFORMANCE

TRANSFORMATION
PARTITIONING APPLY MODEL
METRICS
MODEL EVALUATION
GENERAL PROCESS OF MACHINE LEARNING MODEL
Add a footer 28
DATA PREPARATION
A process to import dataset, over-viewing and cleansing raw dataset.
Import Data Data

dataset exploration Cleaning
Add a footer 29
Import Dataset
CSV
Excel Import
Text
Pandas data frame

Data Exploration
Descriptive Statistics
Missing Values
Duplicated Rows
Checking & Treating

Outliers
Add a footer 31
DATA PRE-PROCESSING
A series of final steps to get the dataset ready for further processing.
Handling Assign independent

Features Feature Dataset
Categorical variable and dependent
Selection scaling Partitioning
– If any variables
Add a footer 32
Assigned Input & Output Attributes
• Assign the output (response) and input (features) variables.
INPUT, x OUTPUT, y
Add a footer 33
B. Partitioning Dataset
Overall Data
100% (100) • Train set use to train the algorithm
• Test set use to validate the model

Partitioning/
Splitting
• Use random state = 0.
Train Data Test Data
80% (80) 20% (20)
Add a footer 34
Now the Data is Ready
Add a footer 35
TRAINING ALGORITHM
• Train OLS method from statsmodel packages.
Use data from Train OLS Algorithm Generate

MLR Model
Train Set for MLR

MODEL PERFORMANCE VALIDATION
Method A
R-Squared
Add a footer 37
Method B
Correlation between Predicted Output and Actual Output
using Test set
Use data from Apply Predict Predicted

MLR Model
Test Set Output
Add a footer 38
MULTIPLE LINEAR REGRESSION
Two or more independent variables are used to predict

the value of dependent variable.
Provide more good model. Address the higher

complexity of the problem.
The more variables, the more factors considering the

model.
It is not about fitting line anymore. Stop being 2D.

Cannot visualized using graph.
IT IS ABOUT THE BEST FITTING MODEL
Add a footer 39
Example – House Pricing
• Price of the house could be depending more than

House Pricing one factor.
• Those factors (independent variable) are such as

2008
area and the location.
2010
... • More than one independent variable, this
consider as Multiple Linear Regression.
Area Year
Add a footer 40
2 2
R > Adjusted R
• R-squared measures how much of the total • Penalized excessive use of variables.
variability that is explained by our model.
• Compares the explanatory power of
• Multiple regression are always better than regression models that contain different
simple regression. Increase additional numbers of predictors.
variable may increase explanatory power.
Add a footer 41
Example Eqn. 2
Eqn. 1 Add new variable
Value p-value
Value p-value R-Squared 0.407
R-Squared 0.406 Adjusted R-Squared 0.392
Adjusted R-Squared 0.399 0.000
0.000 0.762
Add a footer 42
Hands – On 2
Create a machine learning model to predict the Price based on
Size and Year.
Input/Feature/Predictor – Size, Year
Output/Response – Price
dataset
real_estate.csv
Add a footer 43
FEATURES/INPUTS SCALING
size year
Scaling
Add a footer 44
TRAINING ALGORITHM
• Train OLS method from statsmodel packages.
Use data from Train OLS Algorithm Generate

MLR Model
Train Set for MLR

APPLY MODEL USING TEST SET
Use data from Apply Predict Predicted

MLR Model
Test Set Output
Add a footer 46
size size & year
• Adding year has improve the prediction power.
Add a footer 47
Hands – On 3 Data with Categorical Variable
Create a machine learning model to predict the GPA score based on
SAT score and Attendance.
Input/Feature/Predictor – SAT, Attendance
Output/Response – GPA
Add a footer 48
Handling Categorical Variable
Transform into Dummy Variable
• Variable that is used to include categorical data into the model.
• Transform non-numeric data or categorical data into numeric form.
• NOMINAL to NUMERICAL – Use one to many node.
Attendance Attendance_Yes Attendance_No
Yes Yes 1 0
No No 0 1
Add a footer 49
Data Science Professional Certification
Logistic Regression Model

• Classification – Process of categorizing the output into categorical
classes.
• Classifier – Classification algorithm to be trained by data to predict the
output classes.
TYPE A
BUY STAY or
or or TYPE B
NOT BUY CHURN or
TYPE C
Classification vs Regression
Regression Problem Classification Problem
Continuous numerical Categorical Output

Output

Examples of Classification Problem
Churn Prediction Fraud Detection Customer Decision
STAY or CHURN LEGIT or FRAUD BUY or NOT BUY

Classification Algorithm
• Logistic Regression
• Naive Bayes
• K-Nearest Neighbors
• Support Vector Machine
• Decision Tree
• Random Forest
Add a footer 54
LOGISTIC REGRESSION MODEL
• Logistic Regression is a classification algorithm used to assign observations to a discrete set of
classes.
• Supervised classification algorithm. Model builds a regression model to predict the probability of event
to success.
• Produce result in binary format that used to predict the outcome of categorical dependent variable.
Ye Class 1
s
Logistic Regression
Inputs Probability p >= 0.5
Model
Class 2
No
Output
Add a footer 55
Linear vs Logistic Regression
Linear Regression Logistic Regression

The probability of some obtained event is
represented as a linear function of a
Data is modelled using a straight line
combination of predictor variables. Data is
modelled using Sigmoid Function
Output Type Continuous Numeric Variable Categorical Variable

Prediction Variable value Probability of event occurred
Accuracy &
R-Squared, Adjusted R-Squared Accuracy, Precision, Recall
Goodness of Fit
Add a footer 56
Logistic Regression Curve
1
p≥0.5 = 1
Threshold
0.5
p<0.5 = 0
0
-6 0 6
Add a footer 57
Model Accuracy/Performance Assessment
The Confusion Matrix
• For 69 observation, the model correctly predict
Table used to describe the
performance of a classification 1 and the actual true value was 1. The model
did its job
model, such as Logistic Regression. • For 90 observation, the model correctly predict
well
0 and the true value was 0.
Predicted 1 Predicted 0
• For 4 observation, the model predict 1 but the
actual true value was 0.
Actual 1 69 5 Model
• For 5 observations, the model predict 0 but the confused
actual true value was 1.
Actual 0 4 90
Add a footer 58
Model Accuracy
A measure to evaluate the accuracy of the logistic regression model
using confusion matrix.
TP FN
Actual 1
69 5
FP TN • Overall performance of the logistic model is able to predict the

Actual 0 output at 94% accuracy.
4 90
Add a footer 59
Precision
• Metric used to measure the correct number of positive prediction.
• Tell the performance of the model to correctly predict the positive class.
• Answering question such as "What will be the chances of the outcome to be
actually positive when the model predicts the result is positive".
TP FN
Actual 1
69 5
FP TN
Actual 0
4 90
Add a footer 60
Recall
• Ratio of the total amount of correctly classified positive class.
• Answering queries of "What proportion of actual positive class was identified
correctly?"
TP FN
Actual 1
69 5
FP TN
Actual 0
4 90
Add a footer 61
Hands – On 4
Build a machine learning model using Logistic Regression Algorithm to predict the
result of admission status, Admitted of student application to higher learning
institution based on SAT score.
Yes
Admitted
Data File: Admittance.csv No
Add a footer 62
HANDS-ON 5
Build a machine learning model using Logistic Regression Algorithm to predict the
result of Exited Status of the customer in Bank A.
Churn
Exited
Stay
Data File: bank_churn.csv
Add a footer 63
Machine Learning Process
NO
TRAIN
ALGORITHM
Train Set
Generate Performance
YES SAVE
OK? MODEL
ML Model
DATA PREPARATION &
PRE-PROCESSING MODEL
APPLY MODEL
Test Set ASSESSMENT

Partitioning Dataset
• Imbalance classes distribution for the response value will cause bias.
• Apply stratified sampling strategy to distribute the rows into train and test set.
Stay Churn
Total rows = 10,000 Overall
7963 2037
Stay Churn
Train Set – 80%
6370 1630
Stay Churn
Test Set – 20%
1593 407
• Stratified sampling will distribute the data point evenly

according to the proportion of the classes.
Add a footer 65
Training Algorithm
• Train Logistic Regression from sklearn packages.
Use data from Train Logistic Regression Generate Classification

Train Set Algorithm Model

Performance & Accuracy
Predicted Churn Predicted Stay Class Recall F-Score
Actual Churn 72 335 18% 27%
Actual Stay 57 1536 96% 89%
Class
56% 82%
Precision
Overall Accuracy 80.4%
• The overall accuracy is 80.4%. However, this indicator is not sufficient since the classes of outputs are not balance.
• 82% of those predicted Stay are actually Stay. The model has good performance for Stay class.
• Only 56% of those samples predicted as Churn are actually Churn.
Add a footer 67
Predicted Churn Predicted Stay Class Recall F-Score
Actual Churn 72 335 18% 27%
Actual Stay 57 1536 96% 89%
Class
56% 82%
Precision
Overall Accuracy 80.4%
• Only 18% of those Churn classes are able to predicted correctly.
• 96% of the samples that are actually Stay predicted correctly. This is high.
• F-Score is the harmonic mean of precision and recall. Based on this, the model can be concluded perform better
to predict the output classes of Stay.
• Apply cross-validation, smote algorithm, and try to balance the data to get better model.
Add a footer 68

Machine Learning

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning

Uploaded by

Copyright:

Available Formats

Machine Learning Series

Machine Learning with

At the end of this section, the attendees will be able to:

• Understand the fundamental concept of Machine Learning Algorithm in

• Train ML algorithm, including regression, classification & clustering

• Evaluate the performance of each ML model.

By Dr. Nickholas Anting

Historical Data Training Model

Copyright by Dr. Nickholas 5

Copyright by Dr. Nickholas 6

Copyright by Dr. Nickholas 7

Copyright by Dr. Nickholas 8

Watched Movies Recommended to watch

• Samples in dataset are labelled

HP MetColor FuelType Price

Copyright by Dr. Nickholas 11

• Dataset comprises of non-labelled

Copyright by Dr. Nickholas 12

Copyright by Dr. Nickholas 13

Copyright by Dr. Nickholas 15

Import dataset into Feature Selection

Data Cleaning Features/Inputs Test – 20%

Copyright by Dr. Nickholas 16

Copyright by Dr. Nickholas 17

ENERGY MANUFACTURING & NATURAL LANGUAGE

Copyright by Dr. Nickholas 18

Copyright by Dr. Nickholas 19

• The output responses are NUMERIC values.

• y-axis representing dependent value while x-axis

• Measure that widely used to describe how R2 =0.70

R2 = 0 or 0% means that none of the variability

Usually observed value of R2 is ranging

• The values of R-Squared are ranging from 0

• Build a ML model to predict house price, Price based on Size

Copyright by Dr. Nickholas 27

DATA Test Set PERFORMANCE

GENERAL PROCESS OF MACHINE LEARNING MODEL

Import Data Data

Pandas data frame

Checking & Treating

Handling Assign independent

• Test set use to validate the model

• Train OLS method from statsmodel packages.

Use data from Train OLS Algorithm Generate

Copyright by Dr. Nickholas 36

Use data from Apply Predict Predicted

Two or more independent variables are used to predict

Provide more good model. Address the higher

The more variables, the more factors considering the

It is not about fitting line anymore. Stop being 2D.

IT IS ABOUT THE BEST FITTING MODEL

• Price of the house could be depending more than

• Those factors (independent variable) are such as

Input/Feature/Predictor – Size, Year

Use data from Train OLS Algorithm Generate

Copyright by Dr. Nickholas 45

Use data from Apply Predict Predicted

• Adding year has improve the prediction power.

Input/Feature/Predictor – SAT, Attendance

Attendance Attendance_Yes Attendance_No

By Dr. Nickholas Anting

Regression Problem Classification Problem

Continuous numerical Categorical Output