You are on page 1of 80

Course Name : Applied Machine Learning

Course Code : MTDS 5143


Course Mode : Modular
Schedule : 25-26 Nov, 2019 (ML)
02-03 Dec, 2019 (ML, DL)
09-10 Dec, 2019 (DL)
16-17 Dec, 2019 (Assessment)
ULearn : MTDS 5143 Applied Machine Learning
(All work should be submitted through ULearn platform)

Instructor : Dr. Noor Fazilla Binti Abd Yusof


elle@utem.edu.my

: Assoc. Prof. Ts. Dr. Choo Yun Huoy


huoy@utem.edu.my

Subject
Information MTDS5143 Applied Machine Learning

Session 2019 / 2020 Semester I


Suggested Reference
2

Main References: Lab/Practical:


By

Assoc. Prof. Dr. Choo Yun Huoy


Department of Intelligent Computing & Analytics
Faculty of Information and Communication Technology
Univ. Teknikal Malaysia Melaka (UTeM)
76109 Durian Tunggal, Melaka, Malaysia.
huoy@utem.edu.my

MTDS 5143 APPLIED MACHINE LEARNING


4

§ From AI To Machine Learning


§ What Is Machine Learning?
§ Machine Learning Task
§ Using Data to Make Decisions
§ The Machine Learning Workflow
§ Summary
From AI To Machine Learning
5

§ AI is basically the intelligence – how we


make machines intelligent, while
machine learning is the implementation
of the compute methods that support it.
§ AI is the science and machine learning
is the algorithms that make the
machines smarter.
§ “So the enabler for AI is machine
learning”

Source: http://www.wired.co.uk/article/machine-learning-ai-explained
From AI To Machine Learning
6 Source: http://iot.ghost.io/is-it-all-machine-learning/
Where is Machine Learning ?
7
What is Concept Learning?
8

§ The keyword of Machine Learning is the learning process itself.


§ Learning means learning a concept.
§ A concept describe a set of objects or events with similar
characteristics.
Concept
of Owl
Concept of
Butterflies

Concept
of Love

Concept
Concept of Happy
Concept
of Trees of Sad
Iris Flower Concept
9

The Iris Flower Data Set


Famous database; from Fisher, 1936
https://archive.ics.uci.edu/ml/datasets/Iris

§ To predict the Iris flower breed, we must first


learn the concept of Iris flower.
How to recognise an Iris flower?
What Is Machine Learning?
10

§ Using the right features to build the right models that


achieve the right tasks.
What Is Machine Learning?
11

Abstraction of
Mapping / Relations
Problem to be solved

Object descriptor

The method used

§ Using the right features to build the right models that


achieve the right tasks.
Machine Learning Vocabulary
12

§ Target: predicted category or value of the


data (column to predict)
Data
13

The Iris Flower Data Set


Famous database; from Fisher, 1936
https://archive.ics.uci.edu/ml/datasets/Iris

§ Predict the type of iris plant based on the sepal and


petal size (width and length).
Machine Learning Vocabulary
14

The Iris Flower Data Set


sepal length sepal width petal length petal width species
6.7 3.0 5.2 2.3 virginica
6.4 2.8 5.6 2.1 virginica
4.6 3.4 1.4 0.3 setosa
Target
6.9 3.1 4.9 1.5 versicolor
4.4 2.9 1.4 0.2 setosa
4.8 3.0 1.4 0.1 setosa
5.9 3.0 5.1 1.8 virginica
5.4 3.9 1.3 0.4 setosa
4.9 3.0 1.4 0.2 setosa
5.4 3.4 1.7 0.2 setosa
Machine Learning Vocabulary
15

§ Target: predicted category or value of the


data (column to predict)
§ Features: properties of the data used for
prediction (non-target columns)
Machine Learning Vocabulary
16

The Iris Flower Data Set


sepal length sepal width petal length petal width species
6.7 3.0 5.2 2.3 virginica
6.4 2.8 5.6 2.1 virginica
4.6 3.4 1.4 0.3 setosa
Features
6.9 3.1 4.9 1.5 versicolor
4.4 2.9 1.4 0.2 setosa
4.8 3.0 1.4 0.1 setosa
5.9 3.0 5.1 1.8 virginica
5.4 3.9 1.3 0.4 setosa
4.9 3.0 1.4 0.2 setosa
5.4 3.4 1.7 0.2 setosa
Machine Learning Vocabulary
17

§ Target: predicted category or value of the


data (column to predict)
§ Features: properties of the data used for
prediction (non-target columns)
§ Example: a single data point within the data
(one row)
Machine Learning Vocabulary
18

The Iris Flower Data Set


sepal length sepal width petal length petal width species
6.7 3.0 5.2 2.3 virginica
6.4 2.8 5.6 2.1 virginica
4.6 3.4 1.4 0.3 setosa
6.9 3.1 4.9 1.5 versicolor
4.4 2.9 1.4 0.2 setosa
Examples 4.8 3.0 1.4 0.1 setosa
5.9 3.0 5.1 1.8 virginica
5.4 3.9 1.3 0.4 setosa
4.9 3.0 1.4 0.2 setosa
5.4 3.4 1.7 0.2 setosa
Machine Learning Vocabulary
19

§ Target: predicted category or value of the


data (column to predict)
§ Features: properties of the data used for
prediction (non-target columns)
§ Example: a single data point within the data
(one row)
§ Label: the target value for a single data point
Machine Learning Vocabulary
20

The Iris Flower Data Set


sepal length sepal width petal length petal width species
6.7 3.0 5.2 2.3 virginica
6.4 2.8 5.6 2.1 virginica
4.6 3.4 1.4 0.3 setosa
6.9 3.1 4.9 1.5 versicolor
4.4 2.9 1.4 0.2 setosa
Label 4.8 3.0 1.4 0.1 setosa
5.9 3.0 5.1 1.8 virginica
5.4 3.9 1.3 0.4 setosa
4.9 3.0 1.4 0.2 setosa
5.4 3.4 1.7 0.2 setosa
Machine Learning Task
21

Supervised Unsupervised

Learning
Types

Reinforcement
Machine Learning Task
22

Unsupervised Supervised
Continuous

Clustering Regression
Categorical

Association Classification
Analysis
The Machine Learning Tree

Source: https://vas3k.com/blog/machine_learning/
Classical ML vs Deep Learning
Classical Machine Learning

Source: https://becominghuman.ai/deep-learning-made-easy-with-deep-cognition-403fbe445351
Artificial Neural Network vs
Deep Learning Neural Network

Source: https://www.pnas.org/content/116/4/1074
Source: https://vas3k.com/blog/machine_learning/
Source: https://vas3k.com/blog/machine_learning/
Tree Based Learning

Source: https://vas3k.com/blog/machine_learning/
Function Based Learning

Source: https://vas3k.com/blog/machine_learning/
Probability Based Learning

Source: https://vas3k.com/blog/machine_learning/
Source: https://vas3k.com/blog/machine_learning/
Value Prediction

Source: https://vas3k.com/blog/machine_learning/
Source: https://vas3k.com/blog/machine_learning/
Distance-based Clustering

Source: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Mean-Shift Clustering

Mean-shift Clustering for A The Entire Process of Mean-


Single Sliding Window shift Clustering

Source: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Density-Based Spatial Clustering of
Applications with Noise (DBSCAN)

DBSCAN Smiley Face Clustering

Source: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Expectation–Maximization (EM)
Clustering using Gaussian Mixture
Models (GMM)

Source: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Agglomerative Hierarchical
Clustering

Source: https://towardsdatascience.com/the-5-clustering-algorithms-data-scientists-need-to-know-a36d136ef68
Source: https://vas3k.com/blog/machine_learning/
Component Analysis

Finding Principal Components

Source: http://www.ait.edu.gr/ait_web_site/faculty/apne/Face_Recognition.html
Source: https://vas3k.com/blog/machine_learning/
Source: https://vas3k.com/blog/machine_learning/
Association Rule Mining

Source: https://www.stratlytics.com/blog.php?id=9
Association Rule Mining
Machine Learning Task
45
Machine Learning Task
46

§ Semi-supervised learning is to use a small labelled


training set to build an initial model, which is then
refined using the unlabeled data.
§ Semi-supervised learning is used when constructing a
labelled training set is a painstaking process.
MODEL
47
The Output of Machine Learning

• models use intuitions from geometry such as separating


Geometric (hyper) planes, linear transformations and distance metrics.
• SVM, Nearest Neighbour, PCA

• models view learning as a process of reducing uncertainty,


Probabilistic modelled by means of probability distributions
• Bayes model, likelihood ratio, NB

• models are defined in terms of easily interpretable logical


Logical
• Decision trees
MODEL
48
The Output of Machine Learning
FEATURES
49
The Workhorses of Machine Learning
§ Features and models are intimately connected.
Features define model. A single feature can be turned into
a univariate model.
Features:
50
The Workhorses of Machine Learning
§ Features may interact in various ways. These
interactions can be exploited, ignored, or poses a
challenge.
§ Examples:
§ Covariance
§ Correlation coefficient
§ Empirical estimate of sample mean
§ Expectation operator such as population variance

Data Exploration on different features is


important to build a good model !
FEATURES
51
The Workhorses of Machine Learning
§ Features construction and transformation is important
to create a good model.
§ Kernal trick is used to
modify the way the
decision boundary is
calculated.
Using Data to Make Decision
52 The contents of this slide has been modified from its original version.

§ We make decision every now and then...


§ Machine learning model can be used to assist
decision making.
Postal Mail
Spam Filtering Web Search
Routing
Movie Vehicle Driver
Fraud Detection
Recommendations Assistance
Web Speech
Social Networks
Advertisements Recognition
Using Data to Make Decision
53

1. Converting business problems into analytics


solutions
i. What is the business problem? What are the goals
that the business wants to achieve?
ii. How does the business currently work?
iii. In what way could a predictive analytics model help
to address the business problem?

Case Study : Motor Insurance Fraud


Problem to Solutions
54

Case Study : Motor Insurance Fraud


In spite of having a fraud investigation team that investigates up
to 30% of all claims made, a motor insurance company is still
losing too much money due to fraudulent claims. The following
predictive analytics solutions could be proposed to help address
this business problem:
§ Claim prediction to predict the likelihood of fraudulent claim.
§ Member prediction to predict the propensity of member to
commit fraud in the near future.
§ Application prediction to predict the likelihood of a policy
application to ultimately result in fraudulent claim.
§ Payment prediction to predict the amount of pay out after
investigation.
Using Data to Make Decision
55

2. Assessing Feasibility
i. The key objects in the company’s data model and the
data available regarding them.
ii. The connections that exist between key objects in the
data model.
iii. The granularity of the data that the business has
available.
iv. The volume of data involved.
v. The time horizon for which data is available.

Case Study : Motor Insurance Fraud


Problem to Solutions
56

Case Study : Motor Insurance Fraud


[Claim prediction]
§ Data Requirements:
large collection of historical claims marked as fraudulent and non-
fraudulent; the details of each claim; the related policy; and the related
claimant.
§ Capacity Requirements:
Given that the insurance company already has a claims investigation
team, the main requirements would be that a mechanism could be put in
place to inform claims investigators that some claims were prioritized
above others. This would also require that information about claims become
available in a suitably timely manner so that the claims investigation
process would not be delayed by the model.
Problem to Solutions
57

Case Study : Motor Insurance Fraud


[Member prediction] to predict the propensity of member to
commit fraud in the near future

§ Data Requirements:
____________________________________________________________
____________________________________________________________
____________________________________________________________

§ Capacity Requirements:
_________________________________________________________
_________________________________________________________
_________________________________________________________
Problem to Solutions
58

Case Study : Motor Insurance Fraud


[Member prediction]
§ Data Requirements:
large collection of claims labeled as either fraudulent or non-fraudulent; all
relevant details, all claims and policies can be connected to an identifiable
member; historical data on recorded changes to a policy.
§ Capacity Requirements:
Assume to run the prediction every quarter to analysis the behavior of each
customer. The company has the capacity to advise members without
damaging the customer relationship so badly as to lose the customer. Finally,
there are possibly legal restrictions associated with making this kind of
contact.
Using Data to Make Decision
59

3. Designing the Analytics Base Table


i. Elicit the domain concept, subdomain concept, and
the features involved?
ii. Prediction subject details
iii. Demographics
iv. Usage (frequency, recency, monetary value)
v. Changes in usage, and special usage
vi. Lifecycle phase
vii. Network links

Case Study : Motor Insurance Fraud


Problem to Solutions
60

Case Study : Motor Insurance Fraud


[Claim prediction]
§ Concepts elicitation:
Problem to Solutions
61

Case Study : Motor Insurance Fraud


[Claim prediction]
§ Analytic Base Table:
Using Data to Make Decision
62

4. Designing & Implementing Features


i. Data Availability : Data Type
Using Data to Make Decision
63

4. Designing & Implementing Features


i. Data Availability : Features
n Raw Features
n Derived Feature
n Aggregates
n Flags
n Ratios
n Mappings
ii. Data Availability Timing : propensity
n Observation period
n Outcome period
64
65
66
67
Using Data to Make Decision
68

4. Designing & Implementing Features


iii. Feature Longevity
iv. Legal Issues
n Collection limitation principle
n Purpose specification principle
n Use limitation principle
v. Data Manipulation
n Joining data sources
n Filtering rows and fields in a data source
n Combining or transforming features
n Aggregating data sources
Problem to Solutions
69

Case Study : Motor Insurance Fraud


[Claim prediction]
What are the observation period and outcome period for
the motor insurance claim prediction scenario?
Problem to Solutions
70

Case Study : Motor Insurance Fraud


[Claim prediction]
What are the observation period and outcome period for
the motor insurance claim prediction scenario?
§ The observation period and outcome period are measured
over different dates for each insurance claim, defined relative
to the specific date of that claim.
Problem to Solutions
71

Case Study : Motor Insurance Fraud


[Claim prediction]
What are the observation period and outcome period for
the motor insurance claim prediction scenario?
§ The observation period and outcome period are measured
over different dates for each insurance claim, defined relative
to the specific date of that claim.
§ The observation period is the time prior to the claim event,
over which the descriptive features capturing the claimant’s
behavior are calculated.
Problem to Solutions
72

Case Study : Motor Insurance Fraud


[Claim prediction]
What are the observation period and outcome period for
the motor insurance claim prediction scenario?
§ The observation period and outcome period are measured
over different dates for each insurance claim, defined relative
to the specific date of that claim.
§ The observation period is the time prior to the claim event,
over which the descriptive features capturing the claimant’s
behavior are calculated.
§ The outcome period is the time immediately after the claim
event, during which it will emerge whether the claim is
fraudulent or genuine.
Problem to Solutions
73

Case Study : Motor Insurance Fraud [Claim prediction]


Problem to Solutions
74

Case Study : Motor Insurance Fraud [Claim prediction]

What are the features for


Claim Type Subdomain ?
Problem to Solutions
75

Case Study : Motor Insurance Fraud [Claim prediction]


The Analytics Base Table
76

§ The table contains


more descriptive
features
§ The table shows
the first four
instances.
§ If we examine the
table closely, we
see a number of
strange values
(e.g., -9 999) and a
number of missing
values.
Machine Learning Workflow
77

ML
Algorithm
Machine Learning Experiment
78

To evaluate a particular
model on 1 or more
data sets, use the
measurements to
answer the questions
related to the
formulated problem.
Summary
79

§ Machine learning is to use the right features to build the


right models to achieve the right tasks.
§ Tasks are addressed by models, whereas learning
problems are solved by learning algorithms that
produce models.
§ Predictive data analytics models built using machine
learning techniques are tools that we can use to help
make better decisions.
§ It is important to fully understand the business problem
that a model is being constructed to address the goal
behind.
Primary Reference:
Kelleher, John D., Brian Mac Namee, and Aoife D'Arcy. Fundamentals of machine learning for predictive
data analytics: algorithms, worked examples, and case studies. MIT Press, 2015.
Flach, P., (2012), Machine Learning: The Art and Science of Algorithms that Make Sense of Data,
Cambridge University Press.

You might also like