You are on page 1of 30

Introduction to

Machine Learning

zamriosman@ump.edu.my
AI EVOLUTION

Alan Turing develop 200: A space odyssey


Turing test is release

1968
1950 IBM deep blue
computer
Amazon’s Echo is
Apple introduce Siri 1997
released
2014
2011
Microsoft Tay is
released Cambridge
Analytica scandal
2016 becomes public
2018

zamriosman@ump.edu.my
COGNITIVE SOFTWARE
Ability for a machine or program to identify
A method of data analysis that words spoken aloud and convert to
automates analytical model readable text
building
Machine Learning Speech Recognition

Enables computers and systems to


derive meaningful information from
images, videos
Computer Vision
Simplest form of artificial
intelligence and uses prescribed
Natural Language Processing knowledge-based rule
Analyse human language to Rule-based system
understand the meaning of words
and understood in context

zamriosman@ump.edu.my
HEIRARCHY OF AI, ML AND DL

AI

Machine Learning is a
Deep Learning is a class of Machine Learning fundamental parts of Artificial
machine learning algorithms Intelligence that provides
that imitate the working of computers with the ability to
the human brain in processing learn without explicitly
data and creating patterns. It Deep Learning programmed.
is affective in performing
complicated analysis of
unstructured data as speech
and image recognition.

zamriosman@ump.edu.my
MACHINE LEARNING
Machine learning is the science (and art) of programming computers so they can learn from data.

Slightly more general definition:

[Machine Learning is the] field of study that gives computers the ability to learn without
being explicitly programmed.
-- Arthur Samuel, 1959

More engineering-oriented one:

A computer program is said to learn from experience E with respect to some task T and
some performance measure P, if its performance on T, as measured by P, improves with
experience E.
-- Tom Mitchell, 1997

zamriosman@ump.edu.my
MACHINE LEARNING
Why use Machine Learning?
Consider how you would write a spam filter using
traditional programming techniques (Figure)

1. First you would consider what spam typically looks


like. Phrases such as “4U”, “credit card”, “free”,
and “amazing”. Perhaps some patterns in the
sender’s email, email’s body or other parts.
2. You would like a detection algorithm for each of
these patterns, and flagged the email if number of
these patterns were detected.
3. You would test your program and repeat steps 1 So, what your program
and 2 until it is good enough. will faced later?
Long list of complex rules – pretty
hard to maintain

zamriosman@ump.edu.my
MACHINE LEARNING
The Machine Learning Approach
ML techniques automatically learns which words and
phrases are good predictors of spam by detecting
unusually frequent patterns of words in the spam
example.

They might starting writing “For U” instead of “4U”.


Traditional way need to update/add the flag “For U” in
the filter list.

.
Shorter, easy to maintain,
more accurate

zamriosman@ump.edu.my
MACHINE LEARNING
The Machine Learning Approach

In contract, ML automatically notices that


“For U” has become unusually frequent in
spam flagged by users, and its starts flagging
them without your intervention.

zamriosman@ump.edu.my
MACHINE LEARNING
To summarize, ML is great for:

01
Problem for which existing solutions require a lot of fine-tuning or long lists of rules: one ML
algorithm can often simplify code and perform better than the traditional approach.
02
Complex problems for which using traditional approach yields no good solution: the best ML
techniques can perhaps find a solution.
03
Fluctuating environments: a ML system can adapt to new data.
04
Getting insights about complex problems and large amount of data.

zamriosman@ump.edu.my
ML APPLICATIONS
We are using machine learning in our daily
life even without knowing it such as Google
Maps, Google assistant, Alexa, etc. Below are
some most trending real-world applications
of Machine Learning:

zamriosman@ump.edu.my
ML APPLICATIONS
zamriosman@ump.edu.my
zamriosman@ump.edu.my
ML PLAYGROUND
Activities:
Visit https://teachablemachine.withgoogle.com/

zamriosman@ump.edu.my
TYPES OF ML
The Machine Learning Approach

There are so many different types of ML system that it is useful to classify them in bread
categories, based on the following criteria:

1. Whether or not they are trained with human supervision (supervised, unsupervised, semi-
supervised, and Reinforcement learning)
2. Whether or not they can learn incrementally on the fly (online versus batch learning)
3. Whether they work by simply comparing new data points to known data points, or instead by
detecting patterns in the training data and building a predictive model, much like scientist do
(instance-based versus model-based learning)

zamriosman@ump.edu.my
TYPES OF ML (SUPERVISED)
There are four major ML categories: Supervised,
Unsupervised, Semi-supervised, and
Reinforcement Learning

Supervised Learning
In supervised learning, the training set you feed
to the algorithm includes the desired solutions,
called labels.

A typical supervised learning task is classification.


The spam filter is a good example: its trained with
many example emails along with their class (spam
or ham), and it must learn how to classify new
emails.

zamriosman@ump.edu.my
TYPES OF ML (SUPERVISED)
A typical supervised learning task is classification. The spam filter
is a good example: its trained with many example emails along
with their class (spam or ham), and it must learn how to classify
new emails.

zamriosman@ump.edu.my
TYPES OF ML (SUPERVISED)

Another typical task is to predict a target numerical


value, such as the price of a car, given a set of
features (mileage, age, brand, etc) called predictors.
This sort of task is called regression.

Note some regression algorithms can be used for


classification as well, and vice versa. For example
*Logistic Regression commonly used for
classification as it can output a value that
corresponds to the probability of belongings to a
given class.

What other application using


regression technique? *more details in further chapter

zamriosman@ump.edu.my
TYPES OF ML (SUPERVISED)
Some of Supervised Learning algorithms:

1. k-Nearest Neighbors
2. Linear Regression
3. Multilinear Regression
4. Logistic Regression Search over the internet, what
5. Support Vector Machines (SVMs) application used these supervised
learning
6. Decision Tree and Random Forest
7. XGBoost
8. Neural Network

zamriosman@ump.edu.my
TYPES OF ML (UNSUPERVISED)
Unsupervised Learning
In unsupervised learning, the training data is
unlabelled. The system tries to learn without a
teacher.

What you can think about from this training set?


The label? Features?

zamriosman@ump.edu.my
TYPES OF ML (UNSUPERVISED)
Some of unsupervised learning algorithms:
Clustering Association rule learning
1. K-Means 1. Apriori
2. DBSCAN 2. Eclat
3. Hierarchical Cluster Analysis (HSA)

Anomaly detection and novelty detection


1. One-class SVM
2. Isolation Forest

Visualization and dimensionality reduction


1. Principle Component Analysis (PCA)
2. Kernel PCA
3. Locally Linear Embedding (LLE)
4. T-Distributed Stochastic Neigbor Embedding (t-
SNE)

zamriosman@ump.edu.my
TYPES OF ML (UNSUPERVISED)
Visualization:

Visualization algorithms are also good example of


unsupervised learning algorithms:

You feed them a lot of complex and unlabeled data,


and they output 2D or 3D representation of the data
that can be easily plotted.

These algorithm try to preserve as much structure as


they can (e.g. trying to keep separate clusters in the
input space from overlapping in the visualization) so
that you can understand;

1. How the data is organized, and


2. (perhaps) Identify unsuspected patterns.

zamriosman@ump.edu.my
TYPES OF ML (UNSUPERVISED)
Dimensionality Reduction:
TIP
The goal is to simplify the data without losing to much It is often a good idea to try to reduce the dimension of your
training data using a dimensionality reduction algorithm
information. One way to do is to merge several before you feed in to another ML algorithm. It will run much
correlated features into one. faster, requires less disk and memory space, and
sometimes perform better.

For example, a car’s mileage may be strongly


correlated with age, so the dimensionality reduction
algorithm will merge them into one feature that
represents the car’s wear and tear. This is called Both features highly correlated
feature extraction.

+ = The loss

mileage: 23410km age: 3 years

zamriosman@ump.edu.my
TYPES OF ML (UNSUPERVISED)
Anomaly Detection:

Another unsupervised learning task. For example


detecting unusual credit card transactions to prevent
fraud, catching manufacturing defects, or automatically
removing outliers from a dataset before feeding it to
learning algorithm.

The system is shown mostly normal instances during


training, so it learns to recognize them; the when it
sees a new instances, it can tell whether it looks like a
normal or it is likely anomaly.

Novelty detection looks similar with anomaly


detection. The goal is to identify new instances that
look different from all instances in the training set. This
requires “clean” training set.

zamriosman@ump.edu.my
TYPE OF ML (SEMI-SUPERVISED)
Semi-supervised Learning

Since labelling data is usually time-consuming and


costly, you will often have plenty of unlabeled
instances, and few labelled instances. Some
algorithms can deal with this partially labelled.

Most semi-supervised algorithms are


combinations of unsupervised and supervised
algorithms. For example, deep belief networks
(DBNs)

zamriosman@ump.edu.my
TYPE OF ML (REINFORCEMENT)
Reinforcement Algorithm

Reinforcement learning is a very different beast. The


learning system, called an agent in this context, can
observe the environment, select and perform
actions, and get rewards in return (or penalties in
the form of negative rewards).

Where it is applied?
Robots implement RL algorithms to learn how to
walk.
DeepMind’s AlphaGo – beat the world campion Ke
Jie at the game of Go. Its learn millions of
possibilities on how to win the game (reward and
penalty).

zamriosman@ump.edu.my
ML IN DATA SCIENCE LANDSCAPE

Data Science is a multidisciplinary


field that studies algorithms and
methods to solve analytically
complex problems. Its combines
mathematical, statistics,
programming, and computer
science to uncover insights from
data. ML is one of the most
powerful techniques that data
science employs.

zamriosman@ump.edu.my
DATA SCIENCE PROJECT STRUCTURE

zamriosman@ump.edu.my
zamriosman@ump.edu.my
DATA SCIENCE PROJECT STRUCTURE

zamriosman@ump.edu.my

You might also like