You are on page 1of 19

Introduction to

Machine Learning
Agenda
• Introduction
• Basics
• Classification
• Clustering
• Regression
• Use-Cases
2
Quick
Questionnaire
How many people have heard about Machine
Learning

How many people know about Machine


Learning

How many people are using Machine


Learning
About
• subfield of Artificial Intelligence (AI)
• name is derived from the concept that it deals with
“construction and study of systems that can learn from
data”
• can be seen as building blocks to make computers
learn to behave more intelligently
• It is a theoretical concept. There are various
techniques with various implementations.

• http://en.wikipedia.org/wiki/Machine_learning
In other
words…

“A computer program is said to learn from


experience (E) with some class of tasks (T) and
a performance measure (P) if its performance
at tasks in T as measured by P improves with
E”
Terminology
• Features
– The number of features or distinct traits that can be used to
describe
each item in a quantitative manner.
• Samples
– A sample is an item to process (e.g. classify). It can be a
document, a picture, a sound, a video, a row in database or CSV
file, or whatever you can describe with a fixed set of quantitative
traits.
• Feature vector
– is an n-dimensional vector of numerical features that represent
some
object.
• Feature extraction
– Preparation of feature vector
– transforms the data in the high-dimensional space to a space of
fewer dimensions.
Let’s dig deep into
it…
What do you mean by

Apple
Learning (Training)

Features: Features: Feature


1. Color: 1. Sky s:
Radish/Red Blue 1. Yello
2. Type : Fruit 2. Logo w
3. Shape 3. Shape 2. Fruit
etc… etc… 3. Shap
e
etc…
Workflow
Categories

• Supervised Learning

• Unsupervised Learning

• Semi-Supervised
Learning

• Reinforcement
Learning
Supervised Learning
• Supervised learning is where you have input variables (x) and an

output variable (Y) and you use an algorithm to learn the mapping

function from the input to the output.

• Y = f(X)

• Supervised learning problems can be further grouped into

regression and classification problems.

• Classification: A classification problem is when the output

variable is a category, such as “red” or “blue” or “disease” and

“no disease”.

• Regression: A regression problem is when the output variable


Unsupervised Learning
• Unsupervised learning is where you only have input data (X) and
no corresponding output variables.

• The goal for unsupervised learning is to model the underlying


structure or distribution in the data in order to learn more about the
data.

• Unsupervised learning problems can be further grouped into


clustering and association problems

• Clustering: A clustering problem is where you want to


discover the inherent groupings in the data, such as grouping
customers by purchasing behavior.

• Association: An association rule learning problem is where


Semi-Supervised Learning
• Problems where you have a large amount of input data (X) and
only some of the data is labeled (Y) are called semi-supervised
learning problems.
Reinforcement Learning
• allows the machine or software agent to learn its behavior
based on feedback from the environment.
• This behavior can be learnt once and for all, or keep on adapting
as time goes by.
• Application : Energy management based on consumption
Techniques
• classification: predict class from
observations
• clustering: group observations
into “meaningful” groups

• regression (prediction): predict value


from observations
Dataset
• Visual dataset
• Text dataset
• Audio dataset
Visual Dataset
• Object Classification
• Object Detection
• Scene Recognition
• Activity Detection
• Video Captioning
• Video Summarization
Text Dataset
• Text Classification
• Text Summarization
• Question Answer System
Audio Dataset
• Speech Recognition
• Text to Speech
• Sound Classification

You might also like