AZ AI Lec 08 Machine Learing1

Introduction
Welcome
Machine Learning
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
Andrew Ng
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
- Self-customizing programs
E.g., Amazon, Netflix product recommendations
- Understanding human learning (brain, real AI).
Andrew Ng
Introduction
What is machine
learning
Machine Learning
Andrew Ng
Machine Learning definition
• Arthur Samuel (1959). Machine Learning: Field of
study that gives computers the ability to learn
without being explicitly programmed.
• Tom Mitchell (1998) Well-posed Learning
Problem: A computer program is said to learn
from experience E with respect to some task T
and some performance measure P, if its
performance on T, as measured by P, improves
with experience E.
Andrew Ng
“A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T,
as measured by P, improves with experience E.”
Suppose your email program watches which emails you do or do
not mark as spam, and based on that learns how to better filter
spam. What is the task T in this setting?
Classifying emails as spam or not spam.
Watching you label emails as spam or not spam.
The number (or fraction) of emails correctly classified as spam/not spam.
None of the above—this is not a machine learning problem.

Machine learning algorithms:
- Supervised learning
- Unsupervised learning
Others: Reinforcement learning, recommender
systems.
Andrew Ng
Introduction
Supervised
Learning
Machine Learning
Andrew Ng
Housing price prediction.
400
300
Price ($) 200

in 1000’s
100
0
0 500 1000 1500 2000 2500
Size in feet2
Supervised Learning Regression: Predict continuous

“right answers” given valued output (price)
Andrew Ng
Breast cancer (malignant, benign)
Classification
1(Y)
Discrete valued
Malignant?
output (0 or 1)
0(N)
Tumor Size
Tumor Size
Andrew Ng
- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape
…
Tumor Size
Andrew Ng
You’re running a company, and you want to develop learning algorithms to address each
of two problems.
Problem 1: You have a large inventory of identical items. You want to predict how many
of these items will sell over the next 3 months.
Problem 2: You’d like software to examine individual customer accounts, and for each
account decide if it has been hacked/compromised.
Should you treat these as classification or as regression problems?

Treat both as classification problems.
Treat problem 1 as a classification problem, problem 2 as a regression problem.
Treat problem 1 as a regression problem, problem 2 as a classification problem.
Treat both as regression problems.

Introduction
Unsupervised
Learning
Machine Learning
Andrew Ng
Supervised Learning
x2
x1
Andrew Ng
Unsupervised Learning
x2
x1
Andrew Ng
Andrew Ng
Andrew Ng
Genes
Individuals
[Source: Daphne Koller] Andrew Ng

Genes
Individuals
[Source: Daphne Koller] Andrew Ng

Linear regression
with one variable
Model
representation
Machine Learning
20
Andrew Ng
500
Housing Prices
400
(Portland, OR) 300
200
Price
100
(in 1000s
of dollars) 0
0 500 1000 1500 2000 2500 3000
Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
21
Andrew Ng
Training set of Size in feet2 (x) Price ($) in 1000's (y)
housing prices 2104 460
(Portland, OR) 1416 232
1534 315
852 178
… …
Notation:
m = Number of training examples
x’s = “input” variable / features
y’s = “output” variable / “target” variable
22
Andrew Ng
Training Set How do we represent h ?
Learning Algorithm
Size of h Estimated
house price
Linear regression with one variable.

Univariate linear regression.
23
Andrew Ng
Linear regression
with one variable
Cost function
Machine Learning
24
Andrew Ng
Size in feet2 (x) Price ($) in 1000's (y)
Training Set
2104 460
1416 232
1534 315
852 178
… …
Hypothesis:
‘s: Parameters
How to choose ‘s ?
25
Andrew Ng
3 3 3
2 2 2
1 1 1
0 0 0
0 1 2 3 0 1 2 3 0 1 2 3
26
Andrew Ng
y
Idea: Choose so that

is close to for our
training examples
27
Andrew Ng
Linear regression
with one variable
Cost function
intuition I
Machine Learning
28
Andrew Ng
Simplified
Hypothesis:
Parameters:
Cost Function:
Goal:
29
Andrew Ng
(for fixed , this is a function of x) (function of the parameter )
3 3
2 2
y 1
1
0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x
30
Andrew Ng
3 3
2 2
y 1
1
0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x
31
Andrew Ng
3 3
2 2
y 1
1
0 0
0 1 2 3 -0.5 0 0.5 1 1.5 2 2.5
x
32
Andrew Ng
Linear regression
with one variable
Gradient
Machine Learning
descent
42
Andrew Ng
Have some function
Want
Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
43
Andrew Ng
J(0,1)
1
0
44
Andrew Ng
J(0,1)
1
0
45
Andrew Ng
Gradient descent algorithm
Correct: Simultaneous update Incorrect:
46
Andrew Ng
Linear regression
with one variable
Gradient descent
intuition
Machine Learning
47
Andrew Ng
48
Andrew Ng
49
Andrew Ng
If α is too small, gradient descent
can be slow.
If α is too large, gradient descent

can overshoot the minimum. It may
fail to converge, or even diverge.
50
Andrew Ng
at local optima
Current value of
51
Andrew Ng
Gradient descent can converge to a local
minimum, even with the learning rate α fixed.
As we approach a local
minimum, gradient
descent will automatically
take smaller steps. So, no
need to decrease α over
time. 52
Andrew Ng
53
Andrew Ng
Linear regression
with one variable
Gradient descent for
linear regression
Machine Learning
54
Andrew Ng
Gradient descent algorithm Linear Regression Model
55
Andrew Ng
56
Andrew Ng
update
and
simultaneously
57
Andrew Ng
J(0,1)
1
0
58
Andrew Ng
60
Andrew Ng
(for fixed , this is a function of x) (function of the parameters )
61
Andrew Ng
62
Andrew Ng
63
Andrew Ng
64
Andrew Ng
65
Andrew Ng
66
Andrew Ng
67
Andrew Ng
68
Andrew Ng
69
Andrew Ng
“Batch” Gradient Descent
“Batch”: Each step of gradient descent

uses all the training examples.
70
Andrew Ng

AZ AI Lec 08 Machine Learing1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AZ AI Lec 08 Machine Learing1

Uploaded by

Copyright:

Available Formats

Introduction

Classifying emails as spam or not spam.

Watching you label emails as spam or not spam.

The number (or fraction) of emails correctly classified as spam/not spam.

None of the above—this is not a machine learning problem.

Price ($) 200

Supervised Learning Regression: Predict continuous

Should you treat these as classification or as regression problems?

Treat problem 1 as a classification problem, problem 2 as a regression problem.

Treat problem 1 as a regression problem, problem 2 as a classification problem.

Treat both as regression problems.

[Source: Daphne Koller] Andrew Ng

[Source: Daphne Koller] Andrew Ng

Linear regression with one variable.

Idea: Choose so that

Correct: Simultaneous update Incorrect:

If α is too large, gradient descent

“Batch”: Each step of gradient descent

You might also like