Professional Documents
Culture Documents
Welcome
Machine Learning
Machine learning is one of the
most exciting recent
technologies.
Andrew Ng
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
Andrew Ng
Machine Learning
- Grew out of work in AI
- New capability for computers
Examples:
- Database mining
Large datasets from growth of automation/web.
E.g., Web click data, medical records, biology, engineering
- Applications can’t program by hand.
E.g., Autonomous helicopter, handwriting recognition, most of
Natural Language Processing (NLP), Computer Vision.
- Self-customizing programs
E.g., Amazon, Netflix product recommendations
- Understanding human learning (brain, real AI).
Andrew Ng
Introduction
What is machine
learning
Machine Learning
Andrew Ng
Machine Learning definition
Even among machine
learning practitioners,
there isn't a well
accepted definition of
what is and what isn't
machine learning.
Andrew Ng
Machine Learning
• Arthur Lee Samuel was an American pioneer in
the field of computer gaming and artificial
intelligence.
Andrew Ng
- He wrote a checkers playing program, and the amazing thing was that
Andrew Ng
Arthur Samuel himself turns out not to be a very good checkers
player.
Quiz::: Suppose your email program watches which emails you do or do not mark
as spam, and based on that learns how to better filter spam. What is the task T in
this setting?
Andrew Ng
Let's say you plot the data set and it looks like this.
Here on the horizontal axis, the size of different houses in square feet,
and on the vertical axis, the price of different houses in thousands of dollars.
So, given this data, let's say you have a friend who owns a house that is say 750 square feet,
and they are hoping to sell the house, and they want to know how much they can get for the
house. So, how can the learning algorithm help you?
Housing price prediction.
400 1) a straight line through the data,
Let's see collected data set. Suppose you are in your dataset, you have on your horizontal axis the
size of the tumor, and on the vertical axis, one or zero, yes or no, whether or not these are examples
of tumors we've seen before are malignant, which is one, or zero or not malignant or benign.
Breast cancer (malignant, benign)
Classification
1(Y)
Discrete valued
Malignant?
output (0 or 1)
0(N)
Tumor Size
In classification
problems, there is
another way to
plot this data. In
this example we
have One feature Tumor Size
The Machine Learning question is, can you estimate what is the probability, what's the chance that a tumor
as malignant versus benign?
Andrew Ng
- Clump Thickness
- Uniformity of Cell Size
Age - Uniformity of Cell Shape
…
—Problem 1: You have a large inventory of identical items. You want to predict how
many of these items will sell over the next 3 months.
—Problem 2: You’d like software to examine individual customer accounts, and for each
account decide if it has been hacked/compromised.
vOne has to understand the simpler methods first, in order to grasp the more
sophisticated ones.
Andrew Ng
Supervised Learning
x2
x1
Andrew Ng
Unsupervised Learning
x2
Unsupervised
Learning algorithm Data that doesn't have any labels
might decide that Can you find some structure in the data?
the data lives in two
different clusters.
This is called
Clustering Algorithme x1
Andrew Ng
Unsupervised Learning
• Another important class of problems involves situations in which we only observe
input variables, with no corresponding output.
• different from supervised learning, but can be useful as a pre-processing step for
unsupervised learning.
Statistical Learning, Trevor Hastie and Robert Tibshirani. Stanford Online
• In a marketing setting, we might have
demographic information for a number of
current or potential customers.
Andrew Ng
What Google News
does is everyday it
goes and looks at tens
of thousands or
hundreds of thousands
of new stories on the
web and automatically
cluster them together.
Andrew Ng
Genes
Individuals
Here's an example of DNA microarray data. The idea is put a group of different individuals and for
each of them, you measure how much they do or do not have a certain gene. Technically you
measure how much certain genes are expressed. So these colors, red, green, gray and so on, they
show the degree to which different individuals do or do not have a specific gene.
[Source: Daphne Koller] Andrew Ng
Genes
Individuals
Here's a bunch of data. the different types of people are unknown. The unsupervised learning
automatically find structure in the data from the automatically cluster the individuals into these
types that we don't know in advance
• At the root of the cocktail party problem is the fact that the human
voices present in a noisy social setting often overlap in frequency
and in time.
The “Cocktail Party Problem”: What Is It? How Can It Be Solved? And Why Should Animal Behaviorists Study It?. NCBI
Cocktail party problem
Each microphone
records a different
combination of
these two speaker
voices.
Speaker #1 Microphone #1
Each microphone
would cause an
overlapping
combination of
both speakers'
voices.
Speaker #2 Microphone #2
Andrew Ng
Microphone #1: Output #1:
What that the cocktail party algorithm will do is separate out these two audio
sources that were being added or being summed together to form other
recordings.
2) Given a set of news articles found on the web, group them into set of
articles about the same story.
Given a set of news articles found on the web, group them into
set of articles about the same story.
Given a database of customer data, automatically discover market
segments and group customers into different market segments.
Given a dataset of patients diagnosed as either having diabetes or
not, learn to classify new patients as having diabetes or not.
A Brief History of Statistical Learning
A Brief History of Statistical Learning
• Though the term statistical learning is fairly new, many of the concepts that underlie the
field were developed long ago.
• At the beginning of the nineteenth century, Legendre and Gauss published papers on the
method of least squares which implemented the earliest form of what is now known as
linear regression
• The approach was first successfully applied to problems in astronomy. Linear regression is
used for predicting quantitative values, such as an individual’s salary. In order to predict
qualitative values, such as whether a patient survives or dies, or whether the stock market
increases or decreases, Fisher proposed linear discriminant analysis in 1936.
• In the 1940s, various authors put forth an alternative approach, logistic regression
• In the early 1970s, Nelder and Wedderburn coined the term generalized linear models for
an entire class of statistical learning methods that include both linear and logistic
regression as special cases. An Introduction to Statistical Learning. With applications in R. Springer
• By the end of the 1970s, many more techniques for learning from data were
available. However, they were almost exclusively linear methods, be-cause
fitting non-linear relationships was computationally infeasible at the time.