Lecture 1, Machine Learning Course
Lecture 1, Machine Learning Course
Stanford CS229
Andrew Ng
Autumn 2018
Preface
We hope that we can provide tools using machine learning to create pieces that makes the
world a better place. It’s better to find a better meaning for our endeavors.
What is Machine Learning
Arthur Samuel (1959). It is a field of study that gives computers the ability to learn without
being explicitly programmed.
Choosing a specific task could allow a machine to be subjectively be better than a
human.
The checker machine learning model being better than Samuel is an example.
Tom Mitchell (1998): A computer program is said to learn from experience E with respect to
some task T and some performance measure P, if its performance on T, as measured by P,
improves the experience E.
Supervised Learning
One of the most used tool in ML.
Given a dataset, your goal is to learn a mapping from X → Y.
This is regression.
Regression refers to that the value y we are trying to predict is continuous.
You could try to fit different types of models (quadratic, etc.)
An example, a model which determines if a breast tumor is malignant or not based on the
tumor size, could be considered as a classification problem.
classification refers to that Y here takes on a discrete number of variables.
If the output of the model is discrete, it is a classification problem.
In a regression problem, Y is a real number.
There will be times that X will be a multi-dimensional value/number, which is the case for
most machine learning problems.
Figure 1. A sample problem where given the age and tumor size of the user, have the model
predict whether the tumor is malignant or not.
In the example, we need to cater to two values in order to predict its severity. In the next
lectures, we can divide the model into a positive and negative segment using linear regression.
If we were to have more features to plot, we can’t plot them anymore in a high-dimensional
model.
For the breast cancer one, they considered the following features:
Clump Thickness
Uniformity of Cell Size
Uniformity of Cell Shape
Adhesion and more
Support Vector Machine
It uses an infinite number of input features.
Allows us to use an infinite-dimensional vector.
Kernels are the technical terms.
Machine Learning Strategy (Learning Theory)
Two different teams applying the same algorithms could have completely different results.
The skill of us deciding what to do in the project matters in doing machine learning.
The lecture wants to make the learning systematic to eventually allow us to do systematic
engineering process on projects etc.
The way we debug our learning systems will also determine how well our progress will go.
When optimizing code, instead of jumping right into the code and optimize it, a better way
would be to run a profiler and see where the bottlenecks are in the system and then
optimize them.
Deep Learning
Very hot right MEOW.
Third topic that will be tackled as well.
Training neural networks do tend to fall under here.
Unsupervised Learning
Giving a dataset of x’s with no y’s, find something that is interesting when it comes to the
pattern or groups of data.
K-means clustering will fall under here.
A clustering algorithm allows us to figure out whether different data are under the same
label or group.
Figuring out what groups belongs together.
A social network provides us a means to determine their clustering.
Cocktail party problem is under this as well.
If you have a noisy room, and you want to separate individual people’s voices, you can
use unsupervised learning as well.
Independent Component Analysis can be used here.
The internet has lots of unlabeled data.
Learning analogy uses unsupervised learning from unlabeled data from the internet.
Supervised learning has been used more extensively compared to unsupervised learning.
Reinforcement Learning
No one knows how to optimally fly a helicopter.
Let the helicopter do its thing, and reinforce it when it does good things and vice versa.
Fantastic on optimizing logistic systems and robotics.