You are on page 1of 3

What is Supervise /Inductive Learning?

From the perspective of inductive learning, we are given input samples (x) and output samples
(f(x)) and the problem is to estimate the function (f). Specifically, the problem is to generalize
from the samples and the mapping to be useful to estimate the output for new samples in the
future.

In practice it is almost always too hard to estimate the function, so we are looking for very good
approximations of the function.

Some practical examples of induction are:

Credit risk assessment.


o The x is the properties of the customer.
o The f(x) is credit approved or not.
Disease diagnosis.
o The x are the properties of the patient.
o The f(x) is the disease they suffer from.
Face recognition.
o The x are bitmaps of peoples faces.
o The f(x) is to assign a name to the face.
Automatic steering.
o The x are bitmap images from a camera in front of the car.
o The f(x) is the degree the steering wheel should be turned.

When Should You Use Inductive Learning?

There are problems where inductive learning is not a hood idea. It is important when to use and
when not to use supervised machine learning.

4 problems where inductive learning might be a good idea:

Problems where there is no human expert. If people do not know the answer they
cannot write a program to solve it. These are areas of true discovery.
Humans can perform the task but no one can describe how to do it. There are
problems where humans can do things that computer cannot do or do well. Examples
include riding a bike or driving a car.
Problems where the desired function changes frequently. Humans could describe it
and they could write a program to do it, but the problem changes too often. It is not cost
effective. Examples include the stock market.
Problems where each user needs a custom function. It is not cost effective to write a
custom program for each user. Example is recommendations of movies or books on
Netflix or Amazon.
The Essence of Inductive Learning

We can write a program that works perfectly for the data that we have. This function will be
maximally overfit. But we have no idea how well it will work on new data, it will likely de very
badly because we may never see the same examples again.

The data is not enough. You can predict anything you like. And this would be naive assume
nothing about the problem.

In practice we are not naive. There is an underlying problem and we are interested in an accurate
approximation of the function. There is a double exponential number of possible classifiers in the
number of input states. Finding a good approximate for the function is very difficult.

There are classes of hypotheses that we can try. That is the form that the solution may take or the
representation. We cannot know which is most suitable for our problem before hand. We have to
use experimentation to discover what works on the problem.

Two perspectives on inductive learning:

Learning is the removal of uncertainty. Having data removes some uncertainty.


Selecting a class of hypotheses we are removing more uncertainty.
Learning is guessing a good and small hypothesis class. It requires guessing. We dont
know the solution we must use a trial and error process. If you knew the domain with
certainty, you dont need learning. But we are not guessing in the dark.

You could be wrong.

Our prior knowledge could be wrong.


Our guess of the hypothesis class could be wrong.

In practice we start with a small hypothesis class and slowly grow the hypothesis class until we
get a good result.

A Framework For Studying Inductive Learning

Terminology used in machine learning:

Training example: a sample from x including its output from the target function
Target function: the mapping function f from x to f(x)
Hypothesis: approximation of f, a candidate function.
Concept: A boolean target function, positive examples and negative examples for the 1/0
class values.
Classifier: Learning program outputs a classifier that can be used to classify.
Learner: Process that creates the classifier.
Hypothesis space: set of possible approximations of f that the algorithm can create.
Version space: subset of the hypothesis space that is consistent with the observed data.
Key issues in machine learning:

What are good hypothesis space?


What algorithms work with that space?
What can I do to optimize accuracy on unseen data?
How do we have confidence in the model?
Are there learning problems that are computationally intractable?
How can we formulate application problems as machine learning problems?

There are 3 concerns for a choosing a hypothesis space space:

Size: number of hypotheses to choose from


Randomness: stochastic or deterministic
Parameter: the number and type of parameters

There are 3 properties by which you could choose an algorithm:

Search procedure
o Direct computation: No search, just calculate what is needed.
o Local: Search though the hypothesis space to refine the hypothesis.
o Constructive: Build the hypothesis piece by piece.
Timing
o Eager: Learning performed up front. Most algorithms are eager.
o Lazy: Learning performed at the time that it is needed
Online vs Batch
o Online: Learning based on each pattern as it is observed.
o Batch: Learning over groups of patters. Most algorithms are batch.

You might also like