Professional Documents
Culture Documents
UNIT -1
1
Machine learning
4
Related Fields
data
mining
control theory
statistics
decision theory
machine
information theory learning
cognitive science
databases
psychological models
evolutionary
models neuroscience
Machine learning is primarily concerned with the accuracy
and effectiveness of the computer system.
Some more examples of tasks that are
best solved by using a learning algorithm
Recognizing patterns:
Facial identities or facial expressions
Medical images
Generating patterns:
Generating images or motion sequences (demo)
Recognizing anomalies:
Unusual sequences of credit card transactions
Recommendation systems:
Lots of noisy data. Million dollar prize!
Information retrieval:
Find documents or images with similar content.
Data Visualization:
Display a huge database in a revealing way (demo)
Types of learning task
Supervised learning
Learn to predict output when given an input vector
Reinforcement learning
Learn action to maximize payoff
Not much information in a payoff signal
Unsupervised learning
Create an internal representation of the input e.g. form
10
DESIGNING A LEARNING SYSTEM
11
DESIGNING A LEARNING SYSTEM
12
DESIGNING A LEARNING SYSTEM
13
DESIGNING A LEARNING SYSTEM
14
DESIGNING A LEARNING SYSTEM
15
DESIGNING A LEARNING SYSTEM
16
DESIGNING A LEARNING SYSTEM
17
DESIGNING A LEARNING SYSTEM
18
Steps for Designing Learning System
are:
19
Step 1) Choosing the Training Experience
20
Choosing the Training Experience
21
Choosing the Training Experience
For example:
22
Choosing the Training Experience
23
Step 2- Choosing target function
24
Step 3- Choosing Representation for
Target function
27
Choosing the Target Function
Let us therefore define the target value V(b) for an arbitrary
board state b in B, as follows:
30
Choosing the Target Function
Thus, our learning program will represent c(b) as a
linear function of the Form
31
PERSPECTIVES AND ISSUES IN
MACHINE LEARNING
33
PERSPECTIVES AND ISSUES IN
MACHINE LEARNING
34
Decision Tree Learning
A decision tree is a flowchart-like structure in which
each internal node represents a test on a feature
(e.g. whether a coin flip comes up heads or tails).
Each leaf node represents a class label (decision taken
after computing all features) and branches represent
conjunctions of features that lead to those class labels.
The paths from root to leaf represent classification
rules. Below diagram illustrate the basic flow of
decision tree for decision making with labels
(Rain(Yes), No Rain(No)).
35
36
37
A decision tree
38
A decision tree
A decision tree for the concept buys computer,
indicating whether a customer at AllElectronics is
likely to purchase a computer.
Each internal (nonleaf) node represents a test on an
attribute.
Each leaf node represents a class (either buys
computer = yes or buys computer = no).
39
Decision Tree
Decision trees classify instances by sorting them down
the tree from the root to some leaf node, which
provides the classification of the instance.
Each node in the tree specifies a test of some attribute
of the instance.
Each branch descending from a node corresponds to
one of the possible values for the attribute.
40
Decision Tree
Each leaf node assigns a classification.
The instance (Outlook=Sunny, Temperature=Hot,
Humidity=High, Wind=Strong) is classified as a
negative instance.
41
When to Consider Decision Trees
Instances are represented by attribute-value pairs.
Fixed set of attributes, and the attributes take a small
number of disjoint possible values.
The target function has discrete output values.
Decision tree learning is appropriate for a boolean
classification, but it easily extends to learning
functions with more than two possible output values.
42
When to Consider Decision Trees
Disjunctive descriptions may be required
decision trees naturally represent disjunctive
expressions.
The training data may contain errors.
Decision tree learning methods are robust to errors,
both errors in classifications of the training examples
and errors in the attribute values that describe these
examples
43
When to Consider Decision Trees
The training data may contain missing attribute values.
Decision tree method scan be used even when some
training examples have unknown values.
Decision tree learning has been applied to problems such
as learning to classify
medical patients by their disease, equipment
malfunctions by their cause
loan applicants by their likelihood of defaulting on
payment
44
Which Attribute is ”best”?
We would like to select the attribute that is most useful
for classifying examples.
Information gain measures how well a given attribute
separates the training examples according to their
target classification.
ID3 uses this information gain measure to select among
the candidate attributes at each step while growing the
tree.
45
Which Attribute is ”best”?
In order to define information gain precisely, we use a
measure commonly used in information theory, called
entropy
Entropy characterizes the (im)purity of an arbitrary
collection of examples
46
47
Appropriate Problems for Decision
Tree Learning
48
Appropriate Problems for Decision
Tree Learning
49
Appropriate Problems for Decision
Tree Learning
51
THE BASIC DECISION TREE LEARNING
ALGORITHM
52
THE BASIC DECISION TREE
LEARNING ALGORITHM
56
Which Attribute is best?
We would like to select the attribute that is most useful
for classifying examples.
Information gain measures how well a given attribute
separates the training examples according to their
target classification.
ID3 uses this information gain measure to select among
the candidate attributes at each step while growing the
tree.
57
ENTROPY MEASURES
HOMOGENEITY OF EXAMPLES
58
Defining Information Gain
We want to determine which attribute in a given set of
training feature vectors is most useful for
discriminating between the classes to be learned.
Information gain tells us how important a given
attribute of the feature vectors is.
We will use it to decide the ordering of attributes in
the nodes of a decision tree
61
Entropy
To illustrate, suppose S is a collection of 14 examples of some
boolean concept, including 9 positive and 5 negative examples
(we adopt the notation)
62
Entropy
Notice that the entropy is 0 if all members of S
belong to the same class.
For example, if all members are positive (pe = I), then
p, is 0.
Note the entropy is 1 when the collection contains an
equal number of positive and negative examples.
63
Training Examples for play Game
Day Outlook Temp. Humidity Wind Play Tennis
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rain Mild High Weak Yes
D5 Rain Cool Normal Weak Yes
D6 Rain Cool Normal Strong No
D7 Overcast Cool Normal Weak Yes
D8 Sunny Mild High Weak No
D9 Sunny Cold Normal Weak Yes
D10 Rain Mild Normal Strong Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rain Mild High Strong No
64
ID3
S=[9+,5-] S=[9+,5-]
E=0.940 E=0.940
Humidity Wind
Over
Sunny Rain
cast
[2+, 3-] [4+, 0] [3+, 2-]
E=0.97 E=0.0 E=0.971
(5/14)*0.0971 =0.247
66
ID3
67
ID3
68
Result for ID3
Note that every example for which Outlook = Overcast is
also a positive example of PlayTennis.
69
70