You are on page 1of 21

Machine Learning

LECTURE – 05
DECISION TREES (DT)
Decision Tree
The decision tree is one of the most important
machine learning algorithms. It is used for
both classification and regression problems.

 A decision tree is like a flowchart that helps


you make decisions step by step. Each decision
(like "Is it raining?") leads to more decisions
until you reach a conclusion (like "Wear a
raincoat")
Decision Tree
A decision tree is a classification and prediction tool having a tree-like structure,
where each internal node denotes a test on an attribute, each branch represents
an outcome of the test, and each leaf node (terminal node) holds a class label.
used to make decisions based on input data by branching through different
possibilities until a decision is made.
Entropy
In machine learning, entropy is a measure of the randomness or uncertainty in
the information being processed. The higher the entropy, the harder it is to draw
any conclusions from that information.
Entropy is a logarithmic measure. It can be interpreted as the expected error rate in a
classifier.
Information Gain
Information gain can be defined as the amount of information gained about a random
variable or signal from observing another random variable. It can be considered as the
difference between the entropy of parent node and weighted average entropy of child nodes.
Information gain uses entropy to make decisions. If the entropy is less, information will be
more. Information gain is used in decision trees and random forest to decide the best split.
Thus, the more the information gain the better the split and this also means lower the entropy.
It measures how much a particular question helps to reduce the uncertainty (entropy) in your
decision-making process.
IG(S,A) is the information gain by splitting dataset
S based on attribute A.
H(S) is the entropy of the original dataset S.
Decision tree algorithms
There are many algorithms there to build a decision tree. They are:

1. CART (Classification and Regression Trees) — This makes use of Gini impurity
as the metric.
2. ID3 — This uses entropy and information gain as metric.
Example of ID3
Classification using the ID3 algorithm
Consider whether a dataset based on which we will determine whether to play
golf or not.
Advantages
•Simple to understand and to interpret.
•Requires little data preparation.
•The cost of using the tree (i.e., predicting data) is logarithmic in the number of
data points used to train the tree.
•Able to handle both numerical and categorical data.
•Able to handle multi-output problems.
Disadvantages
•Prone to Overfitting.
•Unstable to Changes in the Data.
•Unstable to Noise.
•Non-Continuous.
•Unbalanced Classes.
•Greedy Algorithm: make the best locally optimal choice at each step not considering the choice
will lead to the best tree globally.
•Computationally Expensive on Large Datasets.
• Complex Calculations on Large Datasets.
Reference
https://medium.com/@ashirbadpradhan8115/decision-tree-id3-
algorithm-machine-learning-4120d8ba013b

You might also like