You are on page 1of 7

1.

What is Decision Tree Model


2.Why Decision Tree Model Over Linear Model
3.Building a Decision Tree
4. Advantages and Disadvantages Of Decision Tree
5. Conclusion
What is Decision Tree Model?
A Decision Tree Model is a type of machine learning algorithm that is used for both regression
and classification problems. It is a tree-structured model that splits the data into smaller
subsets based on the most significant input feature, recursively dividing the data until a pure
subset is achieved. Each split of the tree results in a decision node, and each leaf of the tree
represents a prediction. The final prediction is made by following the path from the root of the
tree to a leaf and taking the average of the target values for the instances that end up at that
leaf.
The algorithm is based on the idea of the “decision tree,” where the data is divided into
smaller and smaller subsets based on the value of the input features, until the target values for
each instance can be predicted with a high degree of accuracy. Decision Trees are commonly
used for a wide range of applications, including classification, regression, and feature
selection. They are also easy to understand and interpret, making them a popular choice for
data scientists and business analyst.

Why Decision Tree Model Over Linear Model?


There are some advantages over linear model which are mentioned below:
There are some advantages over linear model which are mentioned below:
 Decision tree predictions are straightforward to interpret.
 A decision tree is flexible in design.
 It makes no particular assumptions regarding the type of attributes in a data set.
 It can handle any type of data, including numeric, category, textual, Boolean, and
others, without a hitch.
 A decision tree is not affected by scale.
 It handles multicollinearity well and does not require normalization since it simply
needs to compare the values of a property.
 Decision trees are extremely effective and quick algorithms that can identify complex
relationships and work well in some situations where you cannot fit a single linear relationship
between the target and feature variables.
 They frequently give us an idea of the relative importance of the explanatory attributes
that are used for prediction.
Building A Decision Tree
Constructing a decision tree involves the following steps:
1. Collect and preprocess the data: This involves gathering and cleaning the data to make
it suitable for analysis. This may include tasks such as removing missing values, handling
outliers, and encoding categorical variables.
2. Split the data into training and testing sets: This involves splitting the data into two
separate sets: one for training the model and the other for testing its performance.
3. Define the target variable and features: The target variable is the variable we want to
predict, while the features are the variables that we will use to make the predictions.
4. Build the decision tree model: This involves using an algorithm (such as ID3, C4.5, or
CART) to create a decision tree based on the training data. The algorithm will determine the
best splits at each node to maximize the information gain or minimize the impurity.
5. Evaluate the model: This involves using the testing data to evaluate the performance of
the model. Common metrics used to evaluate a decision tree model include accuracy,
precision, recall, F1-score, and AUC-ROC.
6. Optimize the model: This involves tuning the model hyperparameters (such as the
maximum depth of the tree, the minimum number of samples required to split a node, and the
impurity measure) to improve its performance on the testing data.
7. Deploy the model: Once the model has been optimized, it can be deployed in
production to make predictions on new data.

Building A Decision Tree

The process of creating a decision tree is top-down. The term “top-down strategy” describes
the method of starting with the entire collection of data and subsequently breaking it down
into smaller subgroups.
We refer to the process as greedy since it disregards what will occur in the following two or
three steps. Small modifications in the input data result in large changes to the tree’s complete
structure. This consequently alters how you divide up and the choices you make overall. This
indicates that the method is not holistic in nature because it only seeks an instantaneous
outcome that is obtained when the data is split at a single node based on a specific rule of the
attribute.

Advantages & Disadvantages Of Decision Tree


The advantages of decision trees can be summarized as follows:
1. Decision tree predictions are simple to understand.
2. Discrete and continuous variables can both be used with decision trees.
3. They are capable of handling data that are linearly separable and non-separable.
4. They make no specific assumptions about the characteristics of the attributes in a data set.
All types of data, including numeric, category, textual, and boolean data, can be handled by
them with ease.
5. Since they must only compare the values for splitting inside an attribute, they do not need
the data to be normalised. Therefore, there is minimal to no need for data pretreatment.
6. Decision trees frequently help you determine the relative weights of the explanatory factors
that are used to make predictions. It makes intuitive sense that the closer the variable utilized
for splitting is closer to the root node, the more significant.
7. The principles are simple to understand since decision trees use a strategy similar to what
people typically use when making judgements.
8. Complex models can be made simpler using decision tree representations that resemble
trees, and even a layperson can comprehend the reasoning behind the decisions/predictions.
The disadvantages of decision trees can be summarized as follows:
1. Decision trees tend to overfit the data. If allowed to grow with no check on its complexity, a
decision tree will keep splitting until it has correctly classified all the data points in the
training data set.
2. Decision trees tend to be extremely unstable, which is an implication of overfitting. A few
changes in the data can change a tree considerably.
3. The mathematical calculations for entropy and information gain (IG) and for all the features
require a lot of time and memory, as you need to perform the split for every feature at every
splitting point.
4. Greedy algorithms do not always give a globally optimal model. Decision trees are called
greedy because you have to optimize at every split and not overall. The process is not holistic
in nature, as it only aims to gain an immediate result that is derived after splitting the data at a
particular node based on a certain rule of the attribute. This can be handled well using random
forests.

Conclusion
In conclusion, a predictive decision tree model is a powerful and widely used machine
learning technique for predictive modelling and decision-making. It works by recursively
partitioning the data into subsets based on the most informative features to create a tree-like
structure that can be used to make predictions. Decision trees are easy to understand and
interpret, making them a popular choice for both beginners and experts in the field of machine
learning. However, the effectiveness of the model depends on the quality of the data and the
choice of hyperparameters. Overall, predictive decision tree models have proven to be a
valuable tool in many fields, including healthcare, finance, marketing, and more.
Prediction with Decision Trees

Model building is the main task of any data science project after understood data,
processed some attributes, and analysed the attributes’ correlations and the individual’s
prediction power. As described in the previous chapters. There are many ways to build a
prediction model. In this chapter, we will demonstrate to build a prediction model with the
most simple algorithm - Decision tree.

A decision tree is a commonly used classification model, which is a flowchart-like tree


structure. In a decision tree, each internal node (non-leaf node) denotes a test on an
attribute, each branch represents an outcome of the test, and each leaf node (or terminal
node) holds a class label. The topmost node in a tree is the root node. A typical decision
tree is shown in Figure 8.1.
Figure 8.1: An example of decision tree

It represents the concept buys_computer, that is, it predicts whether a customer is likely to
buy a computer or not. ‘yes’ is likely to buy, and ‘no’ is unlikely to buy. Internal nodes are
denoted by rectangles, they are test conditions, and leaf nodes are denoted by ovals, which
are the final predictions. Some decision trees produce binary trees where each internal
node branches to exactly two other nodes. Others can produce non-binary trees,
like age? in the above tree has three branches.
A decision tree is built by a process called tree induction, which is the learning or
construction of decision trees from a class-labelled training dataset. Once a decision tree
has been constructed, it can be used to classify a test dataset, which is also
called deduction.

The deduction process is Starting from the root node of a decision tree, we apply the test
condition to a record or data sample and follow the appropriate branch based on the
outcome of the test. This will lead us either to another internal node, for which a new test
condition is applied or to a leaf node. The class label associated with the leaf node is then
assigned to the record or the data sample. For example, to predict a new data input
with 'age=senior' and 'credit_rating=excellent', traverse starting from the root goes to the
most right side along the decision tree and reaches a leaf yes, which is indicated by the
dotted line in the figure 8.1.
Build a decision tree classifier needs to make two decisions:

1. which attributes to use for test conditions?


2. and in what order?

Answering these two questions differently forms different decision tree algorithms.
Different decision trees can have different prediction accuracy on the test dataset. Some
decision trees are more accurate and cheaper to run than others. Finding the optimal tree is
computationally expensive and sometimes is impossible because of the exponential size of
the search space. In real practice, it is often to seek efficient algorithms, that are
reasonably accurate and only compute in a reasonable amount of time. Hunt’s, ID3, C4.5
and CART algorithms are all of this kind of algorithms for classification. The common
feature of these algorithms is that they all employ a greedy strategy as demonstrated in
the Hunt’s algorithm.

You might also like