Ca-Project: Aryan Devesh Puja Shabnas Mudit

CA- PROJECT
ARYAN
DEVESH
PUJA
SHABNAS
MUDIT
What is a Decision Tree Algorithm?
● Decision tree learning is one of the predictive modelling
approaches used in statistics, data mining and machine
learning.
● It uses a decision tree to go from observations about an item
to conclusions about the item's target value.
● Tree models where the target variable can take a discrete set
of values are called classification trees
● Decision trees where the target variable can take continuous
values are called regression trees
● Decision trees are among the most popular machine learning
algorithms given their intelligibility and simplicity to use
Types of decision tree -
Categorical Variable Decision Tree: Decision Tree which has a

categorical target variable then it called a Categorical variable
decision tree.
Continuous Variable Decision Tree: Decision Tree has a continuous

target variable then it is called Continuous Variable Decision Tree.
Different uses of Decision Tree
1 2 3
Using demographic data to find Assessing prospective growth Serving as a support tool in several
prospective clients opportunities fields
Another application of decision trees is in the use of One of the applications of decision trees involves evaluating Lenders also use decision trees to predict the probability of a
demographic d to find prospective clients. They can help prospective growth opportunities for businesses based on customer defaulting on a loan, by applying predictive model
in streamlining a marketing budget and in making informed historical data. Historical data on sales can be used in generation using the client’s past data. The use of a decision
decisions on the target market that the business is decision trees that may lead to making radical changes in the tree support tool can help lenders in evaluating the
focused on. strategy of a business to help aid expansion and growth creditworthiness of a customer to prevent losses.
Decision Tree Algorithm
Advantages Disadvantages
Easy to explain and perfect for visual representation Generally it gives low prediction accuracy compared to
other algorithms
It can be used for both continuous and categorical

High probability of overfitting
data
Information gain with categorical variables gives a biased

Requires little data preprocessing response for attributes with greater no. of categories
The data ends up in distinct groups that are often Trees fail to deal the linear relationship with input and
easier to understand and infer output
What is a Random Forest?
● Random forest is a supervised learning algorithm.
The "forest" it builds, is an ensemble of decision
trees, usually trained with the “bagging” method.
● Random forest, like its name implies, consists of a
large number of individual decision trees that
operate as an ensemble. Each individual tree in the
random forest spits out a class prediction and the
class with the most votes becomes our model’s
prediction
How Random forest works -

● Step 1 − First, start with the selection of random
samples from a given dataset.
● Step 2 − Next, this algorithm will construct a decision
tree for every sample. Then it will get the prediction
result from every decision tree.
● Step 3 − In this step, voting will be performed for
every predicted result.
● Step 4 − At last, select the most voted prediction
result as the final prediction result.
What are the uses of Random forest?
Random forest classifier will handle the
missing values and maintain the

Random forest algorithm can be
accuracy of a large proportion of data.
used for both classifications and
regression task.
. 4
3
If there are more trees, it
won’t allow over-fitting
trees in the model.
It provides higher
accuracy through cross
validation.
It has the power to handle a large data
set with higher dimensionality

Random Forest
Advantages Disadvantages
Work well for both classification and regression Do good job at classification, but not for regression with
problems same effectiveness
Power to handle data sets with higher dimensions Large number of trees in large data sets make algorithm
making it suitable for complicated tasks too slow for processing
It has an effective method for estimating missing data Difficult to interpret , act like black boxes
and maintains accuracy
Presence of large number of trees enhances better

Computationally expensive
final prediction
Methodology and Analysis
Data Cleansing
02
⮚ The variables which contains

⮚ Surface Analysis of data more than Just15%start
entries as ⮚ Only one variable Just start Final
There are many
Raw
“NA” are notare
There selected
many out of two or three
variations passages of Data
⮚ Finding the relevant and ⮚ Replaced “NA”
variations entriesofin
passages same variables lorem ipsum available
Data irrelevant variables intuitively other variables
lorem as average
ipsum available
with same data are
majority have suffered
majorityor
or median have suffered
mode of that alteration
variables alteration only considered
01
03
Analysis
Software Results
Algorithms
Results are obtained by
1. Decision tree coding with the decision tree
2. Random Forest algorithm as well as random
forest algorithm

Ca-Project: Aryan Devesh Puja Shabnas Mudit

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ca-Project: Aryan Devesh Puja Shabnas Mudit

Uploaded by

Copyright:

Available Formats

CA- PROJECT

Types of decision tree -

Categorical Variable Decision Tree: Decision Tree which has a

Continuous Variable Decision Tree: Decision Tree has a continuous

It can be used for both continuous and categorical

Information gain with categorical variables gives a biased

How Random forest works -

missing values and maintain the

If there are more trees, it

won’t allow over-fitting

trees in the model.

It has the power to handle a large data

set with higher dimensionality

Presence of large number of trees enhances better

⮚ The variables which contains

You might also like