You are on page 1of 1

Decision Tree Classi cation Algorithm

Linear regression algorithm shows a linear relationship between a


Decision Tree is a Supervised learning technique that
dependent (y) and one or more independent (y) variables, hence can be used for both classi cation and Regression problems,
called as linear regression. Since linear regression shows the but mostly it is preferred for solving Classi cation problems. It
linear relationship, which means it nds how the value of the is a tree-structured classi er, where internal nodes
dependent variable is changing according to the value of the represent the features of a dataset, branches
independent variable.
represent the decision rules  and  each leaf node
represents the outcome.

The linear regression model provides a sloped straight line In a Decision tree, there are two nodes, which are
representing the relationship between the variables. Consider the the  Decision Node  and Leaf Node.  Decision nodes are
below image:
used to make any decision and have multiple branches,
whereas Leaf nodes are the output of those decisions and do
Linear Regression in Machine Learning
not contain any further branches.

Mathematically, we can represent a linear regression as:


The decisions or the test are performed on the basis of
y= a0+a1x+ ε features of the given dataset.

Here,
It is a graphical representation for getting all the
Y= Dependent Variable (Target Variable)
possible solutions to a problem/decision based
X= Independent Variable (predictor Variable)
on given conditions.

a0= intercept of the line (Gives an additional degree of freedom)


It is called a decision tree because, similar to a tree, it starts
a1 = Linear regression coe cient (scale factor to each input with the root node, which expands on further branches and
value).
constructs a tree-like structure.

ε = random error seyra narsimha reddy


In order to build a tree, we use the CART algorithm, which
Types of Linear Regression stands for  Classification and Regression Tree
Linear regression can be further divided into two types of the algorithm.

algorithm:
A decision tree simply asks a question, and based on the
Simple Linear Regression: answer (Yes/No), it further split the tree into subtrees.

If a single independent variable is used to predict the value of a Below diagram explains the general structure of a decision
numerical dependent variable, then such a Linear Regression tree:

algorithm is called Simple Linear Regression.


Random Forest Algorithm
Mul ple Linear regression: Random Forest is a popular machine learning
If more than one independent variable is used to predict the value algorithm that belongs to the supervised learning
of a numerical dependent variable, then such a Linear Regression technique. It can be used for both Classification and
algorithm is called Multiple Linear Regression.
Regression problems in ML. It is based on the concept
Linear Regression Line of ensemble learning, which is a process
A linear line showing the relationship between the dependent and of combining multiple classifiers to solve a complex
independent variables is called a regression line. A regression line problem and to improve the performance of the
can show two types of relationship:
model.
As the name suggests, "Random Forest is a
classifier that contains a number of decision
Bagging Boos ng trees on various subsets of the given dataset
and takes the average to improve the predictive
accuracy of that dataset." Instead of relying on one
Various training data subsets are randomly drawn with replacement from
the whole training dataset. Each new subset contains the
decision tree, the random forest takes the prediction
components that were misclassified by previous models. from each tree and based on the majority votes of
predictions, and it predicts the final output.
Bagging attempts to tackle the over-fitting issue. Boosting tries The greater number of trees in the forest leads
to reduce bias. to higher accuracy and prevents the problem of
overfitting.
If the classifier is unstable (high variance), then we need to apply bagging. Advantages of Random Forest
If the classifier is steady and straightforward (high bias), then we need to Random Forest is capable of performing both Classi cation
apply boosting. and Regression tasks.

It is capable of handling large datasets with high


Every model receives an equal weight. Models are weighted by dimensionality.

their performance. It enhances the accuracy of the model and prevents the
over tting issue.

Objective to decrease variance, not bias. Objective to decrease bias, Disadvantages of Random Forest
not variance. Although random forest can be used for both classi cation
and regression tasks, it is not more suitable for Regression
It is the easiest way of connecting predictions that belong to the same tasks.

type. It is a way of connecting predictions that belong to the


different types.
Designing a Learning System in Machine Learning:
According to Tom Mitchell, “A computer program is said to be
learning from experience (E), with respect to some task (T).
Thus, the performance measure (P) is the performance at task
Step 3- Choosing Representation for Target function: When the T, which is measured by P, and it improves with experience
machine algorithm will know all the possible legal moves the next E.”

step is to choose the optimized move using any representation Example: In Spam E-Mail detection,

i.e. using linear Equations, Hierarchical Graph Representation, Task, T: To classify mails into Spam or Not Spam.

Tabular form etc. The NextMove function will move the Target Performance measure, P: Total percent of mails being correctly
move like out of these move which will provide more success classi ed as being “Spam” or “Not Spam”.

rate. For Example : while playing chess machine have 4 possible Experience, E: Set of Mails with label “Spam”

moves, so the machine will choose that optimized move which Steps for Designing Learning System are:

will provide success to it.


Step 1) Choosing the Training Experience: The very important
Step 4- Choosing Function Approximation Algorithm: An optimized and rst task is to choose the training data or training
move cannot be chosen just with the training data. The training experience which will be fed to the Machine Learning
data had to go through with set of example and through these Algorithm. It is important to note that the data or experience
examples the training data will approximates which steps are that we fed to the algorithm must have a signi cant impact on
chosen and after that machine will provide feedback on it. For the Success or Failure of the Model. So Training data or
Example : When a training data of Playing chess is fed   to experience should be chosen wisely.

algorithm so at that time it is not machine algorithm will fail or get Step 2- Choosing target function: The next important step is
success and again from that failure or success it will measure choosing the target function. It means according to the
while next move what step should be chosen and what is its knowledge fed to the algorithm the machine learning will
success rate.
choose NextMove function which will describe what type of
Step 5- Final Design: The nal design is created at last when legal moves should be taken.   For example : While playing
system goes from number of examples  , failures and success , chess with the opponent, when opponent will play then the
correct and incorrect decision and what will be the next step etc. machine learning algorithm will decide what be the number of
Example: DeepBlue is an intelligent   computer which is ML- possible legal moves taken in order to get success.

based won chess game against the chess expert Garry Kasparov,
and it became the rst computer which had beaten a human
ti
fi

fi
fi

ti

fi

fi


fi
fi
ffi
fi

fi


fi
fi

fi
fi

You might also like