You are on page 1of 2

Random Forests Classifiers

What is Random Forest Classifier?


Random forests is a supervised learning algorithm. It can be used both for
classification and regression. The random forest is a classification algorithm
consisting of many decisions trees. It uses bagging and feature randomness
when building each individual tree to try to create an uncorrelated forest of trees
whose prediction by committee is more accurate than that of any individual tree.
Random forests has a variety of applications, such as recommendation engines,
image classification and feature selection. It lies at the base of the Boruta
algorithm, which selects important features in a dataset. And uses Gini index
importance or mean decrease in impurity (MDI) to calculate the importance of
each feature.

How does the algorithm work?

It works in four steps:

1. Select random samples from a given dataset.

2. Construct a decision tree for each sample and get a prediction result from
each decision tree.

3. Perform a vote for each predicted result.

4. Select the prediction result with the most votes as the final prediction.

Python libraries and Functions:

In Random Forest Classifier we will use :

1. Pandas library (for data manipulation and analysis)


2. Scikit-learn (Sklearn) library for statistical modeling for
classification.
3. Random forest classifier from Sklearn assemble module.

You might also like