You are on page 1of 10

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/360630359

Students Placement Prediction Using Machine Learning Algorithms

Conference Paper · March 2022

CITATIONS READS

3 3,366

1 author:

Kajal Rai Saraswat


G L Bajaj
14 PUBLICATIONS 191 CITATIONS

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Intrusion Detection System View project

Multiagent Integrated scheme for Intrusion Detection View project

All content following this page was uploaded by Kajal Rai Saraswat on 14 September 2022.

The user has requested enhancement of the downloaded file.


ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper

STUDENTS PLACEMENT PREDICTION USING MACHINE


LEARNING ALGORITHMS

Dr. Kajal Rai


Assistant Professor Chandigarh University

ABSTRACT
Placements include limitless importance for college students and academic organizations. It
helps a student to assemble a robust basis for the expert profession beforehand in addition
to a virtuous placement file affords an aggressive side to a school or college inside the
schooling arcade. Machine learning is a way of statistical evaluation that automates
analytical version construction. This paper makes a specialty of a machine that forecasts if
a pupil might be located or now no longer primarily based totally on the pupil’s
qualifications, ancient statistics, and experience. This forecaster makes use of three
machine learning algorithms, namely, Decision Tree, Naïve Bayes, and Random Forest to
expect pupil’s placement after which evaluation of those algorithms are performed on the
idea of accuracy achieved.
KEYWORDS: Data Analysis, Decision Tree, Machine Learning, Naïve Bayes,
Placements, Prediction Models, Random Forest.

INTRODUCTION
In the modern-day age, campus engage and interact with the enterprise
placement clutches prodigious specialists in the course of the location
importance for college students and drives, which in addition assist to lay a
academic organizations. While it helps a basis for his or her potential profession
learner in edifice a strong basis for the with inside the destiny as they
expert professionals in advance without familiarize with capability contacts from
going through the real-international their selected professional field.
activity fight, peer-opposition or own Placements have progressively come to
circle of relatives pressure, a first-rate be a vital part of an institute’s
placement report offers an aggressive offerings, which turned into now no
side to an institute or college with inside longer the situation earlier. Nowadays,
the studying marketplace. college students pay unique interest to
Campus placements offer the scholars a placement information whilst choosing a
foot-in-the-door opportunity, permitting university or college for admission.
them to start their profession proper Machine learning is a growing
when they have finished their direction technology that enables computers to
curriculum. Furthermore, they get to learn automatically from past data.

54 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
Machine learning uses various supervised machine learning classifiers
algorithms for building mathematical which can be utilized to foresee the
models and making predictions using placement of a student in the IT industry
historical data or information. Currently, centered on their academic performance
it is being used for various tasks such as in class tenth, twelfth, graduation, and
image recognition, speech recognition, backlog till date in graduation.
email filtering, Facebook auto-tagging, Numerous factors utilized to associate
recommendation system, and many and examine the outcomes of distinctive
more. established classifiers are accuracy
The dataset considered for this work is score, percentage accuracy score,
the MBA students’ data at Jain confusion matrix, heat map, and
University, Bangalore of the year 2020 classification report. Classification
[1]. It includes various attributes like report created through advanced
secondary and higher education classifiers consists of parameters
percentage and board; it also consists of precision, recall, f1-score, and support.
specialization and work experience and The classification algorithms Support
many more. With the help of machine Vector Machine, Gaussian Naive Bayes,
learning algorithms, the prediction of K-Nearest Neighbor, Random Forest,
students who got placed in the company Decision Tree, Stochastic Gradient
at campus placements based on various Descent, Logistic Regression, and
attributes was done in this paper. Neural Network are cast-off to develop
LITERATURE REVIEW the classifiers.
In [2] the authors used decision tree and The authors in paper [5] used
random forest to classify the dataset of Knowledge Discovery and Data
campus placed and non- placed students. Mining (KDD) which is the placement
The accuracy obtained by the decision class procedure using the classification
tree was 84% and the random forest was technique. In the primary
86% in this paper. experimentation, the accurately
This paper presents a recommendation classified cases were 84.2%, and in the
framework that forecasts the scholars to second experimentation using the same
have one of the five placement statuses, data and attributes, provide the best
namely, Dream Company, Core percentage of accuracy as 92.1%. The
Company, Mass Recruiters, Not best outcomes were obtained using
Eligible, and Not Interested in Naive Bayes and SMO.
Placements. This model benefits the METHODOLOGY
placement cell within an institute to In this paper, the research methodology
recognize the potential students and pay used can be depicted by the following
consideration to and advance their figure.
technical as well as social abilities.
In this paper, the authors proposed some

53 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
tree, naïve bayes, and random forest.
These models are used to classify the
Data
Collectio dataset based on various selected
n attributes.
Pre- CLASSIFICATION
Processi
ng Classification is done to predict the
Model students who are placed in some
Generati company on campus and those who are
on
not placed. To classify the students three
Classifica well-known machine learning classifiers
tion
are used in this paper.
Result RESULT ANALYSIS
Analysis
The results obtained from different
classifiers are analyzed and compared
based on accuracy.
Fig. 1: Methodology Used
for Research EXPERIMENTATION
Weka is open-source software that
implements a large collection of
DATA COLLECTION machine learning algorithms and is
The sample data has been collected from widely used in data mining
Bangalore college of MBA students. applications. For experiments dataset
The dataset consists of 215 instances of downloaded from Kaggle was in csv
students. file. This file was loaded into WEKA
PRE-PROCESSING explorer. The classify panel enables the
Data pre-processing is a method that is user to apply classification and
used to convert raw data into clean data regression algorithms to the resulting
which can be used for model dataset, to estimate the accuracy of the
construction. This research work resulting predictive model, and to
consists of pre-processing tasks such as visualize erroneous predictions, or the
attribute selection, cleaning missing model itself. The algorithm used for
values, and splitting the dataset into classification is Naive Bayes, Decision
training and testing. Some attributes Tree, and Random Forest. Under the
such as serial no. and salary are removed "Test options", the 10-fold cross-
as they do not contribute to validation is selected as our evaluation
classification. approach. Since there is no separate
MODEL GENERATION evaluation data set, this is necessary to
Three machine learning classifier get a reasonable idea of the accuracy of
models are generated using the given the generated model. This predictive
dataset. These three models are decision model provides a way to predict whether

54 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
a new student will place or not in an weka for experimentation. There are
organization. various attributes in this file such as
UPLOAD FILE IN WEKA gender, work experience, specialization,
The following figure shows the sample senior secondary percentage, board of
from the dataset which is uploaded in senior secondary, etc.

Fig. 2: Data set Uploaded in Weka

MODEL GENERATION Following machine learning models are


Models are generated using the given generated and tested using cross-
dataset of students placed at the campus. validation technique using 10 fold.

DECISION TREE CLASSIFIER features such as handling missing


The most frequently, and currently values, classification of continuous
perhaps the most extensively used attributes, trimming of decision trees,
decision tree algorithm is C4.5. rule derivation, and others. J48
Professor Ross Quinlan developed a algorithm is an implementation of the
decision tree algorithm known as C4.5 C4.5 decision tree algorithm in the
in 1993. It characterizes the outcome of Weka software tool.
research that traces back to the ID3
algorithm which is also proposed by
Ross Quinlan in 1986. C4.5 has extra

55 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
The generated decision tree model is shown in Fig 3.

Fig. 3: Decision Tree

The confusion matrix generated by accuracy achieved by this model is


testing data using a decision tree 82.79 % in which 178 instances are
algorithm is given in Fig 4. The correctly classified.

Fig. 4: Decision Tree Confusion Matrix

NAÏVE BAYES the predictable state. A Naive Bayesian


The Naïve Bayes model detects the classifier is a simple probabilistic
features of failure students. It illustrates classifier based on applying the Bayesian
the probability of each input feature for theorem with robust independence

56 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
suppositions. Naive Bayes employment is from a limited amount of data [8]. The
chosen as it is simple and can be trained accuracy achieved by classifying the
on whole training data, a privilege that dataset using naïve bayes classifier is
boosting “never over-fits” could not be 84.65% in which 182 out of 215
preserved, and the complex resultant instances are correctly classified. The
classifier can be determined constantly confusion matrix of
the naïve bayes classifier is shown in Fig 5.

Fig. 5: Confusion Matrix of Naive Bayes

RANDOM FOREST test record will be decided by the


The random forest algorithm can also be algorithm which uses the majority vote
thought of as an ensemble method in technique. Random forest algorithm
machine learning. The input to a random makes use of the out-of-bag error
forest algorithm is a dataset consisting of technique [9]. The accuracy obtained by
records, with attributes. Random subsets using a random forest classifier is 86% in
of the input are created. On each of the which 185 instances are correctly
random subsets created, a decision tree classified.
will be constructed. The final class of a

The confusion matrix generated by the Weka tool using random forest is given in Fig. 6.

Fig. 6: Confusion Matrix of Random Forest

57 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
RESULTS ANALYSIS results, i.e. each of them accounts for the
To better understand the importance of relevance of variables in a different way.
the input variables, it is customary to Now, we have carried out some
analyze the impact of input variables experiments to evaluate the performance
during students' placement success, in and usefulness of different classification
which the impact of certain input algorithms for predicting students’
variables of the model on the output placement. The results of the
variable has been analyzed. Different experiments are shown in table 1.
algorithms provide very different
Table 1: Performance of Classifiers

Evaluation Criteria Classifiers

Decision Tree Naïve Bayes Random Forest

Time to build model 0.01 seconds 0.001 seconds 0.1 seconds

Correctly Classified 178 182 185


Instances

Incorrectly Classified 37 33 30

Instances

Accuracy (%) 82.79 % 84.65 % 86.04 %

Recall 0.885 0.892 0.946

Precision 0.868 0.886 0.864

The efficiency of the three approaches is Accuracy = (TP + TN)/


compared in terms of accuracy, recall, (TP+FP+TN+FN) * 100
and precision. The accuracy of the where TP, TN, FP, and FN represent the
prediction model is defined as the total number of true positives, true negatives,
number of correctly predicted/classified false positives, and false negatives.
instances. Accuracy is given by using Precision also called positive predictive
the following formula: value is the fraction of relevant instances

58 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
among the retrieved instances, while Tree, Random Forest, and Naïve Bayes.
recall is also known as sensitivity is the Fig. 7 shows the graph of the
fraction of relevant instances that were comparison of the models taken under
retrieved. Both precision and recall are consideration. These models were
therefore based on relevance. selected for our research work as these
We had done the comparative analysis are the best-suited models for
of all three models, namely, Decision classification problems.

Accuracy (%) of Classifiers


87 86.04
Accuracy in Percentage (%)

86
84.65
85
84
82.79
83
82
81
Decision Naïve Bayes Random
Tree Forest
Machine Learning Classifiers

Fig. 7: Classifiers Accuracy

CONCLUSION
The campus placement task is extremely selected to build the model. The
a lot of vital from the organization’s accuracy obtained after analysis for the
point of view as well as the student’s Decision tree is 83%, for Naïve Bayes
point of view. In this respect to advance is 84.65% and for the Random Forest is
the student’s performance, an effort has 86%. These results recommend that
been studied and predicted using the amongst the machine learning algorithm
classification algorithms Decision Tree, verified, the Random Forest classifier
Naïve Bayes, and the Random forest has the potential to significantly
algorithm to authenticate the progress the conventional classification
methodologies. The algorithms are methods for use in placement.
applied to the data set and features are

59 | P a g e
ISSN:2395-1079 Available online at https://journals.edwin.co.in/index.php/esajms/issue/view/619

South Asia Journal of Multidisciplinary Studies SAJMS June 2022, Vol. 8, No.-5

Research Paper
REFERENCES

[1] https://www.kaggle.com/benroshan/factors-affecting-campus-placement
[2] Manvitha, Pothuganti, and Neelam Swaroopa. "Campus placement prediction using
supervised machine learning techniques." Int J Appl Eng Res 14.9 (2019): 2188-
2191.
[3] S. K. Thangavel, P. D. Bkaratki and A. Sankar, "Student placement analyzer: A
recommendation system using machine learning," 2017 4th International Conference
on Advanced Computing and Communication Systems (ICACCS), 2017, pp. 1-5,
doi: 10.1109/ICACCS.2017.8014632.
[4] Laxmi Shanker Maurya and Md Shadab Hussain and Sarita Singh, “Developing
Classifiers through Machine Learning Algorithms for Student Placement Prediction
Based on Academic Performance”, In Applied Artificial Intelligence, vol. 35, no. 6,
pp. 403-420, 2021, doi: 10.1080/08839514.2021.1901032.
[5] Pratiwi, Oktariani Nurul. "Predicting student placement class using data mining."
Proceedings of 2013 IEEE International Conference on Teaching, Assessment and
Learning for Engineering (TALE). IEEE, 2013.
[6] Quinlan, J.R., C4.5: Programs for machine learning, Morgan Kaufmann, San
Francisco, 1993.
[7] Kajal Rai, M. Syamala Devi, and Ajay Guleria, “Decision Tree based Algorithm for
Intrusion Detection”, International Journal of Advanced Networking Applications
(IJANA), ISSN: 0975-0282, vol. 7, no. 4, pp. 2828-2834, 2016.
[8] Russell, Stuart, Norvig, Peter, “Artificial Intelligence: A Modern Approach”, In
Prentice Hall, (1995), ISBN: 978-0137903955.
[9] Livingston, Frederick. "Implementation of Breiman’s random forest machine
learning algorithm." ECE591Q Machine Learning Journal Paper (2005): 1-13.

60 | P a g e

View publication stats

You might also like