Professional Documents
Culture Documents
net/publication/332369964
CITATIONS READS
13 3,770
3 authors, including:
Some of the authors of this publication are also working on these related projects:
Predicting Students Academic Performance using Data Mining Techniques: A Case Study View project
All content following this page was uploaded by Mukesh Kumar on 12 April 2019.
Published By:
Blue Eyes Intelligence Engineering
Retrieval Number: D24210484C19\19©BEIESP 54 & Sciences Publication
Educational Data Mining: Student Performance Prediction in Academic
Table 1: Description of the student attribute taken into build the dataset
S.No Attribute Description Variable Possible Values
Published By:
55 Blue Eyes Intelligence Engineering
Retrieval Number: D24210484C19\19©BEIESP
& Sciences Publication
International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249-8958, Volume-8 Issue-4C, April 2019
Initially, the target output class ranges from 0 to 20 and there Search Method from Student’s performance dataset with
are 21 clusters. This is an unreasonable setting for the mostly effect the prediction accuracy of the classification
classification task, because it makes it extremely difficult to algorithm.
classify remember that the number of instances, we have is
only 649. As a result, I have mapped a group of clusters to a We are tried with three different attribute’s evaluator method
few clusters [A, B, C, D, and F] as indicated in Table 2. namely CorrelationAttributeEval, GainRatioAttributeEvall
and InfoGainAttributeEval with Ranker Search Method. After
Table 2: Change the number of target class clusters finding the result of the implementing attribute evaluator
algorithm, we are selected ten different attribute which mostly
Range of initial class New cluster number affect our prediction result.
16-20 A class
Table 4. Shows the result of the entire algorithm used on
14-15 B Class
student dataset. The ten selected attributes are G2, G1,
12-13 C Class
failures, higher, Medu, school, studytime, Fedu with mostly
10-11 D Class attribute evaluator algorithm found most effective for
<=9 F Class prediction.
IV . CLASSIFICATION MODELS FOR STUDENT Table 4: Attribute Selection with the help of Ranker
PERFORMANCE DATASET Search Method
To find best classifier’s which generalize the data with more
accurately, we implement different classification algorithm in Attribute Evaluator Attribute in decreasing
WEKA in student performance dataset by choosing different order of their Rank
parameters of the algorithms which most efficiently analysis
the dataset and increase their generalized accuracy. Please CorrelationAttributeEval 32,31,15,21,7,1,14,8,27
note that, all the implemented classification algorithm are ,13,4,16,28,30,22,2,25,
tested using 10-fold cross validation to calculate algorithms 19,3,11,9,29,18,26,20,2
accuracy. 4,10,5,23,12,17,6
GainRatioAttributeEvall 32,31,15,21,1,7,16,14,1
Standard classifier’s Model Building using WEKA: 8,8,9,4,22,11,10,12,2,1
Numerous classification algorithms were selected and 9,20,23,5,17,6,25,3,24,
implemented to the student’s performance dataset. The 13,26,28,29,30,27
different classification algorithms which are implemented on InfoGainAttributeEval 32,31,15,21,1,9,7,14,11
WEKA are NaiveBayes, Decision Tree (J48), RandomForest, ,10,8,16,4,22,12,2,19,1
RandomTree, REPTree, JRip, OneR, SimpleLogistic and 8,20,23,17,5,6,29,13,3,
ZeroR. Please note that all the classification algorithms are 25,26,28,30,24,27
implemented with their default parameters. Please note that,
all the implemented classification algorithm are tested using
10-fold cross validation to calculate algorithms accuracy. To Superlative Classifier’s Model Building using WEKA: To
check the baseline accuracy for the student’s performance find best classifier’s which generalize the data with more
dataset, we are implementing ZeroR classification algorithm accurately, we implement different classification algorithm in
with overall accuracy of 30.97 %. In Table 3, different WEKA on student’s performance dataset by choosing
classification algorithm with finest accuracy is underline. different parameters of the algorithms which most efficiently
analysis the dataset and increase their generalized accuracy.
Table 3: Classification algorithm with their accuracy on
default parameter The different classification algorithms which are
Classification Classification Algorithm implemented on WEKA are NaiveBayes, Decision Tree
Algorithm taken for Accuracy with all attributes (J48), RandomForest, RandomTree, REPTree, JRip, OneR,
implementation SimpleLogistic and ZeroR. Please note that, all the
implemented classification algorithm are tested using 10-fold
NaiveBayes 68.2589 %
cross validation to calculate algorithms accuracy with
Decision Tree ( J48) 67.7966 % different parameter selection.
RandomTree 53.4669 %
REPTree 75.1926 %
JRip 70.7242 %
OneR 76.7334 %
SimpleLogistic 71.3405 %
ZeroR 30.9707 % ( Baseline accuracy)
Published By:
Blue Eyes Intelligence Engineering
Retrieval Number: D24210484C19\19©BEIESP 56 & Sciences Publication
Educational Data Mining: Student Performance Prediction in Academic
Published By:
57 Blue Eyes Intelligence Engineering
Retrieval Number: D24210484C19\19©BEIESP
& Sciences Publication
International Journal of Engineering and Advanced Technology (IJEAT)
ISSN: 2249-8958, Volume-8 Issue-4C, April 2019
VI. EVALUATION AND CONCLUSION 6. Witten I. and Frank E., 2005. Data Mining: Practi- cal Machine
Learning Tools and Techniques with Java Implementations.
Morgan Kaufmann, San Francisco, CA.
At the end, we concluded that student’s performance 7. Luan J., 2002. Data Mining and Its Applications in Higher
dataset are valuable to predict the attributes that will affect Education. New Directions for Institutional Research, 113, 17–36.
their academic routine. Furthermore, this process is 8. Mukesh Kumar, A.J. Singh, "Evaluation of Data Mining
important to improve the educational quality which is vital Techniques for Predicting Student’s Performance", International
Journal of Modern Education and Computer Science(IJMECS),
to attract students to stay in the school. Generally, Vol.9, No.8, pp.25-31, 2017.DOI: 10.5815/ijmecs.2017.08.04
researchers in educational data mining are used CGPA 9. Ma Y.; Liu B.; Wong C.; Yu P.; and Lee S., 2000. Targeting the
and internal performance marks to predict academic right students using data mining. In Proc. of 6th ACM SIGKDD
performance. But in our implementation results we found International Conference on Knowledge Discovery and Data
Mining. Boston, USA, 457–464.
that schools as well as studytime also affect the student 10. M. Kumar, S. Shambhu, P. Aggarwal, "Recognition of Slow
final grade. We found that classification algorithm like Learners Using Classification Data Mining Techniques", Imperial
OneR, REPTree and Decision Tree (J48) have more than Journal of Interdisciplinary Research, vol. 2, no. 12, 2016.
76.00 % accuracy for predicting student result and they 11. Mashael A. Al-Barrak And Mona S. Al-Razgan, predicting
students‟ performance through classification: a case study, Journal
perform equally well. As the education and evaluation of Theoretical and Applied Information Technology 20th May
system is very widespread, we assume to a large extent 2015. Vol.75. No.2
that further remarkable insight can still be mined from this 12. Edin Osmanbegović and Mirza Suljic, DATA MINING
student’s performance dataset which is freely available in APPROACH FOR PREDICTING STUDENT PERFORMANCE,
Economic Review – Journal of Economics and Business, Vol. X,
UCI repository. We will depart this as a part of future Issue 1, May 2012.
work. 13. Raheela Asif, Agathe Merceron, Mahmood K. Pathan, Predicting
Student Academic Performance at Degree Level: A Case Study, I.J.
REFRENCES Intelligent Systems and Applications, 2015, 01, 49-61 Published
1. P. Cortez and A. Silva. Using Data Mining to Predict Secondary Online December 2014 in MECS (http://www.mecs-press.org/)
School Student Performance. In A. Brito and J. Teixeira Eds., DOI: 10.5815/ijisa.2015.01.05
Proceedings of 5th FUture BUsiness TEChnology Conference 14. Mohammed M. Abu Tair, Alaa M. El-Halees, Mining Educational
(FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008, Data to Improve Students‟ Performance: A Case Study,
EUROSIS, ISBN 978-9077381-39-7. [Web Link6 ] International Journal of Information and Communication
2. P. Cortez and A. Silva. Using Data Mining to Predict Secondary Technology Research, ISSN 2223-4985, Volume 2 No. 2, February
School Student Performance. In A. Brito and J. Teixeira Eds., 2012.
Proceedings of 5th FUture BUsiness TEChnology Conference
(FUBUTEC 2008) pp. 5-12, Porto, Portugal, April, 2008,
EUROSIS, ISBN 978-9077381-39-7. [Web Link6 ]
3. S. Harvey. Mining Information from US Census Bureau Data.
4. Mukesh Kumar, A.J. Singh, Disha Handa,"Literature Survey on
Student’s Performance Prediction in Education using Data Mining
Techniques", International Journal of Education and Management
Engineering(IJEME), Vol.7, No.6, pp.40-49, 2017.DOI:
10.5815/ijeme.2017.06.05
5. Turban E.; Sharda R.; Aronson J.; and King D., 2007. Business
Intelligence, A Managerial Approach. Prentice-Hall.
Published By:
Blue Eyes Intelligence Engineering
Retrieval Number: D24210484C19\19©BEIESP 58 & Sciences Publication
Educational Data Mining: Student Performance Prediction in Academic
Published By:
Retrieval Number: D24210484C19\19©BEIESP Blue Eyes Intelligence Engineering
59 & Sciences Publication
View publication stats