You are on page 1of 6

Classification of Online Judge Programmers based

on Rule Extraction from Self Organizing Feature


Map
Chowdhury Md Intisar, Yutaka Watanobe
Graduate Department of Information Systems,
University of Aizu, Fukushima, 965-8580, Japan
Email: {m5212106, yutaka}@u-aizu.ac.jp

Abstract—Computer programming is one of the most im- solution and appropriate problems to solve. It is important to
portant and vital skill in the current generation. In order to correctly identify these at risk programmers at an early stage
encourage and enable programmers to practice and sharpen their in order to help them in overcoming the difficulties. However,
skills, there exist many online judge programming platforms.
Estimation of these programmers’ strength and progress has early forecasting of ‘at risk’ programmer is challenging since
been an important research topic in educational data mining the strength of these programmer depends on various features
in order to provide adaptive educational contents and early and characteristics depicted in Table I. Each of these features
prediction of ‘at risk’ learner. In this paper, we trained a are correlated to one another. Thus, applying clustering and
Kohonen Self organizing feature map (KSOFM) neural network statistical analysis on these multidimensional data could give
on programmers’ performance log data of Aizu Online Judge
(AOJ) database. Propositional rules and knowledge was extracted an insight and pattern among these programmers. Studies [4–
from the U-matrix diagram of the trained network which 16] used different machine learning and neural network models
partitioned AOJ programmers into three distinct clusters ie. for early forecasting of weak student’s. In our research, we
‘expert’, ‘intermediate’ and ‘at risk’. The proportional rules have trained a Self-organizing feature map on AOJ log data
performed classification with an accuracy of 94% on a testing in order to get a lower dimensional mapping. Further analysis
set. For validation and comparison, three more predicting models
were trained on the same dataset. Among them, feedforward on the trained network provided us with propositional rules to
multilayer neural network and decision tree have scored accuracy partition these programmers based on strength.
of 97% and 96% respectively. In contrast, the precision score for The rest of the paper is organized as follows. Section II
support vector machine was about 88%, but it scored the highest summarizes the literature review. Section III presents the
recall score of 99% in terms of identifying ‘at risk’ students. methodology of our research. Section IV, V and VI presents
Index Terms—Online judge system, novice programmers, clus-
tering, early prediction, self organizing feature map. experiment setup, parameter settings and result from analysis
respectively. Finally, section VII concludes the paper with
limitations and possible future works.
I. I NTRODUCTION
Online judge system is an educational site which refers to II. BACKGROUND AND R ELATED W ORKS
a web service originally designed for programming contest A significant amount of research works have been conducted
like ACM-ICPC (ACM International Collegiate Programming in the identification of ‘at risk’ and ‘novice learner’ both
Contest). Such online judge platforms have a huge number in e-learning and off-learning systems. Among these studies
of programming problems which can be solved both in online key features that have been taken into account are progress
and offline mode. Most of the users of online judge system are in introductory programming courses, prior programming ex-
from computer science and mathematics background. These perience, gender, disliking/negative attitude in programming,
users use the system with a view to enhancing their problem- mathematics background, formal training in programming,
solving skills and compete against each other online. With student’s understanding against difficulties of learning ma-
the increasing number of online judge platform, the amount terials and ability of a student or programmer in finding
of accumulated data are also increasing. These accumulated the way of solving problems [4–7]. Numerous studies have
data gives us an opportunity to leverage them and discover implemented and trained statistical learning model and neural
important knowledge about programmer’s (novice, expert, network for the purpose of classification. Recent work [4] pro-
intermediate, etc.) behavior and progress. Programming is an posed a back propagation neural network which can estimate
interdisciplinary subject. Moreover, competitive programming student performance according to students prior knowledge.
can be relatively difficult and intimidating to a novice user The contribution also constructs a Student attribute matrix
due to problem difficulties, diverse categories and competitive (SAM), indicators and predictors which can tell how much
nature. It is observed from our studies on Aizu Online Judge a specific factor would affect student’s performance. Research
[1–3] submission log data that there are a significant number work [5] provided the classification of programmers based on
of programmers who are struggling with finding the correct the submission log data such as compilation profiles, error

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 313


profiles, compilation frequency and error quotient profiles TABLE I
during the introductory programming course. The research I NPUT FEATURES FROM SUBMISSION LOG
study also depicted the correlation of the submission log data
Input Feature Description
and the midterm score of the programmers. Research study
Submissions Total attempts so far.
[7] focused on learning behavior and the personality trait of Solved Total number of problems solved.
a programmer to determine their strength and motivation. The AC Number of correct solutions.
research proposed a measuring index known as DiCS-Index WA Total incorrect output by source code.
CE Total error verdict while compiling.
based on few questionnaires. The study contrasted between RTE Error during run time.
programmers who learn through abstract conceptualization as TLE Time limit exceeded due to efficiency.
opposed to gathering concrete experience. Study [8] catego- Easy Total solved from easy category
Hard Total solved from hard category
rized programmer’s strength based on the timeliness of the Medium Total solved from intermidiate category.
assignment submission, i.e. the punctuality of student. Features
also include grade point average and past years progress.
Contribution [9] proposed a support vector machine (SVM) TABLE II
based classifier for student performance prediction. The pro- F EATURE DEFINING PROBLEM DIFFICULTIES
posed SVM model was shown to provide better accuracy than
previously implemented KNN model. An intelligent system Input Feature Description
based on Fuzzy ART-neural network [16] was proposed to Submissions Total submissions for the problem.
Success Rate Ratio of accepted and rejected
predict drop out and ‘at risk’ students. The proposed network Category Categories of problem. e.g graph, math, string etc.
was based on academics progress and demographic data. The
network reached an accuracy of 85-94% in early prediction of
drop out students.
It is evident from the literature review that, a significant B. Programmers Strength indicating feature
number of contributions exist related to the prediction of ‘at Analyzing programmers progress is critical. Each program-
risk’ students in both online and off-line tutoring system. mer is different in terms of specific subject and problem-
Application of data mining and knowledge discovery in Online solving style. Thus, in order to categorize the programmers,
judge programming environment is still sparse. But, research it is important to find out the best features that describe pro-
and contribution in this field are growing dramatically recently. grammer’s level, performance and behavior. It is evident from
Different support systems and recommender systems were pro- Table I and Table II that the features are multi-dimensional.
posed for the online judge programming environment in recent Fig. 1 and Fig. 2 gives us an overall overview of correlation
years [10–14, 17] [15]. Support system includes problems dif- among the features of AOJ log data. We can observe that
ficulty estimation [18][19], problem recommendation [14][15] features such as ‘Submissions’ , ‘solved’, ‘Accepted’, ‘WA’ and
and other services. Most of the proposed support systems ‘TLE’ are strongly correlated. It indicates programmers with
are based on the collaborative filtering method [11][14][15] a higher number of attempt tends to obtain ‘accepted’ verdict
and limited to classroom data. To our knowledge, there is no proportionally. On the contrary, the chance of obtaining ‘WA’
explicit research work on the categorization of online judge and ‘TLE’ verdict are also proportionate to the number of
programmers based on their submission log data, ratings and submissions. Errors such ‘CE’, ‘RTE’ and ‘MLE’ represented
other key features. weak correlation with submissions and solved. Thus, the
III. M ETHODOLOGY contrast between error features and success features can pro-
vide us with an overview of distinction among programmers.
A. Construction of Dataset Other key features that contribute to the prediction of strength
The study was conducted on data extracted from submission are the difficulty level of solved problems and frequency
log database of Aizu Online Judge System [2, 3]. These data of participation in solving. Thus, it is required to get an
are available through our open API [3] for researchers and approximate estimation of problem-solving frequency (Fig.
developers. Two matrices were constructed for the experiment. 3), difficulty level and category of each problem solved (Fig.
The first matrix depicts ‘Programmer’s progress matrix’ which 4). The submission log data of these programmers does not
is of dimension (25000, 8). The matrix represents 8 submission explicitly provide us with training labels. Hence, it is required
features of 25,000 users. Registered Programmers with zero to adopt a semi-supervised learning strategy in order to group
submission records has been dropped from the dataset since these programmers based on the extracted features. Since the
these data would not contribute to model training. The second feature vector is multidimensional, Kohonen Self-organizing
matrix is ‘Problem detail matrix’ of dimension (2040, 10), feature map (KSOFM) [20] can provide us with clustering and
where the number of programming problems are 2040 and projection of these data in a lower dimension.
number of features are 10. In order to reduce the effect of
noise and skewed values, all the data were normalized to ‘zero C. The Kohonen Self Organizing Feature Map for clustering
mean’. Table I and Table II depicts the description of user and The Kohonen self-organizing feature map (KSOFM) [20,
problem feature matrix respectively. 21] is one of the best-known topology preserving unsupervised

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 314


Fig. 1. Heat Map depicting correlation among programmer’s features

Fig. 3. Frquency of problem solving by an individual programmer

Fig. 2. Scatter matrix depicting correlation among programmer’s features.


The figure depicts the correlation between each pair of features (submissions,
solved, AC, WA, CE, RTE, MLE, TLE)

neural network (Fig. 5). It is a fully connected two-layered


neural network with an input layer and Kohonen layer [20–
24]. Kohonen layer is the layer where the mapping is formed
which ensures the observation of clustering in dataset [25].
The KSOFM is often chosen because of its suitability for
the visualization and otherwise difficult to interpret the high
dimensional data [22]. After supervised fine-tuning of the
weight vector of the network, KSOFM has been successful in
various classification and pattern recognition problems [23]. Fig. 4. Problem solved from each category
For the clustering of our AOJ programmers, we construct a
two-dimensional grid of neurons. A finite number of code book
vectors were chosen in a proper manner based on probability
distribution of the input data. Let, the 𝑗 𝑡ℎ neuron’s weights 𝑤𝑖 (𝑡 + 1) = 𝑤𝑖 𝑡 + 𝛼(𝑡)[𝑥(𝑡) − 𝑤𝑖 (𝑡)] (3)
are denoted as follows
( ) In equation 3, the expression 𝛼 controls the magnitude of the
𝑤𝑗 = 𝑤𝑗1 , 𝑤𝑗2 , .., 𝑤𝑗𝑛 , 𝑗 = 1, 2, .., 𝑛 (1) weight updates. In each time stamps 𝑡, the learning constant
𝛼 is reduced gradually. After a certain amount of iteration,
The expression 𝑛 in equation 1 represents the dimension we obtain a close representation of different categories of
of input vector. The input vector and neuron shares the same AOJ programmers based on their submission features. The
dimensions. The output of the 𝑗𝑡ℎ neuron for input vector 𝑥 = two-dimensional mapping of the AOJ programmer based on
[𝑥1 , 𝑥2 , .., 𝑥𝑛 ]⊺ is calculated as follows. success attributes (such as solved, AC, submission, the fre-
quency of solves, etc.) is depicted in Unified-Matrix of Fig.
𝑛
∑ 6. The second network maps the AOJ programmer based on
𝑦𝑗 = 𝑤𝑗𝑖 𝑥𝑖 (2)
the error attributes (such as TLE, CE, RTE, MLE) shown in
𝑖=1
Fig. 7. After a fixed number of iterations, we obtain three
The modification of the neuron’s weight vector for each distinct clusters with fined tuned weight vectors. Thus, we
training samples pushes them to the close proximity of input have three trained prototype neurons at our disposal which
training vectors. For each input in the training process, these represent programmers from three distinct clusters. Analysis
neurons in the KSOFM are in competition with each other of these trained neurons space and clusters give us insight
[25]. For each training example the weight vectors of winner about programmer’s strength and features, which can help us
neurons are update using delta rule as follows. to classify weak programmers at an early stage.

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 315


Fig. 5. The Self Organizing network with input and output layer. Source [26] Fig. 7. Two dimensional feature mapping of errors- wrong answer, time
limit exceed, compilation error and run time error. The red circle denotes the
clustering of ‘at risk’ users. Blue and green indicate intermediate and strong
users respectively.

TABLE III
P ROPOSITIONAL RULES EXTRACTED FROM TRAINED SOM

Rules
[ ]
if solved in range[ 600,1300 ]
and AC in range 2000, 3000
and WA and CE and TLE <800
then Expert
[ ]
if solved in range[ 150, 600 ]
and AC in range 1500, 2500 [ ]
and WA and CE and TLE in range 800, 1500
Fig. 6. Two-dimensional feature mapping of total submissions, solved then Intermedaite
and accepted answers. Here the Kohonen neuron layer (16,16) depicts the
clustering of three distint users based on mentioned features. The red circle if solved <150
denotes the clustering of ‘at risk’ users. Blue and green indicate intermediate and WA and CE and TLE >800
and strong users respectively. then at Risk or Novice

D. Rule extraction from Self Organizing Map (SOM) label. The label of the test set was determined in the same
Analysis of Kohonen self-organizing map can provide with manner and validated with the help of expert programmers.
knowledge discovery and exploratory data analysis [22]. Re- All the predicting models were trained and tested on the same
search work by James Malone et al. [22] proposed an algo- training and testing sets respectively.
rithm to extract propositional if-then type rules from the U-
matrix of a trained SOM network. Thus, implementing the V. PARAMETER S ETTINGS
proposed method in our trained network can give us key
The value of parameters of each model has been selected
properties of the discovered clusters. Initially, the boundaries
based on the Grid search method. For SVM, we have set the
of the important components are identified from the trained
regularization parameter C to 0.5. We used the Gaussian RBF
SOM’s U-matrix. The boundary is identified with the help of
(Radial Basis Function) for the kernel. The free parameter
neighboring units. Two selected neighboring units with the
Gamma of RBF has been set to 0.7. For the Multi-layer per-
highest relative difference are selected as candidate boundary
ceptron network, we have set the number of the hidden layer
units [22]. Table III depicts some of the key rules for classi-
to 1 (With 50 hidden neurons). The output layer consists of
fication.
3 neurons for each class. We have used ReLU(Rectifier linear
IV. E XPERIMENT S ET UP unit) function for the non-linearity in both hidden and output
layers. We have used the Gini score for the measurement of
The dataset contains submission log of 25,000 users. This
split quality in the decision tree implementation.
dataset was divided into two sets. 60% of the sample was
drawn at random from the dataset for training. The rest 40%
VI. R ESULTS AND EXTRACTED RULE VALIDATION
was kept for the testing. Although explicit labels about the
strength of users were not available in the training set, instead The performance of extracted rules from the Self-organizing
the range of user’s rating was mapped to an integer value for map (SOM) was tested together with three different learning

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 316


models (Decision tree, Multilayer perceptron with back propa- rule extraction from trained KSOFM was borrowed from the
gation and Support vector machine with radial basis function). research work [22]. The rule summarizes that the ratio of
Multilayer perceptron and Decision tree have performed with total success verdicts and error verdicts determine the key
an accuracy of 97.0% and 96% respectively. SVM (RBF distinction among programmers. ‘At risk’ programmers are
kernel) performance was satisfactory in terms of precision more likely to have high CA and WA verdicts, despite having
but performed poorly considering the recall. The extracted high submission rate. The validity of these extracted rules was
rule from the trained self-organizing map scored 94% in confirmed with high precision and recall scores. Meanwhile,
accuracy. The propositional rules have succeeded in predicting three different learning algorithms were also trained on the
the ‘at risk’ programmers with a precision and recall of 94%. same dataset, where Multilayer feedforward neural network
It is very important to maintain the right precision-recall scored the highest considering accuracy. Although we have
balance. In this research context, a high recall score is the obtained relatively high scores in terms of accuracy, there are
first priority. Since we do not want to miss classify any ‘at two drawbacks which need to be addressed. First, the research
Risk’ programmer as ‘expert’ or ‘intermediate programmer’. work was based on only submission log data of AOJ system,
Fig. 8 represents the confusion matrix for the extracted rule while in real world scenario programmers and learners are
of SOFM. The confusion matrix depicts a clear picture of the not limited to only one online judge platform. Sometimes it
performance of the extracted rules from SOFM. might be the case that, a programmer has solved a significant
amount of programming problems from other online judge
TABLE IV systems while performing poorly or solving irregularly in AOJ.
C OMPARISONS OF M ODEL ACCURACY Thus, for a reliable prediction, it is important to consider
performance log data of other Online judge systems. Second,
Model Class Precision Recall f1 score
the derived rules are in the crisp form which fails to provide
Intermediate 1.00 0.48 0.65 us with computational granularity. Thus, integrating fuzzy
SVM (RBF) At Risk 0.62 1.00 0.76
Expert 1.00 0.50 0.67 inference system can provide us with more flexibility.stem can
Intermediate 0.96 0.99 0.97 provide us with more flexibity.
MLP At risk 0.99 0.96 0.98
Expert 0.96 0.97 0.96 VIII. ACKNOWLEDGMENTS
Intermediate 0.96 0.95 0.95
Decision Tree At Risk 0.96 0.96 0.96 This work was supported by JSPS KAKENHI Grant Num-
Expert 0.94 0.96 0.95 ber 16K16174.
Intermediate 0.92 0.93 0.92
Self Organizing Map At Risk 0.94 0.94 0.94
Expert 0.95 0.94 0.95
R EFERENCES
[1] Aizu online judge: Programming challenge. [On-
line]. Available: http://judge.u-aizu.ac.jp/onlinejudge/
[Accessed 23 Apr. 2018]
[2] Aizu online judge (new site). [Online]. Available:
https://onlinejudge.u-aizu.ac.jp/home. [Accessed: 23-
Apr- 2018]
[3] Aoj developers site (api). [Online]. Available:
http://developers.u-aizu.ac.jp/index. [Accessed: 23- Apr-
2018]
[4] F. Yang and F. W. Li, “Study on student performance es-
timation, student progress analysis, and student potential
prediction based on data mining,” Computers Education,
vol. 123, pp. 97 – 108, 2018. [Online]. Available:
https://doi.org/10.1016/j.compedu.2018.04.006
[5] E. S. Tabanao, M. M. T. Rodrigo, and M. C. Jadud,
“Predicting at-risk novice java programmers through
the analysis of online protocols,” in Proceedings of
the Seventh International Workshop on Computing
Fig. 8. Confusion Matrix for the propositional rules Education Research, ser. ICER ’11. New York, NY,
USA: ACM, 2011, pp. 85–92. [Online]. Available:
http://doi.acm.org/10.1145/2016911.2016930
VII. C ONCLUSION [6] S. Bergin and R. Reilly, “Programming: Factors that
In this paper, we proposed a study on early prediction of ‘at influence success,” in Proceedings of the 36th SIGCSE
risk’ programmer in an online judge system. The classification technical symposium on Computer science education, ser.
of programmers was based on the knowledge discovery from SIGCSE ’05. New York, NY, USA: ACM, 2005, pp.
trained KSOFM neural network. The methodology for the 411–415.

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 317


[7] D. Capovilla, P. Hubwieser, and P. Shah, “Dics-index: in 2012 IEEE 14th International Conference on High
Predicting student performance in computer science by Performance Computing and Communication 2012 IEEE
analyzing learning behaviors,” in 2016 International 9th International Conference on Embedded Software and
Conference on Learning and Teaching in Computing and Systems, June 2012, pp. 1691–1694.
Engineering (LaTICE), March 2016, pp. 136–140. [18] E. Verd, M. J. Verd, L. M. Regueras, J. P. de Castro,
[8] N. J. Falkner and K. E. Falkner, “A fast and R. Garca, “A genetic fuzzy expert system
measure for identifying at-risk students in computer for automatic question classification in a competitive
science,” in Proceedings of the Ninth Annual learning environment,” Expert Systems with Applications,
International Conference on International Computing vol. 39, no. 8, pp. 7471 – 7478, 2012. [Online]. Available:
Education Research, ser. ICER ’12. New York, NY, https://doi.org/10.1016/j.eswa.2012.01.115
USA: ACM, 2012, pp. 55–62. [Online]. Available: [19] W. X. Zhao, W. Zhang, Y. He, X. Xie, and J.-R.
http://doi.acm.org/10.1145/2361276.2361288 Wen, “Automatically learning topics and difficulty levels
[9] H. Al-Shehri, A. Al-Qarni, L. Al-Saati, A. Batoaq, of problems in online judge systems,” ACM Trans.
H. Badukhen, S. Alrashed, J. Alhiyafi, and S. O. Olatunji, Inf. Syst., vol. 36, no. 3, pp. 27:1–27:33, Mar. 2018.
“Student performance prediction using support vector [Online]. Available: http://doi.acm.org/10.1145/3158670
machine and k-nearest neighbor,” in 2017 IEEE 30th [20] T. Kohonen, E. Oja, O. Simula, A. Visa, and J. Kangas,
Canadian Conference on Electrical and Computer En- “Engineering applications of the self-organizing map,”
gineering (CCECE), April 2017, pp. 1–4. Proceedings of the IEEE, vol. 84, no. 10, pp. 1358–1384,
[10] R. Elias Francisco and A. Ambrosio, “Mining an online Oct 1996.
judge system to support introductory computer program- [21] R. Rojas, Neural Networks, 1st ed. Springer-Verlag
ming teaching,” 06 2015. Berlin Heidelberg, 1996.
[11] X. Yu and W. Chen, “Research on three-layer collabora- [22] J. Malone, K. McGarry, S. Wermter, and C. Bowerman,
tive filtering recommendation for online judge,” in 2016 “Data mining using rule extraction from kohonen
Seventh International Green and Sustainable Computing self-organising maps,” Neural Comput. Appl., vol. 15,
Conference (IGSC), Nov 2016, pp. 1–4. no. 1, pp. 9–17, Mar. 2006. [Online]. Available:
[12] C. Fernandez-Medina, J. R. Pérez-Pérez, V. M. Álvarez http://dx.doi.org/10.1007/s00521-005-0002-1
Garcı́a, and M. d. P. Paule-Ruiz, “Assistance in computer [23] T. Kohonen, “The self-organizing map,” Proceedings of
programming learning using educational data mining the IEEE, vol. 78, no. 9, pp. 1464–1480, Sep 1990.
and learning analytics,” in Proceedings of the 18th ACM [24] A. Ultsch, “U*matrix: a tool to visualize clusters in high
Conference on Innovation and Technology in Computer dimensional data,” 01 2003.
Science Education, ser. ITiCSE ’13. New York, NY, [25] N. Yorek, I. Ugulu, and H. Aydin, “Using self-organizing
USA: ACM, 2013, pp. 237–242. [Online]. Available: neural network map combined with ward’s clustering al-
http://doi.acm.org/10.1145/2462476.2462496 gorithm for visualization of students’ cognitive structural
[13] A. Alvarez and T. A. Scott, “Using student models about aliveness concept,” Intell. Neuroscience,
surveys in determining the difficulty of programming vol. 2016, pp. 6:6–6:6, Jan. 2016. [Online]. Available:
assignments,” J. Comput. Sci. Coll., vol. 26, https://doi.org/10.1155/2016/2476256
no. 2, pp. 157–163, Dec. 2010. [Online]. Available: [26] Kohonen self organizing maps, mnemstudio.org,
http://dl.acm.org/citation.cfm?id=1858583.1858605 2018. [Online]. Available: http://mnemstudio.org/neural-
[14] R. Y. Toledo and Y. C. Mota, “An e-learning collaborative networks-kohonen-self-organizing-maps.htm [Accessed
filtering approach to suggest problems to solve in 20 Apr. 2018]
programming online judges,” Int. J. Distance Educ.
Technol., vol. 12, no. 2, pp. 51–65, Apr. 2014. [Online].
Available: http://dx.doi.org/10.4018/ijdet.2014040103
[15] R. Yera and L. Martı́nez, “A recommendation approach
for programming online judges supported by data
preprocessing techniques,” Applied Intelligence, vol. 47,
no. 2, pp. 277–290, Sep. 2017. [Online]. Available:
https://doi.org/10.1007/s10489-016-0892-x
[16] V. R. D. C. Martinho, C. Nunes, and C. R. Minussi, “An
intelligent system for prediction of school dropout risk
group in higher education classroom based on artificial
neural networks,” in 2013 IEEE 25th International Con-
ference on Tools with Artificial Intelligence, Nov 2013,
pp. 159–166.
[17] C. Lei, X. Yu, X. Meng, and W. Xu, “The design and
research of on-line judge system based on multi-core,”

978-1-5386-5826-0/18/$31.00 ©2018 IEEE 318

You might also like