Professional Documents
Culture Documents
E y
y Sy
A C 2264
E W y
EM
A C 2264
© A
Comparison of Activation Function
on Extreme Learning Machine (ELM) Performance
for Classifying the Active Compound
Dian Eka Ratnawati 1*, Marjono2 ,Widodo3, Syaiful Anam4
1
Ph.D Student Mathematics, Faculty of Mathematics and Natural Sciences,
Brawijaya University, Malang 65145, Indonesia
2,4
Department of Mathematics, Faculty of Mathematics and Natural Sciences,
Brawijaya University, Malang 65145,Indonesia
3
Department of Biology, Faculty of Mathematics and Natural Sciences,
Brawijaya University, Malang 65145,Indonesia
corresponding author: *dian_ilkom@ub.ac.id
Abstract. The active compound could interact with other molecules and can lead to a variety of positive and negative effects on
the living system. Therefore, classification of the compound is very important for understanding its character and functions for
human medicine. The manual classification of compound functions by laboratory test is time and cost consuming. Based on the
previous research, the functions of the compound could be predicted based on its molecular structure in the format of the
Simplified Molecular Input Line Entry System (SMILES). So, we employed the Extreme Learning Machine (ELM) for
classifying the active compounds according to its SMILES structure. The result of this study suggested that ELM could classify
the active compounds very fast and guarantee optimal performance. The accuracy and computational time of classification model
were depending on the activation function. This experiment uses eleven activation functions i.e Binary Step Function, Sigmoid,
Swish, Exponential Linear Squashing (ELiSH ), Hyperbolic Tangent(TanH), Hard Hiperbolic Function (HardTanH), Rectified
Linear Unit (ReLU), TanhRe, Exponential Linier Units (ELUs), SoftPlus, and Leaky ReLU (LReLU). The results of
experiments show that ELU and TanHRe have the best performance based on average and maximal accuracy. Accuracy of the
system depends on the patterns in class and the activation functions which are used. Based on experimental results, the average
accuracy can reach 80.56% on ELUs activation function and the maximum accuracy 88.73% on TanHRe.
INTRODUCTION
Exploring the function of active compounds in the laboratory is consuming time and cost, its function
elucidation could be enhanced by applying the computation approach[1,2]. The similar structure might have the
same activation function [3], therefore active compounds could be classified according to the similarity of its
structure. The structure of a molecular is written based on Simplified Molecular Input Line System (SMILES) that
unique[4,5] and suitable for input data of machine learning[4,6,7]. Currently, the Extreme Learning Machine (ELM)
is a popular machine learning that extremely fast and always guarantee optimal performance[8,9,10]. Therefore we
employed the ELM for classifying the active compounds according to its SMILES structure.
Accuracy and computational time of the classification model by ELM depend on the Activation functions (AFs)
which are used[11]. Therefore, the selection of AFs in the ELM is the main point of this research. There are eleven
activation functions i.e Binary Step Function, Sigmoid, hyperbolic tangent (Tanh), Rectified Linear Unit (ReLU),
TanhRe, EliSH, Swish, LReLU, HardTanH, SoftPlus, and ELUs. All of the eleventh of the activation function also
were compared in this study.
Some previous studies also used the SMILES for classifying the active compound by used Fuzzy KNN[12],
Learning Vector Quantization (LVQ)[13], C4.5[14], K-Means[15], momentum backpropagation[16], and
backpropagation[17]. However, the limitation of the previous researches have just classified the compounds into 2
classes. So, this study was able to employ ELM for classifying compounds into several classes. This is supported by
previous research that ELM is superior to other methods i.e C.45, backpropagation, SVM, and RBF[8,9].
140001-1
METHODS
Activation Functions
Activation functions (AFs) are functions in neural network (ELM) to calculate weighted and biases. An
activation function is used to generate the outputs of our neural network[20]. The AF is the main component for the
training and optimization of a neural network because AFs learn patterns in a dataset. There are several activation
functions which are
1. Binary Step Function.
Binary Step Function has a threshold value and suitable for binary classification. It is defined as Eq (7)
, if ≥ 0 (7)
() =
0, if < 0.
2. Sigmoid Function.
The sigmoid function has been applied successfully in binary classification problems, which is defined as
Eq.(8)
(8)
() = ( )
+
3. Swish
The swish activation function is a combination between sigmoid activation function and the input (). It is
defined as Eq.(9)
(9)
() = ( )
+
140001-2
4. Exponential Linear Squashing (ELiSH )
The ELiSH function is a combination of ELUs and sigmoid function. It is defined as Eq. (10)
(10)
( + ), ≥
() = −
, <
+
, > (14)
() =
(), ≤
140001-3
Preprocessing
Before classifying SMILES codes, every SMILES code must be extracted into 29 features. These features are B,
C, N, O, P, S, F, Cl, Br, I, OH, =, #, @, ( ), [], +, -, charge, ionic (.), aromatic (:) , NO, epoksi (COC), C=C, N+,
C=O, [O-] , total valence, and total cyclic.
a) Average Accuracy
Figure 1(a), Fig 1.(b), and Fig 1(c) show the average accuracies on class combinations 1-2-3, 1-6-7, and 1-3-4,
respectively. The average accuracy of each AFs fluctuating depending on group experiment or class combinations.
This data indicates the AFs ability to classify the active compounds still depends on the datasets which are used. The
5 best average accuracies of AFs are shown in Table 1.
Table 1 shows the 5 best AFs in each group based on average accuracy. TanHRe and ELU activation function
have good performance in all classes combinations or various datasets. TanHRe and ELU have identity functions for
positive and negative values [22]. These results correspond to Maimaitiyiming et.al [11] that TanHRe had the best
performance is compared to TanH and ReLU, and ELU is outperformed than ReLU and LRELU [22,25,26].
b).Maximal Accuracy
Figure 2(a), Fig.2(b) and Fig. 2(c) show maximal accuracies of predictions on class combination 1-2-3, class
combination 1-6-7, and class combination 1-3-4. The maximal accuracy of each AFs fluctuating depending on
140001-4
group experiment or class combination. This data indicates the AFs ability to classify the active compounds still
depends on the dataset which is used. The 5 best maximal accuracies of AFs are shown in Table 2.
Table 2 shows the 5 best activation functions in each class combination based on maximal accuracy. The table
shows that the TanHRe and ELU activation functions exist in all classes (1-2-3, 1-6-7, and 1-3-4). It means TanHRe
and ELU can recognize various data patterns well. Besides, softplus is the best in class 1-6-7, because Softplus is an
improvement from ReLU and it has softened and nonzero gradient properties.
140001-5
TABLE 3. The 5 best activation functions based on Standard Deviation
Ranking Standard Deviation Standard Deviation Standard Deviation
Class 1-2-3 Class 1-6-7 Class 1-3-4
TanHRe and ELUs activation functions have the best average accuracy and they are the 5 best standard deviations of
accuracy in class 1-2-3 and 1-3-4. This is means that TanHRe and ELU tend towards high prediction accuracy.
The experimental results in Table 4 indicate that Softplus, Swish and L ReLU exist in all classes (1-2-3, 1-6-
7, and 1-3-4). Even the Softplus function is always on top ranking, except it has the same speed as LReLU in class
1-3-4. This results appropriately with Zheng study that Softplus converges faster compared to the ReLU and
Sigmoid activation functions[28]. Faster convergence means less computation time needed. This is such as in this
study that the processing time of the Sigmoid and ReLU need a longer time than the Softplus activation function.
Based on scenarios the average accuracy and maximum accuracy, ELU and TanHRe are the best activation
function on SMILES code classification. But, TanHRe requires the processing time longer than ELU and other
activation functions. However, if the performance of activation function based on processing time then the Softplus
140001-6
activation function is the right choice because it has the fastest processing time. In addition, softplus has average
accuracy and maximal accuracy quite well.
CONCLUSIONS
The accuracy of the ELM depends on the patterns in class and the activation function which is used. The
activation functions in ELM have the best performance based on the average and maximal of the accuracy are ELU
and TanHRe. The experimental results show that the average accuracy reaches 80.56% on ELUs function and the
maximum accuracy is 88.73% on the TanHRe function. Beside the ELUs and TanHRe activation function have the
high average and the high maximal of the accuracy, the ELUs and TanHRe activation functions also have a small
standard deviation, it means that the ELM with the ELUs and TanHRe activation function can classify well the
function of an active compound based on a SMILES code.
REFERENCES
140001-7
[24] Z. Wang and Y. Parth, “Extreme Learning Machine for Multi-class Sentiment Classification of Tweets,” in
Proceedings in Adaptation, Learning and Optimization, Vol. 6 (Springer,2015), pp.1-11.
[25] Y. Zhang, Q. Hua, D. Xu, H. Li, Y. Bu, and P. Zhao, “A Complex-Valued CNN for Different Activation
Functions in Polarsar Image Classification,” in IGARSS 2019 - 2019 IEEE International Geoscience and
Remote Sensing Symposium, (IEEE,2019), pp. 10023–10026.
[26] M. M. Lau and K. H. Lim, “Review of Adaptive Activation Function in Deep Neural Network,” in 2018
IEEE-EMBS Conference on Biomedical Engineering and Sciences (IECBES), (IEEE,2018), pp. 686–690.
[27] B. Gagana, H. A. U. Athri, and S. Natarajan, “Activation Function Optimizations for Capsule Networks,” in
2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), 3
(IEEE,2018), pp. 1172–1178.
[28] H. Zheng, Z. Yang, W. Liu, J. Liang, and Y. Li, “Improving Deep Neural Networks Using Softplus Units,”
in 2015 International Joint Conference on Neural Networks (IJCNN), (IEEE,2015), pp. 1–4.
140001-8