You are on page 1of 8



Extreme Learning Machine as Maintainability Prediction model for

Object-Oriented Software Systems
S. O. Olatunji*1, Z. Rasheed*2, K.A. Sattar*3, A. M. Al-Mana*4, M. Alshayeb*5, E.A. El-Sebakhy#6
Information and Computer Science Department, King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
#Senior Scientist of Artificial Intelligence & Data Mining in Business and Health Science, MEDai Inc. an Elsevier Company, Millenia Park One,
4901 Vineland Road, Suite 450, Orlando, Florida 32811, USA

Abstract— As the number of object-oriented software systems increases, it becomes more important for organizations to maintain
those systems effectively. However, currently only a small number of maintainability prediction models are available for object
oriented systems. In this paper, we develop an extreme learning machine (ELM) maintainability prediction model for object-
oriented software systems. The model is based on extreme learning machine algorithm for single-hidden layer feed-forward neural
networks (SLFNs) which randomly chooses hidden nodes and analytically determines the output weights of SLFNs. The model is
constructed using popular object-oriented metric datasets, collected from different object-oriented systems. Prediction accuracy of
the model is evaluated and compared with commonly used regression-based models and also with Bayesian network based model
which was earlier developed using the same datasets. Empirical results from the simulation show that our ELM based model pro-
duces promising results in terms of prediction accuracy measures that are better than most of the other earlier implemented mod-
els on the same datasets.

Index Terms— Software maintainability, Extreme Learning Machines, Bayesian Network, Regression
——————————  ——————————
model for an object-oriented software system based on the
I. INTRODUCTION recently introduced learning algorithm called extreme learn-
ing machine (ELM) for single-hidden layer feed-forward
Software maintainability is defined as the ease of finding
neural networks (SLFNs) which randomly chooses hidden
and correcting errors in the software [1]. It is analogous to
nodes and analytically determines the output weights of
the hardware quality of Mean-Time-To-Repair, or MTTR.
SLFNs. In theory, this algorithm tends to provide good ge-
While there is as yet no way to directly measure or predict
neralization performance at extremely fast learning speed.
software maintainability, there is a significant body of
The experimental results, found in literatures, based on a
knowledge about software attributes that make software eas-
few artificial and real benchmark function approximation
ier to maintain. These include modularity, self (internal)
and classification problems including very large complex
documentation, code readability, and structured coding
applications, and particularly the empirical results from this
techniques. These attributes also improve sustainability, the
study, demonstrated that the ELM can produce good genera-
ability to make improvements to the software [1].
lization performance in most cases and can learn thousands
It is arguable that many object-oriented (OO) software
of times faster than conventional popular learning algo-
systems are currently in use. It is also arguable that the
rithms for feed-forward neural networks.
growing popularity of OO programming languages, such as
Despite the importance of software maintenance, It is un-
Java, as well as the increasing number of software develop-
fortunate that little work has been done as regards develop-
ment tools supporting the Unified Modeling Language
ing predictive models for software maintainability, particu-
(UML), encourages more OO systems to be developed at
larly object-oriented software system, which is evident in the
present and in the future. Hence it is important that those
fewer number of software maintainability prediction mod-
systems are maintained effectively and efficiently. A soft-
els, that are currently available in the literature.
ware maintainability prediction model enables organizations
We have developed a maintainability prediction model for
to predict maintainability of a software system and assists
an object-oriented software system based on the recently
them to manage and plan their maintenance resources.
introduced learning algorithm called extreme learning ma-
In addition, if an accurate maintainability prediction mod-
chine (ELM). Implementation was carried out on representa-
el is available for a software system, a defensive design can
tive datasets related to the target systems. Furthermore, we
be adopted. This would reduce future maintenance effort of
performed comparative analysis between our model and the
the system. Maintainability of a software system can be
models presented by Koten and Gray [5], which include
measured in different ways. Maintainability could be meas-
Regression-Based and Bayesian Network Based models, in
ured as the number of changes made to the code during a
terms of their performance measures values, as recommend-
maintenance period or be measured as effort to make those
ed in the literatures.
changes [1][5]. The predictive model is called a maintenance
Furthermore, the usefulness of the Extreme Learning Ma-
effort prediction model if maintainability is measured as
chines in the area of software engineering and, in particular,
maintainability prediction for an object-oriented software
In this research we developed a maintainability prediction

system, has been made clearer by describing both the steps sion-based techniques for some datasets, the model is still
and the use of Extreme Learning Machines as an artificial inferior to regression-based techniques for some other data-
intelligence modeling approach for predicting the maintai- sets.
nability of object-oriented software system. Zhou and Leung used Multivariate Adaptive Regression
The rest of this paper is organized as follows. Section II Splines (MARS) to predict object oriented software maintai-
contains review of related earlier works. Section III dis- nability [9]. In their research work, they compared MARS
cusses background information that includes software main- with multivariate linear regression (MLR), artificial neural
tainability, some prediction modeling techniques, and also networks (ANN), regression trees (RT), and support vector
describes the main modeling technique used: extreme learn- machines (SVM). However, their MARS-based technique
ing machine (ELM). Section IV presents the OO software does not outperform the compared techniques for all datasets
data sets used in our study and their description. It also con- they used.
tains data description, data analysis based on skewness, and
types of skewness. Section V presents the research ap- III. BACKGROUND
proach. Section VI contains model evaluation, prediction This section will talk about the software maintainability
accuracy measures, empirical results, comparison with other and its types, previously held models used to predict soft-
models and discussions. Section VII concludes the paper and ware maintainability and the novel approach called Extreme
outlines directions for future work. Learning Machine used in this context.

II. RELATED WORK A. Software Maintainability

Software maintainability is the process of modification of
Many object oriented software maintainability prediction
a software product after delivery to correct faults, to improve
models were developed in the last decade; however, most of
performance or other attributes, or to adapt the product to a
them suffer from low prediction accuracies [9]. In fact, most
changed environment [6]. Maintaining and enhancing the
of the research work found in the literature related to soft-
reliability of software during maintenance requires that
ware maintainability prediction of object oriented software
software engineers understand how various components of a
products either make use of statistical models or artificial
design interact. People usually think of software mainten-
neural networks, though some recent work have been done
ance as beginning when the product is delivered to the
using some other artificial intelligence techniques such as
client. While this is formally true, in fact decisions that af-
Bayesian Belief Networks (BBN) [5], and Multivariate
fect the maintenance of the product are made from the earli-
adaptive regression splines (MARS) [9].
est stage of design.
Regression techniques have been thoroughly utilized to
Software maintenance is classified into four types: correc-
predict maintainability of object oriented software systems
tive, adaptive, perfective and preventive [1]. Corrective
[7] [17].
maintenance refers to fixing a program. Adaptive mainten-
In addition, several types of artificial neural networks
ance refers to modifications that adapt to changes in the data
were also employed in predicting the maintainability effort
environment, such as new product codes or new file organi-
of object oriented software systems. Ward neural network
zation or changes in the hardware of software environments.
and General Regression neural network (GRNN) was used
Perfective maintenance refers to enhancements: making the
to predict the maintainability effort for object oriented soft-
product better, faster, smaller, better documented, cleaner
ware systems using object oriented metrics [13]. On the oth-
structured, with more functions or reports. The preventive
er hand, Back-Propagation Multi-Layer Perceptron (BP-
maintenance is defined as the work that is done in order to
MLP) has been used to predict faulty classes in object
try to prevent malfunctions or improve maintainability.
oriented software [11]. In the same research work, they used
When a software system is not designed for maintenance,
radial basis function networks (RBF) to predict the type of
it exhibits a lack of stability under change. A modification in
fault a faulty class has.
one part of the system has side effects that ripple throughout
Bayesian Belief Networks (BBN) was suggested as a
the system. Thus, the main challenges in software mainten-
novel approach for software quality prediction by Fenton et
ance are to understand existing software and to make
al. [12] and Pearson [20]. They build their conjecture based
changes without introducing new bugs.
on Bayesian Belief Networks’ ability in handling uncertain-
ties, incorporating expert knowledge, and modeling the B. Regression Based Models
complex relationships among variables. However, a number Regression models are used to predict one variable from
of researchers [2][10] have introduced several limitations one or more other variables. Regression models provide the
with Bayesian Belief Networks when they are applied as a scientist with a powerful tool, allowing predictions about
model for object oriented software quality and maintainabili- past, present, or future events to be made with information
ty perdition. Recently, Koten and Gray used a special type about past or present events.
of Bayesian Belief Networks called Naive-Bayes classifier
[5] to implement a Bayesian-Belief-Networks-based soft- 1) Multiple Linear Regression Model
ware maintainability prediction model. Although their re- Multiple linear regression attempts to model the relation-
sults showed that their model give better results than regres- ship between two or more explanatory variables and a re-

sponse variable by fitting a linear equation to observed data. correspond to direct probabilistic influences. The RVs cor-
Every value of the independent variable x is associated with respond to important attributes of the modeled system which
a value of the dependent variable y. The regression line for p exemplifying the system’s behavior. Directed connection
explanatory variables x1 , x2 ,......, x p is defined to between the two nodes indicates a casual effect between
RVs which associated with these nodes.
be  y   0  1 x1   2 x2  ....   p x p . This line de- The structure of directed acyclic graph states that each
node is independent of all its non descendants conditioned
scribes how the mean response y changes with the expla-
on its parent nodes. In other words, the Bayesian Network
natory variables. The observed values for y vary about their represents the conditional probability distribution
means  y and are assumed to have the same standard devi- P(Y/X1,…,Xn) which is used to quantify the strength of
variables Xi on the variable Y, Nodes Xi are called the par-
ation  . Formally, the model for multiple linear regression ents of Y and Y is called a child of each Xi. This should be
given n observations is noted that outcomes of the events for the variables Xi have
yi   0  1 xi1   2 xi 2  ....   p xip   i an influence on the outcome of the event Y.
for i = 1,2,….n D. Extreme Learning Machine
In general, the learning rate of feed-forward neural net-
where  i is notation for model deviation. works (FFNN) is time-consuming than required. Due to this
One approach to simplifying multiple regression equa- property, FFNN is becoming bottleneck in their applications
tions is the stepwise procedures. These include forward se- limiting the scalability of them. According to [4], there are
lection, backwards elimination, and stepwise regression. two main reasons behind this behavior, one is slow gradient
They add or remove variables one-at-a-time until some based learning algorithms used to train neural network (NN)
stopping rule is satisfied. and the other is the iterative tuning of the parameters of the
networks by these learning algorithms. To overcome these
2) Forward selection problems, [2][4] proposes a learning algorithm called ex-
Forward selection starts with an empty model. The varia- treme learning machine (ELM) for single hidden layer feed-
ble that has the smallest P value when it is the only predictor forward neural networks (SLFNs) which randomly selected
in the regression equation is placed in the model. Each sub- the input weights and analytically determines the output
sequent step adds the variable that has the smallest P value weights of SLFNs. It is stated that “In theory, this algorithm
in the presence of the predictors already in the equation. tends to provide the best generalization performance at ex-
Variables are added one-at-a-time as long as their P values tremely fast learning speed” [4].
are small enough, typically less than 0.05 or 0.10. This is extremely good as in the past, it seems that there
exists an unbreakable virtual speed barrier which classic
3) Backward elimination learning algorithms cannot go break through and therefore
It starts with all of the predictors in the model. The varia- feed-forward network implementing them take a very long
ble that is least significant that is, the one with the largest P time to train itself, independent of the application type
value is removed and the model is refitted. Each subsequent whether simple or complex. Also ELM tends to reach the
step removes the least significant variable in the model until minimum training error as well as it considers magnitude of
all remaining variables have individual P values smaller than weights which is opposite to the classic gradient-based
some value, such as 0.05 or 0.10. learning algorithms which only intend to reach minimum
training error but do not consider the magnitude of weights.
4) Stepwise regression Also unlike the classic gradient-based learning algorithms
This approach is similar to forward selection except that which only work for differentiable activation functions ELM
variables are removed from the model if they become non learning algorithm can be used to train SLFNs with non-
significant as other predictors are added. differentiable activation functions. According to [4], “Unlike
Backwards elimination has an advantage over forward se- the traditional classic gradient-based learning algorithms
lection and stepwise regression because it is possible for a facing several issues like local minimum, improper learning
set of variables to have considerable predictive capability rate and over-fitting, etc, the ELM tends to reach the solu-
rather than any individual subset. Forward selection and tions straightforward without such trivial issues”.
stepwise regression will fail to identify them because some- The ELM has several interesting and significant features
times variables don't predict well individually and Backward different from traditional popular gradient-based learning
elimination starts with everything in the model, so their joint algorithms for feed forward neural networks: These include:
predictive capability will be seen. The learning speed of ELM is extremely fast. In simula-
C. Bayesian Networks
A Bayesian network consists of nodes interconnected by
the directed links forming directed acyclic graph. In this
graph, nodes represent random variables (RVs) and links
βi2, … , βim] is the weight vector that connects the ith neu-
ron and the output neurons, and bi is the threshold of the ith
hidden neuron. The “.” in wi . xj means the inner product of
wi and xj.
SLFN aims to minimize the difference between oj and tj.
This can be expressed mathematically as:

  g (w . x
i 1
i i j  bi )  t j , j  1, ... , N
Or, more compactly, as:

 g (w1 . x1  b1 ) ... g (wN~ . xN~  b N~ )
tions reported by Huang et al. [2], the learning phase of  . . 
ELM can be completed in seconds or less than seconds for  
H(w1 , ..., wN~ , b1 , ..., bN~ , x1 , ..., xN )   . ... . 
many applications. Previously, it seems that there exists a  
. .
virtual speed barrier which most (if not all) classic learning  
 1 N 1
g w . x  b ) ... g ( wN
~ . x N  b )
N 
~  N  N~
algorithms cannot break through and it is not unusual to take
very long time to train a feed-forward network using classic
learning algorithms even for simple applications.   1T  T1T 
The ELM has better generalization performance than the  .   . 
gradient-based learning such as back propagation in most    
cases. ,β= .  and T =  . 
The traditional classic gradient-based learning algorithms  .   . 
and some other learning algorithms may face several issues  T  T 
like local minima, improper learning rate and over fitting,   N~  N~  m T N~  N  m
etc. In order to avoid these issues, some methods such as As proposed by Huang and Babri (1998), H is called the
weight decay and early stopping methods may need to be neural network output matrix.
used often in these classical learning algorithms. The ELM
tends to reach the solutions straightforward without such According to Huang et al. (2004), the ELM algorithm
trivial issues. The ELM learning algorithm looks much works as follows:
simpler than most learning algorithms for feed-forward Given a training set
neural networks.
Unlike the traditional classic gradient-based learning al-

N   xi , t i  | xi  R n , t i  R m , i  1, ..., N , 
gorithms which only work for differentiable activation func- ~
activation function g(x), and hidden neuron number = N ,
tions, as easily observed the ELM learning algorithm could do the following:
be used to train SLFNs with many non-differentiable activa- Assign random value to the input weight wi and the bias
tion functions [3]. bi,
1) How Extreme Learning Machine Algorithm i = 1, … , N
Works Find the hidden layer output matrix H.
Let us first define the standard SLFN (single-hidden layer Find the output weight β as follows:
feed-forward neural networks). If we have N samples (xi, ti), β = H†T
where xi = [xi1, xi2, … , xin]T  Rn and ti = [ti1, ti2, … , tim]T  where β, H and T are defined in the same way they were
~ defined in the SLFN specification above.
Rn, then the standard SLFN with N hidden neurons and
activation function g(x) is defended as:

  g (w . x
i 1
i i j  bi )  o j , j  1, ... , N ,
where wi = [wi1, wi2, … , win]T is the weight vector that con-
nects the ith hidden neuron and the input neurons, βi = [βi1,


In this work, we made use of OO software datasets pub-
lished by Li and Henry [7]. The datasets consist of five
Name Description
C&K metrics: DIT, NOC, RFC, LCOM and WMC, and four
L&H metrics: MPC, DAC, NOM and SIZE2, as well as Depth of the inheritance tree (=
SIZE1, which is a traditional lines of code size metric. DIT inheritance level number of the
Those metric data were collected from a total of 110 classes class, 0 for the root class)
in two OO software systems: User Interface Management Number of children (= number of
System (UIMS) and Quality Evaluation System (QUES). NOC direct sub-classes that the class
The codes were written in Classical − AdaTM. The UIMS has)
and QUES datasets contain 39 classes and 71 classes, re- Message-passing coupling(=
MPC number of send statements de-
spectively. Li and Henry measure maintainability in
fined in the class)
CHANGE metric by counting the number of lines in the
Response for a class (= total of
code, which were changed during a three-year maintenance
the number of local methods and
period. Neither UIMS nor QUES datasets contain actual RFC
the number of methods called by
maintenance effort data. The same datasets are also used by
local methods in the class)
other researchers [5][6][7]. The description of each metric is
Lack of cohesion of methods (=
shown in Table I. number of disjoint sets of local
LCOM methods, i.e. number of sets of
local methods that do not interact
with each other, in the class)
Data abstraction coupling (=
DAC number of abstract data types
defined in the class)
Weighted method per class (=
sum of McCabe’s cyclomatic
complexity of all local methods
in the class)
Number of methods (= number
of local methods in the class)
Lines of cod (= number of semi-
colons in the class)
Number of properties (= total of
the number of attributes and the
number of local methods in the
b fl h d h
B. Types of Skewness:
Usually there are three types of skewness; Right, normal
and left skewness [21]. The distribution is said to be right-
skewed if the right tail is longer and the mass of the distribu-
A. Data Analysis based on Skewness tion is concentrated on the left. Similarly the distribution is
said to be left-skewed if the left tail is longer and the mass of
the distribution is concentrated on the right. Finally if the
Skewness is a measure of symmetry, or more precisely,
skewness is nearly equal to zero, it is normally distributed
the lack of symmetry. A distribution, or data set, is symme-
throughout the range. Acceptable range for normality is
tric if it looks the same to the left and right of the center
skewness lying between -1 to 1. Figure 1 shows the three
point. Skewness tells us about the direction of variation of
types of skewness (right: skew >0, normal: skew ~ 0 and
the data set.
left: skew <0).
Mathematical Expression:
In our experiment, we have generated the skew graphs of
The skewness of random variable X is defined as:
some of our dataset fields. Two of the observations are
 Xi   
1 N
shown in figure 2 and figure 3.
Skewness 

i 1 3
where Xi is the random variable,
 is the mean,  is the
For each data set, the available data was divided into two
parts. One part was used as a training set, for constructing a
standard deviation and N is the total length of random variable.
maintainability prediction model. The other part was used

for testing to determine the prediction ability of the devel- mean magnitude of relative error (MMRE):
oped model. Although there are many different ways to split
a given dataset, we have chosen to use the stratify sampling 1 n
approach in breaking the datasets due to its ability to break MMRE   MREi
n i 1
data randomly with a resultant balanced division based on
the supplied percentage. The division, for instance could be
70% for training set and 30% for testing set. In this work, According to [14], Pred is a measure of what proportion
we selected 70% of the data for building the model (internal of the predicted values have MRE less than or equal to a
validation) and 30% of the data for testing/ validation (ex- specified value, given by:
ternal validation or cross-validation criterion). We repeated
both internal and external validation processes for 1000 Pred (q) = k / n
times to have a fair partition through the entire process oper- where q is the specified value, k is the number of cases
ations. whose MRE is less than or equal to q and n is the total num-
It was ensured that the same percentage of division are ber of cases in the dataset.
used, for each of the two dataset, in order to maintain the According to [15] and [8], in order for an effort prediction
comparability of prediction accuracy of the two datasets by model to be considered accurate, MMRE < 0.25 and/or ei-
using the same proportion of the sample cases for learning ther pred(0.25) > 0.75 or pred(0.30) > 0.70. These are the
and testing. suggested criteria in literature as far as effort prediction is
We also evaluated and compared our developed model concerned.
with other OO software maintainability prediction models,
cited earlier, quantitatively, using the prediction accuracy B. Empirical Results, Comparison and Discus-
measures recommended in the literatures: absolute residual sion
(Ab.Res.), the magnitude of relative error (MRE) and the
As stated earlier, In order to conduct a valid comparison,
proportion of the predicted values that have MRE less than
our model, ELM, was obtained by training it on exactly the
or equal to a specified value suggested in the literatures
same training set and evaluated on the same testing set
(pred measures). Details of all these measures of perfor-
samples as used in the previous works cited earlier, particu-
mance are provided in next section.
larly as contained in [5]. Below are tables and figures show
the results of our newly developed model in comparison to
the other earlier models used on the same data set.
A. Prediction accuracy measures
1) Results from QUES dataset
We compared the software maintainability prediction
Table II shows the values of the prediction accuracy
models using the following prediction accuracy measures:
measures achieved by each of the maintainability prediction
absolute residual (Abs Res), the magnitude of relative error
models for the QUES dataset. Recall as quoted earlier that,
(MRE) and Pred measures.
in order for an effort prediction model to be considered ac-
curate, MMRE < 0.25 and/or either pred(0.25) > 0.75 or
The Ab.Res. is the absolute value of residual evaluated
pred(0.30) > 0.70, [15][8]. Hence the closer a model value is
to these baseline values, the better. Since Table II shows that
the Extreme learning machine model has achieved MMRE
Ab.Res. = abs ( actual value − predicted value )
value of 0.3502, the pred(0.25) value of 0.368 and the
pred(0.30) value of 0.380. Thus, the ELM is the only one
We used the sum of the absolute residuals (Sum Ab.Res.),
that is very close to the required value for MMRE which is
the median of the absolute residuals (Med.Ab.Res.) and the
4.918, hence it is the best in term of MMRE and also in term
standard deviation of the absolute residuals (SD Ab.Res.).
of the sum absolute residuals values. It is very close to satis-
The Sum Ab.Res. measures the total residuals over the data-
fying the criterion of an accurate prediction in term of
set. The Med.Ab.Res. measures the central tendency of the
MMRE. In term of other measures, it competes favorably
residual distribution. The Med.Ab.Res. is chosen to be a
with the other models.
measure of the central tendency because the residual distri-
In comparison with the UIMS dataset, the MMRE value
bution is usually skewed in software datasets. The SD
of 0.3502 is better, while the pred(0.25) and pred(0.30) val-
Ab.Res. measures the dispersion of the residual distribution.
ues are poorer. This indicates that the performance of the
MRE is a normalized measure of the discrepancy between
extreme learning machine models may vary depending on
actual values and predicted values given by
the characteristics of the dataset and/or depending on what
MRE = abs ( actual value − predicted value ) / actual val-
prediction accuracy measure is used.
The Max.MRE measures the maximum relative discre-
pancy, which is equivalent to the maximum error relative to
the actual effort in the prediction. The mean of MRE, the
Model Max. MMRE Pred Pred Sum Med. SD
MRE (0.25) (0.30) Ab.Res. Ab.Res. Ab.Res.
1.592 0.452 0.391 0.430 686.610 17.560 31.506
Network [5]
2.104 0.493 0.352 0.383 615.543 19.809 25.400
Tree [5]
1.418 0.403 0.396 0.461 507.984 17.396 19.696
Elimination [5]
1.471 0.392 0.422 0.500 498.675 16.726 20.267
Selection [5]
Extreme Learn-
ing 1.803 0.3502 0.368 0.380 56.122 28.06 22.405

Model Max. MMRE Pred Pred Sum Med. SD

MRE (0.25) (0.30) Ab.Res. Ab.Re Ab.Res.
7.039 0.972 0.446 0.469 362.300 10.550 46.652
Network [5]
9.056 1.538 0.200 0.208 532.191 10.988 63.472
Tree [5]
11.890 2.586 0.215 0.223 538.702 20.867 53.298
Elimination [5]
12.631 2.473 0.177 0.215 500.762 15.749 54.114
Selection [5]
Extreme Learning
4.918 0.968 0.392 0.450 39.625 18.768 16.066
Going by the sum absolute residuals values, we can see 3) Discussion
that there is strong evidence that the Extreme learning ma- With the exception of extreme learning machine that has
chine model’s value is significantly lower and thus, better values closer to satisfying one of the criteria; none of the
than those of the other models. maintainability prediction models presented get closer to
satisfying any of the criteria of an accurate prediction model
cited earlier. However, it is reported that prediction accura-
2) Results from UIMS dataset cy of software maintenance effort prediction models are
Table III shows the values of the prediction accuracy often low and thus, it is very difficult to satisfy the criteria
measures achieved by each of the maintainability prediction [16].
models for the UIMS dataset. Table III shows that the Ex- Thus, we conclude that extreme learning machine model
treme learning machine model has achieved the MMRE val- presented in this paper can predict maintainability of the OO
ue of 0.968, the pred(0.25) value of 0.392 and the pred(0.30) software systems reasonably well to an acceptable degree.
value of 0.450. Those values are one of the best among all This work shows that only extreme learning machine model
the five models presented in table III. Specifically, in term has been able to consistently perform better by having val-
of MMRE values, it is the best among all the models and we ues closer to satisfying one of the criteria laid down in litera-
can see that there is strong evidence that the Extreme learn- ture, MMRE, for both data set. For both QUES and UIMS
ing machine model’s value is significantly lower and thus, datasets, whenever the extreme learning machine model’s
better than those of the other models. In term of pred(0.25) prediction accuracy has not been as good as the other mod-
and pred(0.30), it is the second best model after Bayesian els, it has been reasonably close. In terms of absolute resi-
network. In addition, it is also the best in the absolute resi- duals, ELM is better than other models for both datasets.
dual measures. The values of the absolute residuals have
again confirmed strong evidence that the differences of the
Extreme learning machine model from the other models are VII. CONCLUSION
An extreme learning machine OO software maintainabili-
Thus, it is concluded that the Extreme learning machine
ty prediction model has been constructed using the OO
model is able to predict maintainability of the UIMS dataset
software metric data used by Li and Henry [7]. The predic-
better than the other models presented.
tion accuracy of the model is evaluated and compared with

the Bayesian network model, regression tree model and the [8] S.G. MacDonell. Establishing relationships between specification size
and software process effort in case environment. Information and
multiple linear regression models using the prediction accu- Software Technology, 39:35–45, 1997.
racy measures: the absolute residuals, MRE and pred meas- [9] Y. Zhou and H. Leung, "Predicting object-oriented software maintai-
ures. The results indicate that extreme learning machine nability using multivariate adaptive regression splines," The Journal
of Systems and Software, vol. 80, 2007, pp. 1349-1361
model can predict maintainability of the OO software sys- [10] Kamaldeep Kaur, Arvinder Kaur, and Ruchika Malhotra "Alternative
tems. For both datasets, the extreme learning machine mod- Methods to Rank the Impact of Object Oriented Metrics in Fault Pre-
el achieved significantly better prediction accuracy, in term diction Modeling using Neural Networks," Proceedings Of World
of MMRE, than the other models, as it was closer to satisfy- Academy Of Science, Engineering And Technology, vol. 13, 2006,
pp. 207-212.
ing one of the criteria, a fit, which none of the other models [11] S.Kanmani, V. Rhymend Uthariaraj, V. Sankaranarayanan, and P.
have been able to achieve. Also, for both QUES and UIMS Thambidurai, "Object Oriented Software Quality Prediction Using
datasets, whenever the extreme learning machine model’s General Regression Neural Networks," ACM SIGSOFT Software En-
gineering Notes, vol. 29, 2004, pp. 1-5.
prediction measure accuracy has not been as good as the [12] N.E. Fenton, P. Krause, and M. Neil, “Software Measurement: Uncer-
best among the models, it has been reasonably competitive tainty and Causal Modeling,” IEEE Software, vol. 10, no. 4, 2002, pp.
against the best models. 116-122.
[13] M. M. T. Thwin and T.-S. Quah,, “Application of Neural Networks
Therefore, we conclude that the prediction accuracy of the for predicting Software Development faults using Object Oriented
extreme learning machine model is better than, or at least, is Design Metrics”, Proceedings of the 9th International Conference on
competitive against the Bayesian network model and the Neural Information Processing, November 2002, pp. 2312 – 2316.
[14] N.E. Fenton and S.L. Pfleeger. Software Metrics:A Rigorous & Prac-
regression based models. These outcomes have confirmed tical Approach. PWS Publishing Company, second edition, 1997.
that extreme learning machine is indeed a useful modeling [15] S.D. Conte, H.E. Dunsmore, and V.Y. Shen. Software Engineering
technique for software maintainability prediction, although Metrics and Models. Benjamin/Cummings Publishing Company,
further studies are required to realize its full potentials as [16] A. De Lucia, E. Pompella, and S. Stefanucci. Assessing effort estima-
well as reducing its shortcomings if not totally eradicated. tion models for corrective maintenance through empirical studies. In-
The results in this paper also suggest that the prediction formation and Software Technology, 47:3–15, 2005.
accuracy of the extreme learning machine model may vary [17] F. Fioravanti and P. Nesi, "Estimation and Prediction Metrics for
Adaptive Maintenance Effort of Object-Oriented Systems'', IEEE
depending on the characteristics of dataset and/or the predic- Transactions on Software Engineering, vol. 27, no. 12, 2001, pp.
tion accuracy measure used. This provides an interesting 1062–1084.
direction for future studies. Another interesting direction [18] Qin-Yu Zhu, A.K. Qin, P.N. Suganthan, Guang-Bin Huang, Evolutio-
nary extreme learning machine, Elsevier, Pattern Recognition 38
would be using the other variants of extreme learning ma- (2005) 1759 – 1763
chine such as Evolutionary extreme learning machine [18] [19] Ming-Bin Li, Guang-Bin Huang_, P. Saratchandran, N. Sundararajan,
and Fully complex extreme learning machine [19] for soft- Fully complex extreme learning machine, Elsevier, Neurocomputing
68 (2005) 306–314
ware effort prediction. [20] N.E. Fenton and M. Neil, “A Critique of Software Defect Prediction
Research,” IEEE Transactions on Software Engineering, vol. 25, no.
5, 1999, pp. 675–689.
[21] PEARSON K., Contributions to the Mathematical Theory of Evolu-
Acknowledgment tion,-II. Skew Variation in Homogeneous Material. Phil. Trans. Roy.
The authors would like to thank the anonymous review- Soc. London (A.) 1895,186,343-414.
ers for their constructive comments. The authors also ac-
knowledge the support of King Fahd University of Petro-
leum and Minerals in the development of this work.

[1] Liguo Yu, S.R. Schach, Kai Chen, “Measuring the maintainability of
Open Source Software” IEEE 2005
[2] Guang-Bin Huang_, Qin-Yu Zhu, Chee-Kheong Siew, “Extreme
learning machine: Theory and applications”, Neurocomputing 70
(2006) 489–501, ELSEVIER.
[3] G.-B. Huang, Q.-Y. Zhu, K.Z. Mao, C.-K. Siew, P. Saratchandran, N.
Sundararajan, “Can threshold networks be trained directly?”, IEEE
Trans. Circuits Syst. II 53 (3) (2006) 187–191.
[4] G.-B. Huang, Q.-Y. Zhu, C.-K. Siew, “Extreme learning machine: a
new learning scheme of feedforward neural networks”, in: Proceed-
ings of International Joint Conference on Neural Networks
(IJCNN2004), 25–29 July, 2004, Budapest, Hungary.
[5] Chikako Van Koten, Andrew Gray,” An Application of Bayesian
Network for Predicting Object-Oriented Software Maintainability”,
The Information Science, Discussion Paper Series, 2006.
[6] Rikard Land, "Measurements of Software Maintainability IEEE
Standard Glossary of Software Engineering Terminology, report IEEE
Std 610.12-1990, IEEE, 1990.
[7] W. Li and S. Henry. Object-oriented metrics that predict maintainabil-
ity, Journal of Systems and Software, 23:111–122, 1993.