Professional Documents
Culture Documents
net/publication/313417662
CITATIONS READS
2 514
2 authors:
Some of the authors of this publication are also working on these related projects:
Prediction of thunderstorm and lightning using soft computing and data mining View project
All content following this page was uploaded by Dilip Kumar Choubey on 02 August 2017.
Sanchita Paul received her PhD degree and ME degree in Computer Science &
Engineering from Birla Institute of Technology, Mesra, Ranchi, India, and
received BE degree in Computer Science & Engineering from Burdwan
University, West Bengal, India. She has approximately 10 years of teaching
1 Introduction
Diabetes is a chronic disease and a major public health challenge worldwide. Diabetes
happens when a body is not able to produce or respond properly to insulin, which is
needed to maintain the rate of glucose. Diabetes can be controlled with the help of
insulin injections, taking oral medications, a controlled diet (changing eating habits) and
exercise programs, but no whole cure is available. The main three diabetes symptoms
are: increased need to urinate (Polyuria), increased hunger (Polyphagia), increased
thirst (Polydipsia). There are two main types of diabetes: Type 1 (Juvenile or Insulin
Dependent or Brittle or Sugar) diabetes and Type 2 (Adult onset or Non Insulin
Dependent) diabetes. Type 1 diabetes mostly happens to children and young adults but
can affect at any age, 5–10% of diabetes have Type 1 diabetes. For this type of diabetes,
beta cells are destructed and people suffering from the condition require insulin injection
regularly to survive. Type 2 diabetes is the most common type of diabetes, in which
people are suffering at least 90% of all the diabetes cases. This type mostly happens to
the people more than 40 years old but can also be found in younger classes. In this type,
body becomes resistant to insulin and does not effectively use the insulin being produced.
It can be controlled with lifestyle modification, oral medications. In some extreme cases,
insulin injections may also be required but no whole cure for diabetes is available.
As we know that before some year’s physicians used to diagnoses the disease with
experience and clinical data of the patient’s means with laboratory tests reports. So, this
kind of diagnosis of the disease is time consuming mainly because it is entirely
dependent up on the availability and the experience of the physicians who have to deal
with imprecise and uncertain clinical data of the patients. So, to improve the decision
making with clinical data and to reduce time consumption, a good (intelligent) diagnosis
system is needed, and here we just analyses the input data (patient’s data i.e., PIDD) and
to develop an accurate description or model for each class using the attributes in the
dataset by which further easily based on the same concept, diagnosis system may be
developed. Researchers said that even the experience physicians are not able to detect the
disease quickly and accurately. It is always a problem for physicians to find the disease
more accurately with speedily. Here, using dataset is precise, no missing value have
found, noisy free dataset. In this chapter, for the analysis to diagnosing of diabetes
disease, the proposed method is implemented and evaluated by GA as an attribute
selection and used RBF NN for classification. By using GA method, the 4 attributes
obtained among 8 attributes. In particular, the utmost aim of an attribute selection is to
deduce the number of features used in classification, while sustaining the considerable
ROC and classification accuracy. However, reduction in the number of attributes is
critical in statistical learning. Notably, the attribute selection process helps us to preserve
the storage capacity, computation time (shorter training time and test time), computation
cost, increases classification rate, and comprehensibility. RBF NN is supervised learning
method for classification. In this study, the RBF NN has been used for the classification
(diagnosis) of the diabetes disease.
GA_RBF NN 73
The rest of the paper is organised as follows: brief description of GA and RBF NN
are in Section 2, related work is presented in Section 3, proposed methodology is
discussed in Section 4, results and discussion are present in Section 5, discussion and
future directions are devoted to Section 6.
2.1 GA
John Holland introduced GA in the 1970 at University of Michigan (USA). GAs are
adaptive heuristic search algorithms based on the evolutionary ideas of natural selection
and genetics. GA is an adaptive population based optimisation technique, which is
inspired by Darwin’s theory (Darwin, 1859) about survival of the fittest. GA mimics the
natural evolution process given by the Darwin i.e., in GA the next population is evolved
through simulating operators of selection, crossover and mutation. John Holland is
known as the father of the original GA who first introduced these operators in Holland
(1975). Goldberg (1989) and Michalewicz (1996) later improved these operators.
The advantages in GA (Choubey et al., 2015; Choubey and Paul, 2015) concepts are easy
to understand, solves problems with multiple solutions, global search methods, blind
search methods, GAs can be easily used in parallel machines, modular: separate from
application, supports multi-objective optimisation, good for “noisy” environment,
inherently parallel, easily distributed, and the limitation are certain optimisation
problems, no absolute assurance for a global optimum, cannot assure constant
optimisation response times, cannot find the exact solution. GA can be applied in
artificial creativity, bioinformatics, chemical kinetics, gene expression profiling, control
engineering, software engineering, travelling salesman problem, mutation testing, quality
control, business. Mainly, the GA utilises certain rules, i.e. selection, crossover, and
mutation, at each step to build the next generation from the current population. The GA
are more briefly illustrated by Choubey and Paul (2016), also selection, crossover,
mutation.
2.1.1 Selection
It is also called reproduction phase whose primary objective is to promote good solutions
and eliminate bad solutions in the current population, while keeping the population size
constant. This is done by identifying good solutions (in terms of fitness) in the current
population and making duplicate copies of these. Now in order to maintain the
population size constant, eliminate some bad solutions from the populations so that
multiple copies of good solutions can be placed in the population. In other words, those
parents from the current population are selected in selection phase who together will
generate the next population. The various methods like Roulette-wheel selection,
Boltzmann selection, Tournament selection, Rank selection, Steady-state selection, etc.,
are available for selection but the most commonly used selection method is Roulette
wheel. Fitness value of individuals plays an important role in these all selection
procedures.
74 D.K. Choubey and S. Paul
2.1.2 Crossover
It is to be noticed that the selection operator makes only multiple copies of better
solutions than the others but it does not generate any new solution. So in crossover phase,
the new solutions are generated. First two solutions from the new population are selected
either randomly or by applying any stochastic rule and brought over to mating pool in
order to create two offsprings. It is not necessary that the newly generated offsprings is
more, because the offsprings have been created from those individuals which have
survived during the selection phase. So the good bit strings combinations in parents
which will be carried over to offsprings. Even if the newly generated offsprings are not
better in terms of fitness then it should not be a botheration about because they will be
eliminated in next selection phase. In the crossover phase, new offsprings are made from
those parents, who were selected in the selection phase. There are various crossover
methods available like single-point crossover, two-point crossover, multi-point crossover
(n-point crossover), uniform crossover, matrix crossover (two-dimensional crossover),
etc.
2.1.3 Mutation
Mutation of an individual takes place with a very low probability. If any bit of an
individual is selected to be mutated then it is flipped with a possible alternative value for
that bit. For example, the possible alternative value for 0 is 1 and 1 for 0 in binary string
representation case i.e. 0 is flipped with 1 and 1 is flipped with 0. The mutation phase is
applied next to crossover to keep diversity in the population. Again, it is not always
possible to get better offsprings after mutation but it is done to search for few solutions in
the neighbourhood of original solutions.
2.2 RBF NN
One of the models in ANN or NN is RBF NN. The advantages of NN (Choubey et al.,
2015; Choubey and Paul, 2015) are that mapping capabilities or pattern association,
generalisation, robustness, fault tolerance, and parallel and high speed information
processing, good at recognising patterns, no mathematical process model needed, no rule
base knowledge required, different learning algorithms are available and the limitations
are needs training to operate, require high processing time for large neural network, not
good at explaining how they reach their decisions, black box, rules cannot be extracted,
prior knowledge cannot be used, no guarantee that learning converges, determine
heuristic parameters.
NN can be applied in pattern recognition, image processing, optimisation, constraint
satisfaction, forecasting, risk assessment, control systems. The RBF NN is a supervised
feed forward process with one hidden layer of hidden units, called Radial Basis
Functions (RBFs). These RBFs are supervised neural networks; hence, they require a
desired response to be trained. Interestingly, the RBFs learn that how to transform the
input data into a desired response to be trained, this quality make RBFs available for
wide use in pattern classification studies. Particularly, in the present study, a training
algorithm that normally uses gradient descend rule for the training trains certain
parameters of these networks. Nonetheless, the RBF networks are very popular for time
series prediction, function approximation, curve fitting, control and classification
problems, and adaptive & self-learning ability.
GA_RBF NN 75
The RBF NN is different from other NNs, possessing several distinctive features
because of their universal approximation, more compact topology, and faster learning
speed, RBF NNs have attracted much attention, and they have been widely applied in
many science & engineering fields. The structure of RBF NN is shown in Figure 1.
The RBF NN consists of three layers: one input layer, one hidden layers, and one output
layer. A layer is a vector of units. Each layer consists of one or more nodes or neurons,
represented by small circles. The lines between nodes indicate flow of information from
one node to another node. In Input layer, the number of neurons is the same with the
number of input dimensions. The input layer is that which receives the input and this
layer has no function except buffering the input signal (Selvakuberan et al., 2011), the
link between inputs to hidden layer is not weighted, and calculates a value of the RBFs
received from the input layer. Input layer is a set of distribution nodes. Any layer that is
formed between the input and output layers is called hidden layer. Hidden layer is a set of
nodes each one characterised by a Gaussian radial basis function. This layer performs
computations and transmits the results to output layer through weighted links, the output
of the hidden layer is forwarded to output layer. These values (hidden layer output) will
be transmitted to the output layer, which calculates the values of linear sum of the hidden
neuron. The output layer generates or produces the output of the network or classifies the
results or this layer performs computations and produce final result. Output layer is a set
of nodes each of which gives one output.
Notably, the Gaussian function is widely utilised for the activation function.
Therefore, in this view, here the Gaussian function has been implemented as RBF.
Let ф j x be the jth radial basis function. ф j x is represented as:
( x c j )2
ф j x exp (1)
2б2j
76 D.K. Choubey and S. Paul
Here, x ( x1, x2, . xd )T is the input vector, c j (c1 j , c2 j , . cdj )T and б2j are the jth centre
vector and the width parameter, respectively. The output of RBF network Y , which is the
linear sum of radial basis function, is given as follows:
p
Y W j ф j x (2)
j 1
where Y is the output of the RBF network, p is the number of the hidden layer neuron,
and Wj is the weight from jth neuron to the output layer. To construct RBF network, the
number of the hidden layer neuron m must be set, and the centres c j , the widths б j and
the weights Wj must be estimated. The learning in RBF NNs may be in two phase:
Phase I: Computing the centre of the RBF kernels and fixing their width.
Phase II: Use Delta rule to adjust the weights till convergence.
The selection of centres in static RBF networks in the following way:
A set of random input patterns
A set of grid points in the input space
A set of optimal locations using clustering algorithms.
In RBF typical learning, the network structure will be determined based on prior
knowledge or the experiences of experts.
3 Related work
Polat and Gunes (2007) stated Principal Component Analysis (PCA) and Adaptive
Neuro-Fuzzy Inference System (ANFIS) to improve the diagnostic accuracy of diabetes
disease. In this PCA is used to reduce the dimensions of diabetes disease datasets features
and ANFIS diagnosis of diabetes disease means applying classification of the reduced
features of diabetes disease datasets. Seera and Lim (2014) introduced Fuzzy Min-Max
neural network, Classification and Regression Tree (CART), Random Forest (RF) for the
classification of medical data using hybrid intelligence system. The methodology is
implemented on various datasets including Breast Cancer Wisconsin, PIDD, and Liver
Disorders and performs better as compared to other existing techniques. Temurtas et al.
(2009) stated Levenberg-Marquardt (LM) algorithm and Probabilistic neural network
(PNN) were used to train a multilayer neural network, tenfold cross validation technique
were used for estimation of result. The used techniques LM, PNN, 10-Fold cross
validation provide better correct training pattern than conventional validation method.
Dogantekin et al. (2010) used Linear Discriminant Analysis (LDA) and ANFIS for
diagnosis of diabetes. LDA is used to separate feature variables between healthy and
patient (diabetes) data, and ANFIS is used for classification on the result produced by
LDA. The techniques used provide good accuracy then the previous existing results. So,
the physicians can perform very accurate decisions by using such an efficient tool.
Barakat et al. (2010) worked on the classification of diabetes disease using a machine
learning approach such as Support Vector Machine (SVM). The paper implements a new
and efficient technique for the classification of medical diabetes mellitus using SVM. A
sequential covering approach for the generation of rules of extraction is implemented
GA_RBF NN 77
using the concept of SVM which is an efficient supervised learning algorithm. The paper
also discusses Eclectic rule extraction technique for the extraction of rules set attributes
from the dataset such that the selected attributes can be used for classification of medical
diagnosis mellitus. Orkcu and Bal (2011) Backpropogation neural network, binary-coded
genetic algorithms, real-coded genetic algorithm for the classifications of medical
datasets. Aslam et al. (2013) implemented an expert system for the classification of
diabetes data using Genetic programming (GP). The technique implemented here consists
of three stages: the first stage includes feature selection using t-test and Kolmogorov-
Smirnov test and Kulback-Leibler divergence test, the next stage uses GP, which is used
for the non-linear combination of selected attributes from the first stage. In the final stage
the generated features using GP is compared with K-nearest neighbour (KNN) and SVM.
The classification is done on PIDD consists of 768 instance values in the dataset and 8
attributes and one output variable (class variable) which have either a value ‘1’ or ‘0’
available in the dataset. The selected features are then used for the classification of
diabetes patients with high accuracy of classification.
Lukka and Pasi (2011) used Fuzzy entropy measure, similarity classifier for the better
classification of diabetic disease. Fuzzy entropy used as a feature selection and similarity
classifier used for the classification on that selected features. The techniques used
provide much lower computation time, enhanced classification accuracy by the process
to reduce noise, reduced computational cost, more transparent and comprehensible by
removing insignificant features from the dataset. Polat et al. (2008) proposed uses a new
approach of a hybrid combination of Generalised discriminant analysis (GDA) and least
square support vector machine (LS-SVM) for the classification of diabetes disease. Here
the methodology is implemented in two stages: in the first stage pre-processing of the
data is done using the GDA such that the discrimination between healthy and patient
disease can be done. In the second stage LS-SVM technique is applied for the
classification of Diabetes disease patient’s. The methodology implemented here provides
accuracy about 78.21% based on 10 fold-cross validation from LS-SVM and the obtained
accuracy for classification is about 82.05%.
Selvakuberan et al. (2011) used Ranker search method, K star, REP tree, Naive
bayes, logisitic, dagging, multiclass in which ranker search approach is used for feature
selection and K star, REP tree, Naive bayes, logisitic, dagging, multiclass are used for
classification. The techniques implemented here provide a reduced feature set with
higher classification accuracy.
Qasem and Shamsuddin (2011) introduced a Time Variant Multi-Objective Particle
Swarm Optimisation (TVMOPSO) of Radial basis function (RBF) network for
diagnosing the medical disease. RBF networks training to determine whether RBF
networks can be developed using TVMOPSO, and the performance is validated based on
accuracy and complexity.
Kala et al. (2011) proposed a new methodology for the diagnosis of Breast Cancer
using the concept of Neural Networks. In this methodology a mixture of various expert
models are grouped to solve various problems. The decision from each of the individual
expert system is mixed to give a final output. The proposed architecture implemented
here is used for the solving of Breast Cancer Diagnosis by individually evolving neural
network into Genetic Algorithm (GA). The experimental results performed by this
methodology are highly scalable and provides efficient results on attributes and data
items. Sarfaraz et al. (2014) analysed and generate reports for the evaluation of the bio-
artificial liver reactor. Here in this paper Fuzzy Analytic Hierarchy Process (FAHP) is
78 D.K. Choubey and S. Paul
4 Proposed methodology
p ( x c j )2
Y Wj exp (4)
2б2j
j1
3. Calculate error e D Y (5)
where D Desired output, Y Actual output
5. Move Centres
The general methodology involves the division of database into training and testing data
sets. The training data set is used for training the system and testing data set is used to
measure the performance.
The working of RBF NN is summarised in steps as follows:
Phase I: Training the RBF NN
Step1: Collect a Data Set.
Step 2: Divide the dataset into training and test.
Step3: Set the training parameters (such as learning rate, momentum, etc.).
Step4: Train the RBF NN Structures.
Step5: Obtained accuracy and the weights between the layers.
Phase II: Testing Process
Step1: Obtain test dataset.
Step2: Apply the test dataset to the trained RBF NN classifier.
Step3: Obtain the classification results.
GA_RBF NN 83
The work was implemented on i3 processor with 2.30 GHz speed, 2 GB RAM, 320 GB
external storage and software used JDK 8 (Java Development Kit), NetBeans 8.0 IDE
and has been done coding in java. For the computation of RBF NN and various
parameters of Weka library is used. In Experimental studies the dataset partition 70–30%
(538–230) for training & test of RBF NN, GA_RBF NN method for diagnosis
of diabetes. The experimental studies have been performed on PIDD mentioned in
section 4.1.1.
The results compared the proposed system i.e. GA_RBF NN, RBF NN with the
previous results reported by earlier methods (Ganji and Abadeh, 2011; Seera and Lim,
2014). As per Table 2, it can seen that by applying the GA approach, 4 attributes have
been obtained among 8 attributes. This means that the cost have reduced to s(x) =
4/8 = 0.5 from 1 and an improvement on the training and classification by a factor of 2.
84 D.K. Choubey and S. Paul
Confusion Matrix: A confusion matrix (Polat and Gunes, 2007; Polat et al., 2008)
contains information regarding actual and predicted classifications done by a
classification system.
Kappa statistics: It is defined as performance to measure the true classification or
accuracy of the algorithm.
TO TC
K (18)
1 TC
where TO is the total agreement probability and TC is the agreement probability due to
change.
Mean Absolute Error (MAE): MAE means the average of the absolute errors. MAE is
a quantity used to measure how close forecasts or predictions are to the eventual
outcomes. MAE is a common measure of forecasts errors. MAE can be compared
between models whose errors are measured in the same units. It is usually similar in
magnitude to RMSE, but slightly smaller.
It is defined as:
t1 q1 tn qn
MAE (19)
n
Root Mean-Squared Error (RMSE): The square root of the mean /average of the square
of all of the error. RMSE is used to assess how well a system learns a given model.
RMSE can be compared between models whose errors are measured in same units.
It is defined as:
t1 q1 tn qn
2 2
RMSE (20)
n
Relative Absolute Error (RAE): Like RSE, RAE can be compared between models
whose errors are measured in the different units.
It is defined as:
t1 q1 tn qn
RAE (21)
q q1 q q
Relative Squared Error (RSE): Unlike RMSE, RSE can be compared between models
whose errors are measured in different units.
It is defined as:
t q tn qn
2 2
RSE 1 1 2 (22)
q q1 q qn
2
Where, q1, q2,qn , are the actual target values and t1, t2,tn , are the predicted target values.
The time taken to build model training set evaluation= 0.52 seconds, and time taken
to build model testing set evaluation = 0.34 seconds for RBF NN method. Table 1 shows
the results of both the training set and testing set evaluation by using RBF NN method
for PIDD based on some parameters, which is noted below:
86 D.K. Choubey and S. Paul
Figure 4 is the ROC graph for tested_positive class by using RBF NN method on PIDD.
It may be seen that generating less error rate.
Figure 4 ROC graph for tested_positive class by using RBF NN method on PIDD
GA_RBF NN 87
The time taken to build model training set evaluation= 0.21 seconds, and time taken
to build model testing set evaluation = 0.11 seconds for GA_RBF NN methodology.
Table 3 shows the results of both the training set and testing set evaluation by using RBF
NN method for PIDD on the selected attributes by using GA based on some parameters,
which is noted below:
Cofusion Matrix for Training set
a b <--classified as
307 42| a = tested_ negative
84 105 | b = tested_ positive
Cofusion Matrix for Testing set
a b <--classified as
133 18 | a = tested_ negative
34 45 | b = tested_ positive
Table 3 Results of GA_RBF NN for PIDD
Figure 5 is the ROC graph for tested_positive class by using GA_RBF NN methodology
on PIDD. Figure 5 indicates that GA_RBF NN generates less error rate as compared to
Figure 4.
Figure 5 ROC graph for tested_positive class by using GA_RBF NN methodology on PIDD
Table 4 shows the analysis of comparison result with and without GA on RBF NN for
PIDD by several measures along with several methods i.e. noted in table.
Table 4 Evaluation of RBF NN & GA_RBF NN along with several existed method
Performance for PIDD
Table 4 Evaluation of RBF NN & GA_RBF NN along with several existed method
Performance for PIDD (continued)
In Table 4, it may be seen that with GA the improvement has occurred in every measure
in the case of RBF NN. In the above table, mentioned methods i.e. J48graft DT,
GA_J48graft DT, MLP NN, GA_MLP NN implemented by Dilip Kumar Choubey et al.
have mentioned Precision, Recall, F-Measure, Accuracy, ROC value results in the
publication but not the Kappa statistics, MAE, RMSE, RAE, RRSE. So once again went
to implement the above-mentioned methods to find the not available value results i.e.
Kappa statistics, MAE, RMSE, RAE, RRSE.
Figure 6 is the analysis of comparison result with and without GA on several methods
i.e. J48graft DT, MLP NN, and RBF NN for PIDD.
Figure 6 Evaluation of J48graft DT, GA_J48graft DT, MLP NN, GA_MLP NN, RBF NN and
GA_RBF NN Performance for PIDD
Figure 6 is representing Table 5 measures in chart graphical or histogram form and this is
indicating the difference in more precise form between several methods as already
mention in table.
90 D.K. Choubey and S. Paul
Table 5 Results and comparison of accuracy with other existed methods for the PIDD
There are already several method existing which has been implemented on PIDD.
Table 5 shows the result comparison in terms of accuracy on PIDD for the diagnosis of
diabetes.
Table 6 shows the result comparison in terms of ROC on PIDD for the diagnosis of
diabetes. It may be seen in Table 6 that the proposed method provides better ROC than
almost other existing method.
Table 6 Results and comparison of ROC with other existed methods for the PIDD
Diabetes means blood sugar is above desired level on a sustained basis. This is one of the
most world’s widespread diseases, now a day’s very common. According to “Diabetes
Atlas 2013” released by the International Diabetes Federation, there are 382 million
people in the world with diabetes and this is projected to increase to 592 million by the
year 2035. After China (98.4 million), India has the largest numbers of individuals with
diabetes in the world (65.1 million). Diabetes contributes to blindness, blood pressure,
heart disease, kidney disease and nerve damage, etc. which is hazardous to health. In this
paper, firstly the classification has been done on PIDD by using RBF NN, and then using
GA for Attributes selection, and there by performed classification on the selected
attributes. The proposed method minimises the computation cost, computation time and
maximises the ROC and classification accuracy than almost several other existing
methods as we may see in Tables 5 and 6. From Table 4 it is clear that with attribute
selection method (GA), the improvement has occurred in every measure. The proposed
method will help physicians to improve, or take accurate decisions to do work speedily.
For the future research work, we suggest to develop an expert system of diabetes
disease, which will provide good ROC, classification accuracy, precision, recall, F-
Measure, Kappa statistics, MAE, RMSE, RAE, RRSE, and this is possible to achieve
only by using different Attribute selection and classification method which, could
significantly decrease healthcare costs via early prediction and diagnosis of diabetes
disease. The proposed method can also be used for other kinds of diseases but not sure
that in all the medical diseases either same or greater than the existing results in this
paper. Results that are more interesting may also happen for the exploration of the dataset
also.
References
Aslam, M. W., Zhu, Z. and Nandi, A.K. (2013) ‘Feature generation using genetic programming
with comparative partner selection for diabetes classification’, Elsevier: Expert Systems with
Applications, Vol. 40, pp.5402–5412.
Barakat, N.H., Bradley, A.P. and Barakat, M.N.H. (2010) ‘Intelligible support vector machines for
diagnosis of diabetes mellitus’, IEEE Transactions on Information Technology in
Biomedicine, Vol. 14, No. 4, pp.1114–1120.
Choubey, D.K. and Paul, S. (2015) ‘Classification techniques for diagnosis of diabetes disease: a
review’, International Journal of Biomedical Engineering and Technology, ISSN: 1752-6418
(Print), ISSN: 1752-6426.
Choubey, D.K. and Paul, S. (2015) ‘GA_J48graft DT: a hybrid intelligent system for diabetes
disease diagnosis’, SERSC: International Journal of Bio-Science and Bio-Technology, ISSN:
2233-7849, Vol. 7, No. 5, pp.135–150.
Choubey, D.K. and Paul, S. (2016) ‘GA_MLP NN: a hybrid intelligent system for diabetes disease
diagnosis’, MECS: International Journal of Intelligent Systems and Applications, Vol. 8,
No. 1, pp.49–59.
Choubey, D.K., Paul, S. and Bhattacharjee, J. (2014) ‘Soft computing approaches for diabetes
disease diagnosis: a survey’, International Journal of Applied Engineering Research, Vol. 9,
pp.11715–11726.
Das, S., Ghosh, P.K. and Kar, S. (2013) ‘Hypertension diagnosis: a comparative study using fuzzy
expert system and neuro fuzzy system’, IEEE.
92 D.K. Choubey and S. Paul
Darwin, C. (1859) On the Origins of Species by Means of Natural Selection, Murray, London, UK.
Dogantekin, E., Dogantekin, A., Avci, D. and Avci, L. (2010) ‘An intelligent diagnosis system for
diabetes on linear discriminant analysis and adaptive network based fuzzy inference system:
LDA – ANFIS’, Digital Signal Processing, Vol. 20, No. 4, pp.1248–1255.
Ephzibah, E.P. (2011) ‘Cost effective approach on feature selection using genetic algorithms and
fuzzy logic for diabetes diagnosis,’ International Journal on Soft Computing, Vol. 2, No. 1.
Ganji, M.F. and Abadeh, M.S. (2011) ‘A fuzzy classification system based on ant colony
optimization for diabetes disease diagnosis,’ Elsevier: Expert Systems with Applications,
Vol. 38, pp.14650–14659.
Ganji, M.F. and Abadeh, M.S. (2010) ‘Using fuzzy ant colony optimization for diagnosis of
diabetes disease’, IEEE: Proceedings of ICEE, 11–13 May.
Goldberg, D.E. Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-
Wesley, Reading, MA.
Goncalves, L.B., Bernardes, M.M. and Vellasco, R. (2006) ‘Inverted hierarchical neuro-fuzzy BSP
system: a novel neuro-fuzzy model for pattern classification and rule extraction in databases’,
IEEE Transactions on Systems, Man, and Cybernetics—Part C: Applications and Reviews,
Vol. 36, No. 2.
Holland, J.H. (1975) Adaptation in Natural and Artificial Systems, The University of Michigan
Press, Ann Arbor, MI.
Jayalakshmi, T. and Santhakumaran, A. (2010) ‘A novel classification method for diagnosis of
diabetes mellitus using artificial neural networks’, International Conference on Data Storage
and Data Engineering (DSDE), Bangalore, India, pp.159–163.
Kahramanli, H. and Allahverdi, N. (2008) ‘Design of a hybrid system for the diabetes and heart
diseases’, Elsevier: Expert Systems with Applications, Vol. 35, pp.82–89.
Kala, R., Janghel, R.R., Tiwari, R. and Shukla, A. (2011) ‘Diagnosis of breast cancer by modular
evolutionary neural networks’, Inderscience: International Journal of Biomedical Engineering
and Technology, Vol. 7, No. 2, pp.194–211.
Kalaiselvi, C. and Nasira, G.M. (2014) ‘A new approach for diagnosis of diabetes and prediction
of cancer using ANFIS’, IEEE: World Congress on Computing and Communication
Technologies.
Karatsiolis, S. and Schizas, C.N. (2012) ‘Region based support vector machine algorithm for
medical diagnosis on pima indian diabetes dataset’, Proceedings of the IEEE 12th
International Conference on Bioinformatics & Bioengineering (BIBE), Larnaca, Cyprus,
pp.11–13.
Karegowda, A.G., Manjunath, A.S. and Jayaram, M.A. (2011) ‘Application of genetic algorithm
optimized neural network connection weights for medical diagnosis of pima indians diabetes’,
International Journal on Soft Computing, Vol. 2, No. 2.
Kayaer, K. and Yildirim, T. (2003) ‘Medical diagnosis on pima indian diabetes using general
regression neural networks’, IEEE.
Lee, C.-S. (2011) ‘A fuzzy expert system for diabetes decision support application’, IEEE:
Transactions on Systems, Man, and Cybernetics—Part B: Cybernetics, Vol. 41, No. 1,
pp.139–153.
Lukka, P. (2011) ‘Feature selection using fuzzy entropy measures with similarity classifier’,
Elsevier: Expert Systems with Applications, Vol. 38, pp.4600–4607.
Michalewicz, Z. (1996) Genetic Algorithms + Data Structures = Evolution Programs, Springer.
Miller, T. and Leroy, G. (2008) ‘Dynamic generation of a health topics overview from consumer
health information documents’, International Journal of Biomedical Engineering and
Technology, Vol. 1, No. 4, pp.395–414.
Orkcu, H.H. and Bal, H. (2011) ‘Comparing performances of backpropagation and genetic
algorithms in the data classification’, Elsevier: Expert Systems with Applications, Vol. 38,
pp.3703–3709.
GA_RBF NN 93
Polat, K. and Gunes, S. (2007) ‘An expert system approach based on principal component analysis
and adaptive neuro-fuzzy inference system to diagnosis of diabetes disease’, Elsevier: Digital
Signal Processing, Vol. 17, pp.702–710.
Polat, K., Gunes, S. and Arslan, A. (2008) ‘A cascade learning system for classification of diabetes
disease: generalized discriminant analysis and least square support vector machine’, Elsevier:
Expert Systems with Applications, Vol. 34, pp.482–487.
Qasem, S.N. and Shamsuddin, S.M. (2011) ‘Radial basis function network based on time variant
multi objective particle swarm optimization for medical diseases diagnosis’, Elsevier: Applied
Soft Computing, Vol. 11, pp.1427–1438.
Sarfaraz, A., Bonk, R. and Jenab, K. (2014) ‘A bio-artificial liver reactor evaluation method’,
Inderscience: International Journal of Biomedical Engineering and Technology, Vol. 14,
No. 1, pp.1–12.
Seera, M. and Lim, C.P. (2014) ‘A hybrid intelligent system for medical data classification’, Expert
Elsevier: Systems with Applications, Vol. 41 pp.2239–2249.
Selvakuberan, K., Kayathiri, D., Harini, B. and Devi, M.I. (2011) ‘An efficient feature selection
method for classification in health care systems using machine learning techniques’, IEEE.
Temurtas, H., Yumusak, N. and Temurtas, F. (2009) ‘A comparative study on diabetes disease
diagnosis using neural networks’, Elsevier: Expert Systems With Applications, Vol. 36,
pp.8610–8615.
UCI Repository of Bioinformatics Databases [online] Available online at: http://www.ics.uci.edu./
~mlearn/MLRepository.html