You are on page 1of 4

Diabetes mellitus is a group of metabolic diseases characterized by high blood sugar (glucose) levels that

result from defects in insulin secretion, or action, or both. Glucose is a simple sugar found in food. Glucose is an
essential nutrient that provides energy for the proper functioning of the body cells. Carbohydrates are broken
down in the small intestine and the glucose in digested food is then absorbed by the intestinal cells into the
bloodstream, and is carried by the bloodstream to all the cells in the body where it is utilized. However, glucose
cannot enter the cells alone and needs insulin to aid in its transport into the cells. Without insulin, the cells
become starved of glucose energy despite the presence of abundant glucose in the bloodstream. In certain types
of diabetes, the cells' inability to utilize glucose gives rise to the ironic situation of "starvation in the midst of
plenty". The abundant, unutilized glucose is wastefully excreted in the urine. Insulin is a hormone that is
produced by specialized cells (beta cells) of the pancreas. The pancreas is a deep-seated organ in the abdomen
located behind the stomach. In addition to helping glucose enter the cells, insulin is also important in tightly
regulating the level of glucose in the blood. After a meal, the blood glucose level rises. In response to the
increased glucose level, the pancreas normally releases more insulin into the blood stream to help glucose enter
the cells and lower blood glucose levels after a meal. When the blood glucose levels are lowered, the insulin
release from the pancreas is turned down. It is important to note that even in the fasting state there is a low
steady release of insulin than fluctuates a bit and helps to maintain a steady blood sugar level during fasting. In
normal individuals, such a regulatory system helps to keep blood glucose levels in a tightly controlled range. In
patients with diabetes, the absence or insufficient production of insulin causes hyperglycemia. Diabetes is a
chronic medical condition, meaning that although it can be controlled, it lasts a lifetime. The complete cure for
this disease is not available but it can be controlled through regulatory eating habits and exercise. Diabetes can
be caused due to genetics, obesity or lack of exercise. There are two types of diabetes: Type I diabetes or
juvenile diabetes usually diagnosed in children and young adults and Type II diabetes or adult-onset diabetes
[2]. The diabetes existed at adult is associated with obesity. Over time, diabetes can lead to blindness, kidney
failure, and nerve damage. These types of damage are the result of damage to small vessels, referred to as
microvascular disease. Diabetes is also an important factor in accelerating the hardening and narrowing of the
arteries (atherosclerosis), leading to strokes, coronary heart disease, and other large blood vessel diseases.
Diabetes is the third leading cause of death after heart disease and cancer.

The prevelance of diabetes has reached epidemic proportions. World Health Organization (WHO) predicts
that developing countries will bear the brunt of this epidemic in the 21 st century. Currently more than 70% of
people with diabetes live in low and middle income countries. India has the world’s largest diabetes population,
followed by China with 43.2 million. The largest age group currently affected by diabetes is between 40-59
years. By 2030 this “record” is expected to move to the 60-79 age groups with some 196 million cases. In
developing countries, less than half of people with diabetes are diagnosed. Without timely diagnoses and
adequate treatment, complications and morbidity from diabetes rise exponentially. Type 2 diabetes can remain
undetected for many years and the diagnosis is often made from associated complications or incidentally
through an abnormal blood or urine glucose test.
The financial burden borne by people with diabetes and their families as a result of their disease depends on
their economic status and the social insurance policies of their countries. In the poorest countries, people with
diabetes and their families bear almost the whole cost of the medical care they can afford. Expressed in
International Dollars (ID), estimated global expenditures on diabetes will be at least ID 418 billion in 2010, and
at least ID 561 billion in 2030. An estimated average of ID 878 per person will be spent on diabetes in 2010
globally. Besides excess healthcare expenditure, diabetes also imposes large economic burdens in the form of
lost productivity and foregone economic growth. The largest economic burden is the monetary value associated
with disability and loss of life as a result of the disease itself and its related complications. The World Health
Organization (WHO) predicted net losses in national income from diabetes and cardiovascular disease of ID
336.6 billion in India between 2005 and 2015. Unless addressed, the mortality and disease burden from diabetes
will continue to increase.

Diabetes Diagnosis using Neural Networks

Diabetes disease diagnosis via proper interpretation of the diabetes data is an important medical classification
problem. A physician’s diagnosis of diabetes is difficult as he has to consider many factors. Further the decision
may be influenced by human factors like fatigue or inexperience. For accurate and fast analysis artificial neural
networks (ANN) for classification is an active area of research.
ANN can perform computations at very high rate because of their massive parallelism and highly connected
structure. Although an ANN takes time during its training, it supplies instant results in its implementation phase.
The strengths of ANN are ability to solve data-intensive problems, adaptation, learning and generalization,
parallel computing and non-linearity. The ANN consists of many nodes called neurons with weighted
interconnections (links) between them. The incoming signals (xi) are multiplied by the corresponding weights
(wi) of the links and a bias term (b) is added to form the net input at the neuron, which is subjected to a nonlinear
function like sigmoidal as depicted in Fig. 1.3.

Fig. 1.3 A single neuron as a classifier


A number of nodes are connected in parallel to form a network. The emphasis is on tuning weights
automatically to minimize an error function. This is implemented by learning from the past set of data fed to the
network and to be able to apply this knowledge for future decision making (known as generalization).
The multilayer structure of a feed forward Multilayer Perceptron (MLP) network, shown in Fig. 1.4 is composed
of an input layer, an output layer and one or more hidden layers. The structure of a system applying MLP
network is pretty simple. The node output from each of the layers is directly used as the input to the successive
layer nodes. The numbers of nodes as well as the transfer functions in the layers are allowed to be different from
each other. Through the multilayer structure one can attain pattern classification. Generally the back propagation
(BP) algorithm [24]-[25] trains the MLP networks. One of the advantages of the BP algorithm is that its
hardware circuit can be easily realized.

The back propagation is a supervised learning algorithm, where the network is trained to minimise the error
between the desired output (yd) and the actual output (y), for a given input signal. The mean square error is
computed as

(1.8)

where, is the number of input patterns.


The weight update equation is given by

(1.9)

where, is the learning rate .The bias is updated as


(1.10)

(1.11)

(1.12)
The weight up gradation is stopped when the error reaches an acceptable minimized value.

Diabetes Disease Data

The neural network model for the diagnosis of diabetes is developed using the Pima Indian Diabetes dataset
which can be acquired from the UCI Repository of Machine Learning Database. It contains 768 samples with
two-class problem. The problem is to diagnose whether a patient would test positive or negative for diabetes. In
this database, all the patients are Pima-Indian women at least 21 years and living near Phoenix, Arizona USA.
This is the most commonly used data set to study and compare the diabetes diagnosis problem.There are eight
features or attributes for each sample.

S.No. Attribute
01. Number of times pregnant
02. Plasma glucose concentration a 2 h in an oral glucose tolerance test
03. Diastolic blood pressure (mm Hg)
04. Triceps skin fold thickness (mm)
05. 2-h serum insulin (lU/ml)
06. Body mass index (weight in kg/(height in m)^2)
07. Diabetes pedigree function
08. Age (years)

The above features form the 8 inputs to the neural network model and the output layer can have 2 outputs for
normal and diabetes detection. The 768 samples are classified as:
Class 1: normal (500samples)
Class 2: Pima Indian Diabetes (200 samples)

NN Algorithm Implementation

Neural network training can be made more efficient by performing certain pre-processing steps on the network
inputs and targets. The problem of missing data poses difficulty in the analysis and decision-making processes
and the missing data is replaced before applying it to NN model. Without this pre-processing, training the neural
networks would have been very slow. It can be used to scale the data in the same range of values for each input
feature in order to minimize bias within the neural network for one feature to another. Data pre-processing can
also speed up training time by starting the training process for each feature within the same scale. It is especially
useful for modelling application where the inputs are generally on widely different scales. PCA is a very popular
pre-processing method. Polat and Gunes have used neuro-fuzzy inference system and principal component
analysis for diabetes diagnosis problem. A variety of related algorithms have been introduced to address that
problem. Levenberg–Marquardt (LM) algorithm provides generally faster convergence and better estimation
results than other training algorithms (Gulbag, 2006; Gulbag & Temurtas, 2006). Although the LM algorithm
converges very fast but it can cause the memorization effect when the overtraining occurs. If a neural network
starts to memorize the training set, its generalization starts to decrease and its performance may not be improved
for untrained test sets. Temurtas et al have used a probilistic neural network structure with the LM algorithm to
solve the problem of over training.
Block diagram of neural network based diabetes diagnosis

Performance Evaluation through Classification Accuracy

The classification accuracy for Pima diabetes data set using various neural network algorithms is
tabulated below.

Method Classification Accuracy (in


%)
Evolving self organising maps 78.4
Principal Component Analysis – Adaptive Neuro 89.47
Fuzzy Inference System
Least Square Support Vector Machine 78.21
Multi Layer Neural Network – LM 79.62
General Regression Neural Network 80.21

Conclusion
In this paper a comparative study of neural network method is presented for diagnosis of diabetes in Pima-
Indian data set. The data set can be pre-processed using techniques like principal component analysis for
improving classification accuracy. Soft computing techniques like fuzzy system can be combined with neural
networks to give better results.

Abstract

Diabetes occurs when the body is unable to produce or properly utilise insulin. Without timely
diagnosis and adequate treatment complications and morbidity from diabetes rises exponentially.
Type 2 diagnosis can remain undetected for many years. Neural network techniques have been
successfully applied to the diagnosis of many medical problems. In this study we compare the various
neural network techniques for the diagnosis of diabetes. The Pima- Indian data set is used to study the
classification accuracy of the neural network algorithms.

You might also like