5 views

Uploaded by irfina sari

- IV-11 Bouchaoui Setif
- instantly trained neural net
- The Use of ANN for Cracks Predictions in Curvilinear Beams Based on their Natural Frequencies and Frequency Response Functions
- Machine Learning
- 4433.pdf
- A Survey on Diagnosis of Diabetes Using Various Classification Algorithm
- Paper 2016 - Explaining Predictions of Non-Linear Classifiers in NLP
- 43 1511261124_21-11-2017.pdf
- Opt Lecture
- Uncertain Evaluation for an Ultrasonic data fusion based target differentiation problem using Generalized Aggregated Uncertanty measure 2
- 15BCE0163_DA1
- Joie 1381378063800
- Machine Learning Syllabus
- 10.1016@j.enggeo.2010.05.008
- Quantitative Estimation of Water Constituents
- ann-pmsm
- Artificial Neural Network
- Forecasting Particulate Matter Concentrations
- Fuzzy Multi-Layer Perceptron Inferecing and Rule Generation
- 784232775291.pdf

You are on page 1of 9

Neural Networks

journal homepage: www.elsevier.com/locate/neunet

Buse Melis Ozyildirim a, , Mutlu Avci b

a

Department of Computer Engineering, University of Adana Science and Technology, Adana, Turkey

article

info

Article history:

Received 14 April 2012

Received in revised form 7 November 2012

Accepted 3 December 2012

Keywords:

GCNN

GRNN

PNN

Classification neural networks

Gradient descent learning

abstract

In this work a new radial basis function based classification neural network named as generalized classifier

neural network, is proposed.

The proposed generalized classifier neural network has five layers, unlike other radial basis function

based neural networks such as generalized regression neural network and probabilistic neural network.

They are input, pattern, summation, normalization and output layers. In addition to topological difference,

the proposed neural network has gradient descent based optimization of smoothing parameter approach

and diverge effect term added calculation improvements. Diverge effect term is an improvement

on summation layer calculation to supply additional separation ability and flexibility. Performance

of generalized classifier neural network is compared with that of the probabilistic neural network,

multilayer perceptron algorithm and radial basis function neural network on 9 different data sets and

with that of generalized regression neural network on 3 different data sets include only two classes in

MATLAB environment. Better classification performance up to %89 is observed. Improved classification

performances proved the effectivity of the proposed neural network.

2012 Elsevier Ltd. All rights reserved.

1. Introduction

Pattern classification problems are important application areas

of neural networks used as learning systems (Al-Daoud, 2009;

Bartlett, 1998; Specht, 1990). Multilayer perceptrons (MLP), radial

basis functions (RBF), probabilistic neural networks (PNN), self

organizing maps (SOM), cellular neural networks (CNN), recurrent

neural networks and conic section function neural network

(CSFNN) are some of these neural networks. In addition to

classification problems, function approximation problems are

also solved with neural networks. Generalized regression neural

network (GRNN) is one of the most popular neural network,

used for function approximation. GRNN and PNN are kinds of

radial basis function neural networks (RBFNN) with one pass

learning (Al-Daoud, 2009). However they are similar; PNN is used

for classification where GRNN is used for continuous function

approximation (Mosier & Jurs, 2002).

PNN introduced by Donald F. Specht in 1990 (Specht, 1990)

is used for various classification problems ever since (Adeli

& Panakkat, 2009; Hajmeer & Basheer, 2002; Kailun, Huijun,

& Maohua, 2010; Zhu & Hao, 2009). Since performance of

PNN is related with smoothing parameter and size of the

Corresponding author.

E-mail addresses: melis.ozyildirim@gmail.com (B.M. Ozyildirim),

mavci@cu.edu.tr (M. Avci).

0893-6080/$ see front matter 2012 Elsevier Ltd. All rights reserved.

doi:10.1016/j.neunet.2012.12.001

optimization of smoothing parameters and topology of neural

network (Berthold & Diamond, 1998; Mao, Tan, & Set, 2000;

Montana, 1992; Rutkowski, 2004). Genetic algorithm is one

of the optimization methods used for smoothing parameter

identification (Mao et al., 2000). Automatic topology construction

is a solution to determine the appropriate size of neural network.

In Berthold and Diamond (1998), new hidden units are added

to PNN when necessary, thus large datasets are classified with

minimum PNN topology. In addition to topology construction,

smoothing parameter optimization is provided with dynamic

decay adjustment algorithm (Berthold & Diamond, 1998). Pattern

layer neurons also effect PNN performance. In Mao et al. (2000),

orthogonal algorithm is used to select the most representative

pattern layer neurons from training data. Affine transformations

of feature space cause problems on PNN performance. To deal

with these problems, anisotropic Gaussian is implemented in

Montana (1992). Anisotropic Gaussian form includes covariance in

exponential part and training of this method is based on genetic

coding. Studies mentioned so far are related to static probabilistic

distribution, however, some pattern of probabilistic distributions

vary over time. In Rutkowski (2004), time-varying probabilistic

distribution problems are considered as prediction problems and

solved with adaptive PNN structure.

GRNN also introduced by Donald F. Specht in 1991 (Specht,

1991) is based on NadarayaWatson kernel (Kiyan & Yildirim,

2004). GRNN is used for many applications such as prediction,

3D modeling (Amrouche & Rouvaen, 2006; Asad, Zhijiang, Lining,

Reza, & Fereidoun, 2007; Firat & Gungor, 2009; Kayaer & Yildirim,

2003; Kiyan & Yildirim, 2004; Popescu, Kanatas, Constantinou,

& Nafornita, 2002; Ren, Yang, Ji, & Tian, 2010; Wang & Sheng,

2010; Yildirim & Cigizoglu, 2002). Studies show that GRNN has

better function approximation performance than feedforward

networks and other statistical neural networks on some datasets

(Amrouche & Rouvaen, 2006; Firat & Gungor, 2009; Kayaer &

Yildirim, 2003; Kiyan & Yildirim, 2004; Ren et al., 2010; Wang

& Sheng, 2010; Yildirim & Cigizoglu, 2002). Although GRNN is

proposed for function approximation some binary classification

applications exist (Kayaer & Yildirim, 2003; Kiyan & Yildirim,

2004). Large datasets cause complex and huge neural networks

and decrease the efficiency of GRNN. In addition to huge network

size, smoothing parameter directly effects GRNN performance.

Determining optimal smoothing parameter value (Ren et al., 2010)

and decreasing pattern layer size are the major problems of

GRNN. Clustering methods such as K -means, fuzzy means and

fuzzy adaptive resonance theory reduce the number of neurons

at the pattern layer by grouping data into clusters and calculating

centroids of these clusters are utilized for GRNN (Lee, Lim,

Yuen, & Lo, 2004; Specht, 1991; Zhao, Zhang, Li, & Song, 2007).

Feature extraction methods are also utilized for improving the

performance of GRNN (Erkmen & Yildirim, 2008). In Hoya and

Chambers (2001), growing and pruning processes are used for

finding optimal number of neurons at the pattern layer. At growing

step, all misclassified data are added to pattern layer iteratively

until all data are correctly classified. At pruning step, repetitive

data are removed. Smoothing parameter is also updated at both

growing and pruning processes in accordance with the maximum

distance between input and patterns, number of pattern layer

neurons and number of output layer neurons (Hoya & Chambers,

2001). Gradient descent, Quasi-Newton optimization methods

and genetic algorithm are used for optimization of smoothing

parameter (Lee et al., 2004; Masters & Land, 1997). In Tomandl

and Schober (2001), GRNN is modified to be used for any form of

data. Modified GRNN (MGRNN) uses the relative distance between

each sample instead of data. Training of MGRNN is provided with

specific error function (Tomandl & Schober, 2001). In Yoo, Sikder,

Zhou, and Zomaya (2007), GRNN is improved for high dimensional

data by linear dimensionality reduction method.

In this work, a new RBFNN based classification neural

network named as Generalized Classifier Neural Network (GCNN)

is proposed. GCNN has five layers named as input, pattern,

summation, normalization and output. For each pattern layer

neuron, a smoothing parameter is assigned. Smoothing parameters

are updated to converge squared error of winner neuron to global

minimum. GCNN uses target values for each pattern layer neuron

and provides regression based effective classification. Increasing

the distance among different classes provides better classification

performance. For this purpose, a new term amplifying the target

value effects by increasing the distance among classes, is defined

as diverge effect term. It is contained at the summation layer

calculation. Summation layer contains two different types of

neurons. First type of neuron is assigned for each class. Only one

second type of neuron is assigned for denominator calculation.

First type of neurons are used for sum of product of output of

pattern layer and diverge effect term. Normalization layer has N

neurons where N denotes number of classes. In this layer, each

neuron divides first type neuron output to second type neuron

output of summation layer. Output layer contains competition

among normalization layer neurons. Smoothing parameters are

optimized according to squared error of winner neuron estimated

value and target value.

Proposed GCNN is tested with 9 data sets in MATLAB

environment. These are glass identification, Habermans survival,

19

Table 1

Description of data sets.

Data set

Attributes

Classes

Data

Glass

Habermans survival

Two spiral problem

Lenses

Balance-scale

Iris

Breast cancer wisconsin

E.coli

Yeast

10

3

2

4

4

4

10

8

8

7

2

2

3

3

3

2

8

10

214

306

328

24

625

150

699

336

1484

two spiral problem, lenses, balance-scale, iris, breast-cancerwisconsin Bennett and Mangasarian (1992), Mangasarian, Setiono,

and Wolberg (1990), Mangasarian and Wolberg (1990), Wolberg

and Mangasarian (1990), E.coli and yeast data sets (Frank &

20

Table 2

10-fold cross validation classification performances.

Data sets/methods (%)

GCNN

GRNN

PNN

= 0.3

GRNN

optimized

PNN

optimized

Glass

sigma =

0.2567/94.3925

sigma =

0.2794/66.0131

sigma =

0.2998/89.0244

sigma = 0.3/100

52.8037

59.4771

59.4771

85.3659

85.3659

66.6667

sigma =

0.2/59.8

sigma =

0.24/85.37

72.1154

Iris

sigma =

0.2997/91.5064

sigma = 0.2823/100

94

sigma = 0.3/96.2751

95.4155

95.4155

E.coli

sigma = 0.2376/100

56.5476

sigma =

0.265/95.13

Yeast

sigma = 0.2171/100

11.1186

sigma =

0.196/55.61

sigma =

0.25/59.48

sigma =

0.25/85.37

sigma =

0.3/66.6667

sigma =

0.35/72.12

sigma =

0.26/95.33

sigma =

0.265/95

sigma =

0.14/77.68

sigma =

0.15/31.2

Habermans survival

Two spiral problem

Lenses

Balance-scale

= 0.3

GRNN

PNN

MLP

RBF

53.7383

48.130

49.0654

58.8235

61.1111

64.0523

71.2418

18.9024

85.3659

31.0976

79.2683

66.6667

70.83

75

69.7115

87.1795

73.2372

95.33

92

92

95.7020

70.7736

96.1318

67.1920

78.5714

76.1905

71.7262

38.6792

43.7332

38.2749

=1

= 0.1

Table 3

Training and test times.

Data sets

Glass

Habermans survival

Two spiral problem

Lenses

Balance-scale

Iris

Breast cancer wisconsin

E.coli

Yeast

GCNN

GRNN

= 0.3

= 0.3

PNN

GRNN

optimized

PNN

optimized

GRNN

=1

= 0.1

PNN

MLP

RBF

10.9110/0.0873

11.8723/0.1334

14.3450/0.1955

0.1094/0.0014

61.6890/0.4979

3.3015/0.0382

60.8600/0.5141

32.0276/0.3169

680.5347/4.6442

0.1838

0.2004

0.2174

0.1888

0.1840

0.1989

0.1831

0.2064

0.1772

0.2108

0.2001

0.2562

0.1817

0.1924

0.2101

0.2049

0.1844

0.1987

0.1915

0.2053

0.1744

0.2117

0.1924

0.3951

0.1824

0.1909

0.2169

0.2014

0.1841

0.2112

0.1856

0.2069

0.1762

0.2126

0.2289

0.4267

2.0599/0.0607

1.2340/0.0620

1.2129/0.0606

0.8263/0.0521

2.7872/0.0598

1.3461/0.0563

1.7337/0.0552

3.1709/0.0637

14.1513/0.0801

10.1004/0.1005

14.3833/0.1011

11.0586/0.0854

0.9545/0.0802

21.8738/0.0954

0.6813/0.0983

37.7400/0.0883

1.9160/0.0949

174.1730/0.1344

with that of PNN and GRNN of MATLAB Toolbox with varying

and constant smoothing parameters, Multi Layer Perceptron (MLP)

and Radial Basis Function Neural Network (RBFNN) of MATLAB

Toolbox. According to that test results, GCNN is proposed as a new

and effective classifier neural network.

2. Fundamental approaches for GCNN

Since GCNN, GRNN and PNN are based on radial basis function neural network, GCNN can be considered as a close relative

of GRNN and PNN. When they are compared according to their

purposes, PNN and GCNN are used for classification; however

GRNN is used for regression. Unlike PNN, GCNN is based on

regression methodology for effective classification. GCNN is

different from others with its topology, diverge effect term and

training method. Gradient descent method is used as training

method in GCNN. GRNN, PNN and gradient descent training

method are briefly explained in the following subsections.

2.1. Generalized regression neural networks

GRNN is proposed for function approximation purposes (Amrouche & Rouvaen, 2006); however in some works, it is applied

to classification problems (Amrouche & Rouvaen, 2006; Kayaer &

Yildirim, 2003; Kiyan & Yildirim, 2004). Its advantages are fast

learning, consistency and optimal regression with large number of

samples (Ren et al., 2010). GRNN has four layers; input, pattern,

summation and output as shown in Fig. 1 (Specht, 1991).

Input layer provides transmission of input vector x to pattern

layer. Pattern layer consists of neurons for each training datum or

21

distance is calculated according to (1). Any new input applied

to network is first subtracted from pattern layer neuron values,

then according to the distance function either squares or absolute

values of subtracts are summed and applied to activation function.

Generally, exponential function is used as activation function.

Results are transferred to summation layer. Neurons in summation

layer add dot product of pattern layer outputs and weights. In Fig. 1

weights are shown by A and B, their values are determined by y

values of training data stored at pattern layer and f (x)K denotes

weighted outputs of pattern layer where K is a constant associated

with Parzen window. Yf (x)K denotes multiplication of pattern

layer outputs and training data output Y values. At output layer,

Yf (x)K is divided by f (x)K to estimate desired Y , given in (2), (3),

(Al-Daoud, 2009; Erkmen & Yildirim, 2008; Specht, 1991).

Dj = x tj

x tj

(1)

Yf (x, Y ) dY

Y (x) =

f (x, Y ) dY

Y (x) =

yj e

j=1

Dj

2 2

j=1

Specht introduced RBFNN based classification neural network

named as PNN (Specht, 1990). PNN structure is shown in Fig. 2.

According to figure, input layer holds applied input values

to be processed in pattern layer. Each pattern unit consists of

weight vector t and input vector x. In this layer, for each pattern

neuron first dot product of t and x is performed then nonlinear

activation function is applied to this product, given in (4). Generally

exponential function is used as activation function. Results are

summed at summation layer for each pattern unit and fA1 (x)

and fB1 (x) which represent Gaussian activation functions, are

calculated. Output layer is known as decision layer (Specht, 1990).

T

(t x) 2(t x)

(x) = e

(4)

case of probability density function (pdf). Cacoullos extended pdf

to multivariate case as in (5).

.

p

Dj

e

(2)

the other hand the larger one extends radius of effective neighbors

(Amrouche & Rouvaen, 2006; Ren et al., 2010).

(3)

2 2

probability density functions as in (3). In GRNN structure only

smoothing parameter ( ); also known as bandwidth; is updated

(Kiyan & Yildirim, 2004; Tomandl & Schober, 2001). values are

important; smaller limits the number of effective samples, on

g (x1 , x2 , . . . , xm ) =

p

j =1

1

p1 2 . . . m

x1 t1,j x2 t2,j

,...,

xm tm,j

(5)

parameters. x1 , x2 , . . . , xm are input variables. W is weighting

22

Fig. 4. (continued)

samples. In case of equal smoothing parameters and bell-shaped

Gaussian activation function (6) is obtained. This is the most

popular case of PNN.

g ( x) =

1

m

(2)( 2 ) p m

xtj 2

e = (y f )2

2 2

squared error of system, gradient descent is the popular method

for this aim. Firstly, squared error is calculated according to (7).

(6)

j =1

(Hajmeer & Basheer, 2002).

2.3. Gradient descent training method

Gradient descent is an iterative first order optimization

algorithm. The purpose of this method is finding minimum of

a differentiable function by following the opposite direction of

gradient with defined step size (Madsen, Nielsen, & Tingleff, 2004).

(7)

is squared error. Derivative of e is calculated as in (8), where w is

the parameter converging e to minimum.

e

(f )

=2

.

w

w

(8)

e

Finally, w is updated in the opposite direction of w

with step

size as given in (9).

wt +1 = wt +

e

.

w

(9)

GCNN is a new classification neural network with gradient

descent learning on smoothing parameter and can be identified as

a new kind of RBFNN. It has five layers; input, pattern, summation,

normalization and output. Structure of GCNN is shown in Fig. 3.

Input layer transmits applied input vector x to pattern layer.

Pattern layer contains one neuron for each training datum.

Neurons at pattern layer calculate squared Euclidean distance

between the input vector x and the training data vector t as given

in (10) where p denotes the total number of training data. Output

of pattern layer is determined by RBF kernel activation function,

given in (11). As GCNN classification methodology is based

on regression, it builds on one-vs.-all discriminative structure;

therefore each training datum has N values determined by whether

or not belonging to class: if a training datum belongs to ith class

then its ith value is 0.9 and others are 0.1. The reason of choosing

0.9 and 0.1 values is to prevent stuck neuron problem of learning

process.

dist(j) = x tj 2 ,

r (j) = e

y (j, i) =

1 dist(2j)

0.9

0.1

1jp

1jp

else 1 j p.

1iN

d (j, i) = e

y (j, i)

d (j, i) r (j),

1iN

D=

(11)

(12)

(13)

(14)

(15)

class and outputs of these neurons are calculated according to (16).

ui

D

r (j)

1 i N.

(16)

in (17) selects maximum of normalization layer outputs.

dist(j)

j =1

j =1

ci =

where y(z , id) represents the value of zth training input data for

idth class and cid is value of winner class. Secondly, first derivative

of error e is calculated according to (19)(22) (Masters & Land,

1997).

(17)

neuron value and id denotes winner class.

(19)

(20)

(21)

(22)

where is learning step.

new = old +

e

.

(23)

where epoch denotes number of iterations that training algorithm

takes place, amse denotes acceptable mean squared error and lr

stands for learning rate of gradient descent method. When one of

the stopping criteria is provided, optimum smoothing parameters

for each training datum are obtained.

Algorithm 1 Training of GCNN

inputs: epoch, lr, training_input_data, amse

outputs: smoothing parameter

initialize smoothing parameter and ymax

while iteration epoch

e

update w ith

for each training datum; tj

find Euclidean distance between input

and training data, dist (j)

perform RBF activation function, r (j)

for each class; i

calculate diverge effect term,

d (j, i) = e(y(j, i)ymax ) y (j, i)

compute ui =

r (j).

(18)

l(id) = 2

j =1

p

e = (y (z , id) cid )2

(10)

where d(j, i) denotes diverge effect term of jth training data and ith

class. ymax is initialized with 0.9 which denotes the maximum value

of y (j, i) and updated with the maximum value of output layer for

each iteration.

At this layer, when N neurons calculate sum of dot product of

diverge effect terms and pattern layer outputs as given in (14),

other neuron calculates denominator the same as GRNN, given in

(15).

ui =

Since smoothing parameter has an important effect on classification performance, gradient descent based training approach is

adapted to GCNN. During the training step, each training datum at

pattern layer is sequentially applied to neural network. Firstly for

each input, squared error e is calculated as given in (18).

e

cid

= 2 [(y (z , id) cid )]

o

b(id) l(id) cid

=

D

p

dist(j)

b(id) = 2

d (j, id) r (j)

3

j =1

number of classes and 1 is for one neuron to obtain denominator.

At summation layer, GCNN uses diverge effect term in N neurons

for better classification performances. Diverge effect term uses

exponential form of y (j, i) ymax , (13) to increase the effect

of y (j i). The aim of using exponential function is providing

convergence to minimal error between limits. Diverge effect term

provides two important advantages to GCNN. By increasing the

effect of y (j, i), data belong to different classes, are separated

from each other. By taking the advantage of exponential function,

overfitting problem, generally gradient descent approach suffers

from, is suppressed.

(y(j, i)ymax )

23

and D =

j =1

j =1

d (j, i) r (j)

r (j)

u

end-for

find winner neuron and its value; [o, id] = max (c )

to update diverge effect term winner neuron values are stored;

cmax(iteration) = cid

calculate squared error e =(y (z , id) cid )2

where z denotes zth input.

end-for

ymax

= max (cmax) increment iteration

if | e | amse

stop training

end- while

24

Inputs of test algorithm are test data and optimum smoothing parameters obtained from training algorithm. Outputs are estimated classes of test data. Algorithm 2 shows algorithm of GCNN

test step.

Algorithm 2 Test of GCNN

inputs: smoothing_parameter, test_data

outputs: class

for each training datum; tj

find Euclidean distance between test

and training data, dist (j)

perform RBF activation function, r (j)

for each class; i

calculate diverge effect term,

d (j, i) = e(y(j, i)ymax ) y (j, i)

compute ui =

and D =

j =1

d (j, i) r (j)

j=1 r (j)

end-for

find winner neuron and its value; [o, id] = max (c )

end-for

ui

D

GCNN is compared with PNN, MLP and RBFNN for 9 different

data sets by 10-fold cross validation test. Although GRNN is

applications exist. Therefore, in this paper GCNN is also compared

with the classification usage of GRNN. Performance of GCNN

is compared with MATLAB 9 Standard Neural Network Toolbox

PNN, GRNN, MLP and RBFNN implementations. Data sets are

glass identification, Habermans survival, two spiral problem,

lenses, balance-scale, iris, breast-cancer-wisconsin, E.coli and

yeast. Number of attributes and classes are given in Table 1.

Table 2 shows 10-fold cross validation test performances of

GCNN with average smoothing parameter values, obtained from

10-fold cross validation and represented with sigma in table,

GRNN/ PNN with GCNNs initial smoothing parameter value

and GRNN/PNN with average optimized smoothing parameter,

obtained from 10-fold cross validation and represented with sigma

in Table, GRNN/PNN with default smoothing parameter values of

MATLAB Toolbox, standard MLP algorithm and RBFNN of MATLAB

Toolbox respectively.

According to the Table 2 for 9 data sets, GCNN provides better classification performance in the range of %1%89 than PNN.

GRNNs function approximation nature allows its binary classification usage. However since GRNN uses weighted arithmetic mean

in its output layer, as the number of class increases; the classification performance of GRNN decreases. In this paper, GCNN is

compared with GRNN for only binary classification problems.

According to the results, GCNN provides better classification

performance in the range of %0%71 than GRNN. Optimized

smoothing parameter causes both increases and decreases on the

performance of standard GRNN and PNN. However, GCNN provides

%1%68 improved classification performance than GRNN and PNN

25

Fig. 5. (continued)

with optimized smoothing parameter. GCNN has better classification performance than standard and optimized PNN and GRNN.

In addition to the radial basis function based neural networks

GCNN provides better classification performance than both MLP

and RBFNN except Habermans survival data set.

In GCNN smoothing parameter is optimized according to

training data for each fold. Fig. 4 shows smoothing parameter

values for each data set and fold.

Number of epoch and learning step for GCNN, optimized

GRNN and PNN models are chosen as 10 and 0.3 respectively. In

Table 3 average training and test times are given for each methods

compared in Table 2.

Since GCNN has training step, it requires more computational

time than other methods which do not include training step. On

the other hand, MLP has also training step however requires less

training time than GCNN. This is because GCNN has one neuron for

each training data in its hidden layer where MLP has less number

of neurons. Computational times of RBFNN and GCNN are close to

each other. The difference between RBFNN of MATLAB Toolbox and

GCNN is that, while GCNN includes one neuron for each training

26

it requires.

Smoothing parameter determines the radius of effective

neighbors of datum. If some of the data that belong to the

same class can be grouped according to their classes, using the

smoothing parameter that encapsulates these data, improves the

classification performance. In Fig. 5 the relationships between the

smoothing parameters and classification rates of GCNN and PNN

are shown. For binary classification problems, GRNN performance

results are also added to figure.

According to Fig. 5 generally GCNN has equal or better

classification performances than GRNN and PNN under different

smoothing parameter values for most of data sets.

5. Conclusion

Through this work a new neural network for classification

purposes is proposed, and training method for it is introduced.

At training step, smoothing parameters are updated with gradient

descent method to reach optimal smoothing parameters for

corresponding data set. It includes additional layer and calculation

term with respect to existing RBFNN based GRNN and PNN. GCNN

has tested with 9 different popular data sets. Classification results

show that GCNN has better performance than both standard and

optimized GRNN, PNN, MLP and RBFNN. This improvement is

provided by diverge effect term addition, smoothing parameter

optimization approach and competition ability of output layer of

GCNN. Another reason for better classification results than PNN

is GCNNs function approximation based structure. Unlike fixed

smoothing parameters of GRNN and PNN, the training algorithm

adjusts smoothing parameters for each training datum. Diverge

effect term provides classification performance improvement by

increasing the distance among data belong to different classes.

Output layer with competition ability decides effective update for

smoothing parameter.

The memory problem of RBF based neural networks, such

as GRNN and PNN appearing one neuron assignment for each

training datum, still exists with GCNN. Future works are required

to overcome large memory assignment problem. In order to

use GCNN effectively, initial values of the smoothing parameters

should be selected in accordance with the data sets. GCNNs

training time requirement can be seen as a problem; however,

better classification performances are achieved by optimizing the

smoothing parameter.

Finally, GCNN has three important advantages: regression

based training method, dynamic smoothing parameter with

training method and more efficient classification performance

with diverge effect term. Test data performances proved the

effectivity of GCNN.

References

Adeli, H., & Panakkat, A. (2009). A probabilistic neural network for earthquake

magnitude prediction. Neural Networks, 22, 10181024.

Al-Daoud, E. (2009). A comparison between three neural network models for

classification problems. Journal of Artificial Intelligence, 2, 5664.

Amrouche, A., & Rouvaen, J. M. (2006). Efficient system for speech recognition using

general regression neural network. International Journal of Computer Systems

Science and Engineering, 183189.

Asad, B., Zhijiang, D., Lining, S., Reza, K., & Fereidoun, M. A. (2007). Fast 3D

reconstruction of ultrasonic images based on generalized regression neural

network. In World congress on medical physics and biomedical engineering.

Bartlett, P. L. (1998). The sample complexity of pattern classification with neural

networks: the size of the weights is more important than the size of the

network. IEEE Transactions on Information Theory, 44(2), 525536.

Bennett, K. P., & Mangasarian, O. L. (1992). Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods and Software, 2334.

Berthold, M. R., & Diamond, J. (1998). Constructive training of probabilistic neural

networks. Neurocomputing, 19, 167183.

Erkmen, B., & Yildirim, T. (2008). Improving classification performance of sonar

targets by applying general regression neural network with PCA. Expert Systems

with Applications, 35, 472475.

Firat, M., & Gungor, M. (2009). Generalized regression neural networks and feed

forward neural networks for prediction of scour depth around bridge piers.

Advances in Engineering Software, 40, 731737.

Frank, A., & Asuncion, A. (2010). UCI machine learning repository.

http://archieve.ics.uci.edu/ml.

Hajmeer, M., & Basheer, I. (2002). A probabilistic neural network approach for

modeling and classification of bacterial growth/no-growth data. Journal of

Microbiological Methods, 51, 217226.

Hoya, T., & Chambers, J. A. (2001). Heuristic pattern corrrection scheme using

adaptively trained generalized regression neural networks. IEEE Transactions on

Neural Networks, 12, 1.

Kailun, H., Huijun, X., & Maohua, X. (2010). The application of probabilistic neural

network model in the green supply chain performance evaluation for pig

industry. In International conference on e-business and e-government.

Kayaer, K., & Yildirim, T. (2003). Medical diagnosis on pima indian diabetes using

general regression neural networks. In Artificial neural networks and neural

information processing.

Kiyan, T., & Yildirim, T. (2004). Breast cancer diagnosis using statistical neural

networks. Journal of Electrical & Electronics Engineering, 4(2), 11491153.

Lee, E. W. M., Lim, C. P., Yuen, R. K. K., & Lo, S. M. (2004). A hybrid neural

network model for noisy data regression. IEEE Transactions on Systems, Man, and

Cybernetics, 34(2), 951960.

Madsen, K., Nielsen, H. B., & Tingleff, O. (2004). Methods for non-linear least

squares problems. Informatics and Mathematical Modeling Technical University

of Denmark.

Mangasarian, O. L., Setiono, R., & Wolberg, W. H. (1990). Pattern recognition via

linear programming: theory and application to medical diagnosis. In Large-scale

numerical optimization (pp. 2230). SIAM Publications.

Mangasarian, O. L., & Wolberg, W. H. (1990). Cancer diagnosis via linear

programming. SIAM News, 23(5), 118.

Mao, K., Tan, K., & Set, W. (2000). Probabilistic neural-network structure

determination for pattern classification. IEEE Transactions on Neural Networks,

11(4), 10091016.

Masters, T., & Land, W. (1997). A new training algorithm for the general regression

neural network. In IEEE international conference on systems, man and cybernetics,

computational cybernetics and simulation. Vol. 3 (pp. 19901994).

Montana, D. (1992). A weighted probabilistic neural network. Advances in neural

information processing systems, 4, 11101117.

Mosier, P. D., & Jurs, P. C. (2002). QSAR/QSPR studies using probabilistic neural

networks and generalized regression neural networks. Journal of Chemical

Information and Computer Sciences, 42, 14601470.

Popescu, I., Kanatas, A., Constantinou, P., & Nafornita, I. (2002). Application of

general regression neural networks for path loss prediction. In Proceedings of

international workshop trends and recent achievements in information technology.

Ren, S., Yang, D., Ji, F., & Tian, X. (2010). Application of generalized regression neural

network in prediction of cement properties. In 2010 International conference on

computer design and applications.

Rutkowski, L. (2004). Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE Transactions on Neural Networks, 15(4),

811827.

Specht, D. F. (1990). Probabilistic neural networks. Neural Networks, 3, 109118.

Specht, D. F. (1991). A general regression neural network. IEEE Transactions on Neural

Networks, 2(6), 568576.

Tomandl, D., & Schober, A. (2001). A modified general regression neural network

with new efficient training algorithms as a robust black boxtool for data

analysis. Neural Networks, 14, 10231034.

Wang, Z., & Sheng, H. (2010). Rainfall prediction using generalized regression

neural network: case study Zhengzhou. In 2010 International conference on

computational and information sciences.

Wolberg, W. H., & Mangasarian, O. (1990). Multisurface method of pattern

separation for medical diagnosis applied to breast cytology. Proceedings of the

National Academy of Sciences, 87, 91939196.

Yildirim, T., & Cigizoglu, H. K. (2002). Comparison of generazlized regression neural

network and MLP performances on hydrologic data forecasting. In International

conference on neural information processing.

Yoo, P. D., Sikder, A. R., Zhou, B. B., & Zomaya, A. Y. (2007). Improved

general regression network for protein domain boundary prediction. In Sixth

international conference on bioinformatics.

Zhao, S., Zhang, J., Li, X., & Song, W. (2007). A generalized regression neural network

based on fuzzy means clustering and its application in system identification. In

International symposium on information technology convergence.

Zhu, C., & Hao, Z. (2009). Application of probabilistic neural network model in

evaluation of water quality. In Internation conference on environmental science

and information application technology.

- IV-11 Bouchaoui SetifUploaded bybaoHVLAB
- instantly trained neural netUploaded byDatta Pkd Kumar Pushan
- The Use of ANN for Cracks Predictions in Curvilinear Beams Based on their Natural Frequencies and Frequency Response FunctionsUploaded byJournal of Computing
- Machine LearningUploaded byrybk
- 4433.pdfUploaded byalokesengupta
- A Survey on Diagnosis of Diabetes Using Various Classification AlgorithmUploaded byEditor IJRITCC
- Paper 2016 - Explaining Predictions of Non-Linear Classifiers in NLPUploaded byjc224
- 43 1511261124_21-11-2017.pdfUploaded byAnonymous lPvvgiQjR
- Opt LectureUploaded byAbdul Maroof
- Uncertain Evaluation for an Ultrasonic data fusion based target differentiation problem using Generalized Aggregated Uncertanty measure 2Uploaded byMia Amalia
- 15BCE0163_DA1Uploaded byMechnovate 2017
- Joie 1381378063800Uploaded byCarlos Saravia Jurado
- Machine Learning SyllabusUploaded bySatish B basapur
- 10.1016@j.enggeo.2010.05.008Uploaded byeugie
- Quantitative Estimation of Water ConstituentsUploaded byjose amezquita
- ann-pmsmUploaded byChandan Mandal
- Artificial Neural NetworkUploaded byAvecc Amourr
- Forecasting Particulate Matter ConcentrationsUploaded byIJAERS JOURNAL
- Fuzzy Multi-Layer Perceptron Inferecing and Rule GenerationUploaded bySamir Barguil
- 784232775291.pdfUploaded bymahendra A
- Group 5 Assignment on Advantages and Disadvantages of Supervised Hebbian Learning Ver2 (2)Uploaded byMairos Kunze Bonga
- Logistic Regression NotesUploaded byYtamar Visbal Perez
- Effect of the Welding Process Parameter in Mmaw for Joining of Dissimilar MetalsUploaded byIAEME Publication
- Paper Novillo TelloUploaded byAndres Tello
- 18n97456xUploaded byyip90
- Machine Learning, Neural Networks and Statistical ClassificationUploaded bysumonedu
- SP64_1697-1701_C,LiraUploaded byfgrobert
- p52Uploaded byCradle
- 66_pub.pdfUploaded bygetmak99
- projectreport-ocrrecognition-140903052518-phpapp02.pdfUploaded byshubham soni

- Final f04solnUploaded byacrosstheland8535
- Brainrich IEEE Project Titles 2013 - 2014Uploaded bySakthi Vel
- Session 3-8 Need for B.S. Data Science Degree Program.pdfUploaded byCyra Tulabing Baga
- Source Code Plagiarism Detection SCPDet AUploaded byPles Ramadar
- 07_DSM.pdfUploaded bydeepak1892
- Article.elc.612.Learning.cognitionUploaded byDonald Hall
- IRJET-Credit Card Fraud Detection AnalysisUploaded byIRJET Journal
- Mathworks Training InformationUploaded byDaniel Posada
- lecture12-textcatUploaded byMeo Meo Con
- The challenge of modelling ecological nichesUploaded bymeajagun
- AI_group8Uploaded bymeghana kanse
- A Survey on Deep Learning in Medical Image Analysis - Medical Image Analysis 2017Uploaded bymuhammadasif_5839675
- Spotting Translationese: An Empirical Approach (Pau Giménez Thesis Proposal)Uploaded byOle Paul
- 14075003 Aman Gupta ResumeUploaded byJeet Parekh
- E-Learning Timetable Generator Using Genetic AlgorithmsUploaded byUpender Dhull
- deep learningUploaded byHamed Rahimi
- Using Artificial Neural Networks & Twitter Sentiment Analysis to Predict Stock MovementUploaded byRohan Patel
- Sop statement of purposeUploaded byArpkeet IJ
- MEUploaded byMrunal Upasani
- machine-learning-basics-infographic-with-algorithm-examples.pdfUploaded byArjun Satheesh
- Big Data RevolutionUploaded bypercorsielmar
- Student EngagementUploaded byLuis MiguEl STreet
- GD-DynamicsUploaded byKrithiga K Pillai
- ml.pdfUploaded byPriyatham Gangapatnam
- B.Tech-4-1Syllabus-1.rtf.pdfUploaded byGm Lakshman
- 2019-06-exam-pa-syllabi.pdfUploaded bypitipiti
- CS224d-Lecture1Uploaded byMahmood Kohansal
- Sentiment Analysis Final Documentation ReportUploaded byAman Varma
- Analytics Reading ListUploaded byshivnair
- 22_14.8247_Muhamad Saka Sotyasaksi- POSTERUploaded byMuhammad Saka Sotyasaksi