33 views

Uploaded by Chandan Gautam

Developing Hybrid Intelligence Systems for Data Imputation Based on Statistical and Machine Learning Techniques.

- Sample Size Formula for Case Cohort Studies.22
- US Federal Reserve: finalasa2000
- SDE1chA
- Why Some Schools Do Better
- Lab1DataScreening 2
- Hmisc
- Improving the Performance of K-nearest Neighbor Algorithm for the Classification of Diabetes Dataset With Missing Values
- takutut
- tmpD6F3.tmp
- Principal Components Analysis
- Andersson_2016_BJSM_Preventing Overuse Shoulder Injuries Among Throwing Athletes
- Complex Samples in SPSS
- Estimation of Mean Using Improved Ratio-Cum- Product Type Estimator with Imputation for Missing Data
- MissingValues_saar-tsechansky07a
- eina jurnal 1
- Mlt Assignment 6
- 1337722691.9912180
- Stuart J. Russell, Peter Norvig-Inteligencia Artificial, Un Enfoque Moderno-Prentice Hall
- Milling Tool Wear Diagnosis by Feed Motor Current Signal using an Optimized Artificial Neural Network-FA438197.pdf
- Learning by Neural Networks

You are on page 1of 77

Imputation

Chandan Gautam

(12MCMB03)

Under the guidance of

Prof. V. Ravi

Outline

Outline

Problem Statement

Missing Data and their causes

Data Imputation

Literature Survey

Proposed Method

Results

Conclusions

References

2

Problem Statement

Problem Statement

Developing Hybrid Intelligence Systems for Data Imputation

Based on Statistical and Machine Learning Techniques.

In the real world scenario,

missing data is an inevitable

and common problem in

various disciplines.

It circumscribes the ability of

researchers to obtain any

conclusion, even if we will get

result by deleting missing data

then result may have biased and

inappropriate.

So, the missing values have to

be imputed.

Age

Salary

Incentive

25

4000

??

??

500

27

??

50

82

2000

150

42

6500

1000

Literature Survey

Literature Survey*

N. Ankaiah, V. Ravi, A novel soft computing hybrid for data

imputation, In Proceedings of the 7th International Conference

On Data Mining (DMIN), Las Vegas, USA, 2011.

Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimation

using principal component analysis and Auto associative neural

networks, Journal of Systemics, Cybernetics and Informatics, Volume 7,

pp. 72-79 .

I. B. Aydilek, A. Arslan, A hybrid method for imputation of missing

values using optimized fuzzy c-means with support vector regression

and a genetic algorithm, Information Sciences, vol. 233, pp. 25-35,

2013.

Shichao Zhang, Nearest neighbor selection for iteratively kNN

imputation, The Journal of Systems and Software (2012), vol. 85(11),

pp. 2541-2552.

Mean Imputation

Age

Salary

Incentive

Age

Salary

Incentive

25

4000

200

25

4000

??

34

500

??

500

27

1000

50

27

??

50

82

2000

150

82

2000

150

42

6500

1000

42

6500

1000

Mean Imputation :

44

3250

300

7

Mean Imputation

MAPE

Compute

(MAPE) value:

100 n

MAPE

n i 1

x x

x

i

Where,

n - Number of missing values in a given dataset.

i - Predicted by the Mean Imputation for the missing

x

value.

xi - Actual value.

8

Mean Imputation

Result of Mean Imputation

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

Mean Imputation

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

most of the datasets.

So, we have need some

other methods.

Proposed Methods

Module I

PCA-AAELM Imputation

ECM-Imputation

ECM-AAELM Imputation

Module II

PSO-ECM- Imputation

PSO-ECM + ECM-AAELM

Module III

CPAANN Imputation

Gray+PCA-AAELM

Gray+CPAANN

10

Overview of ELM

11

Overview of ELM

Architecture of ELM

Architecture of ELM *

x

Output of hiddenTraining

nodes :

g(ai x + bi)

H=g(a.x)

ai : the weight vector

of the connection

=?

th

between the i hidden

node and the input

H. =O

nodes.

H thO

bi : the threshold of the i hidden node.

Output of SLFNs :

m Testing

( x) g(a i x bi )

i

iH_T=g(y.a)

1

Output=H_T .

i : the weight vector of the connection

Output Weight : H T

H is Moore-Penrose inverse.

output nodes.

12

13

Proposed Method

Architecture of AAELM

feed forward neural

networks trained to

recall the input

space.

14

Ensembled-AAELM

Ensembling of AAELM

Ensembling of AAELM

Run AAELM 10 times independently on same dataset to

generate AAELF.

Use three different probability distribution functions

(Uniform, Normal and Logistic distributions) to generate

weight and two different activation functions (Sigmoid and

Gaussian) at hidden layer.

AAELM ensemble for total six combinations of probability

distribution and activation functions.

15

Ensembled-AAELM

Result of Ensembled AAELM

16

Drawbacks of AAELM

significant because each run of ELM yields different results.

Result could be fluctuate wildly sometimes.

randomness of AAELM :

PCA-AAELM

ECM-AAELM

17

PCA-AAELM

Proposed Method 1:

PCA-AAELM

18

PCA-AAELM

Architecture of PCA-AAELM

Architecture of PCA-AAELM *

Traditional ELM

19

PCA-AAELM

Results

20

ECM-Imputation

Proposed Method 2:

Evolving Clustering method (ECM)

based Imputation

21

ECM-Imputation

Complete

Dataset with

Missing

Values

ECM

Clustering

Incomplete

Find Nearest

Cluster Center from

Incomplete Records

Corresponding Features of the Nearest

Cluster center

Obtained

Cluster Centers

Dataset

without Missing

Values

ECM-Imputation

(2 0)2 (3 2)2 5

0

(2 3)2 (3 1)2 5

3

(2 1)2 (3 2)2 2

5

(2 5) (3 3) 9

2

(2 1)2 (3 9)2 37

23

ECM-Imputation

Results

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

K-Means+MLP

ECM Imputation

[Ankaiah & Ravi]

23.75

18.03

7.83

6.31

21.01

17.84

26.61

22.29

9.41

5.27

29.7

27.16

39.91

31.98

12.14

10.21

33.01

27.90

30.96

46.14

32.17

27.40

21.58

15.61

24

ECM-AAELM

Proposed Method 3:

ECM-AAELM

25

ECM-AAELM

Architecture of ECM-AAELM

Architecture of ECM-AAELM *

Traditional ELM

26

ECM-AAELM

Results

27

PCA/ECM-AAELM

Behavior of PCA/ECM-AAELM on different activation functions

80

PCA-AAELM

70

Sigmoid

60

Sinh

50

Cloglogm

40

Bsigmoid

Sine

30

Hardlim

20

Tribas

10

Radbas

0

Auto mpg Boby fat

Housing

Iris

Prima

indian

Spanish

Spectf

Turkish

UK

UK Credit

bankruptcy

Wine

ECM-AAELM

Softplus

Sigmoid

70

sinh

Cloglogm

60

Bsigmoid

50

Sin

40

Hardlim

Tribas

30

Radbas

20

Softplus

Gaussian

10

Rectifier

0

Auto mpg

Body fat

Housing

Iris

Prima

Indian

Spanish

Spectf

Turkish

UK

UK Credit

bankruptcy

Wine

28

ECM-AAELM

Influence of Dthr value on MAPE results : ECM-AAELM

250

Auto_MPG

200

Body_Fat

Boston_housing

Forest_Fire

150

Iris

Prima_indian

Spanish

100

Turkish

Spectf

50

UK_Credit

UK_Bankruptcy

Wine

Dthr

0.035

0.07

0.105

0.14

0.175

0.21

0.245

0.28

0.315

0.35

0.385

0.42

0.455

0.49

0.525

0.56

0.595

0.63

0.665

0.7

0.735

0.77

0.805

0.84

0.875

0.91

0.945

0.98

29

ECM-AAELM

Module II:

Proposed Method 4:

PSO-ECM

30

2

Dataset contains

incomplete records

Complete Records

3

Initialize PSO parameter

and Apply ECM with

initialized Dthr value

4

ECM imputation based on nearest

cluster center

Incomplete Records

records (Ccov) and total records (Tcov) after

imputation and Determinant of Ccov & Tcov

b/w Det(Ccov) & Det(Tcov )

PSO

Is error

minimum ?

Parameter Optimized

ECM imputation with optimized Dthr

value

incomplete records

31

ECM-Imputation

Results

Proposed Techniques

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

K-Means+MLP

ECM-Imputation

[Ankaiah & Ravi]

23.75

18.03

7.83

6.31

21.01

17.84

26.61

22.29

9.41

5.27

29.7

27.16

39.91

31.98

12.14

10.21

33.01

27.9

30.96

46.14

32.17

27.4

21.58

15.61

PSO-ECM

15.34844

4.96008

14.49978

18.33909

4.82263

24.57587

20.73123

9.85382

19.28137

30.97627

24.61695

12.75819

32

Proposed Method 5:

PSO-ECM + ECM-AAELM

33

PSO-ECM + ECM-AAELM

Proposed Model

34

PSO-COV-ECM + ECM_AAELM

Results

35

PSO-ECM + ECM-AAELM

Comparison

Compare the Results after and before selection of optimal Dthr value

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

K-Means+MLP

[Ankaiah & Ravi]

23.75

7.83

21.01

26.61

9.41

29.7

39.91

12.14

33.01

30.96

32.17

21.58

ECM-AAELM

PSO-ECM +

ECM-AAELM

Before

After

17.38

5.33

16.48

21.54

5.10

23.95

22.09

8.05

21.49

40.06

26.85

14.88

14.69

4.64

14.44

18.17

4.83

23.96

18.53

8.18

18.97

28.66

24.79

12.60

36

CPAANN

Module III:

Proposed Method 6:

CPAANN

37

CPAANN

Introduction of CPNN

Introduction *

Semi-supervised Learning :

Unsupervised

CP NN

Kohonen

SOM

competitive

learning

Supervised

Grossberg

Outstar

created Counter Propagation Auto-associative Neural

Network (CPAANN)

38

Comparison

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

[Ankaiah & Ravi]

Imputation

23.75

18.32

7.83

5.25

21.01

14.86

26.61

16.97

9.41

6.51

29.7

18.21

39.91

17.13

12.14

8.61

33.01

16.07

30.96

21.96

32.17

22.88

21.58

11.56

39

PCA-AAELM

Proposed Method 7:

Gray + PCA-AAELM

40

Proposed Method*:

Stage I

Gray Distance

Based

Nearest

Neighbor

Imputation

Stage II

PCA-AAELM

Based

Imputation

41

Gray+PCA-AAELM

Comparison

Results of PCA-AAELM with Mean Imputation and Gray Distance based Imputation

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

PCA-AAELM with

K-Means+MLP

Gray Distance

Mean Imputation

[Ankaiah & Ravi]

based Imputation

23.75

28.63

16.92

7.83

6.01

5.41

21.01

20.9

17.46

26.61

19.41

20.89

9.41

10.23

5.79

29.7

22.06

22.03

39.91

30.09

28.06

12.14

9.11

8.38

33.01

30.18

27.38

30.96

37.7

37.95

32.17

25.27

27.79

21.58

16.6

14.78

42

Gray+CPAANN

Proposed Method 8:

Gray + CPAANN

43

Gray+CPAANN

Proposed Method*:

Stage I

Gray Distance

Based

Nearest

Neighbour

Imputation

Stage II

CPAANN

Based

Imputation

44

Gray+CPAANN

Comparison

Results of CPAANN with Mean Imputation and Gray Distance based Imputation

Mean

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

CPAANN with

K-Means+MLP

Gray Distance

Mean Imputation

[Ankaiah & Ravi]

based Imputation

23.75

18.32

15.31

7.83

5.25

4.71

21.01

14.86

15.01

26.61

16.97

17.91

9.41

6.51

4.03

29.7

18.21

19.34

39.91

17.13

14.21

12.14

8.61

8.53

33.01

16.07

17.37

30.96

21.96

20.58

32.17

22.88

13.70

21.58

11.56

11.72

45

Comparison

Comparison Between All Proposed Methods based on Average MAPE value over 10

folds

PCAAAELM

ECM_Imp

utation

ECMAAELM

Auto mpg

Body fat

Boston

Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

28.63

6.01

18.03

6.31

17.38

5.33

20.90

19.41

10.23

22.06

30.09

9.11

30.18

17.84

22.29

5.27

27.16

31.98

10.21

27.90

16.48

21.54

5.10

23.95

22.09

8.05

21.49

UK bankruptcy

UK Credit

Wine

37.70

25.27

16.60

46.14

27.40

15.61

40.06

26.85

14.88

PSOGray+PCA

Gray+CPA

PSO-ECM ECM+ECMCPAANN

-AAELM

ANN

AAELM

15.35

4.96

14.50

18.34

4.82

24.58

20.73

9.85

19.28

30.98

24.62

12.76

14.39

4.61

16.92

5.41

18.32

5.25

15.31

4.71

14.18

17.66

4.75

23.38

16.99

8.18

16.49

17.46

20.89

5.79

22.03

28.06

8.38

27.38

14.86

16.97

6.51

18.21

17.13

8.61

16.07

15.01

17.91

4.03

19.34

14.21

8.53

17.37

26.89

23.66

12.21

37.95

27.79

14.78

21.96

22.88

11.56

20.58

13.70

11.72

46

Conclusions

Conclusions

47

Conclusion

Conclusions

The results indicated that all the proposed methods provided significantly

improved results compare to K-Means +MLP.

ECM-Imputation alone outperformed K-Means +MLP. It showed powerful

local learning capability of ECM.

ECM-AAELM yields more accuracy than PCA-AAELM.

Output of ECM-AAELM primarily depends on threshold value of ECM, its

output does not fluctuate wildly according to activation functions.

Based on our experiment, it is proved that selection of optimal Dthr value

always performed better imputation.

In case of PCA-AAELM, it is recommended to use Softplus activation

function because it performed better than other activation functions.

Gray Distance based imputation performed better than Mean imputation as

preprocessing task for most of the dataset.

48

Papers

C. Gautam, V. Ravi, Evolving Clustering Based Data Imputation, 3rd

IEEE Conference, ICCPCT, Kanyakumari, Mar 21-22, 2014.

C. Gautam, V. Ravi, Data Imputation via Evolutionary Computation,

Clustering and a Neural Network, to be communicated in IEEE

Computational Intelligence Magazine (CIM).

A Hybrid Data Imputation method based on Gray System Theory and

Counterpropagation Auto-associative Neural Network, to be

communicated in Neurocomputing.

Imputation of Missing Data Using PCA, Extreme Learning Machine

and Gray System Theory, to be communicated in The 5th Joint

International Conference on Swarm, Evolutionary and Memetic

Computing (SEMCCO 2014).

49

References

Data Imputation

References

Abdella, M., & Marwala, T. (2005). The use of genetic algorithms and neural

networks to approximate missing data in database, IEEE 3rd International

Conference on Computational Cybernetics, Mauritius, pp. 207-212.

Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimation using

principal component analysis and Auto associative neural networks, Journal of

Systemics, Cybernetics and Informatics, Volume 7, pp. 72-79 .

Ankaiah, N., & Ravi, V. (2011). A novel soft computing hybrid for data

imputation, International Conference on Data Mining, Las Vegas, USA.

Vriens, M., & Melton, E. (2002). Managing missing data. Marketing Research,

Volume 14, Issue 3, pp.1217.

Naveen, N., Ravi, V., & Rao, C. R. (2010). Differential evolution trained radial

basis function network: application to bankruptcy prediction in banks, International

Journal of Bio-Inspired Computation (IJBIC), Volume 2, Issue 3, pp. 222-232.50

References

Data Imputation (Cont.)

Nelwamondo, F., V., Golding, D., & Marwala, T. (2013). A dynamic programming

approach to missing data estimation using neural networks, Elsevier, Information

Sciences, Volume 237, pp. 4958.

Nishanth, K.J., Ankaiah, N., Ravi, V., & Bose, I. (2012). Soft computing based

imputation and hybrid data and text mining: The case of predicting the severity of

phishing alerts, Expert Systems with Applications, Volume 39, Issue 12, pp. 1058310589.

K. J. Nishanth, V. Ravi, A Computational Intelligence Based Online Data

Imputation Method: An Application For Banking, Journal of Information

Processing Systems, vol. 9 (4), pp. 633-650, 2013.

M. Krishna, V. Ravi, Particle swarm optimization and covariance matrix based data

imputation, IEEE International Conference on Computational Intelligence and

Computing Research (ICCIC), Enathi, 2013.

V. Ravi, M. Krishna, A new online data imputation method based on general

regression auto associative neural network, Neurocomputing, vol. 138, pp. 207212, 2014.

51

References

Extreme Learning Machine (ELM)

Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine: Theory and

Applications, Neurocomputing, Elsevier, 7th Brazilian Symposium on Neural

Networks, Volume 70, pp. 489-501.

Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and State of

Art, International Journal Of Wisdom Based Computing, Volume 1, pp. 35-49.

Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A Survey,

International Journal of Machine Learning and Cybernetics June 2011, Volume

2, Issue 2, pp 107-122.

Bartlett, P. (1997). For Valid Generalization, The Size of the Weights is more

important than the Size of the Network, Advances in Neural Information Processing

Systems, Volume 9, pp. 134-140.

Huang, G., Chen, L., & Siew, C. (2006). Universal Approximation Using

Incremental Constructive Feedforward Networks with Random Hidden Nodes,

IEEE Transactions on Neural Networks, Volume 17, Issue 4, pp. 879-892.

52

References

Extreme Learning Machine (ELM) (Cont.)

Zhu, Q., Qin, A. K., Suganthan, P.N., & Huang, G. (2005). Evolutionary Extreme

Learning Machine, Pattern Recognition, Elsevier, Volume 38, Issue 10, pp. 1759

1763.

Castao, A., Fernndez-Navarro, F., & Hervs-Martnez, C. (2013). PCA-ELM -A

Robust and Pruned ELM Approach Based on PCA, Neural Processing Letter,

Springer, Volume 37, Issue 3, pp. 377-392.

Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning Machine

for Regression and Multiclass Classification, IEEE Transaction on Systems, Man

And Cybernetics, Volume 42, Issue 2, pp. 513-529.

53

References

Extreme Learning Machine (In Finance)

Duan, G., Huang, Z., & Wang, J. (2009). Extreme Learning Machine for Bank

Clients Classification, International Conference on Information Management,

Innovation Management and Industrial Engineering, Xian, China, Volume 2, pp.

496-499.

Duan, G., Huang, Z., & Wang, J. (2010). Extreme Learning Machine for Financial

Distress Prediction for Listed Company, International Conference on Logistics

Systems and Intelligent Management, Harbin, China, Volume 3, pp. 1961-1965.

Zhou, H., Lan, Y., Soh, Y. C., Huang G.B., & Zhang, R. (2012). Credit Risk

Evaluation with Extreme Learning Machine, IEEE International Conference on

Systems, Man and Cybernetics, Seoul, Korea, pp. 1064-1069.

Teresa, M., Carmen, M., David B., & Jos, F. (2012). Extreme Learning Machine

to Analyze the Level of Default in Spanish Deposit Institutions, Journal of

Methods for the quantitative Economy and Enterprise, Volume 13, Issue 1, pp. 323.

54

References

Activation Function

Sibi,

p., Jones, s., & Siddarth, p. (2013). Analysis of Different Activation Functions

Using Back Propagation Neural Networks, Journal of Theoretical and Applied

Information Technology, Volume 47, Issue 3, pp. 1264-1268.

Peng, J., Li, L., & Tang (2013). Combination of Activation Functions in Extreme

Learning Machines for Multivariate Calibration, Chemometrics and Intelligent

Laboratory Systems, Elsevier, Volume 120, pp. 53-58.

Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011). Comparison of new

activation functions in neural network for forecasting financial time series, Neural

Computing and Applications, Springer, Volume 20, Issue 3, pp. 417-439.

Asaduzzaman, Md., Shahjahan, M., & Murase, K. (2009). Faster Training Using

Fusion of Activation Functions for Feed Forward Neural Networks, International

Journal of Neural Systems , Volume 19, Issue 06, pp. 437-448 .

Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various Activation

Functions in Generalized MLP Architectures of Neural Networks, International Journal

of Artificial Intelligence and Expert Systems, Volume 1, Issue 4, pp. 111-122. 55

References

Activation Function(Cont.), ECM, Cross Validation & PCA

Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural

Networks, International Conference on Artificial Intelligence and Statistics, Fort

Lauderdale, USA, Volume 15, pp. 315-323.

Song, Q. & Kasasbov, N. (2001) ECM A Novel On-line, Evolving Clustering

Method and Its Applications, Proceedings of the Fifth Biannual Conference on

Artificial Neural Networks and Expert Systems, Berlin, pp. 87-92.

Refaeilzadeh, P., Tang, L., & Liu. H. (2009). "Cross Validation", in Encyclopedia

of Database Systems (EDBS), Springer, Volume 1, pp. 532-538.

Smith, L. (2002). A tutorial on Principal Components Analysis.

56

References

Counter Propagation Neural Network (CPAANN)

Matlab, Chemometrics and Intelligent Laboratory Systems, pp. 84-91.

Taner, M. (1997). Kohonens self organizing networks with CONSCIENCE.

Ballabio, D., & Vasighi, M. (2012). MATLAB toolbox for Self Organizing Maps and

supervised neural network learning strategies, Chemometrics and Intelligent

Laboratory Systems, pp. 24-32.

Ballabio, D., Consonni, V., & Todeschini, R. (2009). The Kohonen and CP-ANN

toolbox: A collection of MATLAB modules for Self Organizing Maps and

Counterpropagation Artificial Neural Networks, Chemometrics and Intelligent

Laboratory Systems, pp. 115-122.

Introduction to neural networks Using MATLAB 6.0 by S N Sivanandam, S Sumathi

and S N Deepa.

Elements of Artificial Neural Networks by Kishan Mehrotra, Chilukuri K. Mohan and

Sanjay Ranka .

57

Thank You

Thank You

58

Activation Function

Activation Function *

Sibi, p., Jones, s., & Siddarth, p. (2013). Analysis of Different

Activation Functions Using Back Propagation Neural Networks, Journal

of Theoretical and Applied Information Technology, Volume 47, Issue

3, pp. 1264-1268.

Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011).

Comparison of new activation functions in neural network for

forecasting financial time series, Neural Computing and Applications,

Springer, Volume 20, Issue 3, pp. 417-439.

59

Activation Function

Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various

Activation Functions in Generalized MLP Architectures of Neural

Networks, International Journal of Artificial Intelligence and Expert

Systems, Volume 1, Issue 4, pp. 111-122.

Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier

Neural Networks, International Conference on Artificial Intelligence

and Statistics, Fort Lauderdale, USA, Volume 15, pp. 315-323.

60

Experimental Design

10 fold cross validation has been used in our experiment.

Both

defined parameter, PCA has variance i.e. eigen values and

ECM has threshold value.

We fixed activation function and varied variance from 1 to 99

in PCA-AAELM and threshold from 0.001 to 0.999 in ECMAAELM for each activation function on whole dataset.

We used 11 activation functions and compare their

performances.

61

PCA-AAELM

Steps of PCA-AAELM

Training Dataset

Selection of optimal

number of hidden

nodes and value of

hidden node as input

weight

PC * Training Data

Neural Network

Model

Compute

the

output weight by

performing MoorePenrose generalized

inverse

transformation

62

ECM-AAELM

Evolving Clustering Method

x3

x1

x1

C10 R10 =0

C20 R20 =0

x4

x2

R11

C11

x8

C30

C30 R30 =0

x7

C21

C21

x6

x9

C13

C12

x5

63

ECM-AAELM

Steps of ECM-AAELM

1) First; perform ECM on given dataset and find how many

clusters will be generated.

2) Then extract centre of each cluster and assume each cluster

as a hidden node. Therefore, number of hidden nodes are

equal to number of generated clusters.

3) Calculate normalized Euclidean distance from centre of each

cluster, which is presented as hidden nodes.

4) Performs non-linear transformation by activation function on

this distance to get hidden node output.

64

ECM-AAELM

Steps of ECM-AAELM (Cont.)

x y

x

q

i 1

6) After this perform Moore-Penrose generalized inverse on

output of previous step and multiply by dataset to calculate

output weight.

7) In last, multiply output weight to hidden node output to get

final output.

65

ELM

Why Moore-Penrose Inverse

Moore- Penrose provides solution of a linear system

Ax=y

in such a way that

error = Ax-y and x

both will be minimized simultaneously and gives a unique

solution :

x=H y

Formula : H = (HT H)-1 HT

66

Initialize Network

N epochs

Get Input

Repeat for

all inputs

Find Winner

Update Winner &

neighbourhoods

Update nodes at Grossberg

Outstar

67

Input

Weights trained by

simple competitive

learning

Hidden

Weights trained by

Outstar rule

Output

x1

h1

y1

x2

h2

y2

xm

hn

yp

68

Hidden Nodes

and 10 blue color

input samples

69

Gray Relational Coefficient :

mis

GRC ( x kp

, xi )

mis

mis

min i min p | x kp

xip | max i max p | x kp

xip |

|x

mis

kp

mis

kp

xip |

p 1,2,3,......, m.

k 1,2,3,......, n.

i 1,2,3,......., o.

0 1 Control the level of differences with respect to the relational coefcient.

Gray Relational Grade :

GRG ( x

mis

k

1

, xi )

m

GRC ( x

p 1

mis

kp

, xi )

i 1,2,......, o.

k 1, 2,....., n.

70

Example *

attr1

attr2

attr3

attr4

attr5

R1

0.2

0.9

0.6

0.5

R2

R3

R4

R5

0.1

0.1

0.8

0.5

0.3

0.4

0.2

0.8

0.9

0.8

0.5

0.3

0.4

0.5

0.3

0.9

0.6

0.6

0.2

0.7

Abs.

Diff1

0.1

0.1

0.6

0.3

Abs.

Diff2

0

0.1

0.4

0.6

Abs.

Diff3

0.2

0.1

0.3

0.3

Abs.

Diff4

0.1

0.1

0.3

0.2

Min

Max

0

0.1

0.3

0.2

0.2

0.1

0.6

0.6

GRC1

GRC2

GRC3

GRC4

GRG

R2

0.75

0.6

0.75

0.775

R3

0.75

0.75

0.75

0.75

0.75

0.5

0.5

0.440476

0.5

0.6

0.483333

Imputation by Gray

Distance = 0.3

R4

R5

0.333333 0.428571

0.5

0.333333

Min= 0

Max=0.6

71

Results

Auto mpg

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

Mean

K-Means+MLP

59.7

11.61

37.77

24.728

23.57

24.022

55.53

14.85

66.007

37.07

28.43

29.99

23.75

7.83

21.01

26.61

9.41

29.7

39.91

12.14

33.01

30.96

32.17

21.58

Imputation

16.73

7.65

19.28

22.89

5.34

28.06

36.29

11.60

36.63

39.75

28.90

17.58

72

Literature Survey

73

Data Imputation

Data Imputation

Vriens, M., & Melton, E. (2002). Managing missing data. Marketing Research,

Volume 14, Issue 3, pp.1217.

Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimation using

principal component analysis and Auto associative neural networks, Journal of

Systemics, Cybernetics and Informatics, Volume 7, pp. 72-79 .

Ankaiah, N., & Ravi, V. (2011). A novel soft computing hybrid for data

imputation, International Conference on Data Mining, Las Vegas, USA.

Nishanth, K.J., Ankaiah, N., Ravi, V., & Bose, I. (2012). Soft computing based

imputation and hybrid data and text mining: The case of predicting the severity

of phishing alerts, Expert Systems with Applications, Volume 39, Issue 12, pp.

10583-10589.

M. Krishna, V. Ravi, Particle swarm optimization and covariance matrix

based data imputation, IEEE International Conference on Computational

Intelligence and Computing Research (ICCIC), Enathi, 2013.

74

Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine:

Theory and Applications, Neurocomputing, Elsevier, 7th Brazilian

Symposium on Neural Networks, Volume 70, pp. 489-501.

Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A

Survey, International Journal of Machine Learning and Cybernetics June

2011, Volume 2, Issue 2, pp 107-122.

Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and

State of Art, International Journal Of Wisdom Based Computing, Volume

1, pp. 35-49.

Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning

Machine for Regression and Multiclass Classification, IEEE Transaction

on Systems, Man And Cybernetics, Volume 42, Issue 2, pp. 513-529.

75

Clustering Method and Its Applications, Proceedings of the Fifth

Biannual Conference on Artificial Neural Networks and Expert Systems,

Berlin, pp. 87-92.

Kuzmanovski, I., & Novi, M. (2008). Counter-propagation neural

networks in Matlab, Chemometrics and Intelligent Laboratory Systems,

pp. 84-91.

Ballabio, D., Consonni, V., & Todeschini, R. (2009). The Kohonen and CPANN toolbox: A collection of MATLAB modules for Self Organizing Maps

and Counterpropagation Artificial Neural Networks, Chemometrics and

Intelligent Laboratory Systems, pp. 115-122.

Sivanandam, S. N., & Deepa, S. N. Introduction to neural networks Using

MATLAB 6.0.

76

Dataset Description

Dataset

Body fat

Boston Housing

Forest fires

Iris

Prima Indian

Spanish

Spectf

Turkish

UK bankruptcy

UK Credit

Wine

392

7

252

14

506

13

516

10

150

4

768

8

66

9

267

44

40

12

60

10

1225

12

178

13

77

- Sample Size Formula for Case Cohort Studies.22Uploaded byKizito Lubano
- US Federal Reserve: finalasa2000Uploaded byThe Fed
- SDE1chAUploaded byNanaoba Mangangja
- Why Some Schools Do BetterUploaded bymatthewgeraldfrench
- Lab1DataScreening 2Uploaded byZohaib Ahmed
- HmiscUploaded byAbel
- Improving the Performance of K-nearest Neighbor Algorithm for the Classification of Diabetes Dataset With Missing ValuesUploaded byIAEME Publication
- takututUploaded bydpkkd
- tmpD6F3.tmpUploaded byFrontiers
- Principal Components AnalysisUploaded byAyushYadav
- Andersson_2016_BJSM_Preventing Overuse Shoulder Injuries Among Throwing AthletesUploaded bymk78_in
- Complex Samples in SPSSUploaded byJessica Harvey
- Estimation of Mean Using Improved Ratio-Cum- Product Type Estimator with Imputation for Missing DataUploaded byNarendraSinghThakur
- MissingValues_saar-tsechansky07aUploaded byalfredyap
- eina jurnal 1Uploaded byFandyi Sanizan
- Mlt Assignment 6Uploaded bybharti goyal
- 1337722691.9912180Uploaded bybacuoc.nguyen356
- Stuart J. Russell, Peter Norvig-Inteligencia Artificial, Un Enfoque Moderno-Prentice HallUploaded byBenja
- Milling Tool Wear Diagnosis by Feed Motor Current Signal using an Optimized Artificial Neural Network-FA438197.pdfUploaded bylazoroljic
- Learning by Neural NetworksUploaded bykaskamotz
- ictejournal-2014-1-article-4.pdfUploaded byArianto Gunawan
- EARL2017 - Boston - Francesca Lazzeri - Deep LearningUploaded byChitradeep Dutta Roy
- fUploaded bylego1445
- Research GlossaryUploaded bysheel_shalini
- 118Uploaded bySharad Patel Lodhi
- Artificial Neural Networks in MeasurementsUploaded byvaalgatamilram
- ANN Azri LatestUploaded byAzri Mohd Khanil
- A Short Story About Autonomous RobotsUploaded byChathushka Hendalage
- Supercritical BoilerUploaded bySreepradha Sivaram
- Research Glossary.docxUploaded bysheel_shalini

- Chapter 1Uploaded byMary Ann Tonera Pateño
- CMOS Propagation DelayUploaded byjawahar_reddy
- Lembar DeskripsiUploaded byErwin Chandra N
- Modal parameter identification based on ARMAV and state–space approachesUploaded byPedro Motta Nunes
- 435_03 Napocor v Heirs of Macabangkit SangkayUploaded byJoyceMendoza
- Nilsen-2000.pdfUploaded byAyala Ru
- testes-7ano.pdfUploaded byMarciano Silva
- Family TherapyUploaded byJos Bramasto
- CHAPTER 02- INTRO MIS 1.pptUploaded byMrz Rostan
- Latin Verb via SanskritUploaded bybrumah1008
- incoldbloodUploaded byapi-321434139
- Diabetes MellitusUploaded byYollanda Putri Aswani
- 02 SynopsisUploaded byFatma Fabigha
- Lecture #1 - Modern Programming Platforms (CSHTP3e_03)Uploaded byOrçan Yedal Öksüz
- Present Perfect TenseUploaded byazrail_35
- Mobile Phones Improve Antenatal Care Attendence in ZanzibarUploaded bydwi handayani
- Transportation Statistics: entireUploaded byBTS
- Selamat Datang Ke Royal United FinanceUploaded byina171100
- Ralph Waldo Emerson -- The conduct of lifeUploaded bytruepotentialz
- DRR11-12-Ia-b-5Uploaded byben lee
- Compensating ControlUploaded bykasim_arkin2707
- James Altucher I Was Blind but Now I SeeUploaded byMRT10
- MWC Newsletter August 2013Uploaded byColleen Nicholas Tronson
- PSY 100 notesUploaded bymarciniakjordynn
- United States v. Charles Green, 523 F.2d 229, 2d Cir. (1975)Uploaded byScribd Government Docs
- Oregon State Sovereignty DeclarationUploaded byGuy Razer
- Specification 1 4Uploaded byrahulonthefield
- Stamping Die Solution Overview Updated 01142011Uploaded byHerberPeraza
- Hero's Journey - WikipediaUploaded byMonika Reszczyńska
- Phototropism LabUploaded byNur Syakira Ismail