1 Up votes0 Down votes

31 views9 pagesJan 12, 2009

© Attribution Non-Commercial (BY-NC)

PDF, TXT or read online from Scribd

Attribution Non-Commercial (BY-NC)

31 views

Attribution Non-Commercial (BY-NC)

- We Can Track You If You Take the Metro: Tracking Metro Riders Using Accelerometers on Smartphones
- Interpretation of IP15
- Homework 3
- Lab 7 Document
- Well Clasification
- Decision Trees
- Untitled
- 08 Classification Using K NN
- Neural Network Ensembles using Interval Neutrosophic Sets and Bagging for Mineral Prospectivity Prediction and Quantification of Uncertainty
- Xiang Li Adversarial Open-World Person ECCV 2018 Paper
- 1111
- Analyzing Attribute Dependencies
- Drawing Class if i Er
- Bayes Probabilistic Inferencing
- 43-1-61-1-10-20170621
- final_text.pdf
- ICDM08ClassifyTutoCheng.ppt
- Machine Learning with R Second Edition - Sample Chapter
- Calidad del codigo.pdf
- Knn

You are on page 1of 9

DOI 10.1007/s10489-006-0105-0

relational structure

Chi-Chun Huang · Hahn-Ming Lee

C Springer Science + Business Media, LLC 2006

two instances—used for pattern classification—is gener-

ally determined by some similarity functions, such as the In recent years, different learning algorithms, such as

Euclidean or Value Difference Metric (VDM). However, instance-based learning (IBL) [1, 2, 17, 23, 33, 34], rule

Euclidean-like similarity functions are normally only suit- induction [3, 14, 28, 36], decision trees [11, 13, 30], decision

able for domains with numeric attributes. The VDM metrics tables [16, 27] and neural networks [6, 24] have been in-

are mainly applicable to domains with symbolic attributes, vestigated to solve classification problems. As a comparable

and their complexity increases with the number of classes in a alternative, IBL is quite straightforward to understand and

specific application domain. This paper proposes an instance- can yield excellent performance [1, 2], i.e., high classifica-

based learning approach to alleviate these shortcomings. tion accuracy. In IBL, a training set of labeled instances is

Grey relational analysis is used to precisely describe the en- first collected by the learning system. A new, unseen instance

tire relational structure of all instances in a specific domain. is then classified according to its ‘nearest’ training instance or

By using the grey relational structure, new instances can be instances [7, 12]. Based on the nearest-neighbor rule [7, 12],

classified with high accuracy. Moreover, the total number of instance-based learning methods include nearest-neighbor

classes in a specific domain does not affect the complexity classifiers [4, 7, 15], similarity-based induction (learning),

of the proposed approach. Forty classification problems are lazy learners, memory based learning or case-based reason-

used for performance comparison. Experimental results show ing systems [34, 38].

that the proposed approach yields higher performance over In general, the ‘nearness’ between two instances in IBL is

other methods that adopt one of the above similarity func- determined by a similarity function. In the machine learning

tions or both. Meanwhile, the proposed method can yield literature, two metrics, Euclidean metric [1] and the Value

higher performance, compared to some other classification Difference Metric (VDM) [34], are widely used for IBL.

algorithms. Euclidean-like similarity functions are normally suitable for

domains with numeric attributes. The VDM is mainly ap-

Keywords Instance-based learning . Grey relational plicable to domains with symbolic attributes; however, the

analysis . Grey relational structure . Pattern classification number of training instances with different values for each

attribute in a specific application domain should be deter-

mined prior to learning, such that the complexity of using

C.-C. Huang () . H.-M. Lee

Department of Information Management, National Kaohsiung the VDM increases with the number of classes [39]. In this

Marine University, NKMU Kaohsiung, Taiwan 811, R.O.C. paper, an instance-based learning approach based on grey re-

e-mail: cchuang@mail.nkmu.edu.tw lational structure (GRS) is proposed to alleviate these short-

comings. Here, grey relational analysis (GRA) [8–10, 29] is

H.-M. Lee

Department of Computer Science and Information Engineering, used to precisely describe the entire relational structure for

National Taiwan University of Science and Technology, Taipei all instances in a specific application domain. Some proper-

106, Taiwan ties of GRA, including wholeness, asymmetry and normal-

e-mail: hmlee@mail.ntust.edu.tw ity, are helpful for the learning tasks (as stated in Section 3).

Springer

244 Appl Intell (2006) 25:243–251

By using the so-called grey relational structure, new instances the distance will be set to one (i.e., the distance is maximal).

in a specific domain can be classified with high accuracy. Meanwhile, the distance will also be set to one if a or b is

Moreover, the total number of classes in a specific domain unknown (i.e., missing). This biases to a high dependence on

does not affect the complexity of the proposed approach. symbolic attributes or unusually high biases against missing

Forty classification problems are used for performance com- attributes. A heterogeneous similarity function is thus defined

parison. Experimental results have shown that the proposed for domains with both numeric and symbolic attributes. This

approach yields higher performance over other methods that distance function, also known as HOEM [39], was adopted

adopt one of the above-mentioned two similarity functions in IB1, IB2 and IB3 [1].

or both, i.e., Euclidean metric and the Value Difference Met- In [34], another well-known distance function, namely the

ric (VDM). Moreover, the proposed method can yield higher Value Difference Metric (VDM), was proposed to determine

performance, compared to some other classification algo- the ‘nearness’ between instances. For each symbolic attribute

rithms. s of instances, the VDM between two values a and b is defined

The rest of this paper is structured as follows. Some simi- as follows.

C k

larity functions used for IBL are introduced in Section 2. The

concept of grey relational analysis is reviewed in Section 3. In vdm(a, b) = Nai − Nbi , (3)

N Nb

Section 4, an instance-based learning approach based on grey i=1 a

relational structure is presented. In Section 5, experiments

where Na is the number of training instances with value a for

performed on forty datasets are reported. Finally, Section 6

attribute s, and Nai is the number of training instances with

gives our conclusions.

value a for attribute s and output class i; C is the number

of output classes in a specific domain and k is usually set to

1 or 2. In Eq. (3), the two ratios are estimates of the prob-

2 Similarity functions in IBL

abilities of attribute values given the class (i.e., of the class

conditional probabilities).

This section reviews some similarity functions used for

Generally, the VDM is mainly suitable for domains with

IBL. In most IBL systems, the Euclidean similarity func-

symbolic attributes. Some discretization methods were thus

tion has been adopted. Let x and y be two instances with

incorporated with the VDM for dealing with numeric at-

n attributes, denoted as x = (x(1), x(2), . . . , x(n)) and y =

tributes, i.e., numeric attributes were discretized into sym-

(y(1), y(2), . . . , y(n)). The Euclidean metric is defined as

bolic attributes for the learning tasks.

follows.

In [39], three variants of VDM, namely Heterogeneous

n

Value Difference Metric (HVDM), Interpolated Value Differ-

Eu(x, y) = [x(i) − y(i)]2 . (1) ence Metric (IVDM) and Windowed Value Difference Met-

i=1 ric (WVDM) were proposed. The HVDM uses the VDM and

the above-mentioned normalized-distance (i.e., each distance

An alternative similarity function, the Manhattan distance

for numeric attribute i is divided by 4 standard deviations of

function, is defined as follows.

attribute i) to handle symbolic and numeric attributes, respec-

n

tively. As for the IVDM and the WVDM, different discretiza-

Ma(x, y) = |x(i) − y(i)|. (2)

tion methods were incorporated with the original version of

i=1

VDM. These similarity functions are useful for applications

where x and y are two instances with n attributes, denoted as with both numeric and symbolic input attributes. Further de-

x = (x(1), x(2), . . . , x(n)) and y = (y(1), y(2), . . . , y(n)). tails regarding these three similarity functions are mentioned

Obviously, Euclidean-like distance functions are normally in [39]. Similar to VDM, however, their complexity is in-

applicable to domains with numeric attributes. In general, creased along with the number of classes in the learning

prior to learning, normalization should be done for each nu- tasks. In addition, the PEBLS learning system [32] intro-

meric attribute, i.e., each distance for numeric attribute i is duces a variant of VDM, called the modified VDM (MVDM).

divided by the maximal difference or by the standard devia- MVDM incorporates VDM with a scheme that adjusts the

tion of attribute i. In this manner, the problem of one attribute weight of each attribute for learning. A review of various

with a relatively larger range of values than other attributes similarity functions can be found in [39].

dominating the distance can be avoided. Accordingly, the

distance of each attribute can be bounded between zero and 3 Grey relational analysis

one. To deal with each symbolic attribute i, the distance be-

tween two values a and b of attribute i will be set to zero if a Since its inception in 1984 [8], grey relational analysis

and b are the same (i.e., the distance is minimal); otherwise [8–10] has been applied to a wide variety of application

Springer

Appl Intell (2006) 25:243–251 245

domains. As a measurement method, grey relational analy- The grey relational orders of observations x1 , x2 ,

sis is used to determine the relationships among a referential x3 , . . . , xm can be similarly obtained as follows.

observation and the compared observations based on cal-

culating the grey relational coefficient (GRC) and the grey

GRO(xq ) = (yq1 , yq2 , . . . , yqm ), (7)

relational grade (GRG). Consider a set of m + 1 observa-

tions {x0 , x1 , x2 , . . . , xm }, where x0 is the referential obser-

vation and x1 , x2 , . . . , xm are the compared observations. where q = 1, 2, . . . , m, GRG(xq , yq1 ) ≥ GRG(xq , yq2 ) ≥

Each observation xe includes n attributes and is represented · · · ≥ GRG(xq , yqm ), yqr ∈ {x0 , x1 , x2 , . . . , xm }, yqr = xq ,

as xe = (xe (1), xe (2), . . . , xe (n)). The grey relational coeffi- r = 1, 2, . . . , m, and yqa = yqb if a = b.

cient can be calculated as, In Eq. (7), the GRG between observations xq and yq1

exceeds those between xq and other observations (yq2 , yq3 ,

min + ζ max . . . , yqm ). That is, the difference between xq and yq1 is the

GRC (x0 ( p), xi ( p)) = , (4)

|x0 ( p) − xi ( p)| + ζ max smallest.

Grey relational analysis meets four principal axioms, in-

where min = min∀ j min∀k |x0 (k) − x j (k)|, max = cluding (a) normality (b) dual symmetry (c) wholeness and

max∀ j max∀k |x0 (k) − x j (k)|, ζ ∈ [0,1] (Usually, ζ = 0.5), (d) approachability [8–10, 29].

i = j = 1, 2, . . . , m, and k = p = 1, 2, . . . , n.

Here, GRC (x0 ( p), xi ( p)) is considered as the similar- (a) normality—GRG(x0 , xi ) takes a value between zero and

ity between x0 ( p) and xi ( p). If GRC (x0 ( p), x1 ( p)) exceeds one

GRC (x0 ( p), x2 ( p)) then the similarity between x0 ( p) and (b) dual symmetry—If only two observations (x0 and x1 )

x1 ( p) is larger than that between x0 ( p) and x2 ( p); otherwise are made in the relational space, then GRG(x0 , x1 ) =

the former is smaller than the latter. Moreover, if x0 and xi GRG(x1 , x0 )

have the same value for attribute p, GRC (x0 ( p), xi ( p)) (i.e., (c) wholeness—If three or more observations are made in

the similarity between x0 ( p) and xi ( p)) will be one. By con- the relational space, then GRG(x0 , xi ) often doesnot equal

trast, if x0 and xi are different to a great deal for attribute GRG(xi , x0 ), ∀i and

p, GRC (x0 ( p), xi ( p)) will be close to zero. Similar meth- (d) approachability—GRG(x0 , xi ) decreases as the difference

ods for dealing with symbolic attributes will be detailed in between x0 ( p) and xi ( p) increases (other values in Eqs.

Section 4. Notably, changing the value of ζ does not affect (4) and (5) are held constant).

GRO and the performance of the proposed learning approach. Based on these axioms, grey relational analysis offers

Restated, the original version of GRC (as stated in [8]) is con- some advantages. For example, it gives a normalized mea-

sidered to determine the similarity between two instances in suring function (Normality)—a method for measuring the

the proposed learning approach. similarities or differences among observations—to analyze

Accordingly, the grey relational grade between instances the relational structure. Also, grey relational analysis yields

x0 and xi is expressed as, whole relational orders (wholeness) over the entire relational

space. As stated in the following section, these properties are

1 n

GRG (x0 , xi ) = GRC(x0 (k), xi (k)), (5) useful for instance-based learning.

n k=1

The primary characteristic of grey relational analysis is as relational structure

follows.

If GRG (x0 , x1 ) is larger than GRG (x0 , x2 ) then the dif- To build a successful instance-based learning model, the rela-

ference between x0 and x1 is smaller than that between x0 tionships among all instances in a specific application domain

and x2 ; otherwise the former is larger than the latter. should be determined. Here, grey relational analysis is used

According to the degree of GRG, the grey relational order to describe the relational structure of all instances and then

(GRO) of observation x0 can be stated as follows. new, unseen instances can be identified.

As mentioned above, if a set of m + 1 instances {x0 ,

GRO(x0 ) = (y01 , y02 , . . . , y0m ), (6) x1 , x2 , . . . , xm } is given, the grey relational orders of

each instance xq (q = 0, 1, . . . , m) can be expressed as

where GRG(x0 , y01 ) ≥ GRG(x0 , y02 ) ≥ · · · ≥ GRG(x0 , follows.

y0m ), y0r ∈ {x1 , x2 , x3 , . . . , xm }, r = 1, 2, . . . , m, and y0a

= y0b if a = b. GRO(xq ) = (yq1 , yq2 , . . . , yqm ), (8)

Springer

246 Appl Intell (2006) 25:243–251

where GRG(xq , yq1 ) ≥ GRG(xq , yq2 ) ≥ · · · ≥ GRG(xq , for classifying a new, unseen instance i, only k inward edges

yqm ), yqr ∈ {x0 , x1 , x2 , . . . , xm }, yqr = xq , r = 1, 2, . . . , connected with instance i in the above k-level grey relational

m, and yqa = yqb if a = b. structure are needed. In other words, k nearest neighbors of

Here, a graphical structure, called k-level grey relational each unseen instance are considered for the learning tasks

<

structure (k = m), is defined as follows to describe the relation- (i.e., pattern classification).

ships among referential instance xq and all other instances, Next, an instance-based learning algorithm for pattern

where the total number of ‘nearest’ instances of referential classification based on the k-level grey relational structure

instance xq (q = 0, 1, . . . , m) is restricted to k. is detailed. Assume that we have a training set T of m

labeled instances, denoted by T = {x1 , x2 , . . . , xm }, where

GRO∗ (xq , k) = (yq1 , yq2 , . . . , yqk ), (9) each instance xe has n attributes and is denoted as xe =

(xe (1), xe (2), . . . , xe (n)). For classifying a new, unseen in-

where GRG(xq , yq1 ) ≥ GRG(xq , yq2 ) ≥ · · · ≥ GRG(xq , stance x0 , the proposed learning procedure is performed as

yqk ), yqr ∈ {x0 , x1 , x2 , . . . , xm }, yqr = xq , r = 1, 2, . . . , k, follows.

and yqa = yqb if a = b.

That is, a directed graph, shown in Fig. 1, can be used Step 1. Calculate the grey relational coefficient (GRC) and

to express the relational space, where each instance xq (q = the grey relational grade (GRG) between x0 and xi , for

0, 1, . . . , m) as well as its k nearest instances (i.e., yqr , r = i = 1, 2, . . . , m.

1, 2, . . . , k) are represented by vertices, and each expression

GRO∗ (xq , k) is represented by k directed edges (i.e., xq to If attribute p of each instance xe is numeric, the

yq1 , xq to yq2 , . . . , xq to yqk ). value of GRC (x0 ( p), xi ( p)) is calculated by

Here, the characteristics of the proposed k-level grey rela- Eq. (4).

tional structure are described in detail. First, for each instance If attribute p of each instance xe is symbolic, the

xq , k instances (vertices) are connected by the inward edges value of GRC (x0 ( p), xi ( p)) is calculated as

from instance xq . That is, these instances are the nearest

neighbors (with small difference) of instance xq , implying GRC (x0 ( p), xi ( p)) = 1, if x0 ( p) and xi ( p) are the same.

that, they evidence the class label of instance xq according

GRC (x0 ( p), xi ( p)) = 0, if, x0 ( p) and xi ( p) are different.

to the nearest neighbor rule [7, 12]. Also, in the one-level

grey relational structure, instance yq1 , with the largest sim-

Accordingly, calculate the grey relational grade

ilarity, is the nearest neighbor of instance xq . Thus, a new,

(GRG) between x0 and xi , for i = 1, 2, . . . , m

unseen instance can be classified according to its nearest in-

by Eq. (5).

stance in the one-level grey relational structure or its nearest

instances in the k-level grey relational structure. Obviously,

Step 2. Calculate the grey relational order (GRO) of x0 based

on the degree of GRG(x0 , xi ), where i = 1, 2, . . . , m.

Fig. 1 k-level grey relational Step 3. Construct the k-level grey relational structure accord-

x0 y 01

structure

ing to the above grey relational order (GRO) of x0 , where

y 02

. k<= m. Here, only k inward edges connected with instance

. x0 are needed.

Step 4. Classify the new instancex0 by considering the class

y 0k

labels of instances y1 , y2 , . . . , yk with the majority voting

x1 y 11 method [7], where y1 , y2 , . . . , yk are the vertexes con-

y 12 nected by k inward edges from x0 in the k-level grey

.

relational structure (i.e., instances y1 , y2 , . . . , yk are the

.

nearest neighbors of instance x0 ). Notably, the best choice

of k used for pattern classification can be determined by

. y 1k cross validation [35].

.

.

As stated in Section 3, GRC (x0 ( p), xi ( p)) can be treated

xm y m1

as the similarity between x0 ( p) and xi ( p). If x0 and xi have the

.

y m2 same value for symbolic attribute p, GRC (x0 ( p), xi ( p)) (i.e.,

. the similarity between x0 ( p) and xi ( p)) will be set to one.

By contrast, if x0 and xi are different for symbolic attribute

y mk p, G RC (x0 ( p), xi ( p)) will be set to zero. These settings are

Springer

Appl Intell (2006) 25:243–251 247

similar to those used in [1]. As mentioned earlier, the similar- ing approach. As mentioned earlier, the complexity of using

ity function presented here offers some advantages, including the VDM is increased along with the number of classes in

normality and wholeness (i.e., asymmetry). That is, this simi- a specific application domain, i.e., O(mnC), where C is the

larity function is appropriate for measuring the similarities or number of classes. This problem does not appear in the pro-

differences among observations and yields whole relational posed learning approach.

orders (wholeness) over the entire relational space in which In addition to deal with classification tasks, the above k-

all instances (or patterns) in a specific domain are treated as level grey relational structure can be used for instance prun-

various vectors. ing or partial memory learning [21, 26, 40]. For example,

In some application domains, instances may contain miss- an instance may not be connected by any inward edges from

ing attribute values (for example, some datasets in the exper-

Table 1 Average accuracy (%) of classification for the proposed ap-

iments in Section 5 contain missing attribute values). In this

proach and other methods with HOEM (with k-nn), HVDM (with k-nn)

paper, to handle domains that contain missing attribute val- and IVDM (with k-nn) [39], respectively. (k) indicates the best value

ues, a method presented in [20] for missing attribute value for k using cross-validation on each classification problem

prediction is applied prior to learning (That is, domains with

Proposed

missing attribute values in the experiments in Section 5 are Dataset HOEM HVDM IVDM approach (k)

handled by using the prediction method first presented in

[20]). In this missing attribute value prediction method, the Allbp 94.89 95.05 95.29 95.48 (13)

nearest neighbors of an instance with missing attribute val- Allhyper 97.09 97.00 97.20 97.35 (11)

Allhypo 90.31 90.16 96.11 92.74 (7)

ues can be determined. Accordingly, the valid attribute val-

Allrep 96.14 96.31 98.25 97.18 (7)

ues derived from these nearest neighbors are used to predict Australian 81.30 81.72 80.52 81.87 (13)

those missing values. After predicting (estimating) missing Autos 74.90 79.79 80.19 76.45 (1)

attribute values with high accuracy, an imperfect dataset can Breast cancer 70.90 66.73 66.73 70.90 (7)

be handled as a complete dataset in classification tasks. Fi- Breast-w 95.54 95.24 95.74 96.78 (5)

nally, the proposed learning approach is applied for classifi- Cpu 68.04 68.51 65.39 70.09 (3)

cation. Notably, any method used for dealing with missing Crx 80.12 81.06 80.27 80.82 (11)

attribute values probably biases the data. Dis 98.20 98.42 98.24 97.77 (3)

Echoi 81.09 80.32 79.36 80.94 (7)

Assume that we have a training set T of m labeled

Glass 69.45 72.63 70.69 74.13 (3)

instances, denoted by T = {x1 , x2 , . . . , xm }, where each Hepatitis 79.40 80.45 81.47 80.84 (7)

instance xe has n attributes and is denoted as xe = Hypothyroid 93.58 93.60 98.06 98.24 (1)

(xe (1), xe (2), . . . , xe (n)). For classifying a new, unseen in- Ionosphere 87.22 86.40 91.14 91.37 (1)

stance x0 in typical instance-based learning methods (in Iris 94.67 94.67 94.67 95.33 (7)

which the Euclidean distance is used as the similarity func- Letter 96.25 96.01 96.10 95.30 (1)

tion), the Euclidean distance between x0 and xa (1 < <

= a = m)

Liver disorders 61.86 62.89 62.43 63.30 (19)

Mushroom 100.00 100.00 100.00 100.00 (1)

is calculated without considering other training instances

Pageblocks 96.17 96.24 96.34 96.43 (1)

(i.e., without considering all xi , i = a, and 1 < <

= i = m). By

Pimadiabetes 70.49 70.25 68.53 69.40 (17)

contrast, in the proposed learning approach, all training in- Satelliteimage 90.23 90.26 90.14 90.19 (5)

stances with n attributes will be considered (calculated) to Satelliteimagetest 88.35 88.32 88.41 88.81 (1)

determine the similarity (i.e., GRC and GRG) between x0 Segment 96.80 97.07 97.33 97.81 (1)

and xa (1 < <

= a = m), i.e., the property of the axiom, whole- Shuttle 99.05 99.86 99.85 99.94 (1)

ness [8] of GRA in Section 3. This consideration is the main Shuttletest 98.88 98.76 98.87 98.99 (1)

difference between Euclidean-based similarity functions and Sick 87.01 86.86 96.84 92.75 (5)

Sickeuthyroid 68.30 68.41 95.08 88.60 (7)

GRA. In other words, in the proposed learning approach, a

Sonar 85.88 87.20 84.24 86.01 (1)

whole relational order (i.e., GRO) by considering all train- Soybean 90.51 90.98 92.03 89.31 (1)

ing instances will be derived for classifying a new, unseen Soybeansmall 100.00 100.00 100.00 100.00 (1)

instance. Sponge 84.29 84.38 84.28 85.37 (1)

In addition, let m denote the number of compared in- Tae 63.71 60.99 60.33 63.51 (1)

stances and n be the number of attributes. The time for clas- Vehicle 70.01 70.90 69.53 70.86 (5)

sifying a new, unseen instance (including the time for cal- Voting 93.57 95.17 95.17 93.57 (5)

culating the grey relational order) is O (mn + m log m). In Vowel 98.52 98.67 98.52 98.95 (1)

Wine 94.89 95.46 97.47 97.30 (13)

addition, the time for discovering of the best value for k in

Yeast 53.37 53.72 53.21 53.44 (19)

the proposed learning approach should also be included as Zoo 95.45 95.34 96.43 96.04 (1)

the complexity of the proposed learning approach. The two Average 85.91 86.15 87.26 87.35

parts are the overall time complexity of the proposed learn-

Springer

248 Appl Intell (2006) 25:243–251

Table 2 The statistic analysis for the worse test result of X/Y under method

proposed approach and other learn- S column indicates that the proposed

ing methods with HOEM, HVDM and learning approach performs better than

IVDM [39], respectively. A better or method S in X cases

Better or worse test (B/W test) 29/7 27/11 26/12 −

Wilcoxon test 99.50 99.20 88.70 −

Table 3 Average accuracy (%) of classification for the proposed approach and other classification algorithms

Dataset Baseline Stump Pipes VFI 1R Bayes Table C4.5 approach

Allbp 95.25 94.82 35.04 44.04 95.93 93.89 96.89 97.29 95.48

Allhyper 97.25 97.21 97.93 87.14 97.57 95.68 98.61 98.64 97.35

Allhypo 92.14 95.75 98.21 91.68 96.68 95.00 99.11 99.43 92.74

Allrep 96.89 96.89 96.89 93.07 96.82 93.96 99.21 99.25 97.18

Australian 55.51 85.51 44.93 86.81 85.51 76.67 85.07 85.51 81.87

Autos 32.74 44.93 62.95 58.93 62.90 58.02 80.05 82.52 76.45

Breast cancer 70.30 70.69 69.94 67.13 65.39 74.42 74.14 75.18 70.90

Breast-w 65.52 91.70 88.42 96.00 91.84 96.00 94.42 95.28 96.78

Cpu 57.90 65.10 61.24 55.57 69.86 68.98 67.93 66.05 70.09

Crx 55.51 85.51 60.43 84.78 85.51 77.68 85.51 85.94 80.82

Dis 98.39 98.39 48.61 55.21 98.25 95.14 98.75 99.14 97.77

Echoi 42.58 64.63 70.91 74.14 63.71 86.34 79.19 84.25 80.94

Glass 35.50 44.87 50.97 57.90 58.35 50.39 69.18 66.71 74.13

Hepatitis 79.38 81.29 65.16 83.88 80.00 82.58 83.87 83.23 80.84

Hypothyroid 95.23 97.38 95.51 48.61 97.91 97.85 99.08 99.24 98.24

Ionosphere 64.10 82.61 92.60 94.02 82.04 82.36 89.75 90.90 91.37

Iris 33.33 66.67 93.33 96.67 93.33 96.00 93.33 95.33 95.33

Letter 4.07 7.09 22.25 61.23 17.25 64.11 71.24 88.23 95.30

Liver disorders 57.98 61.17 44.62 59.76 57.40 55.96 56.23 66.38 63.30

Mushroom 51.80 88.68 99.77 99.88 98.52 95.75 100.00 100.00 100.00

Pageblocks 89.77 93.13 91.39 87.01 93.55 90.08 95.85 96.00 96.43

Pimadiabetes 65.11 72.01 35.03 64.84 72.28 75.78 74.48 74.09 69.40

Satelliteimage 24.17 44.74 48.00 71.52 59.82 79.55 82.57 86.11 90.19

Satelliteimagetest 23.50 41.70 58.30 72.70 58.05 79.05 80.80 83.25 88.81

Segment 14.29 28.53 75.50 77.45 63.98 80.30 91.69 97.10 97.81

Shuttle 78.41 86.94 84.15 78.27 94.69 91.52 99.75 99.96 99.94

Shuttletest 79.16 86.77 87.61 82.68 94.65 92.88 99.70 99.90 98.99

Sick 93.89 96.75 93.86 61.93 96.54 92.57 97.54 98.82 92.75

Sickeuthyroid 90.74 94.44 90.74 46.38 94.91 84.00 97.28 97.79 88.60

Sonar 53.38 71.60 59.57 55.76 63.93 65.93 72.55 74.07 86.01

Soybean 13.03 26.06 89.90 82.44 39.41 90.55 83.39 88.93 89.31

Soybeansmall 35.50 57.00 100.00 97.50 83.50 97.50 100.00 98.00 100.00

Sponge 16.07 41.25 80.36 75.89 44.82 84.46 73.04 66.07 85.37

Tae 34.42 35.75 47.04 52.96 42.42 53.63 50.96 53.04 63.51

Vehicle 25.77 39.59 38.55 53.55 52.70 44.45 67.96 73.28 70.86

Voting 61.38 95.62 38.62 90.34 95.62 90.57 94.49 97.00 93.57

Vowel 9.09 17.58 36.67 60.20 34.34 67.78 67.07 80.81 98.95

Wine 39.93 59.51 91.08 95.46 76.93 97.22 91.08 94.41 97.30

Yeast 31.20 40.70 35.85 50.27 40.30 58.09 56.88 54.39 53.44

Zoo 40.64 60.45 94.09 94.09 73.36 95.18 89.18 92.09 96.04

Average 55.02 67.78 69.40 73.69 74.26 81.20 84.70 86.59 87.35

Springer

Appl Intell (2006) 25:243–251 249

Table 4 The statistic analysis for the proposed approach and other various classification algorithms. A better or worse test result of X/Y under

method S column indicates that the proposed learning approach performs better than method S in X cases

Average accuracy 55.02 67.78 69.40 73.69 74.26 81.20 84.70 86.59 87.35

Better or worse test (B/W test) 37/3 31/9 33/6 35/5 30/10 32/8 21/17 17/21 −

Wilcoxon test 99.50 99.50 99.50 99.50 99.50 99.50 94.05 54.35 −

other instances in the k-level grey relational structure. In other ods and the proposed approach. Here, “Baseline” means that

words, this instance is rarely used in determining the class the majority class is simply chosen for classification. Simi-

labels of other instances, implying that, it is probably a good larly, Table 4 gives the statistical analysis, including better or

choice for instance pruning in a learning system. worse test (B/W test; for example, a better or worse test re-

sult of 32/8 under NaiveBayes column indicates that the pro-

posed learning approach performs better than NaiveBayes in

5 Experimental results 32 cases) and Wilcoxon Signed Ranks test [37] (i.e., the pro-

posed approach is compared with others), for comparing the

In this section, experiments performed on forty data sets above learning methods. As a result, the proposal presented

(from [5]) are reported to demonstrate the performance of here can yield higher performance, compared to some other

the proposed learning approach. In the experiments, ten-fold classification algorithms.

cross validation [35] was used and applied ten times for each

application domain. That is, the entire data set of each appli-

cation domain was equally divided into ten parts in each trial; 6 Conclusions

each part was used once for testing and the remaining parts

were used for training. Accordingly, the average accuracy of In this paper, an instance-based learning approach based on

classification was obtained. grey relational structure is proposed. Grey relational analy-

Table 1 gives the performances (average classification ac- sis is used to precisely describe the entire relational structure

curacy) of the proposed approach (i.e., h-nn with GRG), the of all instances in a specific application domain. By using

HOEM (with k-nn), HVDM (with k-nn) and IVDM (with k- the above-mentioned grey relational structure, new instances

nn) [39]. These distance functions used for comparison are can be identified with high accuracy. Experiments performed

mentioned in Section 2. As shown in Table 2, the statistical on forty application domains are reported to demonstrate the

analysis, including better or worse test (B/W test; for exam- performance of the proposed approach. It can be easily seen

ple, a better or worse test result of 29/7 under the HOEM that the proposed approach yields high performance over

column means that the proposed learning approach performs other methods that adopt one of Euclidean metric and the

better than the HOEM in 29 cases) and Wilcoxon Signed Value Difference Metric (VDM) or both. Moreover, the pro-

Ranks test [37] (i.e., the proposed approach is compared with posal presented here can yield higher performance, compared

others), was done for the above methods with various distance to some other classification algorithms. For some domains

functions. Here, Wilcoxon Signed Ranks test was used to test with pure symbolic attributes in the experiments, instance-

the null hypothesis that the differences (classification accu- based learning approaches using the VDM perform better

racy) between two methods are distributed symmetrically than the proposed learning approach, which is based on the

around zero (i.e., to determine if one method is significantly grey relational structure. As pointed earlier, the VDM is

better than another, regarding the classification accuracy). mainly applicable to domains with symbolic attributes. For

Of these forty application domains, the proposed approach further work, the VDM will be incorporated with the pro-

reveals its superiority over the HOEM (with k-nn) and the posed metric to increase the performance of the correspond-

HVDM (with k-nn). Meanwhile, the classification accuracy ing instance-based learning approach.

of the proposed approach is comparable to that of the IVDM

(with k-nn).

Acknowledgments This work was supported in part by the National

Furthermore, various classification algorithms were also Digital Archive Program-Research & Development of Technology Di-

used for performance comparison, including DecisionStump vision (NDAP-R & DTD), the National Science Council of Taiwan

[41], DecisionTable [27], HyperPipes [41], C4.5 [31], Naive- under grant NSC 94-2422-H-001-006, and by the Taiwan Information

Security Center (TWISC), the National Science Council under grant

Bayes [25], 1R [18] and VFI [41]. These methods are all

NSC 94-3114-P-001-001-Y. In addition, the authors would like to thank

available in [41]. Table 3 gives the performances (average the National Science Council of Taiwan for financially supporting this

accuracy of classification) of the above classification meth- research under grant NSC 94-2213-E-022-006.

Springer

250 Appl Intell (2006) 25:243–251

Science 2810, Springer-Verlag, pp 68–75

22. Huang YP, Huang CH (1997) Real-valued genetic algorithms for

1. Aha DW, Kibler D, Albert MK (1991) Instance-based learning

fuzzy grey prediction system. Fuzzy Sets Syst 87:265–276

algorithms. Mach Learn 6:37–66

23. Hullermeier E (2003) Possibilistic instance-based learning. Art

2. Aha DW (1992) Tolerating noisy, irrelevant and novel attributes

Intell 148:335–383

in instance-based learning algorithms. Int J Man-Mach Stud

24. Ignizio JP, Soltys JR (1996) Simultaneous design and training of

36(2):267–287

ontogenic neural network classifiers. Comp Oper Res 23:535–546

3. An A (2003) Learning classification rules from data. Comp Math

25. John GH, Langley P (1995) Estimating continuous distributions

Appl 45:737–748

in bayesian classifiers. In: Proc. of the Eleventh Conference on

4. Bay SD (1999) Nearest neighbor classification from multiple fea-

Uncertainty in Artificial Intelligence, pp 338–345

ture subsets. Intell Data Anal 3:191–209

26. Kibler D, Aha DW (1987) Learning representative exemplars of

5. Blake CL, Merz CJ (1998) UCI Repository of machine learning

concepts: An initial case study. In: Proc. of the fourth international

databases [http://www.ics.uci.edu/∼mlearn/MLRepository.html].

workshop on machine learning. Morgan Kaufmann, CA, Irvine,

Irvine, CA: Department of Information and Computer Science, Uni-

pp 24–30

versity of California

27. Kohavi R (1995) The power of decision tables. In: European

6. Brouwer RK (1997) Automatic growing of a hopfield style network

conference on machine learning, pp 174–189

during training for classification. Neur Netw 10:529–537

28. Langley P, Simon HA (1995) Applications of machine learning

7. Cover TM, Hart PE (1967) Nearest neighbor pattern classification.

and rule induction. Commun ACM 38(11):55–64

IEEE Trans Inform Theory 13(1):21–27

29. Lin CT, Yang SY (1999) Selection of home mortgage loans using

8. Deng J (1984) The theory and method of socioeconomic grey sys-

grey relational analysis. J Grey Syst 4:359–368

tems. Soc Sci China 6:47–60 (in Chinese)

30. Quinlan JR (1986) Induction of decision trees. Mach Learn

9. Deng J (1989) Introduction to grey system theory. J Grey Syst 1:1–

1:81–106

24

10. Deng J (1989) Grey information space. J Grey Syt 1:103–117 31. Quinlan JR (1993) C4.5: Programs for machine learning. Morgan

11. Elouedi Z, Mellouli K, Smets P (2001) Belief decision trees: The- Kaufmann Publishers, San Mateo, CA

oretical foundations. Int J Appr Reas on 28:91–124 32. Rachlin J, Kasif S, Salzberg S, Aha DW (1994) Towards a better

12. Fix E, Hodges JL (1951) Discriminatory analysis: nonparametric understanding of memory-based and bayesian classifiers. In: Proc.

discrimination: consistency properties. Technical Report Project of the eleventh international machine learning conference, NJ,

21-49-004, Report Number 4, USAF School of Aviation Medicine, Morgan Kaufmann, New Brunswick, pp 242–250

Randolph Field, Texas 33. Salzberg S (1988) Exemplar-based learning: theory and imple-

13. Freund Y, Mason L (1999) The alternating decision tree learning mentation. Technical Report TR-10-88, Center for Research in

algorithm. In: Proc. of the 16th international conference on machine Computing Technology, Harvard University

learning, Bled, Slovenia, pp 124–133 34. Stanfill C, Waltz D (1986) Towards memory-based reasoning.

14. Friedman JH (1977) A recursive partitioning decision rule for non- Commun ACM 29(12):1213–1228

parametric classification. IEEE Trans Comp, pp 404–408 35. Stone M (1974) Cross-validatory choice and assessment of

15. Hattori K, Takahashi M (2000) A new edited k-nearest neighbor rule statistical predictions. J Royal Stat Soc B 36:111–147

in the pattern classification problem. Pattern Recog 33:521–528 36. Tsumoto S (2003) Automated extraction of hierarchical decision

16. Hewett R, Leuchner J (2003) Restructuring decision tables for elu- rules from clinical databases using rough set model. Expert Syst

cidation of knowledge. Data Knowl Engin 46:271–290 Appl 24:189–197

17. Hickey RJ, Martin RG (2001) An instance-based approach to pat- 37. Watson CJ, Billingsley P, Croft DJ, Huntsberger DV (1993)

tern association learning with application to the English past tense Statistics for management and economics, 5th edn. Allyn and

verb domain. Knowl-Based Syst 14:131–136 Bacon, Boston

18. Holte RC (1993) Very simple classification rules perform well on 38. Watson I (1999) Case-based reasoning is a methodology not a

most commonly used datasets. Mach Learn 11:63–91 technology. Knowl-Based Syst 12:303–308

19. Hu YC, Chen RS, Hsu YT, Tzeng GW (2002) Grey self-organizing 39. Wilson DR, Martinez TR (1997) Improved heterogeneous distance

feature maps. Neurocomputing 48:863–877 functions. J Art Intell Res 6:1–34

20. Huang CC, Lee HM (2001) A grey-based nearest neighbor approach 40. Wilson DR, Martinez TR (2000) Reduction techniques for

for predicting missing attribute values. In: Proc. of 2001 national exemplar-based learning algorithms. Mach Learn 38(3):257–268

computer symposium, Taiwan, pp B153–159 41. Witten I, Frank E (2000) Data mining—practical machine learn-

21. Huang CC, Lee HM (2003) A partial-memory learning system ing tools and techniques with java implementations. Morgan

based on grey relational structure. In: Berthold MR et al (eds) Kaufmann, San Francisco, CA

Springer

Appl Intell (2006) 25:243–251 251

Chi-Chun Huang is currently Assistant Professor in the Department Hahn-Ming Lee is currently Professor in the Department of Computer

of Information Management at National Kaohsiung Marine University, Science and Information Engineering at National Taiwan University

Kaohsiung, Taiwan. He received the Ph.D. degree from the Depart- of Science and Technology, Taipei, Taiwan. He received the B.S. de-

ment of Electronic Engineering at National Taiwan University of Sci- gree and Ph.D. degree from the Department of Computer Science and

ence and Technology in 2003. His research includes intelligent Internet Information Engineering at National Taiwan University in 1984 and

systems, grey theory, machine learning, neural networks and pattern 1991, respectively. His research interests include, intelligent Internet

recognition. systems, fuzzy computing, neural networks and machine learning. He

is a member of IEEE, TAAI, CFSA and IICM.

Springer

- We Can Track You If You Take the Metro: Tracking Metro Riders Using Accelerometers on SmartphonesUploaded byPatrick Howell O'Neill
- Interpretation of IP15Uploaded byU Tint Lwin
- Homework 3Uploaded bykrishna135
- Lab 7 DocumentUploaded bygowthami sirana balu
- Well ClasificationUploaded byruzauk
- Decision TreesUploaded bySwapnil Joshi
- UntitledUploaded byapi-282205604
- 08 Classification Using K NNUploaded byIlham Jaya
- Neural Network Ensembles using Interval Neutrosophic Sets and Bagging for Mineral Prospectivity Prediction and Quantification of UncertaintyUploaded byAnonymous 0U9j6BLllB
- Xiang Li Adversarial Open-World Person ECCV 2018 PaperUploaded byMir Murtaza
- 1111Uploaded byTejaswini T Manjunath
- Analyzing Attribute DependenciesUploaded byelakadi
- Drawing Class if i ErUploaded byJOSRVERC
- Bayes Probabilistic InferencingUploaded byDivyanshu Sinha
- 43-1-61-1-10-20170621Uploaded byLia Amellya L
- final_text.pdfUploaded bymahitha
- ICDM08ClassifyTutoCheng.pptUploaded byMadiha Shaikh
- Machine Learning with R Second Edition - Sample ChapterUploaded byPackt Publishing
- Calidad del codigo.pdfUploaded byBrayanParraOtaku
- KnnUploaded byJhonSolanoTipoMamani
- Granular Kernel TreeUploaded bylast_2011
- Part I _ Supervised LearningUploaded byyonando wibowo
- Investigation on Hrbrid learning in ANFIS.pdfUploaded byRam Kumar
- MCPR2016 Paper 58Uploaded byFernando Montoya Manzano
- [IJET-V2I3P22] Authors: Harsha Pakhale,Deepak Kumar XaxaUploaded byInternational Journal of Engineering and Techniques
- Cheng, P. M. Et Al 2018Uploaded byAnonymous N7KGZx9r
- 1 - CopyUploaded byAmal Raj T Asok
- 30334390 INV Inventory Beginers GuideUploaded byaccountant_88
- Scholastic-Book-NeuralNetworks-Part01-2013-02-15.pdfUploaded byገብረዮውሃንስ ሃይለኪሮስ
- psanthi clusterUploaded byDragos Popescu

- Motivating Factors in Advertisement for Brand Recognition in Print MediaUploaded byspitraberg
- 03Uploaded byspitraberg
- 71413876605-تصميم گيري چند شاخصه در رتبه بندي طرحهاي تامين آب شهريUploaded byspitraberg
- 52613830510-ارائه يک مدل پشتيبان تصميم گيري جهت برنامه ريزي، ارزيابي و انتخاب تامين کنندگانUploaded byspitraberg
- Islamic Perspective of Emotional Intelligence and Significance of Its DevelopmentUploaded byspitraberg
- Strategic Selection of Problem-Solving Methodologies (PSMs( To Gain Lean Production SystemUploaded byspitraberg
- 337- طراحي آزمايشها در محيط فازي با استفاده از تصميم گيري چند هدفه فازيUploaded byspitraberg
- 323 -رتبه بندي صنايع ايران بر اساس تکنيک هاي تصميم گيري با معيارهاي چندگانهUploaded byspitraberg
- How to Find Relationship Between Performance Measurement System and TQM in Manufacturing OrganizationUploaded byspitraberg
- Measuring Service Quality at Hospitals affiliated to Iran Medical Sciences UniversityUploaded byspitraberg
- 224 - تحليل تصميم گيري چندشاخصي با گزينه هاي ترکيبيUploaded byspitraberg
- 20-establishing dominance and potential optimality in multi-criteria analysis with imprecise weight and valueUploaded byspitraberg
- Gender Differences in Communication Styles Among Employees of Service Sector: Age, Gender And Individual’s EducationUploaded byspitraberg
- Total Quality Management-an Approach towards Good overnanceUploaded byspitraberg
- 12-the choquet integral for the aggregation of interval scales in multicriteria decision makingUploaded byspitraberg
- 11-multicriteria job evaluation for large organizationsUploaded byspitraberg
- Benchmarking for Competitive Advantage – Process Vs. Performance Benchmarking For Financial ResultsUploaded byspitraberg
- Emulative Leadership Qualities and Their Global RamificationUploaded byspitraberg
- 04570-فرايند تبيين و تدوين بيانيه رسالت سازمان (مطالعه موردي : آستان قدس رضوي)Uploaded byspitraberg
- 01480-اجراي استراتژي:سازمان مبتني بر نقاط مرجع استراتژيكUploaded byspitraberg
- Establishing Framework for Pakistan Quality AwardUploaded byspitraberg
- The Evolution of ExcellenceUploaded byspitraberg
- 820-نقش فرهنگ در انتخاب مدل استراتژيUploaded byspitraberg
- 127-برنامه ريزي راهبردي در آستان قدس رضويUploaded byspitraberg
- 0002-طراحي و پيادهسازي کارگزار فازي برای اولويت بندي راهبردي سهام بازارUploaded byspitraberg
- 0001-ارائة سیستم هوشمند تصمیم یار برای اولویت بندی استراتژیها با استفاده از فرایند تحليل سلسله مراتبی در حالت فازيUploaded byspitraberg
- Constructing Context in Quality Management ImplementationUploaded byspitraberg
- Knowledge Management in a BSC Project Experience of GIGUploaded byspitraberg
- Knowledge Management in SMEs – Current Practices and Potential BenefitsUploaded byspitraberg
- Weighted Quantum Management for School Teaching / Learning ProgramUploaded byspitraberg

- Dimensionare bare CupruUploaded byAnne Kelley
- Professional Teachers CPDProvider v2Uploaded byPRC Board
- ELEX 248 LAB 05Uploaded byCorey Koelewyn
- Anatomia.de.un.Plan.de.Negocio.-.Linda.Pinson.pdfUploaded byCarolina Ruiz
- Comb GatesUploaded byTwins Leong
- 6346HWUploaded byMushfiq M Rahman
- Lesson Plan - RelationshipsUploaded byGiuliete Aymard
- imb reading lesson planUploaded byapi-348929694
- Model Maglev Train Final ReportUploaded bystephan_habicht
- MPX200 Multifunction Router Release NotesUploaded byBoris Dossa
- 1 - Froghopper Insects Leap to New HeightsUploaded byFrancisca Pinto Bascuñán
- Data GuardUploaded bySravan Kumar
- Workbook Level 4 2017IIUploaded byJOSE LEZAMA Q
- Impact of election in indian economyUploaded byReshma Majumder
- Gasosyn EnglishUploaded bynasirfahim
- En ACS580 Standard Control Program FW D A5Uploaded bymodelador3d
- 7 Let Number Theory.pdf10Uploaded bynovey_casio
- Symmetrical Fault AnalysisUploaded byPao Castillon
- Sebastian KuehnlUploaded byErnesto Guiñazu
- 02 17 EksmalUploaded by1aleksandar_t5349
- Paolino v. JF Realty, LLC, 1st Cir. (2016)Uploaded byScribd Government Docs
- Complete Pump Solutions for Fire ProtectionUploaded byDestiany Prawidyasari
- PROS & CONSUploaded byJuan Haynes
- Partial (2)Uploaded byAnonymous hBKVBJ2d
- Johannes Finck remembers Karl Richter and his time with the Munich Bach-ChoirUploaded byJohannes Martin
- Kerala VAUploaded byarchitectfemil6663
- lesson 1Uploaded byapi-325715593
- 4-3-21.1Uploaded byDian Sylviani Parung
- Dudhsagar DairyRreport 2009,10Uploaded bysan1983ind5663
- LicenseUploaded byBogdan Nechita

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.