Professional Documents
Culture Documents
www.elsevier.com/locate/aei
Abstract
Case-based reasoning (CBR), one of the artificial intelligence (AI) learning approaches, is drawing the attention of many researchers in
Civil Engineering. However, due to vagueness and uncertainties in knowledge representation, attribute description, and similarity measures
in CBR—especially when dealing with similarity assessment—it is difficult to find the cases from a case base which exactly match the query
case. Therefore, fuzzy theories have been incorporated into CBR allowing for more robust, flexible, and accurate models. In this study, two
fuzzy membership functions (trapezoidal and step-wise) and fuzzy numbers are used to measure the similarity between attribute values. They
are integrated into CBR to develop a model used to monitor highway bridge health. This model’s learning capabilities have been validated
using five different error-metrics, based on the cross-validation method. The code is implemented using the programming language CCC,
and all the cases used for both training and testing are extracted from the electronic bridge database of the Kansas Department of
Transportation. It is shown from the experimental results that it is feasible to apply fuzzy case-based reasoning to monitor bridge health.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Fuzzy case-based reasoning; Fuzzy membership functions; Bridge health monitoring
retrieval, similarity measure, and reference of cases in CBR. numbers are. Thus, the impetus behind the introduction of
Therefore, fuzzy theory concepts are incorporated into CBR fuzzy set theories was to provide a means of defining
to overcome its drawbacks. categories that are inherently imprecise [4].
Melhem et al. [15] proposed the general concept of using
fuzzy case-based reasoning in bridge management systems,
which addresses two objectives: (a) monitoring the health of
3. Concept of case-based reasoning
bridge decks and predicting their future deterioration, and
(b) identifying the necessary actions for maintenance,
In CBR, knowledge is represented as cases. A case is a
rehabilitation, and replacement (MR&R). Portions of the
conceptualized piece of knowledge representing an experi-
proposed concept have been implemented (a few months
ence [11]. A case usually consists of a problem description,
ago) and will be presented by Cheng and Melhem [9]. This
and its corresponding outcome/solution. Whether the case
paper discusses the detailed implementation and
includes all these parts or not depends on the specific
results validation of the model developed for the fist
problem to be solved. A representative set of cases
objective, i.e., monitoring/predicting the deterioration of
comprises a case library for a problem domain.
bridge decks. Additional details on this task, and complete
CBR involves the following four cyclical processes: (1)
details and validation of the second task (evaluation of
retrieving the most similar case(s), (2) reusing the solutions
MR&R actions) are presented in full elsewhere [7].
of the retrieved case(s), (3) revising the proposed solution if
necessary, and (4) retaining the new solution as part of a
new case [1]. The CBR system retrieves one or more similar
2. Fuzzy set concept cases from the case library when a new problem is
encountered. A solution proposed by the most similar
Given a universe U of objects, a conventional crisp case(s) is reused or adapted to solve the new problem. The
subset S of U is commonly defined by specifying the problem may be retained as a new case in the case library to
objects, which are members of S, of the universe. An update the knowledge of the CBR system.
equivalent way of defining S is to specify the characteristic Among the tasks, retrieving the most similar case(s) is
function of S, ms: U/{0, 1}, where for all x2U: the first and most crucial step in CBR since the subsequent
mS ðxÞ Z 1; x 2S (1) steps would not take place without it. Retrieving the most
similar case(s) involves evaluating the degrees of similarity
mS ðxÞ Z 0; x ;S (2) between any two cases being compared. The approach
commonly used to assess similarity is the distance function.
For a fuzzy set, the idea of vagueness is introduced by However, imprecision and uncertainties are inherent in this
assigning an indicator function that may take on values in distance function [6]. Therefore, the fuzzy membership
the range 0–1. This is called membership level of x in S and functions are here integrated into CBR to deal with
indicates the strength of the belief that x lies within S. If S imprecision in the similarity measure. When fuzzy theories
represents a fuzzy set, then the membership level of any are incorporated into CBR, the algorithm is usually called
element, x2S, is denoted as ms(x), which satisfies: fuzzy case-based reasoning (FCBR).
0% ms ðxÞ% 1 (3)
A particular value of the membership function can be
expressed as: 4. Data sets
ms : x/ ½0; 1 (4) Data sets have a great effect on how well a machine
An example of a fuzzy set is the set of real numbers much learning technique performs, so the extraction of data from
larger than zero, which can be defined with a membership the database is a very important task. The data set used here
function as follows: was extracted from the electronic bridge database of the
Kansas Department of Transportation (KDOT). This
x2
mS ðxÞ Z ; xR 0 (5) database is used by PONTIS [19]. Such computerized data
x2 C 1 processing is crucial for managing any large inventory of
infrastructure assets. PONTIS uses the concept of Com-
mS ðxÞ Z 0; x! 0 (6)
monly-Recognized (CoRe) bridge elements and measures—
According to the above equations, the larger the x, the among other things—the deterioration of a bridge using the
greater, and the closer it is to 1. For example, for xZ0.5, health index of the entire bridge. The health index of the
ms(0.5)Z0.2, and for xZ20, ms(20)Z0.998. The strength of bridge is derived from the health indices of the individual
the belief or the membership level of 20 ‘much larger than CoRe elements, which are a function of the CoRe elements
zero’ is greater than that of 0.5. This fuzzy membership condition states and corresponding coefficients. These are
function indicates how much larger than zero these real explained in more details in the following paragraphs.
Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315 301
4.1. Commonly-recognized (CoRe) elements the condition of an element at a given time as a point along a
continuous timeline from 100% in the best state to 0% in the
Most states have successfully used the CoRe bridge worst state, even though element condition states are
elements as the basis for data collection, performance categorical. The element condition state distribution
measurement, resource allocation, and management (among states 1 through 5) at any point in time can be
decision support [23]. The commonality aspect of CoRe obtained from field inspection. The health index of any
elements depends on definitions that are widely understood specific element is the ratio of the sum of the
by transportation agencies and that are stable over time. In current quantities in each condition state multiplied by a
this study, the CoRe element ‘Bridge Deck’ is used as the corresponding coefficient, over the initial total quantity of
basis for the investigation. the element. It can be given by the following formula [19,
Section 4.2)
4.2. Condition states P
kq
He Z Ps s s !100% (7)
s qs
The health index for CoRe elements is based on their
condition states. The condition states, whose definitions are where He is the health index of an individual element,
from the element-level inspection, reflect the levels of s denotes the index of the condition state, qs represents the
deterioration and the influence of deterioration on service- element quantity in sth condition state, and ks is for health
ability. The levels of deterioration of each CoRe element are index coefficient corresponding to the sth condition state.
generally described as follows [23]: The health index coefficients of the condition states {ks} are
fractional values calculated as follows:
1. Protected. The element’s protective materials or systems
nKs
are sound and functioning as intended to prevent ks Z (8)
deterioration of the element. nK1
2. Exposed. The element’s protective materials or systems in which sZ1,2,.,n, and n represents the number of
have partially or completely failed, leaving the element applicable condition states that can be 3, 4, or 5 depending
vulnerable to deterioration. upon the type of CoRe element. The health index
3. Attacked. The element is experiencing active attack by coefficients are given in Table 1 [19] according to Eq. (8).
physical or chemical processes, but is not yet damaged. The health index of the entire bridge can be evaluated as
4. Damaged. The element has lost important amounts of follows
material, such that its serviceability is suspect. P
e He Qe We
5. Failed. The element no longer serves its intended HZ P (9)
function. e Q e We
the health index due to repairs or deck reconstruction were and sometimes 3 years, depending on the bridge type and
not considered. condition. Therefore, for each case used in CBR, only the
From the inventory database, only the parameters (fields) traffic values corresponding to the applicable years of
that seem to mostly influence the deterioration of bridge inspection are used.
decks were considered. Moreover, environmental factors The health index is a special attribute, since it has the
were not considered as a separate attribute because the same property as ADT and ADTT/ADT, i.e. taking multiple
bridges selected (located in Districts One and Two) all had values in a case, and participating in matching of attributes.
the same value in the database (3ZModerate, i.e., ‘Typical However, the health index differs from ADT and ADTT/ADT
level of environmental influence on deterioration’) for the and the other attributes because the former is also
parameter Environment. A total of 17 attributes were considered as a target/class. In some cases, there are five
selected. Some of them were continuous (with real values), records for the health index for a bridge, whereas, in others,
including (Re-)Construction Year, Design Load, Main three or four records are available. Each of these records
Spans, Max Span, Length, Deck Area, Skew, Operating corresponds to different inspection date, which starts from
Rating Load, Inventory Rating Load, Average Daily Traffic 1994 through 2002, or 1993 through 2003, for every 2–4
(ADT), Average Daily Truck Traffic (ADTT) to ADT years.
(ADTT/ADT), and Health Index; others were symbolic The attribute Deck Structure Type consists of nine values
(with discrete values), including Deck Structure Type, Main including: Concrete Deck Coated Bar, Bare Concrete Deck,
Span Materials, Main Span Design, Deck Surface Type, and Protected Concrete Deck AC, Unprotected Concrete Deck
Kind of Highway. AC, Protected Concrete Deck Rigid, Concrete Slab Coated
Since the improvement due to repair action or deck Bar, Bar Concrete Slab, Unprotected Concrete Slab AC, and
reconstruction was not considered, the database fields Protected Concrete Slab Rigid. The attribute Main Span
‘Construction Year’ and ‘Reconstruction Year’ were Materials has 4 values: Concrete, Concrete Continuous,
combined into one attribute ‘(Re-)Construction Year’ with Steel Continuous, and Prestressed Concrete Continuous.
the most recent date selected as the corresponding attribute The attribute Main Span Design includes 5 values: Slab,
value. Stringer/Girder, Girder Floor Beam, Tee Beam, and
The attribute values for ‘Design Load’ were originally Multiple Box Beam. The attribute Deck Surface Type
listed in the database as follows: M9 (H10), M13.5 (H15), takes 5 values: Monolithic Concrete, Integral Concrete,
MS13.5 (HS15), M18 (H20), MS18 (HS20), and MS18 Low Slump Concrete, Latex Concrete, and Bituminous. The
(HS20)CMod. An equivalency system was used to convert attribute Kind of Highway also includes 5 values: Interstate
such data into numeric values with a range going from 135 Highway, State Highway, US Numbered Highway, County
to 360 kN (30 kips to 80 kips). The ‘CMod’ extension refers Highway, and City Street.
to the addition of the Alternate Military Loading, which For each data instance, the health index of the bridge
results in a 20–25% increase in load effect (bending deck is computed using Eq. (7). The quantities in each of the
moment) depending on the span length. This equivalency five condition states are the areas (in square meters)
was used in consultation with a KDOT bridge design extracted from the Kansas PONTIS inspection data. The
engineer [24]. weights are those given in the last line of Table 1. The health
The attribute ‘Operating Rating Load’ corresponds to the index is used as a class/target (decision). After the health
maximum level of stresses. Thus load ratings describe the index is computed, the quantities in the condition states are
maximum permissible live load to which the structure may not used anymore and are ‘hidden’ from the machine
be subjected. The attribute ‘Inventory Rating Load’ learning process.
generally corresponds to the customary design level of
stresses, but reflects the existing bridge and material
conditions with regard to deterioration and loss of section. 5. Evaluation of similarities
Inventory level load ratings allow comparisons with the
capacity of new structures, and therefore, result in a live The section discusses how to evaluate the degree of
load that can safely utilize an existing structure for an similarity for attributes and cases. The ways of measuring
indefinite period of time. similarity between two attribute values depend on the type
The two attributes ADT and ADTT/ADT are different than of the attributes. In this study, fuzzy membership functions
the others. Usually, an attribute takes one value in a case, are used to measure the similarities between two numeric
however, both of ADT and ADTT/ADT have multiple values values for an attribute, while similarity matrices or trees are
for the same bridge. In this research, for each bridge case, for enumerated or hierarchical attributes. Two types of
ten records (corresponding to the years 1993 through 2002) fuzzy membership functions for numeric attributes are
were extracted from the electronic database of KDOT with investigated here: step-wise triangular and trapezoidal.
their respective values of ADT and ADTT/ADT. Typically, Similarity matrices and trees for enumerated and hierarch-
ADT and ADTT traffic values are recorded every year for all ical symbolic attributes are determined according to experts,
bridges while the inspection frequency varies from 1, 2, and they may vary within certain ranges. Optimal values for
304 Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315
similarity matrices and trees can be achieved based on a The D is considered a reference value, i.e., if an is exactly
trial-error method. the maximum or minimum, and ao is the minimum or
maximum, the difference x between the two attribute values
5.1. Fuzzy membership functions for numeric-valued reaches the highest value, then their similarity is considered
attributes 0. As seen from the above equations, the smaller the
difference between two attribute values, the larger the value
Two types of fuzzy membership functions are presented of m(x) is, that means the bigger the degree of similarity
to measure the degree of similarity between any two between the two attributes, and vice versa.
numeric attribute values: trapezoidal and step-wise
triangular.
5.2. Step-wise triangular fuzzy functions
5.1.1. Trapezoidal fuzzy functions A step-wise triangular membership function may be given
The trapezoidal membership function is given by Eqs. by the following Eqs. (14)–(19), and is described by Fig. 4.
(10)–(13), and is represented by Fig. 3.
jxjð1KmðtvÞÞ
mðxÞ Z 1Kj
x
j; where aD% jxj% D (10) mðxÞ Z 1K ; where 0% jxj% tv (14)
ð1KaÞD tv
mðtnÞðjxjKtvÞ C mðtvÞðtnKjxjÞ
mðxÞ Z 1; where xj% aD (11) mðxÞ Z ;
tnKtv (15)
D Z amax Kamin (12) where tv% jxj% tn
µ(x)
µ(x)
Extremely near
1 1
Very near µ(tv)
Near µ(tn)
Sightly near
µ(ts)
0 α∆ ∆ x 0 tv tn ts ∆ x
Table 2
The similarity matrix for the attribute Main Span Design
Slab Stringer Girder floor Tee beam Multiple box Frame Truss deck Truss thru
girder beam beam
Slab 1
Stringer girder 0.4 1 SYMMETRIC
Girder floor beam 0.6 0.5 1
Tee beam 0.4 0.8 0.4 1
Multiple box beam 0.3 0.4 0.3 0.8 1
Frame 0.2 0.2 0.2 0.2 0.2 1
Truss deck 0.5 0.2 0.2 0.2 0.2 0.3 1
Truss thru 0.2 0.2 0.2 0.2 0.2 0.3 0.6 1
the FCBR system can be optimized. The other symbols are the attribute and the nodes are all-possible values of that
same as the ones in the trapezoidal fuzzy function. attribute [11]. Taxonomy tree technique differs from
similarity matrix technique in terms of the knowledge it
5.3. Fuzzy numbers (similarity matrices/trees) expresses [16]. This approach expresses the relationship
for symbolic-valued attributes between attribute values through their position in the
taxonomy tree [3]. By going from the root to the leaves of
Since the values of symbolic attributes cannot be the tree, attribute values become more specialized. Fig. 5
represented on the basis of a unified scale, therefore, the shows an example of a taxonomy tree for the symbolic
difference between these values cannot be computed. In attribute Main Span Materials.
addition, the values of a symbolic attribute can be In Fig. 5, the degree of similarity between Concrete/
represented in many formats, which may cause problems continuous/ reinforced and Concrete/Simply Sup-
in calculations of similarity [11]. In order to avoid these ported/ Reinforced is evaluated as 1!0.6Z0.6, which
problems, similarity has to be addressed in a different way, stands for the difference in support conditions only. This
and for this purpose symbolic attributes are divided into two degree of similarity differs from the one between
categories: enumerated attribute and hierarchical attribute. Concrete/continuous/reinforced and Concrete /
Their similarities are determined using similarity matrices Simply Supported/ Pre-stressed, which is computed as
and similarity trees, respectively. 0.8!0.6Z0.48. This value is more reasonable since it
accounts for the difference in support conditions as well as
reinforcement.
5.3.1. Enumerated symbolic-valued attribute
An enumerated attribute is the first type of symbolic
attributes. A similarity matrix is most commonly used to 5.4. Overall similarities between two cases
define the similarity between the values of the enumerated
attribute. In similarity matrices, the numbers (also called The global similarity between any two cases can be
fuzzy numbers) in the rows and columns denote the degree calculated using Eqs. (20)–(22) [11]:
of similarity between each pair of values of an enumerated
attribute and are determined by domain experts. Since the GSimðxn ; x0 Þ Z TSim=TWt (20)
similarity matrix is symmetrical, the degree of similarity can
be mirrored abound the matrix diagonal. Consequently, only
the degree of similarity for the upper or lower triangle of the X
k
TSim Z LSimi !Wti (21)
similarity matrix needs to be determined. iZ1
As discussed earlier, there are five symbolic attributes in
our problem domain. The four attributes: Kind of Highway,
Main Span Design, Deck Surface Type, and Structural Type, X
k
belong to the enumerated type. The fifth attribute, Main TWt Z Wti (22)
iZ1
Span Materials, is of the hierarchical type and will be
discussed later. Table 2 shows a possible similarity matrix where GSim(xn, x0) stands for the global similarity between
of the attribute Main Span Design. It may be noted again a new/query case and an old/past case, TSim denotes the
that all similarity matrices are symmetrical. total attribute similarity, represents the local similarity
between new and old case values for attribute i evaluated
5.3.2. Hierarchical symbolic-valued attribute using the appropriate similarity measures mentioned in
A hierarchical attribute is the other type of symbolic previous sections (Eqs. (10)–(19), and similarity matrices/
attributes with values defined as nodes of a taxonomy tree. trees), TWt is the total attribute weight, and is the weight of
A taxonomy tree is a tree in which the root is a hierarchical attribute i.
306 Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315
0.0
0.6 0.6
Steel Concrete
Support Support
0.8
0.8
Simply
Simply Supported Continuous
Continuous Reinforcement
Supported Reinforcement
7. Experiments
where k is the number of nearest neighbors, denotes the
class value of the ith (iZ1, 2,., k) neighbor, and represents
Error measures and the correlation coefficient (cc) are
the prediction of the new/test case.
used to evaluate the performance of the FCBR mode. The
following four metrics of error measures are explored here:
6.2. Distance-weighted KNN fuzzy predictor root mean-squared error (rmse), mean absolute error (mae),
root relative-squared error (rrse), and relative absolute error
The algorithm of the distance-weighted fuzzy predictor (rae). These are discussed in the next section.
weights the contribution of each of the k neighbors in terms Ten-fold cross-validation is normally considered the
of their distance to a new/test case , giving greater weight to standard approach for prediction quality of a learning
Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315 307
Table 3
The effect of the parameter a of the trapezoidal membership function on mean absolute error
Average
10 folds is that a data set D is randomly divided into 10
0.0396
0.0474
0.0358
0.0436
0.0367
0.0446
0.0419
0.0472
0.0345
0.0421
0.0358
0.0442
parts, each of which is approximately 10% of D. Nine
parts are used as training set for prediction, and the
remaining part is used as a test set for evaluating
0.0317
0.0329
0.0267
0.0322
0.0267
0.0311
0.0350
0.0366
0.0233
0.0282
0.0233
0.0268
the prediction based on the training set. Each of the 10
10
not simply ‘hit’ or ‘miss’. Let e1,e2 .,en stand for the
The effect of the tv, tn, ts, m(tv), m(tn), and m(ts) of the step-wise membership function on mean absolute error
and is given by
3
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðe1 Ka1 Þ2 C/C ðen Kan Þ2
rmse Z (27)
0.0139
0.0216
0.0139
0.0204
0.0208
0.0323
0.0226
0.0245
0.0208
0.0324
0.0208
0.0334
n
2
calculated as
1
D.W. kNN
D.W. kNN
D.W. kNN
D.W. kNN
D.W. kNN
M.V. kNN
M.V. kNN
M.V. kNN
M.V. kNN
M.V. kNN
M.V. kNN
predictors
Fuzzy
Table 5
The optimal values of the parameters for the trapezoidal and the step-wise
m(tv), m(tn), m(ts)
Mean-valued Distance-
weighted
Trapezoidal a 0.05 0.05
Step-wise tv 0.15 0.15
0.025, 0.25, 0.80
Ts 0.50 0.50
0.1, 0.2, 0.6
The mean-squared error tends to exaggerate the according to their magnitude. These two methods measure
influence of the instances whose prediction error is larger the absolute errors, but sometimes the averages of absolute
than the others, but the absolute error does not have this error will be meaningless. Therefore, methods measuring
effect, since all sizes of error are dealt with equally relative errors become of importance. The measures
Table 6
The effect of the number k of nearest neighbors on the fuzzy predictors
k No. of run Fuzzy membership functions
Mean-valued fuzzy predictor Distance-weighted fuzzy predictor
Trapezoidal Step-wise Trapezoidal Step-wise
1 1 0.0590 0.0729 0.0590 0.0729
2 0.0208 0.0313 0.0208 0.0313
3 0.0486 0.0729 0.0486 0.0729
4 0.0625 0.1042 0.0625 0.1042
5 0.0417 0.0417 0.0417 0.0417
6 0.0313 0.0417 0.0313 0.0417
7 0.0417 0.0417 0.0417 0.0417
8 0.0521 0.0313 0.0521 0.0313
9 0.0521 0.0417 0.0521 0.0417
10 0.04 0.03 0.04 0.03
Average 0.0450 0.0509 0.0450 0.0509
2 1 0.0503 0.066 0.056 0.0701
2 0.0156 0.0156 0.0173 0.0228
3 0.0347 0.0451 0.0393 0.0525
4 0.0365 0.0521 0.0455 0.0668
5 0.0313 0.026 0.0366 0.0327
6 0.0521 0.0365 0.0562 0.0368
7 0.0417 0.0156 0.0441 0.0136
8 0.0625 0.0469 0.06 0.049
9 0.0399 0.0313 0.0409 0.0313
10 0.0275 0.03 0.0297 0.03
Average 0.0392 0.0365 0.0426 0.0405
3 1 0.0527 0.0613 0.0569 0.066
2 0.0139 0.0208 0.015 0.0262
3 0.0405 0.0370 0.0428 0.0384
4 0.0382 0.0521 0.0462 0.0633
5 0.0295 0.0208 0.0327 0.0272
6 0.0417 0.0347 0.048 0.0363
7 0.026 0.0139 0.0365 0.0168
8 0.0475 0.037 0.0472 0.0365
9 0.0521 0.0452 0.0591 0.0482
10 0.0317 0.0233 0.0321 0.0268
Average 0.0374 0.0346 0.0416 0.0386
4 1 0.0447 0.0564 0.0514 0.0617
2 0.013 0.0247 0.0143 0.0308
3 0.0512 0.033 0.0461 0.0344
4 0.0313 0.0469 0.0437 0.0564
5 0.0352 0.0313 0.0346 0.0343
6 0.0547 0.0339 0.0579 0.0368
7 0.0247 0.0135 0.0312 0.0161
8 0.0538 0.0391 0.0516 0.0382
9 0.0482 0.0365 0.0527 0.0382
10 0.0288 0.0275 0.0307 0.0335
Average 0.0386 0.0343 0.0414 0.0380
5 1 0.0483 0.0576 0.0554 0.0617
2 0.0229 0.0323 0.0261 0.0365
3 0.0514 0.0347 0.0462 0.0355
4 0.0458 0.0479 0.0563 0.0574
5 0.0344 0.0281 0.0352 0.032
6 0.0549 0.0403 0.0574 0.0426
7 0.0323 0.0198 0.0351 0.0232
8 0.0493 0.0375 0.0488 0.0396
9 0.051 0.0354 0.0546 0.0377
10 0.021 0.024 0.0245 0.0305
Average 0.0411 0.0358 0.044 0.0397
310 Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315
Table 7
The performance of the distance-weighted fuzzy predictor at kZ3
Metrics No. of run Fuzzy functions Metrics No. of run Fuzzy functions
Trapezoidal Step-wise Trapezoidal Step-wise
mae 1 0.0569 0.066 rae 1 0.2756 0.3114
2 0.015 0.0262 2 0.0692 0.1159
3 0.0428 0.0384 3 0.1977 0.1769
4 0.0462 0.0633 4 0.2531 0.372
5 0.0327 0.0272 5 0.1762 0.141
6 0.048 0.0362 6 0.2354 0.1785
7 0.0365 0.0168 7 0.1417 0.0623
8 0.0472 0.0365 8 0.2518 0.1846
9 0.0591 0.0482 9 0.237 0.2021
10 0.0321 0.0267 10 0.1544 0.1284
Average 0.0416 0.0386 Average 0.1992 0.1873
rmse 1 0.1546 0.1617 cc 1 0.8373 0.818
2 0.0369 0.0773 2 0.9901 0.958
3 0.0946 0.085 3 0.9329 0.9488
4 0.1146 0.1269 4 0.9348 0.9113
5 0.0767 0.0717 5 0.9638 0.971
6 0.0905 0.0823 6 0.9358 0.943
7 0.0818 0.0483 7 0.9652 0.9872
8 0.1029 0.1155 8 0.899 0.8809
9 0.1259 0.1039 9 0.8946 0.9266
10 0.0963 0.0975 10 0.9276 0.9257
Average 0.0975 0.097 Average 0.9281 0.9271
rrse 1 0.5726 0.599
2 0.1447 0.3031
3 0.4013 0.3606
4 0.4146 0.4591
5 0.2893 0.2706
6 0.3903 0.355
7 0.284 0.1679
8 0.4508 0.5063
9 0.5042 0.4159
10 0.385 0.3898
Average 0.3837 0.3827
7.2. Optimization of parameters the mean absolute errors that are obtained by setting kZ3
and threshold of similarity degreeZ0.5. The different
7.2.1. Optimization of the trapezoidal and the step-wise combinations of the six parameters of the step-wise
triangular fuzzy functions triangular membership function yield Table 4, which
In order to obtain the best performance of the FCBR results from the same settings as for the former.
model, the parameters for the trapezoidal and the step- As shown in Table 3, when a equals 0.05, the Mean-
wise membership functions first had to be investigated Valued kNN results in the average mean absolute error of
since these fuzzy membership functions are used to 0.0377, and the Distance-Weighted kNN yields the
evaluate the degree of similarity. As shown in Fig. 3, the average mean absolute error of 0.0461. Both these values
trapezoidal membership function has only one parameter, are the smallest among the mean absolute errors produced
aD, to be determined (where D is known), while Fig. 4 by Mean-Valued kNN and Distance-Weighted kNN,
shows that there are 6 parameters in the step-wise respectively. Therefore, 0.05 is the local optimal value
membership function: tv, tn, ts, m(tv), m(tn), and m(ts). of a. From Table 4, when tv, tn, and ts take the values of
These parameters have been explained in Section 5.1. The 0.15, 0.25, 0.5, and m(tv), m(tn), and m(ts) are equal to
purpose of this section is to find the best values of these 0.85, 0.6, and 0.3 correspondingly, both the Mean-Valued
parameters such that this FCBR model can achieve the k-NN and the Distance-Weighted k-NN achieve the
optimal performance. Thus, the influence of these smallest mean-absolute error of 0.0345, and 0.0421,
parameters on the FCBR model is mainly explored in respectively. So this combination of the six parameters
this section. Mean Absolute Error is selected to find the is locally the best. These values are shown in bold in the
local optimal values of these parameters based on 10-fold two tables. Moreover, endless values for all parameters
cross-validation. Table 3 shows the effect of different make it impossible to try each of them, so the best
parameters of the trapezoidal membership function on discussed here is called the local, rather than global,
Table 8
The performance of the distance-weighted fuzzy predictor at kZ4
Metrics No. of run Fuzzy functions Metrics No. of run Fuzzy functions
Trapezoidal Step-wise Trapezoidal Step-wise
mae 1 0.0514 0.0617 rae 1 0.2561 0.2938
2 0.0143 0.0308 2 0.0654 0.1346
3 0.0461 0.0344 3 0.2158 0.1652
4 0.0437 0.0564 4 0.2399 0.321
5 0.0346 0.0343 5 0.1827 0.1807
6 0.0579 0.0368 6 0.2883 0.1872
7 0.0312 0.0161 7 0.1205 0.0591
8 0.0516 0.0382 8 0.2669 0.1908
9 0.0527 0.0382 9 0.2235 0.1676
10 0.0307 0.0335 10 0.1515 0.1692
Average 0.0414 0.0381 Average 0.2011 0.1869
rmse 1 0.1371 0.1591 cc 1 0.8724 0.8245
2 0.036 0.0727 2 0.9909 0.9653
3 0.036 0.0727 3 0.9491 0.9605
4 0.1113 0.1212 4 0.9414 0.9178
5 0.0757 0.0771 5 0.9616 0.9623
6 0.0934 0.069 6 0.9298 0.9594
7 0.0714 0.0473 7 0.9723 0.9879
8 0.1133 0.1157 8 0.8795 0.8817
9 0.1074 0.0903 9 0.9167 0.9427
10 0.0795 0.0951 10 0.9523 0.9287
Average 0.0861 0.092 Average 0.9366 0.9331
rrse 1 0.5077 0.5891
2 0.141 0.2853
3 0.3607 0.3074
4 0.4028 0.4386
5 0.2858 0.2911
6 0.4032 0.2979
7 0.2478 0.1644
8 0.4964 0.5071
9 0.4298 0.3613
10 0.3179 0.3802
Average 0.3593 0.3622
312 Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315
optimal value. The optimal values of these parameters are 7.2.3. Evaluation of fuzzy predictors
summarized in Table 5. The two fuzzy k-NN predictors discussed previously
were investigated, and the five methods of measuring the
errors are applied to compare the learning performance of
7.2.2. The optimal values of k for the fuzzy predictors
This section explores the influence of the number of the two predictors. All the following experiments are
nearest neighbors, k, on the fuzzy predictors. Ten-fold based on 10-fold cross-validation, threshold value of 0.5,
cross-validations result in Table 6 by setting the threshold and the local optimal values of the parameters for the
of similarity degree equal to 0.5 and the parameters for trapezoidal and the step-wise fuzzy membership functions
the fuzzy membership functions to be the optimal values identified previously. As found earlier, for the distance-
obtained previously. The best averages over 10 runs and weighted fuzzy predictor, the optimal values of k
the corresponding k values are shown in bold. As shown corresponding to the trapezoidal and the step-wise
in Table 6, the mean-valued fuzzy predictor achieved the triangular are equal, i.e. 4 obtained using Mean-Absolute
smallest average of mean absolute errors at kZ3 no Error. The optimal value of k for mae is not probably the
matter which fuzzy function was used (the trapezoidal best for other error metrics. Therefore, for verification of
0.0374 and the step-wise 0.0346). Therefore, for the the optimal value obtained by mae, all errors and
mean-valued fuzzy predictor, the optimal value of k is 3. correlation coefficients due to the above two fuzzy
For the distance-weighted predictor, the trapezoidal and membership functions are shown based on kZ3, 4, and
step-wise fuzzy functions led to the smallest values of 5. Tables 7–9 are the results obtained by setting k to 3, 4,
0.0414 and 0.0380, respectively, when k equals 4. Thus, and 5, respectively. rmse due to the trapezoidal fuzzy
the best value of k is 4 for the fuzzy distance-weighted function and rae due to the step-wise fuzzy function are
predictor. the best at kZ4, which completely matches the one
Table 9
The performance of the distance-weighted fuzzy predictor at kZ5
Metrics No. of run Fuzzy functions Metrics No. of run Fuzzy functions
Trapezoidal Step-wise Trapezoidal Step-wise
mae 1 0.0554 0.0617 rae 1 0.2713 0.2941
2 0.0261 0.0365 2 0.1179 0.162
3 0.0462 0.0355 3 0.2171 0.1702
4 0.0563 0.0574 4 0.3195 0.328
5 0.0352 0.032 5 0.1814 0.1671
6 0.0574 0.0426 6 0.287 0.2102
7 0.0337 0.0242 7 0.1279 0.0914
8 0.0488 0.0396 8 0.2552 0.211
9 0.0546 0.0377 9 0.2295 0.1717
10 0.0245 0.0305 10 0.1219 0.1549
Average 0.0438 0.0398 Average 0.2129 0.1961
rmse 1 0.1406 0.1601 cc 1 0.8628 0.8228
2 0.0513 0.0709 2 0.9827 0.9678
3 0.0789 0.0723 3 0.957 0.9599
4 0.1185 0.1201 4 0.9253 0.9228
5 0.0706 0.0735 5 0.9655 0.9645
6 0.0919 0.075 6 0.9312 0.9556
7 0.0688 0.0588 7 0.9741 0.9803
8 0.1108 0.0994 8 0.8843 0.9059
9 0.1066 0.0827 9 0.9185 0.95
10 0.0683 0.0874 10 0.9656 0.9393
Average 0.0906 0.09 Average 0.9367 0.9369
rrse 1 0.5206 0.5928
2 0.2014 0.2782
3 0.3346 0.3069
4 0.4287 0.4345
5 0.2664 0.2775
6 0.3967 0.3236
7 0.2391 0.2041
8 0.4855 0.4355
9 0.4267 0.3309
10 0.2729 0.3493
Average 0.3573 0.3533
Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315 313
Table 10
The performance of the mean-valued fuzzy predictor at kZ3
Metrics No. of run Fuzzy functions Metrics No. of run Fuzzy functions
Trapezoidal Step-wise Trapezoidal Step-wise
mae 1 0.0527 0.0613 rae 1 0.2521 0.2961
2 0.0139 0.0208 2 0.064 0.093
3 0.0405 0.037 3 0.1892 0.173
4 0.0382 0.0521 4 0.1964 0.283
5 0.0295 0.0208 5 0.1581 0.1062
6 0.0417 0.0347 6 0.2025 0.1724
7 0.026 0.0139 7 0.1014 0.0523
8 0.0475 0.037 8 0.2455 0.1882
9 0.0521 0.0451 9 0.2151 0.1919
10 0.0317 0.0233 10 0.1533 0.1139
Average 0.0374 0.0346 Average 0.1778 0.167
rmse 1 0.1519 0.1591 cc 1 0.8438 0.8244
2 0.034 0.0589 2 0.9917 0.9756
3 0.0858 0.0769 3 0.9456 0.9587
4 0.0884 0.1006 4 0.9632 0.9483
5 0.0706 0.0589 5 0.9671 0.9798
6 0.0798 0.0798 6 0.951 0.9455
7 0.0619 0.0417 7 0.9808 0.9903
8 0.0984 0.1159 8 0.9092 0.8794
9 0.1185 0.1 9 0.9042 0.9324
10 0.087 0.0833 10 0.9407 0.945
Average 0.0876 0.0875 Average 0.9397 0.9379
rrse 1 0.5624 0.5891
2 0.1334 0.2311
3 0.364 0.3263
4 0.3198 0.3641
5 0.2666 0.2224
6 0.3443 0.3443
7 0.2151 0.1447
8 0.431 0.508
9 0.4745 0.4003
10 0.3477 0.3331
Average 0.3459 0.3463
found before, but the optimal (smallest) values of all the seven optimal values are obtained when k equals 5.
other error metrics produced by both the fuzzy functions All the optimal values are highlighted in bold in
appear at kZ5, rather than 4. However, for a certain Tables 7–9, respectively.
fuzzy function, there are no significant difference between
the values of a given metric at the optimal values of k For easy comparison of the two fuzzy functions, the
obtained in this section and the previous section. For optimal values of the four error measures and the correlation
example, for the step-wise fuzzy function, the smallest coefficient produced by the fuzzy predictors in conjunction
value of rmse is 0.090 at kZ5, and is 0.092 at kZ4. This
means that the optimal values of k obtained before are Table 11
accurate enough. Therefore, the performance of the mean- The optimal values of the five metrics produced by the fuzzy predictors
valued fuzzy k-NN predictors is measured on the basis of Fuzzy predic- Metrics Optimal values
the optimal value of kZ3, achieved earlier and the results tors
are listed in Table 10. Trapezoidal Step-wise
For the distance-weighted fuzzy predictor, from Distance- mae 0.0414 0.0381
Tables 7–9, it can be found that: weighted pre- rmse 0.0861 0.09
dictor rrse 0.3573 0.3533
rae 0.1992 0.1869
1. All the optimal values of rrse and cc corresponding to cc 0.9367 0.9369
the two fuzzy functions appear at kZ5. Mean-valued mae 0.0374 0.0346
2. Generally, this predictor performs slightly worse at kZ3 predictor rmse 0.0876 0.0875
than at kZ4 and 5. However, it is slightly better at kZ5 rrse 0.3459 0.3463
rae 0.1778 0.1670
than at kZ4, since only one optimal value results from
cc 0.9397 0.9379
kZ3, the four optimal values correspond to kZ4, while
314 Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315
with these fuzzy functions are summarized in Table 11 1. It is feasible to apply fuzzy case-based reasoning
according to Tables 7–10. It should be noted that the optimal algorithms to monitor highway bridge health. The
value is the one corresponding to smallest error metrics, predictive errors by the fuzzy k-NN predictors are very
while yielding the largest correlation coefficient. small (no matter what metrics are used) and the
From Table 11, the following interesting points can be correlation coefficients reach more than 0.9.
seen: 2. Based on the actual data used in this study, the mean-
valued fuzzy k-NN predictor is more efficient than the
1. The optimal values of the five-error metrics have shown distance-weighted fuzzy k-NN predictor. Regardless of
a good performance of the fuzzy predictors because both which metrics and fuzzy membership functions are used,
the distance-weighted and the mean-valued fuzzy almost all the optimal results by the former are better
predictors lead to very small errors and very high than the corresponding ones by the latter. The only
correlation coefficients. For the distance-weighted fuzzy exception is the rmse due to the trapezoidal fuzzy
predictor, even the largest errors produced by the two membership function at kZ4 which is worse by the
fuzzy functions are very small. For example, the largest mean-value predictor than by the distance-weighted
values of mae, rmse, rrse, and rae are 0.0414, 0.0906, predictor.
0.3573, and 0.201, respectively. Also, rrse and rae can 3. Generally speaking, the step-wise fuzzy function is
be expressed as a percentage, 35.73 and 20.1% since slightly better than the trapezoidal fuzzy function.
they are relative errors. The largest value of cc is 0.9381
4. Based on the same data sets (training/test), usually the
(cc varies from 0 to 1 in this study, usually the larger the
smaller the error is, the larger the correlation coefficients
value of cc, the better the predictors are). For the mean-
are.
valued fuzzy predictor, the same is also true.
5. The threshold values for the trapezoidal and step-wise
2. For both fuzzy predictors, mae and rae due to the step-
fuzzy functions can be adjusted so that the fuzzy system
wise fuzzy function are smaller than due to
achieves the best performance.
the trapezoidal fuzzy function. Therefore, the step-wise
6. The optimal value of k depends on factors such as the
fuzzy function is best for both the mean absolute error
fuzzy membership functions and the algorithms of fuzzy
and the relative absolute error. The rmse yielded by the
predictors.
mean-valued fuzzy predictor in conjunction with the
7. The data sets (including training cases and test cases)
trapezoidal function (0.0876) is almost equal to that
heavily influence the learning performance of the fuzzy
conjunction with the step-wise function (0.0875).
Regarding which fuzzy function is best suited for the system. For example, among any 10 runs, the data sets
rrse depends on the fuzzy predictors. (training/test) for the first run always produce the larger
errors than the other data runs.
8. Conclusions
References
The results presented in this paper are part of a more
comprehensive study described in more details in a PhD [1] Aamodt A, Plaza E. Case-based reasoning: foundational issues,
dissertation [7]. The paper mainly investigates the effec- methodological variations, and system approaches. AI Commun 1994;
tiveness of monitoring bridge health using the FCBR model. 7:39–59.
The two fuzzy predictors (mean-valued and distance- [2] American Association of State highway and Transportation officials
(AASHTO), Guidelines for bridge management systems, Washington,
weighted) and the two fuzzy membership functions DC; 1993.
(trapezoidal and step-wise) have been explored here. [3] Bergmann, R. On the use of taxonomies for representing case features
Other types of membership functions (triangular and and local similarity measures. Proceeding of the 6th German
Gaussian) have being investigated but are not reported in Workshop on Case-Based Reasoning (GWCBR’98); Berlin,
this paper. Meanwhile, the threshold parameters for the Germany. 1998; 23–32.
[4] Bezdek LA. Pattern recognition with fuzzy objective function
trapezoidal and the step-wise fuzzy functions and the
algorithms. New York: Plenum Press; 1981.
number k of nearest-neighbors are locally optimized such [5] Boussabaine AH. The use of artificial neural networks in construction
that this model can achieve the best performance. All management: a review. Construction Management and Economics,
algorithms are implemented in object-oriented program- E&FN Spon 1996;14:427–36.
ming language, CCC. The performance of the developed [6] Burkhard HD, Richter MM. On the notion of similarity in case based
model is measured using five different error metrics. The reasoning and fuzzy theory. In: Pal Sankar K, Dilon Tharam S,
Yeung Daniel S, editors. Soft computing in case-based reasoning,
model is validated using the actual data sets extracted from
Chapter 2. London: Springer; 2001.
the electronic database of bridge deck inspections of KDOT [7] Cheng, Y. Development of Bridge Management Systems Using Fuzzy
based on the 10-fold cross-validation method. According to Case-Based Reasoning. PhD Dissertation. Kansas State University,
the experiments, the following conclusions can be made: Manhattan, KS, USA; 2005.
Y. Cheng, H.G. Melhem / Advanced Engineering Informatics 19 (2005) 299–315 315
[8] Cheng, Y., Melhem, H., Numeric prediction algorithms for bridge [17] Morcous G, Rivard H, Hanna AM. Case-based reasoning system for
corrosion. Proceedings of the 10th International Conference on modeling infrastructure deterioration. J Comput Civil Eng, ASCE
Computing in Civil and Building Engineering (ICCCBE), Weimar, 2002a;16(2):104–14.
Germany. 2004. [18] Morcous G, Rivard H, Hanna AM. Modeling bridge deterioration
[9] Cheng, Y., Melhem, H. Application of fuzzy case-based reasoning to using case-based reasoning. J Infrastruct Syst, ASCE 2002b;8(3):
bridge management system. Proceedings of International Conference 86–95.
on Computing in Civil Engineering, ASCE, Cancun, Mexico; 2005. [19] PONTIS, Pontis Bridge Management, Release 4, Technical Manual,
[10] Freyermuth CL, Klieger P, Stark DC, Wenke HN. Durability of American Association of State Highway and Transportation Officials,
concrete bridge decks-a review of cooperative studies. Transport Res Washington, DC; 2001.
Record, TRB 1970;328:50–60. [20] Reich Y, Fenves SJ. A system that learns to design cable-stayed
[11] Kolodner JL. Case-based reasoning.. San Francisco, USA: Morgan bridges. J Struct Eng 1995;121(7):1090–100.
Kaufmann Publisher; 1993. [21] Roddis KWM, Bocox J. Case-based approach for steel bridge
[12] Maher ML, Balachandran B. Multimedia approach to case-based fabrication errors. J Comput Civil Eng 1997;11(2):84–91.
structural design. J Comput Civil Eng 1994;8(3):359–76. [22] Shepard RW, Johnson MB. A diagnostic tool to maximize bridge
[13] Melhem HG, Cheng Y. Prediction of remaining service life of bridge longevity, investment. Transport Res News 2001;215:6–11.
decks using machine learning. J Comput Civil Eng, ASCE 2003; [23] Thompson, PD., Shepard, RW. AASHTO Commonly-recognized
17(1):1–9. bridge elements - successful applications and lessons learned.
[14] Melhem HG, Cheng Y, Kossler D, Scherschligt D. Wrapper methods National Workshop on Commonly Recognized Measures for
for inductive learning: example application to bridge decks. J Comput Maintenance; 2000.
Civil Eng, ASCE 2003;17(1):46–57. [24] Washburn L. Personal contact. Bridge evaluation engineer, KDOT,
[15] Melhem H, Cheng Y, Kossler D. Fuzzy case-based reasoning for Topeka, Kansas 2002;2002.
bridge management. Proceedings of the 11th International Workshop [25] Watson I. Applying case-based reasoning: techniques for enterprise
for Intelligent EG-ICE. Weimar, Germany 2004;2004:76–85. systems. San Mateo, CA, USA: Morgan Kaufmann Publisher; 1997.
[16] Morcous, G. Case-based reasoning for modeling bridge deterioration. [26] Witten IH, Frank E. Data mining. San Francisco, CA, USA: Morgan
PhD Dissertation. Concordia University, Canada; 2000. Kaufmann Publisher; 2000.