You are on page 1of 4

As already stated, the synaptic weights are initialized to the yield those output vectors.

those output vectors. The present network only classifies


normalized input vector matrix as the input data into a number of classes. Therefore, the output
vectors presented as Y 10 Y2, and Y 3 are not required by the
[W] = [i] (22) network. Moreover, in illustration 2, in the sentence "It may
When a new mode shape with noise, as shown in Fig. 4, is be calculated that the two patterns that are distorted versions
presented as of input vectors Xl and X 3 have their Hamming distances
'closer' to that of Xl and X3," Xl and X 3 should be replaced
(X) = (0.1, 0.5, 0.35, 1.0) (23) by X2 and X 3 •
the corresponding normalized input is
APPENDIX. REFERENCE
(l) = (0.08, 0.42, 0.29, 0.85) (24) Mukherjee, A., and Kamat, R. (1997). "A self-organising neural network
in damage assessment of multi-storeyed building frames." J. Sound
The output vector is calculated as and Vibration, in press.
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/23/18. Copyright ASCE. For personal use only; all rights reserved.

(l). [i] = (0.92, -0.142,0.285, -0.175) (25)


which shows mode 1 is the winner and represents the first
mode shape; similarly, other classes can be determined. Hence
the procedure involves no training. The unsupervised learning
network presented in this discussion does not involve training
and is capable of learning immediately. Also, the network is CONSTRUCTABILITY ANALYSIS:
highly noise tolerant. The discussers have already applied this MACHINE LEARNING ApPROACIf'
kind of training for counterpropagation networks to various
problems (Rajasekaran 1997; Rajasekaran and Pai 1996, 1997;
Rajasekaran and Prameela, in press, 1998) and found it to be
suitable for complicated problems. Discussion by Yoram Reich4
APPENDIX. REFERENCES
The first 1997 issue of the Journal of Computing in Civil
Rajasekaran, S. (1997). "Training free counterpropagation neural network Engineering includes two papers on the use of machine learn-
to pattern recognition in fabric defects." Textile Res. J., 67(6).
Rajasekaran, S., and Pai, G. A. V. (1996). "Training free counterpropa- ing (ML) for two different purposes: improving understanding
gation networks as fuzzy rule processors." Proc. Int. Con! on Neural (the paper under discussion) and making predictions (Wang
Information Processing, Hong Kong, China. and Gero 1997). When creating knowledge by ML or in any
Rajasekaran, S., and Pai, G. A. V. (1997). "Training free counterpropa- other way, it is critical to assess its quality (Reich 1995). This
gation networks as static heteroassociative memories." Indian J. assessment depends on the way one defines knowledge or on
Engrg. and Mat. Sci., New Delhi, India, 4,245-253. the purpose of learning.
Rajasekaran, S., and Prameela, K. S. (1998). "Computing with words
with fuzzy backpropagation and training free counterpropagation net- Artificial intelligence has developed two definitions of
works." J. Struct. Engrg., in press. knowledge: structural and functional (Reich 1995). According
to the structural definition, knowledge is whatever is repre-
sented in a system (e.g., facts, axioms, rules, causal relations,
etc.). Knowledge created by ML in comprehensible forms,
Closure by Abhijit Mukherjee4 e.g., rules, may be assessed qualitatively, for example, by ex-
perts' interpretation of learned knowledge or by comparison
with expert-generated rules. This assessment can serve as the
The writer thanks the discussers for their interest in the pa- evaluation of the ML program that created this knowledge.
per. The discussers argue that the network training can be This is the evaluation approach described by the authors.
eliminated when the input and the synaptic weights are nor- According to the functional definition, knowledge is what a
malized. The writer does not agree with this view. Through system has that allows it to perform. Thus, knowledge cannot
training, the synaptic weights are updated continuously to pre- be assessed directly, but rather, indirectly through observing
serve in them a statistical mean of the input patterns. There- the performance of the system manipulating it. Knowledge
fore, the synaptic weights continuously adapt to the input. This created by ML could be evaluated by assessing how it is used
is not possible when the training process is eliminated. The to perform tasks. For example, learned knowledge could be
illustrations presented by the discussers simply demonstrate evaluated by comparing its predictive accuracy with some
that the network has already trained (i.e., the synaptic weights baseline performance. This baseline may be another ML tech-
represent the classes properly). The training of the illustrations nique, other computational tools such as expert systems, de-
presented is easy to the extent that the synaptic weights can fault rules, or expert judgment.
be guessed manually. This is due to a good separation between The best assessment of a ML program is through assessing
the classes present. However, when the input needs to be clas- the practical benefit of the deployed system. However, this
sified with a finer resolution, training is mandatory. The writer state has not yet been reached in the application of ML in civil
has demonstrated this in later research works (Mukherjee and engineering (Reich 1997). Obviously a combination of all
Kamat, in press, 1997). these assessments is recommended. Thus, at the level of eval-
The writer would like to indicate a few points in the dis- uation, the two aforementioned papers complement each other;
cussers' note that may lead to confusion of the readers. In together, they give an opportunity to elaborate on the evalu-
illustration 1 the output vectors are presented as Y 10 Y2, and ation of ML programs and correct several details found in
Y 3' It is also indicated that the network yields those output these papers. (A procedural reason requires that the discussion
vectors. The output vectors are not necessary for the self-or-
ganizing network presented here. Neither does the network 'January 1997, Vol. 11, No. I, by Miroslaw Skibniewski, Tomasz
Arciszewski, and Kamolwan Lueprasert (Paper 11420).
·Assoc. Prof., Dept. of Civ. Engrg., Indian Inst. of Technol., Bombay ·Sr. Lect., Dept. of Solid Mech., Mat., and Struct., Facu. of Engrg.,
400 076, India. Tel Aviv Univ., Ramat Aviv 69978, Israel. E-mail: yoram@eng.tau.ac.il.

184/ JOURNAL OF COMPUTING IN CIVIL ENGINEERING / JULY 1998

J. Comput. Civ. Eng., 1998, 12(3): 164-166


of these papers be split into two separate discussions; never- best performance in a leave-one-out experiment. The optimal
theless, contentwise they should have been discussed together.) accuracy was 74.2% compared to the default rule, whose ac-
The authors describe the use of ML to extract knowledge curacy is 45.2%, and compared to the pessimistic 64.0% with
about constructability. They demonstrate this by applying the 0.15% standard deviation in a hold-out test with 100 iterations,
learning program INLEN to a database of 31 examples. They each using 22 training examples and nine for testing. Note that
create two rule sets using two modes, generalization and spe- when ML is used for the purpose of understanding, one can
cialization, of the program. They report that the "rules were use the optimal set of parameters or even use inferior para-
found by practicing engineers as acceptable in the context of meters. [One cannot simply use the optimal parameters for the
examples used to produce them." This knowledge assessment performance purpose as done by Wang and Gero (1997).] The
can be improved by following a careful procedure including guiding principle is improved understanding and not blind
careful evaluation by experts augmented by quantitative anal- model quality in a performance test. To illustrate, another rule
ysis of the rules, as discussed later. set created by using a different type of rule error estimation
Qualitative expert evaluation is clearly subjective and has achieved an accuracy of 87.1 % in the leave-one-out experi-
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/23/18. Copyright ASCE. For personal use only; all rights reserved.

to be performed with care (Reich 1997). It is critical to per- ment. Nevertheless, the rules created were very specific and
form it with several experts to get an intersubjective opinion. thus are not too interesting from the understanding purpose
If an evaluation involves comparing learned rules with other viewpoint.
knowledge, a particular setup should be followed in which one To further illustrate the usefulness of using multiple models
group of experts evaluates the learned rules with respect to a for better understanding, the discusser ran the decision tree
baseline and another group-serving as the control group- induction program NewID (Boswell 1990) on the same data
evaluates another arbitrary set of rules compared to the same The performance of NewlD in a leave-one-out test was 67.7%
baseline. The experts must not know to which of the groups accuracy, suggesting that its models are "less accurate" than
they belong. It is outside the scope of this discussion to explain CN2 on this data. The decision tree created (after simple
statistical experimental design in depth; nevertheless, without "postprocessing") was
careful attention to such design, results would be unreliable.
The discusser ran the learning program CN2 (Clark and Ni-
NoSla
blett 1989) on the 31 examples to compare its performance One:BeCha1
with INLEN. (I thank Peter Clark and Robin Boswell for the SlichgeReinf:ReRa
access to CN2 and NewID.) CN2 employs the covering al- Lov:CoBeRa1
gorithm used by INLEN (or AQI5). In addition, it uses en-
tropy measurement to assess the quality of rules, thus can han- Lov: EXCELLENT 1
dle noise well. This ability could lead CN2 to produce rules High: Good 1
that are more general than those of INLEN. The rules extracted Average: GOOD 1
by CN2 are given in Table 5. The presentation follows the
self-explanatory format of Table 3. In addition, the last column
AIIChgeReinf: GOOD 2
states how many examples of the same decision a rule covers. None: EXCELLENT 2
No rule covered examples of other decisions. In this sense, SameTvo:BeCha1
the rules are an accurate reflection of the data. <> SlichgeReinf: POOR 3
Rules 1 and 4 are more general than their corresponding
rules in Table 3, whereas rule 5 is the same for both programs. SlichgeReinf: GOOD 4
From the table, and as predicted, CN2 rules tend to be more DiffTvo:BeCha1
general than INLEN rules and still correct. A rule that covers <> None: POOR 8
more examples is more "reliable" and interesting from an None: GOOD 2
understanding viewpoint. Given the small data set, the table
shows that there are two or at most three interesting rules: 1, None:ReRa
5, and 10. The rest cover too few examples to warrant assess- Lov: EXCELLENT 3
ment. Average: GOOD 4
When using ML for creating models (e.g., rules) that im-
prove understanding, it is fruitful to use several tools because
each provides a different perspective of the data (Reich et al. The number after the decision is the number of examples this
leaf covers. Interesting rules can be created by traversing the
1996). The utility of each model depends on its evaluation by
tree from its root to some leaves. For example, the rule
experts and by other assessments. One such assessment deals
with how well the rules reflect the data. This can be answered IF (NoSla = DiffTwo) & (BeChal ( ) None)
quantitatively by executing a performance test. In order to ex- THEN ConEva = POOR
plain it, the discusser executed these tests with CN2. The basic
test involves checking how the rules reconstruct the decisions covers eight examples, thus is fairly general. This rule does
of the examples. This is a coverage or a resubstitution test that not appear in Table 5 or Table 3. It might be interesting to ask
produces an optimistic upper-bound estimation of accuracy. experts whether it is interesting. In summary, although we
The rules created by CN2 achieve 100% accuracy in this test. would avoid using models that are poor in performance tests,
Another common test used on ML programs, the hold-out, a "slightly worse" model from a performance viewpoint can
uses about 70-80% of the examples for training and the rest still give a different and complementary perspective, thus im-
for testing. It can be pessimistic and is inaccurate for small proving understanding.
data sets (i.e., those having less than about 1,000 examples in Compared to CN2 or NewID, it might be that INLEN per-
the testing set). The better tests are the k-fold cross-validation formance on a leave-one-out test is better, but this figure is
(CV) or leave-one-out. These tests can be improved by using not provided. Therefore, it is hard to assess the suitability of
the bootstrap method. See Reich (1997) for an elaboration on using INLEN to extract knowledge from the database. It is
this subject. also hard to assess the quality of the data Arciszewski et al.
The rule set in Table 5 was created using parameters whose (1995) discussed the complete process of using ML programs
combination, lest the type of rule error estimation, led to the for knowledge extraction including the process of collecting
JOURNAL OF COMPUTING IN CIVIL ENGINEERING / JULY 1998/165

J. Comput. Civ. Eng., 1998, 12(3): 164-166


TABLE 5 . Rul•• Created by CN2
Decisions Conditions
CoBeRa1 CoBeRa2 BeCha1 BeCha2
(column-to- (column-to- NoSla NoWall (changes in (changes In Example
ConEva ReRa beam beam (number of (number of reinforcement reinforcement covered
Rule (construction (reinforcement reinforcement reinforcement slabs attached walls attached and size of and size of of same
number evaluation) ratio) ratio 1) ratio 2) to beam) to beam) left beam) right beam) decision
(1 ) (2) (3) (4) (5) (6) (7) (8) (9) (10)
I Poor High DiffI'wo S
2 Poor High One 3
3 Poor DiffI'wo 3
4 Poor High 3
S Good Average Low 6
6 Good High None SlichgeReinf 3
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/23/18. Copyright ASCE. For personal use only; all rights reserved.

7 Good Average One 2


8 Good Low SameTwo 3
9 Good Low High 2
10 Excellent Low One 4
11 Excellent Low None 2
Note: DiffI'wo =different slabs or walls on each Side; SameTwo = same slabs or walls on both sides; SlichgeReinf = slight changes in reinforcement.

examples, the evaluation of results, and the assessment of the ments about the evaluation of quality of engineering decision
"quality" of the data. It is recommended that when using rules will initiate a discussion about this critical subject for
ML programs, a similar process to the one discussed by the future of machine learning in engineering. The second
Arciszewski et al. (1995) or by Reich (1997) should be used. writer became involved in work on automated knowledge ac-
One last minor comment: The authors confused COBWEB quisition in the mid-1980s. He soon realized that a formal and
with CLUSTER, which is also an unsupervised learning pro- objective evaluation of quality of acquired decision rules is the
gram, in their discussion of BRIDGER (page 10). key to the acceptance of learning systems by the engineering
community. For this reason, he developed in 1989 an initial
APPENDIX. REFERENCES outline of a method for evaluation of a collection of examples!
decision rules in the context of case-based optimization (Ar-
Arciszewski, T., Michalski, R. S., and Dybala, T. (1995). "STAR meth- ciszewski and Ziarko 1991). In 1992, he conducted studies of
odology-based learning about construction accidents and their preven-
tion." Automation in Constr., 4(1), 75-85.
performance-based evaluation of learning systems and of col-
Boswell, R. (1990). "Manual for NewID version 4.1." Tech. Rep. TIl lections of decision rules. This work resulted in a formal eval-
P21541RAB1412.3, Turing Institute, Glasgow, U.K. uation method presented in Arciszewski et al. (1992). The
Reich, Y. (1995). "Measuring the value of knowledge." Int. J. Human- method has been used extensively in various machine learning
Compo Studies, 42(1), 3-30. experiments conducted at George Mason University dealing
Reich, Y. (1997). "Machine learning techniques for civil engineering with symbolic selective and constructive induction-based
problems." Microcomputers in Civ. Engrg., 12(4),295-310.
Reich, Y., Medina, M., Shieh, T.-Y., and Jacobs, T. (1996). "Modeling
learning systems during the period 1993-1996. The results of
and debugging engineering decision procedures with machine learn- this work, which clearly demonstrated advantages and limita-
ing." J. Computing in Civ. Engrg., ASCE, 10(2), 157-166. tions of performance-based evaluation of quality of decision
Wang, W., and Gero, J. S. (1997). "Sequence-based prediction in con- rules, are presented in Szczepanik et al. (1996). Therefore, he
ceptual design of bridges." J. Computing in Civ. Engrg., ASCB, 11(1), recently proposed a semantic evaluation of decision rules in
37-43. the context of the background knowledge understood as a hi-
erarchical system of concepts and their relationships in a given
domain. This method is presented in Arciszewski (1998).
In the context of the writers' experience in the area of qual-
Closure by Tomasz Arciszewski5 ity evaluation of decision rules, they entirely agree with the
discusser's comments, which complement the paper and the
writers' own discussion of the acquired decision rules. How-
The writers appreciate the comments of the discusser. He
ever, one should be very careful in reaching definite conclu-
correctly pointed out that it is critical to assess the quality of
sions while working with only 31 examples. In this case, any
the automatically acquired knowledge. However, the objective
formal performance-based analysis of decision rules can ob-
of the paper was to "describe a novel approach to construct-
viously be conducted only in the context of those examples.
ability knowledge acquisition based on the use of machine
Moreover, in the case of such a small number of examples the
learning and to demonstrate its feasibility," not to produce any
use of statistical methods becomes quite controversial. There-
specific domain-related knowledge and to verify its quality.
For all these reasons, the writers used only a very small num- fore, if the objective is to learn about a given domain, one
ber of examples. Thus, the use of statistical performance eval- should use a much larger number of examples of a balanced
uation methods for such a small set of examples would simply nature within this domain. That would give one a chance to
be questionable, at best. Therefore, the issue of knowledge acquire a larger, or better, collection of decision rules that
quality was only marginally addressed in the paper. The writ- could be correctly evaluated using statistical methods.
ers are aware, however, of the importance of this issue and it The comment about using several tools to learn about a
is their pleasure to provide a response to the discussion. This given domain addresses a very complex issue of knowledge
response is difficult, however, because the discusser mostly acquisition. The writers agree with the discusser that using
refers his comments to a paper that at the time of writing is various tools may produce complementary results in some
still in print. cases, particularly when significantly different tools are used,
The writers hope that the discusser's and their own com- for example tools producing decision rules and decision trees,
utilizing selective and constructed attributes, etc. However, the
'Assoc. Prof. of Urban Sys. Engrg., George Mason Univ., Fairfax, VA recent experiments of Arciszewski et al. (1997) clearly dem-
22030-4444. onstrated that this may not be always the case. When two
166/ JOURNAL OF COMPUTING IN CIVIL ENGINEERING / JULY 1998

J. Comput. Civ. Eng., 1998, 12(3): 164-166


advanced symbolic selective learning systems are used, and is reached of the effect of the underlying learning algorithms
both are comparable in tenns of their sophistication and per- and of tuning on the performance of a learning system.
formance measured by various empirical error rates, decision Again, the writers express their sincere appreciation for the
rules produced can be surprisingly similar, if not identical, discusser's stimulating discussion and hope that it will bring
even if the learning systems are based on entirely different about an increase in the attention of the research community
learning algorithms. In the case of the experiments referred to, as well as of practicing engineers to the potential of machine
Arciszewski et al. (1997) used a system based on the theory learning in solving civil engineering problems.
of rough sets (Pawlak 1982, 1991) and a system based on The writers appreciate the correction made by the discusser
STAR methodology (Michalski 1983; Michalski et al. 1986). regarding his research on machine learning in design knowl-
The results of learning conducted on various collections of edge acquisition, conducted at Carnegie-Mellon University.
examples were nearly identical. It is admitted, however, that BRIDGER is based on the learning program COBWEB, and
in this particular case both systems were used with their de- not on CLUSTER as stated in the paper.
fault settings. As is known, more advanced learning systems
Downloaded from ascelibrary.org by Universidad Nacional De Ingenieria on 10/23/18. Copyright ASCE. For personal use only; all rights reserved.

allow their tuning to a given collection of examples in the APPENDIX. REFERENCES


context of learning objectives. Therefore, both systems used
in the experiments could produce significantly different results, Arciszewski, T. (1998). "Engineering semantic evaluation of decision
rules." J. Intelligent and Fuzzy Sys., 5(3).
if desired. In any case, the "discovery" of the similarity of Arciszewski, T.• Ardayfio, D., and Doulamis, J. (1997). "Automated
results when using different learning systems may be signifi- knowledge acquisition in proactive design." Proc. ASME Engrg. Des.
cant in engineering for several reasons. First, it may indicate Con/., American Society of Mechanical Engineers, New York, N.Y.,
that learning systems have already reached such a level of 1-11.
maturity that their use in engineering is justified. Second, it Arciszewski, T., Dybala, T., and Wnek, T. (1992). "A method for eval·
may suggest that the selection of a learning system for a given uation of learning systems." J. Knowledge Engrg. "Heuristics." 2(4),
22-31.
task may not be as significant as previously thought. As is Arciszewski, T., and Ziarko, W. (1991). "Structural optimization: A case·
known, many developers of learning systems claim unique fea- based approach." J. Computing in Civ. Engrg., ASeE, 5(2), 159-174.
tures and superior performance of their systems with respect Michalski, R. S., Hong, J., and Mozetic, I. (1986). "AQI5: Incremental
to the other systems available. These claims confuse potential learning of attribute-based descriptions from examples, the method and
users of learning systems who are simply unable to conduct user's guide." Dept. of Camp. Sci. Rep., University of nlinois, Urbana,
extensive comparison tests before performing knowledge ac- Ill.
quisition. Consequently, engineers are reluctant to use learning Pawlak, Z. (1991). Rough sets: Theoretical aspects of reasoning about
data. Kluwer Academic Publishers Group, Boston, Mass.
systems, and the progress in this area is unnecessarily delayed. Szczepanik, w., Arciszewski, T., and Wnek, J. (1996). "Empirical com-
In any case, much more research has to be conducted on com- parison of selective and constructive induction." Engrg. Applications
parison of various learning systems before a full understanding of Artificial Intelligence. 9(6), 627 -639.

JOURNAL OF COMPUTING IN CIVIL ENGINEERING / JULY 1998/187

J. Comput. Civ. Eng., 1998, 12(3): 164-166

You might also like