Professional Documents
Culture Documents
AUGMENT-NEURON NETWORKS
By I-Cheng Yeh 1
ABSTRACT: In this paper, a novel neural network architecture, augment-neuron network, is proposed a~d
examined for its efficiency and accuracy in modeling concrete strength with seven factors (water/cement ratIO,
water, cement, fine aggregate, coarse aggregate, maximum grain size, and age of testing). The architectur~ of
the augment-neuron network is that of a standard back-propagation neural network, but augment neurons, I.e.,
logarithm neurons and exponent neurons, are added to the input layer and the output layer of the net~ork. Two
Downloaded from ascelibrary.org by Universidad Politecnica De Valencia on 06/26/15. Copyright ASCE. For personal use only; all rights reserved.
hundred examples were collected from actual experimental data from 15 sources. The sy~tem was tral~~d based
on 100 training examples chosen randomly from the example set, and then teste~ usmg the remaml~g 100
examples. The results showed that the logarithm neurons and exponent neurons m. the network proVIde .an
enhanced network architecture to improve performance of these networks for modeling concrete strength sIg-
nificantly. A neural network-based concrete mix optimization methodology is proposed and is verified to be a
promising tool for mix optimization.
INTRODUCTION work. Their results look very promising. Brown et al. (1991)
demonstrated the applicability of neural networks to composite
Most research in material modeling aims to construct math- material characterization. In their approach, a back-propaga-
ematical models to describe the relationship between compo- tion neural network had been trained to accurately predict
nents and material behavior. These models consist of mathe- composite thermal and properties when provided with basic
matical rules and expressions that capture these varied and information concerning the environment, constituent materials,
complex behaviors. Concrete is a highly nonlinear material, and component ratios used in the creation of the composite.
so modeling its behavior is a difficult task. Kasperkiewicz et al. (1995) demonstrated that the fuzzy-
Artificial neural networks are a family of massively parallel ARTMAP neural network can model strength properties of
architectures that solve difficult problems via the cooperation high-performance concrete mixes and optimize the concrete
of highly interconnected but simple computing elements (or mixes.
artificial neurons). Basically, the processing elements of a neu- A back-propagation neural network consists of a number of
ral network are similar to neurons in the brain, which consist interconnected processing elements (artificial neurons). The el-
of many simple computational elements arranged in layers. ements are logically arranged into two or more layers, and
Interest in neural networks has expanded rapidly in recent interact with each other via weighted connections. These scalar
years. Much of the success of neural networks is due to such weights determine the nature and strength of the influence be-
characteristics as nonlinear processing and parallel processing. tween the interconnected elements. Each element is connected
In the past decade, considerable attention has been focused on to all the neurons in the next layer. There is an input layer
the problem of applying neural networks in diverse fields, such where data are presented to the neural network, and an output
as system modeling, fault diagnosis, and control. This is be- layer that holds the response of the network to the input. It is
cause neural networks offer the advantages of performance the intermediate layers (hidden layers) that enable these net-
improvement through learning by using parallel processing. works to represent the interaction between inputs as well as
The neural network's performance can be measured by the nonlinear property between inputs and outputs. Traditionally,
speed of learning (efficiency) and generalization capability the learning process is used to determine proper interconnec-
(accuracy) of these networks. The speed of learning can be tion weights, and the network is trained to make proper as-
expressed either as CPU time or as the number of epochs sociations between the inputs and their corresponding outputs.
required for convergence of the network and thus can form Once trained, the network provides rapid mapping of a given
the basis for comparison. There is at present no formal defi- input into the desired output quantities. A thorough treatment
nition of what it means to generalize correctly, but the gen- of neural network methodology is beyond the scope of this
eralization capability of the network may be assessed based paper. The basic architecture of neural networks has been cov-
on how well it performs on the test data set. ered widely (Rumelhart et al. 1986; Yeh et al. 1992, 1993).
The back-propagation algorithm is now recognized as a The basic strategy for developing a neural-based model of
powerful tool in many neural-network applications. Most ap- material behavior is to train a neural network on the results of
plications of neural networks are based on the back-propaga- a series of experiments on a material. If the experimental re-
tion paradigm, which uses the gradient-descent method to min- sults contain the relevant information about the material be-
imize the error function (Rumelhart et al. 1986; Yeh et al. havior, then the trained neural network would contain suffi-
1992, 1993). In the area of material modeling, Ghaboussi et cient information about the material behavior to qualify as a
al. (1991) modeled the behavior of concrete in the state of material model. Such a trained neural network not only would
plane stress under monotonic biaxial loading and compressive be able to reproduce the experimental results it was trained
uniaxial cycle loading with a back-propagation neural net- on, but through its generalization capability should be able to
approximate the results of other experiments (Ghaboussi et al,
I Assoc. Prof., Dept. of Civ. Engrg., Chung-Hua Univ., 30 Tung Shiang, 1991).
Hsin Chu, Taiwan, 30067. Although a back-propagation neural network can be better
Note. Discussion open until April 1, 1999. To extend the closing date suited than an analytical approach for problems involving pre-
one month, a written request must be filed with the ASCE Manager of
Journals. The manuscript for this paper was submitted for review and
dicting the output response of a complex and nonlinear phys-
possible publication on September 20, 1996. This paper is part of the ical system to its inputs, the network's speed of learning is
Journal of Materials in Civil Engineering, Vol. 10, No.4, November, often unacceptably slow, or its generalization capability is of-
1998. ©ASCE, ISSN 0899-1561/98/0004-0263-0268/$8.00 + $.50 per ten unsatisfactorily low, for solving highly nonlinear function
page. Paper No. 14196. mapping problems. To improve the efficiency and accuracy of
JOURNAL OF MATERIALS IN CIVIL ENGINEERING / NOVEMBER 1998/283
Normalization of Data
Before the neural nets are trained, the input and output data
must be normalized. In this paper, the normalization formula o normal • logarithm • exponent
of input data is as follows: FIG. 1. Architecture of Augment-Neuron Network
ma • - Y min
.( yd
max -
yd )
min + d
Ymln (2)
where lJ =jth output value of the training data; and CJ = output
of the jth logarithm neuron in the output layer.
where Ynew = normalized data of the output variable; YOld = The exponent neuron in the output layer will receive natural
original data of the output variable; Ymin = minimum value of exponent transformation of the corresponding output value of
the output variable; Yma • = maximum value of the output var- the training data with the following formula:
iable; Y~in = minimum desired value of output variable; and
Y~a. = maximum desired value of output variable. DJ = exp(0.6931l'j) - I (6)
This formula can transfer the minimum value of the output
variable into the minimum desired value, as well as transfer where DJ = output of the jth exponent neuron in the output
the maximum value of the output variable into the maximum layer.
desired value. In this approach, the minimum and maximum
desired values of the output variable are set to 0.2 and 0.8,
respectively.
Network Architecture
The architecture of the augment-neural network is that of a
standard back-propagation neural network, but logarithm neu-
rons and exponent neurons are added to the input layer and
the output layer of the network (Fig. 1).
The logarithm neuron in the input layer will receive natural
logarithm transformation of the corresponding input value of
the training data with the following formula:
Ai = In(1.l75Xi + 1.543) (3)
o (12)
o 0.2 0.4 0.6 0.8 1
Output Value Two hundred examples were collected from actual experi-
mental data from 15 sources (Lay 1993). Their ranges are
FIG. 3. Transfer Function of Augment Neurons on Output listed in Table 1. From these examples, 100 are sampled ran-
Layer domly as training examples, and the remaining 100 are re-
garded as testing examples.
Under (5) and (6), 0 will be transferred into 0 identically,
and 1 will be transferred into 1 identically (Fig. 3).
Network Parameters
In addition to network architecture, the learning rule is the
same as the standard back-propagation neural network, that is, The values of network parameters considered in this ap-
the general delta rule (Rumelhart et al. 1986) will be employed proach are as follows:
to modify the connection weights of the network. In this ar-
chitecture, the output neurons that receive the original output
• Number of hidden layers = 1.
values will provide the reasoning output values.
• Number of hidden units:
1. Back-propagation network: 4· (number of input varia-
Learning Rate and Momentum Factor bles + number of output variables).
In this approach, the learning rate and the momentum factor 2. Augment-neuron network: 2· (number of input varia-
of the general delta rule decay under the following formulas: bles + number of output variables).
• Learning rate: initial value = 5.0; reduced factor = 0.95;
TI'+I = r,,' TI, ~ TImiD (7) minimum value = 0.1.
• Momentum factor: initial value = 0.5; reduced factor =
(8) 0.95; minimum value = 0.1
where TI = learning rate; r" = reduced factor of learning rate; • Learning cycles = 3,000.
= minimum bound of learning rate; lX =momentum factor;
TImID
ra = reduced factor of momentum factor; and lX miD = minimum Training Results
bound of momentum factor.
The neural network's performance can be measured by ac-
MODELING OF CONCRETE STRENGTH curacy and efficiency. To evaluate the network's accuracy,
root-mean-square (RMS) is adopted. Table 2 shows the results
System Models of these four models by standard back-propagation neural net-
work and the augment-neuron network, and verifies that the
Compressive strength of concrete is a function of the fol- augment-neuron network is superior to the standard back-prop-
lowing seven input features: agation neural network in generalization capabilities in all four
models; model 1 is the best model. The predicted values of
1. Water/cement ratio (WC) model 1 with the two networks compared with values actually
2. Cement (C) (kg/m 3) observed in tests are shown in Figs. 4 and 5.
3. Water (W) (kg/m 3 ) The efficiency varies with the number of training iterations
4. Fine aggregate (FA) (kglm 3) the network must perform to reach a specific error limit. Fig.
5. Coarse aggregate (CA) (kglm 3) 6 shows the convergence histories of model 1 with the two
6. Maximum grain size (MG) (mm) networks, and demonstrates that the augment-neuron network
7. Age of testing (T) (days) is superior to the standard back-propagation neural network in
Four models are considered in this approach: TABLE 2. Training Results of BPN and Augment-Neuron Net-
works
• Modell. Seven-inputs model
Back-Propagation Augment-Neuron
f: =fMODEL1(WC, C, W, FA, CA, MG, T) (9) Network Network
RMS of RMS of RMS of RMS of
• Model 2. Six-inputs model Strength train set test set train set test set
f: =fMODEL2(C, W, FA, CA, MG, T) (10) model (MPa) (MPa) (MPa) (MPa)
(1 ) (2) (3) (4) (5)
• Model 3. Five-inputs model Model I 3.80 4.13 2.95 3.69
Model 2 5.44 5.78 3.96 4.11
f: =fMODEL3(WC, FA, CA, MG, T) (11) Mode13 5.41 5.78 3.82 4.48
Model 4 4.57 5.29 3.64 4.15
• Model 4. Two-inputs model
5
'--I--- TABLE 4. Training Results Based on Statistical Techniques
RMS of train set RMS of test set
o Statistical techniques (MPa) (MPa)
o 500 1000 1500 2000 2500 3000 (1 ) (2) (3)
speed of learning because the augment-neuron network takes TABLE 5. Data of Analysis
fewer iterations (about 870) to reach the error, while the stan-
dard back-propagation neural network takes 3,000 iterations to Variable Case 1 Case 2 Case 3 Case 4 Case 5
reach it. (1 ) (2) (3) (4) (5) (6)
Water/cement ratio 0.5 0.6 0.7 0.8 0.9
Water (kglm3 ) 180 180 180 180 180
Comparison with Other Neural Networks Cement (kglm3 ) 360 300 257 225 200
Fine aggregate (kglm 3) 700 750 800 850 900
To compare with other neural networks, the aforementioned Coarse aggregate (kglm3 ) 1,170 1,171 1,157 1,134 1,105
model 1 (the seven-inputs model) is remodeled with the fol- Note: Maximum aggregate size = 19.05 mm.
lowing famous neural network paradigms:
266/ JOURNAL OF MATERIALS IN CIVIL ENGINEERING / NOVEMBER 1998
i
in Table 5. Fig. 7 shows the water/cement ratio-strength
curves under various ages (14, 28, 90, and 365 days). Fig. 8
shows the age-strength curve under various water/cement ra- 1400
tios (0.5, 0.6, 0.7, 0.8, and 0.9).
~ 1300
OPTIMIZATION OF CONCRETE MIX i
\,,)
1200
i
30
coarse aggregate (CA), and water/cement ratio (WC); and Pw,
Vl 20 Pc, PFA, and PeA = specific gravity of water, cement, fine ag-
gregate, and coarse aggregate.
10 In this approach, it is assumed that C w = 0,01; Cc = 2.8;
CFA = 0.264; CeA = 0.264; Wmin = 150; Wmax = 325; C m1n = 200;
Cmax = 425; FA min = 600; FA max = 950; CA min = 850; CAm.. =
0
1,150; WCm1n = 0.35; WC max = 0.9; Pw = 1.0; Pc = 3.15; PFA =
0.4 0.5 0.6 0.7 0.8 0.9
2,65; PCA = 2.65; T (age of testing) = 28 days; and MG (max-
W/C Ratio imum grain size) = 19.05 mm (3/4 in,).
FIG. 7. Water/Cement Ratio and Strength of Concrete Curves Because this problem has only three independent design
variables (W, C, and CA), a simple random search can get the
80 near-optimum solution. Therefore, 20,000 random combina-
-W/C=0.5 tions of these three design variables (water, cement, and coarse
70 ~
-W/C=O.6 aggregate) between the minimum and maximum bound [(18),
60 ~ --W/C=O.7 (19), and (21)] were generated, and fine aggregate was cal-
-W/C=O.8 culated with (22). These combinations were tested by (20) and
~~/C=o.91--
.---- ..... _.
(23) to generate feasible mix designs. Then their strengths
were predicted with the augment-neuron network based on the
Z
seven-inputs model, and the results are shown in Fig. 9, The
~ optimum mix design can be deduced with the filter [to satisfy
20 D"
/' ~ (17)] and sort [to optimize (16)] functions of a commercial
spreadsheet. Table 6 shows the optimum mix designs for var-
;.-.
10 • ious desired concrete strength.
o CONCLUSIONS
o 20 40 60 80 100
Age (days)
Concrete is a highly nonlinear material, so modeling its be-
havior is a difficult task. An artificial neural network is a good
FIG. 8. Age and Strength of Concrete Curves tool to model nonlinear systems. The major disadvantage of
JOURNAL OF MATERIALS IN CIVIL ENGINEERING 1 NOVEMBER 1998/267
parameters in the seven-input model, WIC, is made up of two Rumelhart, D. E., Hinton, G. E., and Williams, R. J. (1986). "Learning
internal representation by error propagation." ParaLLel Distributed
other parameters, Wand C. In other words, even in the seven- Processing, Vol. I, D. E. Rumelhart and J. L. McClelland, eds., MIT
input model, there are only six independent physical param- Press, Cambridge, Mass., 318 - 362.
eters. Why do the two models give different results? The rea- Schultz, A. (1991). "Differentiating similar patterns using a weight-decay
son is that the six-input model must discover the dominant term." J. Neural Network Computing, Winter, 5-14.
factor of concrete strength WIC itself, so it needs more efforts Specht, D. F. (1991). "A general regression neural network." IEEE Trans.
than the seven-input model. Although an artificial neural net- on. Neural Networks, 2(6), 568-576.
Yeh, I.-C., Kuo, Y.-H., and Hsu, D.-S. (1992). "Building an expert system
work can discover complex relations in data, the previous ex- for debugging FEM input data with artificial neural networks." Expert
perience and knowledge in a specific domain are very useful Sys. with Applications, 5, 59-70.
to build a more accurate model efficiently. Yeh, I.-C., Kuo, Y.-H., and Hsu, D.-S. (1993). "Building KBES for di-
Like other data-fitting techniques, the neural network only agnosing PC Pile with artificial neural network." J. Computing in Civ.
processes predictive capability within the range of data em- Engrg., ASCE, 7(1), 71-93.
ployed for model fitting. The range of applicability of the pres-
ent work is limited to the range of the various parameters of APPENDIX II. NOTATION
experimental data listed in Table 1.
The following symbols are used in this paper:
Future efforts will be directed toward other components like
superplasticizers, fly ash, and blast furnace slag. In addition,
other factors that can significantly affect the strength of con-
C = amount of water in mix;
crete, like its casting and curing temperature, will be consid-
CA = amount of coarse aggregate in mix;
ered.
C w , Cc , CFA> CCA = unit price of water, cement, fine aggregate,
and coarse aggregate;
APPEND~I. REFERENCES
= amount of fine aggregate in mix;
= concrete strength;
Brown, D. A., Murthy, P. L. N., and Berke, L. (1991). "Computational = desired concrete strength;
simulation of composite ply micromechanics using artificial neural net- ra = reduced factor of momentum factor;
works." Microcomputers in Civ. Engrg., 6, 87-97.
Cho, S.-B., and Kim, J. H. (1991). "A fast back-propagation learning
r" = reduced factor of learning rate;
W= amount of water in mix;
method using Aitken's process." Int. J. Neural Networks, 2(1), 37-47. momentum factor;
Ghaboussi, J., Garrett, J. H., and Wu, X. (1991). "Knowledge-based mod-
amin = minimum bound of momentum factor;
eling of material behavior with neural networks." J. Engrg. Mech.,
ASCE, 117(1), 129 - 134. II learning rate;
Gunaratnam, D. J., and Gero, J. S. (1994). "Effect of representation on llmin minimum bound of learning rate; and
the performance of neural networks in structural engineering applica- Pw, Pc, PFA, PCA = specific gravity of water, cement, fine aggre-
tions." Microcomputers in Civ. Engrg., 9, 97 -108. gate, and coarse aggregate.