Machine Learning Predicts Concrete Strength

Construction and Building Materials 73 (2014) 771–780
Contents lists available at ScienceDirect
Construction and Building Materials

journal homepage: www.elsevier.com/locate/conbuildmat
Machine learning in concrete strength simulations: Multi-nation data

analytics
Jui-Sheng Chou a,⇑, Chih-Fong Tsai b, Anh-Duc Pham a,c, Yu-Hsin Lu d
a
Department of Civil and Construction Engineering, National Taiwan University of Science and Technology, Taiwan
b
Department of Information Management, National Central University, Taiwan
c
Faculty of Project Management, The University of Danang, University of Science and Technology, Vietnam
d
Department of Accounting, Feng Chia University, Taiwan
h i g h l i g h t s
This comprehensive study used advanced machine learning techniques to predict concrete compressive strength.
Model performance is evaluated through multi-nation data simulation experiments.
The prediction accuracy of ensemble technique is superior to that of single learning models.
This study developed advanced learning approaches for solving civil engineering problems.
The approach also has potential applications in material sciences.
a r t i c l e i n f o a b s t r a c t
Article history: Machine learning (ML) techniques are increasingly used to simulate the behavior of concrete materials
Received 30 May 2014 and have become an important research area. The compressive strength of high performance concrete
Received in revised form 5 September 2014 (HPC) is a major civil engineering problem. However, the validity of reported relationships between con-
Accepted 24 September 2014
crete ingredients and mechanical strength is questionable. This paper provides a comprehensive study
Available online 1 November 2014
using advanced ML techniques to predict the compressive strength of HPC. Specifically, individual and
ensemble learning classifiers are constructed from four different base learners, including multilayer per-
Keywords:
ceptron (MLP) neural network, support vector machine (SVM), classification and regression tree (CART),
High performance concrete
Compressive strength
and linear regression (LR). For ensemble models that integrate multiple classifiers, the voting, bagging,
Multi-nation data analysis and stacking combination methods are considered. The behavior simulation capabilities of these tech-
Machine learning niques are investigated using concrete data from several countries. The comparison results show that
Ensemble classifiers ensemble learning techniques are better than learning techniques used individually to predict HPC com-
Prediction pressive strength. Although the two single best learning models are SVM and MLP, the stacking-based
ensemble model composed of MLP/CART, SVM, and LR in the first level and SVM in the second level often
achieves the best performance measures. This study validates the applicability of ML, voting, bagging, and
stacking techniques for simple and efficient simulations of concrete compressive strength.
Ó 2014 Elsevier Ltd. All rights reserved.
1. Introduction are often used to increase the compressive strength and durability
of HPC [3–5]. In terms of concrete mix design and quality control,
An important research problem in materials science is predict- compressive strength is generally considered the most important
ing the mechanical properties of construction materials [1]. For quality of HPC. Developing accurate and reliable compressive
many years, the use of high performance concrete (HPC) in various strength prediction models can save time and costs by providing
structural applications has markedly increased [2]. Cement materi- designers and structural engineers with vital data. Thus, accurate
als such as fly ash, blast furnace slag, metakaolin, and silica fume and early prediction of concrete strength is a critical issue in con-
crete construction.
⇑ Corresponding author. Tel.: +886 2 2737 6321; fax: +886 2 2737 6606. Concrete compressive strength (CCS) is usually predicted using
E-mail addresses: jschou@mail.ntust.edu.tw (J.-S. Chou), cftsai@mgt.ncu.edu.tw linear or non-linear regression methods [3,6–8]. The general form
(C.-F. Tsai), paduc@dut.udn.vn (A.-D. Pham), luyh@fcu.edu.tw (Y.-H. Lu). of the regression method is
http://dx.doi.org/10.1016/j.conbuildmat.2014.09.054
0950-0618/Ó 2014 Elsevier Ltd. All rights reserved.
772 J.-S. Chou et al. / Construction and Building Materials 73 (2014) 771–780
y ¼ f ðbi xi Þ ð1Þ To sum up, the contribution of this paper is two-fold. The first
one is to compare the performance of single and ensemble tech-
niques, which is not given thorough investigation in HPC compres-
where y, f, bi and xi are the CCS, linear or nonlinear function, regres- sive strength prediction. The second one is the identified best
sion coefficients and concrete attributes, respectively. prediction model, which provides the lowest error rate, can be
However, obtaining an accurate regression equation when used for not only the practical purpose, but also future researches
using these empirical-based models is difficult. Moreover, several as the baseline prediction model to compare with other advanced
factors that affect the compressive strength of HPC differ from models.
those that affect the compressive strength of conventional con- The rest of this paper is organized as follows. The study context
crete. Therefore, regression analysis may be unsuitable for predict- is introduced by a brief literature review, including studies of CCS
ing CCS [9]. prediction and some well-known ML applications. The methodol-
To compensate for the drawbacks of conventional models, ogy section then describes individual and ensemble ML schemes
machine learning algorithms (i.e., neural networks, classification and evaluation methods. The modeling experiments section dis-
and regression tree, linear regression, or support vector machine cusses the experimental settings and compares the prediction
(SVM)) as baseline models have been applied in evolutionary or results among individual and ensemble ML models of HPC com-
hybrid approaches to developing accurate and effective models pressive strength. Finally, the conclusion summarizes the findings
for predicting CCS [10]. Machine learning, which is a branch of arti- and conclusions.
ficial intelligence (AI), can be used not only in knowledge-genera-
tion tools, but also in general information-modeling tools for
conventional statistical techniques [11]. The ML models approxi- 2. Literature review
mate the relationships between the inputs and outputs based on
a measured set of data. The use of computer-aided modeling for predicting the
Recently, the use of ML-based applications has increased in mechanical properties of construction materials is growing [12].
many areas of civil engineering, ranging from engineering design The many prediction techniques proposed so far include empirical
to project planning [11–15]. Other material science problems that models, statistical techniques and artificial intelligence algorithms
have been solved by ML include mixture design, predicting [3,8,10,13,21,29]. Some linear or non-linear regression analyses
mechanical properties, or fault diagnosis [10,14,16–19]. Particu- have achieved good prediction accuracy. Zain and Abd, for
larly, ML-based solutions using learning mechanisms readily avail- instance, used multivariate power equations to predict the
able in WEKA software1 often provide a good alternative approach strength of high performance concrete [8]. Similarly, Aciti devel-
to solving prediction problems. oped a regression model for estimating concrete strength through
For example, Chou et al. proposed several supervised learning non-destructive testing methods and then performed statistical
models for predicting CCS. Their analytical results indicated that tests to verify the model [7].
multiple additive regression trees achieve the highest predictive However, predicting the behavior of HPC is relatively more dif-
accuracy [20]. Moreover, Yan and Shi reported that SVM was better ficult than predicting the behavior of conventional concrete. Since
than other models for predicting elastic modulus of normal and the relationship between components and concrete properties is
high strength concrete [21]. Notably, artificial neural networks highly non-linear, mathematically modeling the compressive
(ANNs) are used to construct mapping functions for predicting strength of HPC based on available data is difficult [30,31]. There-
CCS [19,22]. Thus, ML techniques such as SVM and ANN are fre- fore, conventional methods are often unsuitable for predicting con-
quently used in prediction models. However, no single model has crete compressive strength [9]. Since conventional materials
consistently proven superior. models are inadequate for simulating complex non-linear behav-
Conversely, ensemble approaches that combine multiple learn- iors and uncertainties, researchers have proposed various AI tech-
ing classifiers (or prediction models) have been proposed to niques for enhancing prediction accuracy [9,13,15,20,21].
improve the performance of single learning techniques [23,24]. Boukhatem et al. showed that simulation models, decision support
The three methods of combining multiple prediction models into systems, and AI techniques are useful and powerful tools for solv-
a single model are voting, bagging, and stacking combination ing complex problems in concrete technology [12].
methods [25–27]. However, a literature review shows no studies Notably, researchers have applied or evaluated the capability of
that have compared individual models and ensemble learning ANNs to predict strength and other concrete behaviors [29,30,32–
techniques for predicting the compressive strength of HPC. 34]. Ni and Wang, for instance, used multi-layer feed-forward neu-
Particularly, related works only develop their prediction models ral networks to predict 28-day CCS based on various factors [32].
based on single statistical or machine learning techniques; it is Altun et al. further showed that the ANN was superior to regression
unknown that whether the prediction models based on ensemble method in estimating CCS [29]. Yeh also successfully used ANN to
learning techniques can perform better than single ones in the predict the slump of concrete with fly ash and blast furnace slag
problem of HPC compressive strength prediction. [30]. The use of an adaptive probabilistic neural network for
Therefore, individual and ensemble ML techniques were com- improving accuracy in predicting CCS was studied by Lee et al. [35].
pared in this study to identify the best model for predicting the Meanwhile, the SVM has excellent generalization capability
mechanical properties of HPC. Specifically, four well-known indi- when solving non-linear problems. The SVM can also overcome
vidual ML techniques are compared: multilayer perceptron neural the problem of small sample size. An SVM analysis was used to
network, SVM, classification and regression tree, and linear regres- estimate the temperatures at which concrete structures are dam-
sion. Additionally, the voting, bagging, and stacking integration aged by fire [14]. Moreover, Gupta investigated the potential use
methods of combining individual models are examined in terms of SVM for predicting CCS [18] by combining radial basis function
of the mean absolute error (MAE), root mean squared error (RMSE) with SVM. Yan and Shi used SVM to predict the elastic modulus of
and mean absolute percentage error (MAPE) via a synthesis index. normal and high strength concrete. The analytical results showed
Meanwhile, cross validation method [28] is used to avoid bias in that the SVM outperformed other models [21].
the experimental datasets. Similarly, evolutionary algorithm-based methodologies have
been used for knowledge discovery. Cheng et al. proposed an
1
http://www.cs.waikato.ac.nz/ml/weka/ advanced hybrid AI model that fused fuzzy logic, weight SVM
J.-S. Chou et al. / Construction and Building Materials 73 (2014) 771–780 773
and fast messy genetic algorithms to predict compressive strength Dwji ðhÞ ¼ gdpi xpi þ aDwji ðh 1Þ ð5Þ
in HPC [16]. Comparisons showed that their model was better than where g is the learning rate parameter, dpi is the propagated error, xpi is the output of
SVM and back-propagation neural network. Additionally, Mousavi neuron i for record p, a is the momentum parameter, and Dwji(h 1) is the change in
et al. applied gene expression programming (GEP), a subset of GP, wji in the previous cycle.
to approximate the compressive strength in various HPC mixes
3.1.2. Support vector machine
[36]. The prediction performance of the optimal GEP model was
The SVM, which was developed by Vapnik in 1995 [41], is widely used for clas-
superior to that of regression-based models. Yeh and Lien com- sification, forecasting and regression. The high learning capabilities of SVMs have
bined an operation tree and genetic algorithm into a proposed been confirmed in the civil engineering field [14,15,18,21]. In this supervised learn-
Genetic Operation Tree (GOT) for automatically obtaining self- ing method, SVMs are generated from the input–output mapping functions of a
organized formulas for accurately predicting HPC compressive labeled training dataset. An SVM can be classified as either a classification target,
which has only two values (i.e., 0 and 1), or as a regression target, which has a con-
strength [9]. Their comparisons showed that GOT was more accu-
tinuous real value. The regression model of SVMs is typically used to construct the
rate than nonlinear regression formulas. Again, however, GOT was input–output model because it effectively solves nonlinear regression problems
less accurate than neural network models. [42].
As the need for prediction accuracy increases, complexity The input for the SVM regression is first mapped to an n-dimensional feature
space by using a fixed mapping procedure. Nonlinear kernel functions then fit the
approaches that of combined ML techniques. The unique advanta-
high-dimensional feature space in which input data become more separable com-
ges of ensemble learning methods are apparent when solving prob- pared to input data in the original input space. The linear model in the space is
lems involving a small sample size, high-dimensionality, and f(x,w), which can be expressed by the following equation:
complex database structures [24,27,37–39]. Chou and Tsai, for X
n
example, improved accuracy in predicting HPC compressive f ðx; xÞ ¼ wj g j ðxÞ þ b ð6Þ
j¼1
strength by using a novel hierarchical approach that combined
classification and regression technique [23]. where gj(x) is a set of nonlinear transformations from the input space, b is a bias
Generally, various ensemble approaches may prove to be term, and w denotes the weight vector estimated by minimizing the regularized risk
function that includes the empirical risk.
efficient tools for solving problems that are difficult or impossi-
Estimation quality is also measured by a loss function Le where
ble to solve by individual ML techniques or by conventional
regression methods. However, very few studies have compared 0 ifjy f ðx; xÞj 6 e
Le ¼ Le ðy; f ðx; xÞÞ ¼ ð7Þ
various single and ensemble learning techniques for predicting jy f ðx; xÞj otherwise
HPC compressive strength. Therefore, the objective of this study The novel feature of SVM regression is its e-insensitive loss function for com-
was to evaluate the usefulness and identify, in terms of error puting a linear regression function for the new higher dimensional feature space
rate, the best ML technique for predicting HPC compressive while simultaneously decreasing model complexity by minimizing kxk2 . This func-
tion is introduced by including nonnegative slack variables ni and n⁄i , where
strength.
i = 1, . . ., n is used to identify training samples from the e-insensitive zone. The
SVM regression can thus be formulated by simplifying the following function:
3. Methodology
1 2
Xn
min kxk þ C ðni þ ni Þ ð8Þ
Machine learning technologies are now used in many fields to simulate materi- 2 i¼1
als behavior. For predicting HPC compressive strength, ML-based methodologies
8
include artificial neural network, classification and regression trees (CART), linear
< yi f ðxi ; xÞ 6 e þ ni
>
regression (LR), and SVM. The reasons of choosing these techniques are because
subject to f ðxi ; xÞ yi 6 e þ ni
they are the most popular and applied techniques in related works (c.f. Sections 1 >
:
ni ; ni P 0; i ¼ 1; . . . ; n
and 2), and some of them are also recognized as the top data mining algorithms
[40]. This optimization problem can be transformed into a dual problem, which is
solved by
3.1. Machine learning techniques
X
nSV
3.1.1. Artificial neural network f ðxÞ ¼ ðai ai ÞKðx; xi Þ subject to 0 6 ai 6 C; 0 6 ai 6 C ð9Þ
The ANN is a powerful tool for solving very complex problems. Essentially, the i¼1
processing elements of a neural network resemble neurons in the human brain,
which consist of many simple computational elements arranged in layers. The
where nSV is the number of support vectors. The kernel function is
use of ANNs to predict CCS has been studied intensively [5,19,30,32,34]. Multilayer
perception (MLP) neural networks are standard neural network models with an X
m
input layer containing a set of sensory input nodes representing concrete compo- Kðx; xi Þ ¼ g i ðxÞg i ðxi Þ ð10Þ
i¼1
nents, one or more hidden layers containing computation nodes, and an output
layer containing one computation node representing CCS. Like any intelligence During training, selected SVM kernel functions (i.e., linear, radial basis, polyno-
model, ANNs have learning capability. mial, or sigmoid function) are used to identify support vectors along the function
The most widely used and effective learning algorithm for training an MLP neu- surface. The default kernel parameters depend on the kernel type and on the imple-
ral network is the back-propagation (BP) algorithm, which adjusts connection mented software.
weights and bias values during training. An activated neuron in a hidden output
layer can be expressed as 3.1.3. Classification and regression tree
X The CART is a machine-learning method for constructing prediction models
netj ¼ wji xi and yj ¼ f ðnetj Þ ð2Þ from data. This decision tree method constructs classification trees or regression
trees depending on the variable type, which may be categorical or numerical
[43,44]. Breiman et al. showed that a learning tree can be optimized by using a
where netj is the activation of jth neuron, i is the set of neurons in the preceding
learning data set to prune the saturated tree and select among the obtained
layer, wji is the weight of the connection between neuron j and neuron i, xi is the out-
sequence of nested trees [43]. This process helps to retain a simple tree, which
put of neuron i, and yj is the sigmoid or logistic transfer function
ensures robustness.
1 Depending on the target field, several impurity measures can be used to locate
f ðnetj Þ ¼ ð3Þ
1 þ eknetj splits for CART models. For instance, Gini is usually applied to symbolic target fields
while the least-squared deviation method is used for automatically selecting con-
where k controls the function gradient. tinuous targets without explaining the selections. For node t in a CART, Gini index
The formula for training and updating weights wji in each cycle h is g(t) is defined as
X
wji ðhÞ ¼ wji ðh 1Þ þ Dwji ðhÞ ð4Þ gðtÞ ¼ pðjjtÞpðijtÞ ð11Þ
j–i
The change in Dwji(h) is where i and j are target field categories

using the Gini index, only records in node t and only root nodes with valid split-pre-
dictors are used to compute Nj(t) and Nj, respectively.
Many studies have investigated the use of CART in various fields, including road
safety analysis, traffic engineering, motor vehicle emissions [45]. A novel feature of
CART is its automatic search for the best predictors and the best threshold values for
all predictors to classify target variable.
3.1.4. Linear regression

The multiple linear regression (LR) model, an extension of the simple regression
model, determines the relationship between a numerical response variable and two
or more explanatory variables [46,47]. This model specifies that an appropriate
function of the fitted probability of the event is a linear function of the observed val-
ues of the available explanatory variables.
In the literature, LR is commonly used for modeling the mechanical properties
of construction materials [6,7,20,23,48]. The computational problem addressed by
LR is fitting a hyperplane to an n-dimensional space where n is the number of inde-
pendent variables. For a system with n inputs (independent variables), X’s, and one
output (dependent variable), Y, the general least square problem is to determine
unknown parameters of the linear regression model. Because of its simplicity, this
study investigated the applicability of LR. The general formula for LR models is
shown in Eq. (13).
Y ¼ b0 þ b1 x1 þ b2 x2 þ þ bn xn þ e ð13Þ
In the proposed model, Y is concrete compressive strength, bi is a regression

coefficient (i = 1,2,..., n), e is an error term, and X’s values represent concrete attri-
butes. Regression analysis estimates the unbiased values of the regression coeffi-
cients bi against the training data set. The LR model applies four regression
methods using ordinary least squares estimation: enter, stepwise, forward, and
backward [49]. Equation (14) is a concise vector-matrix form.
Y ¼ bx þ e ð14Þ
3.2. Ensemble models and cross-validation
Fig. 1. Ten-fold cross-validation method. The various supervised learning techniques such as SVMs, classification and
regression tree, linear regression, and multilayer perceptron neural network [40]
are typically used individually to construct single classifiers as the benchmark mod-
pðj; tÞ pðjÞNj ðtÞ X els. In ML, ensemble classifiers or combinations of (different) classifiers have proven
pðjjtÞ ¼ ; pðjtÞ ¼ ; andpðtÞ ¼ pðj; tÞ ð12Þ superior to many individual classifiers [50,51].
pðtÞ Nj j
Specifically, a combination of classifiers can compensate for errors made by the
individual classifiers on different parts of the input space. Therefore, the strategy
where p(j) is the prior probability value for category j, Nj(t) is the number of records used in ensemble systems is to create many classifiers and combine their outputs
in category j of node t, and Nj is the number of records of category j in the root node. such that the combination improves upon the performance of single classifiers in
Notably, when the improvement after a split during tree growth is determined by isolation [52].
Conventional Concrete
A W Cement FA CA
Fine Coarse
Air Water Cement
Aggregate Aggregate
Fines
A W Powder FA CA
High Performance Concrete
Fig. 2. Materials used in regular concrete and HPC.
Table 1
Sources of datasets in literature.
Dataset Data source Supplementary cementing materials Laboratory Sample size

Dataset 1 Yeh [31] Blast-furnace slag; fly ash; super-plasticizer Taiwan 1,030
Yeh [54] 103
Dataset 2 Videla et al. [55] Blast-furnace slag Chile 194
Silica fume
Dataset 3 Lam et al. [56] Fly ash; silica fume Hong Kong 144
Dataset 4 Lim et al. [57] Fly ash; silica fume; super-plasticizer South Korea 104
Dataset 5 Safarzadegan et al. [58] Metakaolin Iran 100
Next, the three methods combined into ensemble classifiers in this study are This study combined four individual learning techniques into an MLP ensemble,
described. an SVM ensemble, a CART ensemble, and an LR ensemble.
3.2.3. Stacking method

3.2.1. Voting method
Stacking or stacked generalization [25] is a method of constructing multi-level
The simplest method of combining multiple classifiers is voting. In the cases of
classifiers hierarchically. The first level consists of several single classifiers, and the
prediction, the outputs of the individual classifiers are pooled. Then, the class which
outputs of the first level classifiers are used to train the ‘stacked’ classifier, i.e., the
receives the largest number of votes is selected as the final classification decision. In
second level classifier. Therefore, the final decision depends on the output of the
general, the numerical output can be determined by different combination of prob-
stacked classifier. Unlike the above combined methods such as voting are per-
ability estimates.
formed by the ‘static’ combiner, the stacked classifier is a ‘trainable’ combiner. That
By combining two to four different individual classifiers, this study obtained
is, it estimates the classifier errors for a particular learning dataset and then corrects
eleven different ensemble classifiers. Ensembles of two different classifiers included
them.
MLP+CART, MLP+SVM, MLP+LR, CART+SVM, CART+LR, and SVM+LR; ensembles of
Since SVM performs better than many supervised learning techniques in many
three different classifiers included MLP+CART+SVM, MLP+CART+LR, CART+SVM+LR,
pattern recognition problems, the stacking based ensemble classifiers in this study
and MLP+SVM+LR. One ensemble of four different classifiers was used:
were based on a two-level scheme with three different individual classifiers in the
MLP+CART+SVM+LR.
first level and an SVM regression in the second level [53].
3.2.2. Bagging method 3.2.4. Cross-validation method

The bagging method uses bootstrap method to train several classifiers indepen- The k-fold cross-validation algorithm is often used to minimize bias associated
dently and with different training sets [26]. Bootstrapping builds k replicate train- with random sampling of training and hold out data samples. Since Kohavi reported
ing datasets to construct k independent classifiers by randomly re-sampling and that ten-fold validation testing yields the optimal computational time and reliable
replacing the original training dataset. That is, each training example may appear variance [28], this work applied a stratified ten-fold cross-validation approach to
repeatedly or not at all in any particular replicated training dataset of k. The k clas- assess model performance. This method categorizes a fixed number of data samples
sifiers are then aggregated through an appropriate combination method, e.g., aver- into ten subsets. In each of ten rounds of model building and validation, it chooses a
age of probabilities. different data subset for testing and trains the model with the remaining nine data
Table 2
The HPC attributes in the datasets.
Parameter Unit Min. Ave. Max. Direction

Dataset 1 – Taiwan
Cement kg/m3 102.0 276.50 540.0 Input
Blast-furnace slag kg/m3 0.0 74.27 359.4
Fly ash kg/m3 0.0 62.81 260.0
Water kg/m3 121.8 182.98 247.0
Super-plasticizer kg/m3 0.0 6.42 32.2
Coarse aggregate kg/m3 708.0 964.83 1145.0
Fine aggregate kg/m3 594.0 770.49 992.6
Age of testing Days 1.0 44.06 365.0
Concrete compressive strength MPa 2.3 35.84 82.6 Output
Dataset 2 – Chile
Coarse aggregate kg/m3 1105.0 1135.73 1173.0 Input
Cement kg/m3 408.0 518.31 659.0
Silica fume kg/m3 0.0 24.57 59.0
Water kg/m3 160.0 164.74 168.0
Plasticizer kg/m3 2.2 2.73 3.3
High-range water-reducing kg/m3 6.7 9.30 14.5
Entrapped air content % 1.3 1.82 2.5
Dataset 3 – Hong Kong
Fly ash replacement ratio % 0.0 25.00 55.0 Input
Silica fume replacement ratio % 0.0 1.88 5.0
Total cementitious material kg/m3 400.0 436.67 500.0
Water content lit/m3 150.0 171.98 205.0
High rate water reducing agent lit/m3 0.0 4.87 13.0
Age of samples Days 3.0 60.67 180.0
Dataset 4 – South Korea
Water to binder ratio % 30.0 37.60 45.0 Input
Water content kg/m3 160.0 170.00 180.0
Fine aggregate % 37.0 46.00 53.0
Fly ash % 0.0 10.10 20.0
Air entraining ratio kg/m3 0.04 0.05 0.08
Supper-plasticizer kg/m3 1.89 4.48 8.5
Dataset 5 – Iran
Cement kg/m3 320.0 357.4 400.0 Input
Water kg/m3 140.0 173.0 200.0
Metakaolin kg/m3 0.0 42.0 80.0
subsets. The test subset is used to validate model accuracy (Fig. 1). Algorithm accu- materials were Portland cement equivalent to ASTM Type I, low
racy is then expressed as average accuracy acquired by the ten models in ten vali-
calcium fly ash equivalent to ASTM Class F, and a condensed silica
dation rounds.
fume commercially available in Hong Kong.
Dataset 4 – South Korea [57]: All materials used in this experi-
3.3. Performance evaluation methods
ment were produced in South Korea. Compressive strength was
The following performance measures were used to evaluate the accuracy of the 40–80 MPa. The W/B varies between 0.30 and 0.45, and the
proposed machine learning models. amount of fly ash used varies from 0% to 20% of the total binder,
and the content of super-plasticizer and air-entraining agent are
Mean absolute error
0–2% and 0.010–0.013%, respectively, when expressed as a
1X n
MAE ¼ jy y0 j ð15Þ Table 3
n i¼1
Default model parameter settings.
where y0 is the predicted value; y is the actual value; and n is the number of data
Model Parameter Setting
samples.
Root mean squared error MLP Hidden layer 1
Learning rate 0.3
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Momentum 0.2
u n
u1 X Training/time 500
RMSE ¼ t ðy0 yÞ2 ð16Þ
n i¼1 Validation threshold 20
CART Initial count 0.0
Mean absolute percentage error
Max depth 1
MinNum 2.0
n
1X 0
y y minVarianceProp 0.001
MAPE ¼ ð17Þ

n i¼1 y noPruning false
numFolds 3
To obtain a comprehensive performance measure, a synthesis index (SI) based seed 1
on three statistical measures, MAE, RMSE, and MAPE, was derived as
SVM C 1.0
n
1X Pi Pmin;i Kernel RBF
SI ¼ ð18Þ
n i¼1 Pmax;i Pmin;i LR EliminateColinearAttributes True
Minimal False
where n = number of performance measures; and Pi = ith performance measure. The
Ridge 1.0E8
SI ranged from 0 to 1; an SI value close to 0 indicated a highly accurate predictive
model. Voting classifiers 2-4 weka.classifiers.Classifier
combinationRule Average of
Probabilities
4. Modeling experiments
Bagging bagSizePercent 100
numIterations 10
4.1. Data description and preparation
Stacking classifiers 3 weka.classifiers.Classifier
metaClassifier SMOreg
Data collected from reliable laboratory tests and published numFolds 10
studies were used as experimental data in evaluations of forecast-
ing performance of the proposed ML models. Supplementary
cement materials such as fly ash, blast furnace slag, metakaolin,
and silica fume are often added to HPC [2] to improve its material Table 4
properties (Fig. 2). Table 1 lists the five multi-nation datasets with Prediction performances of individual models.
various additional cementing materials. Dataset ML technique MAE (MPa) RMSE (MPa) MAPE (%) SI
Dataset 1 – Taiwan: In this experimental dataset originally col- Dataset 1 – Taiwan
lected by Yeh [31,54], the final set of 1133 samples of ordinary MLP 6.19 7.95 20.84 0.54
Portland cement containing various additives and cured under nor- CART 5.86 7.84 20.66 0.50
mal conditions was evaluated from numerous university research SVM 3.75 5.59 12.03 0.00
LR 7.87 10.11 29.89 1.00
labs. All tests were performed on 15-cm cylindrical specimens of
concrete prepared using standard procedures. Dataset 2 – Chile
MLP 4.00 5.40 6.81 0.00
Dataset 2 – Chile [55]: Rapid hardening Portland blast-furnace
CART 4.29 5.72 7.35 0.04
slag cement, silica fume, coarse and fine crushed siliceous aggre- SVM 8.02 10.32 16.95 0.61
gates, and plasticizer and high-range water reducing chemical LR 11.38 13.49 21.87 1.00
admixtures were used. Concrete trial mixtures were proportioned Dataset 3 – Hong Kong
using the coarse aggregate dosages recommended by ACI 211.4R- MLP 4.28 5.81 10.51 0.00
93. All samples were stored in a standard curing room CART 7.29 9.28 16.31 0.56
SVM 5.34 6.62 13.26 0.19
(T = 20 ± 3 °C and relative humidity [RH] > 90%) until testing at
LR 9.56 11.04 23.91 1.00
varying age. Compressive strength tests were performed on two Dataset 4 – South Korea
150 300 mm cylindrical samples according to ASTM C 39 on days
MLP 1.43 1.90 11.34 0.47
1, 3, 7, 28, and 56. CART 1.86 2.58 3.22 0.68
Dataset 3 – Hong Kong [56]: Concrete mixes were prepared at SVM 1.31 1.73 2.95 0.00
different ratios of water to cementitious materials, with low and LR 1.72 2.20 3.41 0.45
high volumes of fly ash, and with or without addition of small Dataset 5 – Iran
amounts of silica fume. The 144 different samples consisted of MLP 6.08 7.87 14.16 0.27
24 different mixes cured for 3, 7, 28, 56, or 180 days. In each mix CART 5.42 7.11 12.55 0.00
SVM 7.73 10.10 18.83 1.00
series, the percentage of cement replacement by fly ash (on direct
LR 6.47 7.86 15.68 0.40
weight to weight basis) varied from 0% to 55%. Some mixes con-
tained a further 5% silica fume replacement. The cementitious Highlighted in bold denotes the best model and performance measure.
Table 5
Prediction performances of ensemble models.
Ensemble method Model MAE (MPa) RMSE (MPa) MAPE(%) SI

Dataset 1 – Taiwan
Voting
MLP+CART 5.13 6.67 17.85 0.05
MLP+SVM 8.35 10.65 30.20 0.17
MLP+LR 6.13 7.87 22.29 0.09
CART+SVM 3.97 5.51 14.24 0.02
CART+LR 5.70 7.40 21.50 0.08
SVM+LR 5.27 7.05 19.88 0.06
MLP+CART+SVM 12.08 15.45 35.44 0.27
MLP+CART+LR 36.32 38.41 111.04 1.00
CART+SVM+LR 4.66 6.18 17.54 0.04
MLP+SVM+LR 6.51 8.47 23.90 0.10
MLP+CART+SVM+LR 28.84 31.71 80.89 0.76
Bagging
MLP 15.66 19.70 53.40 0.41
CART 4.26 5.71 15.02 0.02
SVM 4.01 5.75 14.09 0.02
LR 7.88 10.13 29.96 0.16
Stacking
MLP+CART+SVM 7.25 9.48 25.92 0.13
MLP+CART+LR 7.03 9.02 24.59 0.12
CART+SVM+LR 3.52 5.08 11.97 0.00
MLP+SVM+LR 5.92 7.66 20.64 0.08
Dataset 2 – Chile
Voting
MLP+CART 18.83 22.06 25.30 0.26
MLP+SVM 10.74 12.79 19.71 0.14
MLP+LR 54.97 70.13 89.40 1.00
CART+SVM 4.91 6.39 9.80 0.03
CART+LR 5.90 7.52 11.47 0.05
SVM+LR 8.51 10.87 17.98 0.11
MLP+CART+SVM 17.54 20.79 25.67 0.25
MLP+CART+LR 48.26 58.00 74.39 0.83
CART+SVM+LR 5.98 7.76 12.50 0.05
MLP+SVM+LR 7.84 9.65 15.27 0.08
MLP+CART+SVM+LR 37.98 45.95 56.16 0.63
Bagging
MLP 20.91 24.31 38.68 0.34
CART 3.82 5.13 6.50 0.00
SVM 9.66 12.06 19.82 0.13
LR 11.49 13.52 21.79 0.15
Stacking
MLP+CART+SVM 5.97 7.49 9.94 0.04
MLP+CART+LR 5.86 7.30 9.71 0.04
CART+SVM+LR 4.86 6.30 8.30 0.02
MLP+SVM+LR 5.66 7.15 9.37 0.03
Dataset 3 – Hong Kong
Voting
MLP+CART 19.22 22.46 31.77 0.08
MLP+SVM 10.75 12.87 28.26 0.04
MLP+LR 161.77 166.78 403.48 1.00
CART+SVM 4.89 6.26 11.35 0.00
CART+LR 5.78 7.12 14.74 0.01
SVM+LR 6.14 7.51 16.19 0.01
MLP+CART+SVM 18.04 21.53 34.79 0.08
MLP+CART+LR 95.62 99.98 251.60 0.59
CART+SVM+LR 4.81 6.09 12.54 0.00
MLP+SVM+LR 7.10 8.61 17.86 0.02
MLP+CART+SVM+LR 70.32 74.84 193.99 0.44
Bagging
MLP 21.33 26.03 67.16 0.12
CART 5.63 6.98 14.97 0.01
SVM 6.13 7.72 15.93 0.01
LR 9.64 11.10 24.14 0.03
Stacking
MLP+CART+SVM 6.40 8.09 15.74 0.01
MLP+CART+LR 5.57 7.38 14.22 0.01
CART+SVM+LR 5.22 7.01 12.18 0.00
MLP+SVM+LR 4.70 6.02 11.75 0.00
(continued on next page)

Table 5 (continued)
Ensemble method Model MAE (MPa) RMSE (MPa) MAPE(%) SI

Dataset 4 – South Korea
Voting
MLP+CART 4.78 6.18 8.42 0.09
MLP+SVM 4.44 4.98 8.61 0.08
MLP+LR 45.53 49.72 79.73 1.00
CART+SVM 1.29 1.79 2.50 0.01
CART+LR 1.50 2.05 2.89 0.01
SVM+LR 1.26 1.65 2.44 0.01
MLP+CART+SVM 5.46 6.66 9.93 0.10
MLP+CART+LR 26.15 30.72 49.58 0.59
CART+SVM+LR 1.24 1.70 2.39 0.01
MLP+SVM+LR 1.35 1.78 2.64 0.01
MLP+CART+SVM+LR 17.79 21.51 33.72 0.40
Bagging
MLP 5.74 7.34 11.34 0.11
CART 1.68 2.37 3.22 0.02
SVM 1.52 2.03 2.95 0.01
LR 1.73 2.17 3.41 0.01
Stacking
MLP+CART+SVM 2.43 3.63 4.61 0.04
MLP+CART+LR 1.64 2.34 3.28 0.01
CART+SVM+LR 1.09 1.51 2.11 0.00
MLP+SVM+LR 1.13 1.59 2.20 0.01
Dataset 5 – Iran
Voting
MLP+CART 4.71 6.22 11.17 0.14
MLP+SVM 5.72 7.39 13.61 0.40
MLP+LR 4.97 6.39 11.85 0.20
CART+SVM 5.12 7.06 12.61 0.29
CART+LR 4.26 5.90 10.56 0.06
SVM+LR 6.57 8.15 16.21 0.62
MLP+CART+SVM 4.74 6.31 11.38 0.16
MLP+CART+LR 4.06 5.61 9.87 0.00
CART+SVM+LR 4.83 6.58 12.13 0.21
MLP+SVM+LR 5.15 6.60 12.59 0.26
MLP+CART+SVM+LR 4.47 6.07 10.99 0.11
Bagging
MLP 4.38 5.69 10.29 0.05
CART 4.25 5.78 10.17 0.04
SVM 7.93 10.47 19.13 1.00
LR 6.22 7.75 15.00 0.52
Stacking
MLP+CART+SVM 5.47 6.99 12.42 0.31
MLP+CART+LR 5.21 6.53 12.21 0.25
CART+SVM+LR 5.54 7.17 12.64 0.33
MLP+SVM+LR 6.21 7.83 14.62 0.51
Highlighted in bold denotes the best model and performance measure.
Stacking CART+SVM+LR
Bagging SVM
Dataset 1
Voting CART+SVM
SVM
Bagging CART
Dataset 2 Voting CART+SVM
Concrete dataset
MLP
Stacking MLP+SVM+LR
Bagging CART
Dataset 3 Voting CART+SVM+LR
MLP
Bagging SVM
Dataset 4 Voting CART+SVM+LR
SVM
Stacking MLP+CART+LR
Bagging CART
Dataset 5 Voting MLP+CART+LR
CART MPa
0.00 0.50 1.00 1.50 2.00 2.50 3.00 3.50 4.00 4.50 5.00 5.50 6.00 6.50 7.00
Average mean absolute error (MAE)
Fig. 3. Average MAEs of prediction models.

Table 6
Comparison of best individual and ensemble models.
Dataset Predictive technique MAE (MPa) RMSE (MPa) MAPE (%)

Dataset 1 (Taiwan) SVM 3.75 5.59 12.03
CART+SVM+LR (stacking) 3.52 5.08 11.97
Dataset 2 (Chile) MLP 4.00 5.40 6.81
CART (bagging) 3.82 5.13 6.50
Dataset 3 (Hong Kong) MLP 4.28 5.81 10.51
MLP+SVM+LR (stacking) 4.70 6.02 11.75
Dataset 4 (South Korea) SVM 1.31 1.73 2.95
CART+SVM+LR (stacking) 1.09 1.51 2.11
Dataset 5 (Iran) CART 5.42 7.11 12.55
MLP+CART+LR (voting) 4.06 5.61 9.87
percentage of dry solids to the binder content. Portland cement best in Datasets 1–4 (SI values of 0.02, 0.03, 0.00 and 0.01, respec-
prepared according to the definition for ASTM type I was used. tively). For the bagging approach, SVM or CART ensemble had the
The coarse aggregate used was crushed granite (specific gravity, lowest error rates over Datasets 1–5 (SI values of 0.02, 0.00, 0.01,
2.7; fineness modulus, 7.2; maximum particle size, 19 mm). The 0.01 and 0.04, respectively). For the stacking approach, the first
fine aggregate was quartz sand (specific gravity, 2.61; fineness level technique based on CART+SVM+LR often had the best perfor-
modulus, 2.94). mance over the multi-nation datasets.
Dataset 5 – Iran [58]: Metakaolin (MK) is a very active natural Specifically, the ensemble learning based models that per-
pozzolan that can produce concrete with high early strength. formed the best, in terms of SI values, over the five datasets were
Recently, the highly pozzolonic properties of MK have been studied stacking-based CART+SVM+LR for Datasets 1 and 4, bagging-based
intensively. Incorporating MK in concrete significantly increased CART for Dataset 2, stacking-based MLP+SVM+LR for Dataset 3, and
resistance to chloride penetration. ASTM C150 type-I Portland voting-based MLP+CART+LR for Dataset 5. Fig. 3 shows the average
cement (PC) was used for all of the concrete mixtures. Coarse MAEs for the best prediction models based on single, voting, bag-
and fine aggregates were crushed calcareous stone (maximum size, ging, and stacking learning. For example, the best learning models
19 mm) and natural sand, respectively. Potable water was used for were voting based MLP+CART+LR (5.21 MPa in Dataset 5), stacking
casting and curing of all concrete specimens. The MK was used as based CART+SVM+LR (3.52 and 1.09 MPa in Datasets 1 and 4,
the supplementary cementitious material. The percentages of MK respectively), single MLP (4.28 MPa in Dataset 3), and bagging
used to replace PC in this experiment were 0%, 5%, 10%, 12.5%, CART (3.82 MPa in Dataset 2).
15%, and 20% by mass of concrete added to the clinker. Data for Table 6 further compares the best single and ensemble learning
100 samples of concrete containing MK were obtained. based models over the five datasets based on k-cross validation
The multi-nation datasets used in this study have been con- algorithm. The comparative results showed that the best ensemble
firmed in part or in whole in many studies of predictive models learning based models outperformed the single learning based
(Table 1). Based on four predictive techniques used as baseline models over four datasets.
models, this study used five experimental datasets to investigate The comparison results indicated that SVM and MLP can be con-
the performance of the ensemble models. Table 2 summarizes sidered the two best individual learning based models and that the
the statistical parameters in each of the databases, which were ensemble techniques generally perform better than the best indi-
obtained from various university research laboratories, including vidual models for predicting HPC compressive strength if chosen
descriptive data for various additives and various curing times carefully. Specifically, the voting-based MLP+CART+LR, bagging-
under normal conditions. The response/target was CCS, and the based CART, and stacking-based MLP/CART+SVM+LR were the best
predictor variables were the remaining attributes. individual models for constructing ensemble learning based
models.
Moreover, the analytical results indicated that most ML models
4.2. Model construction
achieved good performance and obtained lower error values
compared to those of previous works, such as multi-gene genetic
The four different ML techniques used as single and combined
programming approach (MAE = 5.480 MPa) in Dataset 1 [59], com-
classifiers in the prediction models were multilayer perceptron
bination of hyperbolic and exponential equation (MAE = 5.000 MPa)
(MLP) neural network, classification and regression tree (CART),
in Dataset 2 [55] and weighted genetic programming approach
SVM, and linear regression (LR). Table 3 lists the parameters used
(RMSE = 2.180 MPa) in Dataset 4 [60].
to develop these individual and ensemble models. The WEKA soft-
ware was used to implement the models.
6. Conclusions
5. Results and discussion
The objective of this study was to perform a comprehensive
Tables 4 and 5 shows the prediction performances of single and comparison of various learning techniques used individually and
ensemble learning based models, respectively over the multi- in combination for performing simulations of concrete compres-
nation datasets. Table 4 shows the cross-fold modeling perfor- sive strength based on multi-nation datasets with diverse additive
mance. For single learning based models, SVM performs the best materials. Four individual ML techniques (MLP, SVM, CART, and LR)
over two datasets (i.e., Datasets 1-Taiwan and 4-South Korea) were used to construct the prediction models as benchmark base-
and MLP has the lowest error rates over two datasets (i.e., Datasets lines. The ensemble learning based models used voting, bagging,
2-Chile and 3-Hong Kong). Notably, the best single ML model is and stacking approaches to combine multiple single learning based
CART in Dataset 5-Iran over the synthesis index. models.
Overall, the ensemble models achieved good outcomes in terms The experimental results showed that the best individual learn-
of overall performance measures. For example, of the models based ing techniques were SVM and MLP except for Dataset 5. On aver-
on ensemble learning by voting approach, CART+SVM performed age, the stacking based ensemble model composed of MLP/CART,
SVM, and LR as the first level models and SVM as the second level [23] Chou J-S, Tsai C-F. Concrete compressive strength analysis using a combined
classification and regression technique. Automat Constr 2012;24(0):52–60.
model performed best in terms of MAE, RMSE, and MAPE (Datasets
[24] Singh KP, Gupta S, Rai P. Identifying pollution sources and predicting urban air
1, 3, and 4). quality using ensemble learning methods. Atmos Environ 2013;80:426–37.
Generally, ensemble learning techniques outperformed individ- [25] Wolpert DH. Stacked generalization. Neural Networks 1992;5(2):241–59.
ual learning techniques in predicting HPC compressive strength. [26] Breiman L. Bagging predictors. Mach Learn 1996;24(2):123–40.
[27] Dietterich T. Ensemble methods in machine learning. In: Multiple classifier
However, individual ML based models should be selected carefully systems. Berlin (Heidelberg): Springer; 2000. p. 1–15.
to obtain the best ensemble model. Specifically, the best individual [28] Kohavi R. A study of cross-validation and bootstrap for accuracy estimation
ML model (i.e., MLP) sometimes provided lower error rates com- and model selection. In: International joint conference on artificial
intelligence, Morgan Kaufmann; 1995. p. 1137–43.
pared to ensemble learning based models (Dataset 3). [29] Altun F, Kisßi Ö, Aydin K. Predicting the compressive strength of steel fiber
The contribution of this paper to the domain knowledge is to added lightweight concrete using neural network. Comput Mater Sci
propose and validate the machine learning, voting, bagging, and 2008;42(2):259–65.
[30] Yeh IC. Analysis of strength of concrete using design of experiments and neural
stacking techniques for simulating concrete compressive strength. networks. J Mater Civ Eng 2006;18(4):597–604.
To maximize ease of use and modeling efficiency, this work used [31] Yeh IC. Modeling of strength of high-performance concrete using artificial
default settings in the individual and ensemble models in WEKA. neural networks. Cem Concr Res 1998;28(12):1797–808.
[32] Ni H-G, Wang J-Z. Prediction of compressive strength of concrete by neural
Therefore, further studies are needed to explore how the parame- networks. Cem Concr Res 2000;30(8):1245–50.
ters in these models can be optimized automatically. [33] Parichatprecha R, Nimityongskul P. Analysis of durability of high performance
concrete using artificial neural networks. Constr Build Mater 2009;23(2):
910–7.
References _ Sarıdemir M. Prediction of properties of waste AAC aggregate concrete
[34] Topçu IB,
using artificial neural network. Comput Mater Sci 2007;41(1): 117–25.
[1] Sobhani J, Najimi M, Pourkhorshidi AR, Parhizkar T. Prediction of the [35] Lee JJ, Kim D, Chang SK, Nocete CFM. An improved application technique of the
compressive strength of no-slump concrete: a comparative study of adaptive probabilistic neural network for predicting concrete strength.
regression, neural network and ANFIS models. Constr Build Mater Comput Mater Sci 2009;44(3):988–98.
2010;24(5):709–18. [36] Mousavi SM, Aminian P, Gandomi AH, Alavi AH, Bolandi H. A new predictive
[2] Kosmatka SH, Wilson ML. Design and control of concrete mixtures, EB001. model for compressive strength of HPC using gene expression programming.
fifteenth ed. Skokie (IL, USA): Portland Cement Association; 2011. Adv Eng Softw 2012;45(1):105–14.
[3] Bharatkumar BH, Narayanan R, Raghuprasad BK, Ramachandramurthy DS. Mix [37] Adeodato PJL, Arnaud AL, Vasconcelos GC, Cunha RCLV, Monteiro DSMP. MLP
proportioning of high performance concrete. Cement Concr Compos ensembles improve long term prediction accuracy over single networks. Int J
2001;23(1):71–80. Forecast 2011;27(3):661–71.
[4] Papadakis VG, Tsimas S. Supplementary cementing materials in concrete: Part [38] Erdal HI, Karakurt O, Namli E. High performance concrete compressive
I: Efficiency and design. Cem Concr Res 2002;32(10):1525–32. strength forecasting using ensemble models based on discrete wavelet
[5] Prasad BKR, Eskandari H, Reddy BVV. Prediction of compressive strength of SCC transform. Eng Appl Artif Intell 2013(0).
and HPC with high volume fly ash using ANN. Constr Build Mater 2009;23(1): [39] Erdal HI, Karakurt O. Advancing monthly streamflow prediction accuracy of
117–28. CART models using ensemble learning paradigms. J Hydrol 2013;477:119–28.
[6] Bhanja S, Sengupta B. Investigations on the compressive strength of silica fume [40] Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, et al. Top 10
concrete using statistical methods. Cem Concr Res 2002;32(9):1391–4. algorithms in data mining. Knowl Inf Syst 2008;14(1):1–37.
[7] Atici U. Prediction of the strength of mineral-addition concrete using [41] Vapnik VN. The nature of statistical learning theory. New York: Springer-
regression analysis. In: Concrete Research. Thomas Telford Ltd.; 2010. p. Verlag; 1995.
585–92. [42] Smola A, Schölkopf B. A tutorial on support vector regression. Stat Comput
[8] Zain MFM, Abd SM. Multiple regression model for compressive strength 2004;14(3):199–222.
prediction of high performance concrete. J Appl Sci 2009;9(1):155–60. [43] Breiman L, Friedman J, Olshen R, Stone C. Classification and regression
[9] Yeh IC, Lien L-C. Knowledge discovery of concrete material using Genetic trees. New York: Chapman & Hall/CRC; 1984.
Operation Trees. Expert Syst Appl 2009;36(3, Part 2):5807–12. [44] Loh W-Y. Classification and regression trees. Wiley Interdiscipl Rev Data
[10] Topçu IB,_ Sarıdemir M. Prediction of compressive strength of concrete Mining Knowl Discov 2011;1(1):14–23.
containing fly ash using artificial neural networks and fuzzy logic. Comput [45] de Oña J, de Oña R, Calvo FJ. A classification tree approach to identify key
Mater Sci 2008;41(3):305–11. factors of transit service quality. Expert Syst Appl 2012;39(12):11164–71.
[11] Reich Y. Machine learning techniques for civil engineering problems. Comput [46] Neter J, Kutner MH, Nachtsheim CJ, Wasserman W. Applied Linear Statistical
Aided Civil Infrastruct Eng 1997;12(4):295–310. Models. Fourth ed: McGraw-Hill/Irwin; 1996.
[12] Boukhatem B, Kenai S, Tagnit-Hamou A, Ghrici M. Application of new [47] Liang H, Song W. Improved estimation in multiple linear regression models
information technology on concrete: an overview. J Civil Eng Manage 2011; with measurement error and general constraint. J. Multivar. Anal. 2009;
17(2):248–58. 100(4):726–41.
[13] Tiryaki S, Aydın A. An artificial neural network model for predicting [48] Chen L. A multiple linear regression prediction of concrete compressive
compression strength of heat treated woods and comparison with a multiple strength based on physical properties of electric arc furnace oxidizing slag. Int.
linear regression model. Constr Build Mater 2014;62(0):102–8. J. Appl. Sci. Eng. 2010;7(2):153–8.
[14] Chen B-T, Chang T-P, Shih J-Y, Wang J-J. Estimation of exposed temperature for [49] Yan X, Su XG. Linear regression analysis: theory and computing. Singapore:
fire-damaged concrete using support vector machine. Comput Mater Sci World Scientific Publishing Co. Pte. Ltd.; 2009.
2009;44(3):913–20. [50] Frosyniotis D, Stafylopatis A, Likas A. A divide-and-conquer method for multi-
[15] Majid A, Khan A, Javed G, Mirza AM. Lattice constant prediction of cubic and net classifiers. Pattern Anal Appl 2003;6(1):32–40.
monoclinic perovskites using neural networks and support vector regression. [51] Ghosh J. Multiclassifier systems: back to the future. In: Roli F, Kittler J, editors.
Comput Mater Sci 2010;50(2):363–72. Multiple classifier systems. Berlin (Heidelberg): Springer; 2002. p. 1–15.
[16] Cheng M-Y, Chou J-S, Roy AFV, Wu Y-W. High-performance concrete [52] Polikar R. Ensemble based systems in decision making. IEEE Circuits Syst Mag
compressive strength prediction using time-weighted evolutionary fuzzy 2006;6(3):21–45.
support vector machines inference model. Automat Constr 2012;28(0): [53] Lee S-W, Byun H. A survey on pattern recognition applications of support
106–15. vector machines. Int J Pattern Recognit Artif Intell 2003;17(03):459–86.
[17] Peng C-H, Yeh IC, Lien L-C. Building strength models for high-performance [54] Yeh IC. Modeling slump of concrete with fly ash and superplasticizer. Comput
concrete at different ages using genetic operation trees, nonlinear regression, Concr 2008;5(6):559–72.
and neural networks. Eng Comput 2010;26(1):61–73. [55] Videla C, Gaedicke C. Modeling Portland blast-furnace slag cement high-
[18] Gupta S. Support vector machines based modelling of concrete strength. In: performance concrete. ACI Mater J 2004;101(5):365–75.
Proceedings of world academy of science: engineering & technology, vol. 36; [56] Lam L, Wong YL, Poon CS. Effect of fly ash and silica fume on compressive and
2007 fracture behaviors of concrete. Cem Concr Res 1998;28(2):271–83.
[19] Dantas ATA, Batista Leite M, de Jesus Nagahama K. Prediction of compressive [57] Lim C-H, Yoon Y-S, Kim J-H. Genetic algorithm in mix proportioning of high-
strength of concrete containing construction and demolition waste using performance concrete. Cem Concr Res 2004;34(3):409–20.
artificial neural networks. Constr Build Mater 2013;38:717–22. [58] Safarzadegan GS, Bahrami Jovein H, Ramezanianpour AA. Hybrid support
[20] Chou J-S, Chiu C, Farfoura M, Al-Taharwa I. Optimizing the prediction accuracy vector regression – particle swarm optimization for prediction of compressive
of concrete compressive strength based on a comparison of data-mining strength and RCPT of concretes containing metakaolin. Constr Build Mater
techniques. J Comput Civil Eng 2011;25(3):242–53. 2012;34(0):321–9.
[21] Yan K, Shi C. Prediction of elastic modulus of normal and high strength [59] Gandomi A, Alavi A. A new multi-gene genetic programming approach to
concrete by support vector machine. Constr Build Mater 2010;24(8):1479–85. nonlinear system modeling. Part I: Materials and structural engineering
[22] Uysal M, Tanyildizi H. Predicting the core compressive strength of self- problems. Neural Comput Appl 2012;21(1):171–87.
compacting concrete (SCC) mixtures with mineral additives using artificial [60] Tsai H-C, Lin Y-H. Predicting high-strength concrete parameters using
neural network. Constr Build Mater 2011;25(11):4105–11. weighted genetic programming. Eng Comput 2011;27(4):347–55.

Machine Learning Predicts Concrete Strength

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Machine Learning Predicts Concrete Strength

Uploaded by

Copyright:

Available Formats

Construction and Building Materials 73 (2014) 771–780

Contents lists available at ScienceDirect

Construction and Building Materials

Machine learning in concrete strength simulations: Multi-nation data

The change in Dwji(h) is where i and j are target ﬁeld categories

3.1.4. Linear regression

In the proposed model, Y is concrete compressive strength, bi is a regression

3.2. Ensemble models and cross-validation

High Performance Concrete

Fig. 2. Materials used in regular concrete and HPC.

Dataset Data source Supplementary cementing materials Laboratory Sample size

3.2.3. Stacking method

3.2.2. Bagging method 3.2.4. Cross-validation method

Parameter Unit Min. Ave. Max. Direction

Ensemble method Model MAE (MPa) RMSE (MPa) MAPE(%) SI

(continued on next page)

Ensemble method Model MAE (MPa) RMSE (MPa) MAPE(%) SI

Highlighted in bold denotes the best model and performance measure.

Fig. 3. Average MAEs of prediction models.

Dataset Predictive technique MAE (MPa) RMSE (MPa) MAPE (%)

You might also like