You are on page 1of 19

buildings

Article
Predicting Compressive Strength of Blast Furnace Slag and
Fly Ash Based Sustainable Concrete Using Machine
Learning Techniques: An Application of Advanced
Decision-Making Approaches
Syyed Adnan Raheel Shah 1, *, Marc Azab 2 , Hany M. Seif ElDin 2 , Osama Barakat 2 ,
Muhammad Kashif Anwar 1 and Yasir Bashir 3

1 Department of Civil Engineering, Pakistan Institute of Engineering and Technology, Multan 66000, Pakistan;
kashifanwar723@gmail.com
2 College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait;
marc.azab@aum.edu.kw (M.A.); hany.seifeldin@aum.edu.kw (H.M.S.E.); osama.barakat@aum.edu.kw (O.B.)
3 North Region, Punjab Aab-e-Pak Authority, Lahore 54660, Pakistan; yasirbashir.8@gmail.com
* Correspondence: syyed.adnanraheelshah@uhasselt.be; Tel.: +92-300-79-14-248

Abstract: The utilization of waste industrial materials such as Blast Furnace Slag (BFS) and Fly Ash
(F. Ash) will provide an effective alternative strategy for producing eco-friendly and sustainable
concrete production. However, testing is a time-consuming process, and the use of soft machine
learning (ML) techniques to predict concrete strength can help speed up the procedure. In this study,
Citation: Shah, S.A.R.; Azab, M.; Seif
artificial neural networks (ANNs) and decision trees (DTs) were used for predicting the compressive
ElDin, H.M.; Barakat, O.; Anwar,
strength of the concrete. A total of 1030 datasets with eight factors (OPC, F. Ash, BFS, water, days,
M.K.; Bashir, Y. Predicting
SP, FA, and CA) were used as input variables for the prediction of concrete compressive strength
Compressive Strength of Blast
Furnace Slag and Fly Ash Based
(response) with the help of training and testing individual models. The reliability and accuracy of
Sustainable Concrete Using Machine the developed models are evaluated in terms of statistical analysis such as R2 , RMSE, MAD and
Learning Techniques: An Application SSE. Both models showed a strong correlation and high accuracy between predicted and actual
of Advanced Decision-Making Compressive Strength (CS) along with the eight factors. The DT model gave a significant relation
Approaches. Buildings 2022, 12, 914. to the CS with R2 values of 0.943 and 0.836, respectively. Hence, the ANNs and DT models can be
https://doi.org/10.3390/ utilized to predict and train the compressive strength of high-performance concrete and to achieve
buildings12070914 long-term sustainability. This study will help in the development of prediction models for composite
Academic Editors: materials for buildings.
Jurgita Antucheviciene and
Oldrich Sucharda Keywords: sustainability; reinforced concrete; industrial wastes; compressive strength; green and
sustainable concrete; high performance concrete
Received: 24 May 2022
Accepted: 17 June 2022
Published: 29 June 2022

Publisher’s Note: MDPI stays neutral 1. Introduction


with regard to jurisdictional claims in
The construction industry is one of the world’s largest producers of CO2 emissions.
published maps and institutional affil-
As a result, they have a tremendous impact on the environment [1]. Each year, carbon
iations.
emissions are produced mostly owing to the manufacturing and usage of cement in building
projects across the globe [2]. Similarly, cement consumption is estimated to be nearly
4 billion tons (BT) per year, with one ton of ordinary cement producing around about the
Copyright: © 2022 by the authors.
same rate of CO2 emissions [3]. Cement production accounts for 7% of all CO2 pollution
Licensee MDPI, Basel, Switzerland. in the environment [3]. Therefore, supplementary cementitious materials (SCMs) need
This article is an open access article to be employed as a cement replacement in the construction sector to achieve worldwide
distributed under the terms and sustainable development. Fly Ash is a fine powder that is produced during the combustion
conditions of the Creative Commons of pulverized coal-based power stations, while blast furnace slag (BFS) is a by-product
Attribution (CC BY) license (https:// obtained after the processing of iron ore; these are the most easily available SCMs in the
creativecommons.org/licenses/by/ world. Cement production is a major source of carbon dioxide (CO2 ) emission. As a result,
4.0/). they have a significant environmental impact. The modern industry currently produces

Buildings 2022, 12, 914. https://doi.org/10.3390/buildings12070914 https://www.mdpi.com/journal/buildings


Buildings 2022, 12, 914 2 of 19

a considerable amount of waste material and has a negative impact on the environment.
Concrete researchers are particularly interested in Fly Ash and BFS, which are among
the different industrial by-products. Making use of such materials will help to reduce
pollution while also providing a cost-effective approach for making concrete. Furthermore,
the incorporation of BFS and Fly Ash as a partial cement substitute in concrete could greatly
increase concrete qualities such as compressive strength ( f c0 ), permeability, and concrete
durability [4–8]. As a result, determining the BFS and Fly Ash proportions for design mix
is critical and important, especially in terms of improving concrete strength.
Several research studies have been conducted to estimate the BFS concentration in a
concrete design mixture. According to Oner and Akyuz [9], the optimal content of BFS is
about 55–59% of the overall OPC content. Shariq et al. [10] investigated the influence of
three BFS dosages ranging from 20 to 60% on the compressive strength (CS) of concrete at
three various water-to-binder ratios. For all W/C ratios, the concrete compressive strength
incorporating 40% of BFS content is superior to those of concrete specimens made with
20% and 60% BFS contents. Siddique and Kaur [11] found that replacing 20% of OPC with
BFS in high-temperature constructions is feasible. Another researcher [12] found that the
compressive strength of concrete containing 60% of BFS achieved the maximum CS value
after 28 days. However, Majhi et al. [13] had achieved the optimum CS of concrete with
40% of BFS substitution.
Moreover, numerous experimental studies have been conducted to determine the BFS
or Fly Ash substitution proportion in the concrete mixture. The strength of concrete has
been studied by Gehlot [14] in which samples made with a combination of BFS or F. Ash
at six various BFS/ Fly Ash wt % ratios (i.e., 0/0, 10/20, 20/10, 30/0, and 0/30). It was
observed that the more the BFS to Fly Ash wt %, the better the CS of the concrete. Similarly,
the concrete compressive strength was also studied using various ratios of Fly Ash and BFS
in this study [15]. The set of experimental results and the variation of the mix proportions
can have a significant impact on the reliability of concrete strength estimation. As a result,
a new technique is needed to reduce the amount of time and money spent on experiments
due to the large number of tests. There is a need to introduce and employ a universal
predictive model with a high predictive accuracy.
Nowadays, Artificial Intelligence (AI) has become a widely used model for addressing
a variety of challenges in the engineering and science field [16–20]. AI methods have
been built to forecast various concrete characteristics: for example, the shear resistance
of RC beams [21,22], corrosion factors in PCC sewers [23], concrete crack width [24], the
maximum strength of RC beams [25], compressive strength of recycled concrete aggregate
(RCA) [26], strength of concrete containing SF [27], BFS [28,29] and Fly Ash [30,31], and
concrete strength of geopolymer concrete (GPC) [32]. Artificial neural networks (ANNs) is
the most effective and powerful AI model for simulating complicated technical situations
right now [33,34]. The ANNs model can solve complex, non-linear equations, particularly
those in which the correlation between the multiple inputs or outputs variables is difficult
to establish directly. For instance, Bilim et al. [35] developed an ANNs model to estimate
the CS of the concrete using 225 datasets. Researchers used six design variables such as
OPC content, ground granulated blast furnace slag content (GBFS), water/binder ratio,
superplasticizer (SP), aggregate ratio, and curing age. The optimum R-square (R2 ) value
of 0.96 was achieved for this ANNs model. Chopra et al. [36] suggested an ANNs model
having one hidden layer and comprising 50 neurons to estimate the CS of the concrete
samples made with both BFS and Fly Ash contents, based on 204 test data. A coefficient
of determination (R2 ) of 0.92 was achieved, which is used to analyze the accuracy of such
an ANNs model. Furthermore, Yeh [37] developed an ANNs model on 990 datasets for
forecasting the CS of concrete incorporating various percentage ratios of BFS and F. Ash
contents. Similarly, other researchers also estimated the CS of concrete using an ANNs
model and found the optimum R-square value of 0.92, which showed the higher accuracy of
such a model [37]. However, the range of datasets and ANNs design are mainly dependent
upon the number of layers, or neurons (nodes) in each hidden layers, and they have a
ANN effectiveness.
The novelty and research significance of this study is to explore the effects of differe
factors on the compressive strength of concrete made with BFS and Fly Ash using ad
vanced machine learning algorithms such as Artificial Neural Networks (ANNs) and D
Buildings 2022, 12, 914 cision Tree (DT). The suggested ML algorithms were selected based on their 3 of reputatio
19

and higher precision and accuracy in different scientific fields, such as biomedical an
construction fields. Moreover, the main objective of this study is based on the performan
considerable impact on the efficiency of the ANNs model. As a result, determining the
and comparative analysis of compressive strength using a supervised learning algorith
number of neurons and the hidden layer is important for improving ANN effectiveness.
of machineThe learning. Theresearch
novelty and performance of the
significance adopted
of this study is models
to explorewas compared
the effects based on th
of different
coefficient
factorsof
oncorrelation (R )strength
the compressive2 and root mean made
of concrete square values
with orFly
BFS and errors (RMSE)
Ash using of each alg
advanced
rithm. The purpose of this study is to better comprehend the effect of modelTree
machine learning algorithms such as Artificial Neural Networks (ANNs) and Decision paramete
(DT). The suggested ML algorithms were selected based on their reputation and higher
and the reliability of the results from different machine learning techniques. The outcom
precision and accuracy in different scientific fields, such as biomedical and construction
of the fields.
experimental research
Moreover, the are also
main objective presented
of this alongonwith
study is based a comparative
the performance assessment
and compar-
both collective and single ML approaches. The statistical analyses and
ative analysis of compressive strength using a supervised learning algorithm of machine cross-validatio
with k-fold were
learning. The also used toofassess
performance each models
the adopted model's accuracy.
was compared based on the coefficient
of correlation (R2 ) and root mean square values or errors (RMSE) of each algorithm. The
purpose of this study is to better comprehend the effect of model parameters and the
2. Background
reliability of the results from different machine learning techniques. The outcomes of the
2.1. Supervised
experimentalMLresearch
Algorithms
are also presented along with a comparative assessment of both
collective and single ML approaches. The statistical analyses and cross-validation with
The model is presented with the specific dataset, also known as a feature vector
k-fold were also used to assess each model’s accuracy.
the supervised process of learning processes, which signifies that the databases have in
stances2. Background
of observations together with their desired output. Such predictive models ca
generate Supervised
2.1. ML function
an inferred Algorithms that defines the features vectors to result labels. Decisio
trees, randomThe model is presented
forest, with the
Ada Boost, specific dataset,
regression also known
analysis, as a feature
artificial neuralvector in the and
networks,
supervised process of learning processes, which signifies that the databases have instances
nearest neighbors are some of the most widely used and effective supervised machin
of observations together with their desired output. Such predictive models can generate an
learning approaches.
inferred function that defines the features vectors to result labels. Decision trees, random
forest, Ada Boost, regression analysis, artificial neural networks, and k-nearest neighbors
are some of the
2.2. Unsupervised MLmost widely used and effective supervised machine learning approaches.
Algorithms
Unsupervised
2.2. Unsupervisedlearning uses datasets that have very little or no information for th
ML Algorithms
result labels.
Unsupervised learning usesof
However, the goal such that
datasets models
have is
veryto little
findorano
link betweenforthe
information thevariabl
and/orresult
to discover inactivethe
labels. However, factors.
goal ofThe
suchprinciple
models is of
to independent component
find a link between analysis, h
the variables
and/or to discover inactive factors. The principle of independent component
erarchical and bi-clustering, and k-means clustering are some examples of unsupervise analysis,
hierarchical and bi-clustering, and k-means clustering are some examples of unsupervised
learning approaches as shown in Figure 1.
learning approaches as shown in Figure 1.

FigureFigure
1. Machine Learning
1. Machine Algorithms.
Learning Algorithms.
Buildings 2022, 12, 914 4 of 19

Artificial neural networks (ANNs), decision trees (DTs), deep learning (DL), genetic
programming (GP), and maximum margin classifier or SVM are all examples of machine
learning (ML) approaches that are now frequently utilized to forecast engineering and sci-
ence challenges [38]. To combat the limitations of conventional strategies, SVM is designed
in such a manner that it can cope with non-linear regression issues more effectively [39].
It can produce a superior optimized solution rather than a local optimum because of its
well-generalized potential. In contrast, decision tree (DTs) and random forest (RF) resemble
like a tree structure; they forecast results using roots and nodes [40]. DT employs a whole
database containing the target variable; however, RF chooses features factors at random
inside the variables that govern the number of trees in estimation. After that, the predic-
tion is summed and correlated to the maximum number of voters, resulting in a reliable
estimation. GP is one of the most modern ML-based algorithms that simulates Darwinian
evolution [41]. The non-linear interaction is described as an expression style tree. ML
techniques are frequently used to retrieve unknown trends, statistics, and relationships
from large datasets. However, this method makes use of database systems, along with ML
and statistical analysis. Simulations and predictions are completed using two methods [42].
One is the traditional strategy, which is based on a specific model, while the other is the
ensemble algorithm strategy [43]. According to previous studies on such approaches,
ensemble approaches appear to be more accurate than ordinary single ML algorithms [44].
Ensemble learning algorithms are used to teach so-called poor learners through data for
training before integrating them into a profound learner. For many decades, ML techniques
have been employed to forecast the outcome of numerous metrics. However, their applica-
tion in civil engineering has increased dramatically in recent years. The basic concept of
ML is like that of classical methods, although the non-linear function is more precise than
linear phenomena. Several prediction models (such as ANNs, DL, DT, GEP, RF, and SVM)
are often utilized nowadays. Jesika et al. [45] employed eleven models for the prediction of
shear strength in RC beams made with steel fiber. To forecast the mechanical behavior of
the SF concrete, Behnood et al. [4] employed ANNs based on multi-objective gray wolves
(MOGW) optimizer. For the forecasting of concrete strength, Kadir et al. [46] applied DT,
LR, ANNs, and SVR modeling. The mechanical strengths of waste concrete were predicted
using an ANNs approach by Getahun et al. [47]. Ling et al. [48] used SVM to estimate
concrete strength in coastal environments and compared it with the results of ANNs or
DT algorithms. Zaher et al. [49] forecast the mechanical properties of foamed concrete
by using several ML algorithms. Another researcher [50] also employed an ML model
for assessing the durability properties of RC structures. Suguru et al. [51] used machine
learning to construct an autonomous fracture detector for civil infrastructure. Learning data
were extracted from images of concrete specimens, and deep learning (DL) was utilized
to detect cracks. Wassim et al. [52] assess the performance and accuracy of the models.
Miao et al. [53] also employed ANNs, MLR and SVM to forecast the bond strength in
fiber-based RC structures. To conclude this, we used two ML algorithms based on their
higher precision and popularity in predicting the compressive strength of concrete. In
addition, the author plans to test more machine learning models to find the best accurate
and effective model for identifying mix design processes.

3. Materials and Methods


Machine learning (ML) is a promising artificial intelligence (AI) concept that is widely
used in Civil Engineering for an accurate prediction of material behavior. The present
study used ANNs and decision tree (DT)-based machine learning approaches to predict the
compressive strengths (CS) of concrete, as shown in Figure 2. These methods are the most
efficient data analysis algorithms. They were picked as they had already been utilized in
several past studies. The following section gives an overview of the ANNs, and DT-based
ML techniques employed in this study.
conventional models. The present study developed an empirical model to study the effects
of different parameters on the compressive strength of slag and silica fume-based con-
crete. For this purpose, the different models of machine learning were compared to esti-
mate the best predictive model or fit among them. In this study, two machine learning
Buildings 2022, 12, 914 models such as artificial neural networks and decision tree have been applied. The 5next
of 19
section describes each adopted machine learning approaches one by one.

FlowchartofofResearch
Figure2.2.Flowchart
Figure ResearchMethodology.
Methodology.

Machine
3.1. Decision learning
Tree is a quite efficient prediction technique in terms of computing and
(DT) Model
time consumption. It minimizes error margins to nearly negligible levels as compared to
A decision tree is a type of machine learning that resembles a tree structure. It is
conventional models. The present study developed an empirical model to study the effects
composed of tree-like branches and nodes. Inner nodes have outgoing corners, while
of different parameters on the compressive strength of slag and silica fume-based concrete.
leaves dopurpose,
For this not. The the
set different
of data used for regression
models of machineorlearning
classification is dividedto
were compared into multiple
estimate the
classes by an inner node according to a specific function. During the training
best predictive model or fit among them. In this study, two machine learning models such process, the
model parameters
as artificial neural are considered
networks as a given
and decision treefunction.
have been A procedure
applied. The which
next is considered
section as
describes
aeach
tree adopted
structuremachine
from given cases is the stimulant
learning approaches one by one. for the DT model. The performed ap-
proach gives the optimum decision tree by decreasing the fitness function. The DT model
based on regression
3.1. Decision Tree (DT)was employed in this research for the intended parameter utilizing
Model
the different
A decision tree is a type there
variables because were no
of machine classes in
learning theresembles
that records. The data
a tree collectionItisis
structure.
divided multiple times for individual parameters. At each split
composed of tree-like branches and nodes. Inner nodes have outgoing corners, node, the performed
while
leaves do not. The set of data used for regression or classification is divided into multiple
classes by an inner node according to a specific function. During the training process, the
model parameters are considered as a given function. A procedure which is considered
as a tree structure from given cases is the stimulant for the DT model. The performed
approach gives the optimum decision tree by decreasing the fitness function. The DT model
based on regression was employed in this research for the intended parameter utilizing
the different variables because there were no classes in the records. The data collection
Buildings 2022, 12, x FOR PEER REVIEW 6 of 19
Buildings 2022, 12, 914 6 of 19

algorithm computes the variation in between predicted and experimental readings for the
is divided weighting
pre-specified multiple times
factor.for individual
The parameters.
errors in the separatedAt eachare
nodes split node, thethrough-
comparable performed
outalgorithm computes
the variables, and thethe variation
split spot is in between predicted
determined and with
by the factors experimental readings
the least fit values.for
the
This pre-specified
process weighting
is continued factor.toThe
indefinitely errorsthe
improve inmodel
the separated nodes are comparable
efficiency.
throughout the variables, and the split spot is determined by the factors with the least fit
values. This
3.2. Artificial process
Neural is continued
Networks (ANNs)indefinitely
Model to improve the model efficiency.
The artificial neural network consists of neurons also known as perceptions. This
3.2. Artificial Neural Networks (ANNs) Model
model is based on non-linear modeling in which multiple inputs and output variables are
used to The artificial
predict neural network
the response consists
parameters. Thisofmodel
neurons also known
is working basedas on
perceptions. This
multi-linear
perceptions. This multi-layered structure of ANNs is responsible for the ability to estimateare
model is based on non-linear modeling in which multiple inputs and output variables
used to predict the response parameters. This model is working based on multi-linear
complex models. The simulation input is directed and multiplied by weight in a forward
perceptions. This multi-layered structure of ANNs is responsible for the ability to estimate
run. The bias is imposed to every sheet, and the observed prediction model is computed
complex models. The simulation input is directed and multiplied by weight in a forward
in the back pass. The estimated responses that fit the variables are measured. After that,
run. The bias is imposed to every sheet, and the observed prediction model is computed
loss is estimated. After considering the input data, the efficiency model predicts results.
in the back pass. The estimated responses that fit the variables are measured. After that,
The back-propagation mechanism can be used to evaluate the errors of the projected re-
loss is estimated. After considering the input data, the efficiency model predicts results.
sults. We use various loss mechanisms based on our performance and criteria. The back-
The back-propagation mechanism can be used to evaluate the errors of the projected
ward propagation of the ANNs model provides partial derivatives of the gradient descent
results. We use various loss mechanisms based on our performance and criteria. The
back into the system for the individual inputs. The loss of back propagation occurs during
backward propagation of the ANNs model provides partial derivatives of the gradient
this phase, and the model's weights are revised via a learning algorithm. The ANNs struc-
descent back into the system for the individual inputs. The loss of back propagation occurs
ture has three layers (input, intermediate and output layers). In this study, we used eight
during this phase, and the model’s weights are revised via a learning algorithm. The
parameters as an input
ANNs structure has variable to study
three layers the intermediate
(input, effect of eachand
parameter
outputon the compressive
layers). In this study,
strength
we usedof modified concrete as
eight parameters with
an BFS and
input Fly Ash,
variable towhich
study is theeffect
the output response
of each shown on
parameter
in Figure 3.
the compressive strength of modified concrete with BFS and Fly Ash, which is the output
response shown in Figure 3.

Figure
Figure 3. ANNs
3. ANNs Diagram
Diagram Using
Using ANNs
ANNs Modeling.
Modeling.

3.3. Activation Function and Learning Algorithms


3.3. Activation Function and Learning Algorithms
The artificial neural network (ANN) is a statistical projection mechanism constructed
The artificial neural network (ANN) is a statistical projection mechanism constructed
on existing attributes derived from the structure of the human brain. This framework is
on existing attributes derived from the structure of the human brain. This framework is
comprised of neurons, which are functional units. Weights are interconnected with neurons,
comprised of neurons, which are functional units. Weights are interconnected with neu-
which are normally chosen at random initially. Weights are adjusted accordingly by certain
rons, which are normally chosen at random initially. Weights are adjusted accordingly by
iterations in a learning phase to ultimately create the ideal network that can anticipate it
certain iterations in a learning phase to ultimately create the ideal network that can
with acceptable precision [54]. As a result, the desired outcome in a trained ANNs model
Buildings 2022, 12, 914 7 of 19

can be accomplished by processing the input data and considering the adjusted weights.
The network refines with time by computing the errors and matching the expected inputs
and outputs. The ML model improves with time, which implies that the precision of the
predictive model can be increased, and the projected outcomes are accurate. In this study,
non-linear activation algorithms such as sigmoid are utilized due to their superior reliability.
The main objective of training the ANNs model is to reduce the error function, which is
often known as the mean square error or MSE [55].

1 n
N i∑
MSE = (ti − ai ) (1)
=1

where “N” represents input variables, while the output and target output are represented
by “ti ” and “ai ”, respectively. The most common technique for learning ANNs is the back-
propagation approach. For improved curve fitting and productivity, the back-propagation
method is selected as a training model to predict the CS of the concrete in this study. The
RMSE was reduced during neural network optimization. Sigmoid transfer function has
been used for hidden and output layers. The eight neurons in the hidden layer were chosen
by a hit and trial approach, and it had been suggested in the literature that using fewer
neurons would improve data precision [56].

4. Material Properties and Description of Dataset


The collected dataset used for the predictions of concrete strength developed with
waste materials is extracted from previous research papers based on experimental test-
ing [37,57–61]. It consists of eight parameters, which are ordinary Portland cement (OPC),
blast furnace slag (BFS), Fly Ash (F. Ash), water, superplasticizer (SP), coarse aggregates
(CA), fine aggregates (FA) and curing days (Age); these were used to predict the compres-
sive strength of sustainable concrete for the overall 1030 dataset. Cement of type A and
coarse aggregate sizes up to 10 mm were utilized. Two waste materials such as F. Ash
collected from the power station and BFS from the local steel manufacturing industry were
employed. A superplasticizer called naphthalene formaldehyde was used. Due to the
greater size of coarse aggregates (above 20 mm) and under specific curing considerations,
and other certain factors, some of the concrete specimens were removed from the col-
lected data during the analysis. Moreover, previous studies used samples of various sizes
and forms, which were all transformed to samples comparable for a cylindrical 150 mm
(15 cm) dimension using standard procedures [37,62]. Their frequency distribution for each
parameter along with the concrete compressive strength is presented in Figure 4.
The performance of models was significantly influenced by the independent factors.
The variables utilized to train the models to estimate the compressive strength of high-
strength concrete were also beneficial in model prediction. The factors have a significant
impact on the data collected (output) in terms of the improved efficiency and performance.
ML is an efficient artificial modeling based on neurons to predict the multiple number of
characteristics by studying the relevant concentration of each factor in an artificial manner.
The statistical analysis of all the factors used for modeling is presented in Table 1. The
statistics metrics such as standard deviation, coefficient of variance, SE (standard error)
mean, mean, minimum and maximum ranges of all factors were evaluated. According to
study [63], the minimum ratio seen between independent factors and the dataset should be
at least three, and it was recommended to choose more than five for the effective prediction
of models. The numbers are considerably greater in this study, which used the 1030 dataset
of compressive strength with eight input factors. The processing step which determines the
SFC’s properties before developing a model is data collection. The validity and reliability
of ML-based empirical models were determined using 80% data for training and 20% data
for validation. The established model is assessed using the testing and validation of the
dataset, while the training set of data is used to develop it.
Q1 192 0 0 164.9 0 932 730.3 7 23.695
Median 272.9 22 0 185 6.35 968 779.51 28 34.443
Q3 350 143 118.27 192 10.16 1029.4 824.25 56 46.209
Min 102 0 0 121.75 0 801 594 1 2.332
Buildings
Max 2022, 12, 914 540 359.4 200.1 247 32.2 1145 992.6 365 8 of 19
82.599
Range 438 359.4 200.1 125.25 32.2 344 398.6 364 80.267

Figure 4. Frequency distribution plot for variables factors (a) OPC, (b) BFS, (c) Fly Ash, (d) Water,
Figure 4. Frequency distribution plot for variables factors (a) OPC, (b) BFS, (c) Fly Ash, (d) Water,
(e) SP, (f) CA, (g) FA, (h) Age, and (i) CS.
(e) SP, (f) CA, (g) FA, (h) Age, and (i) CS.
5. Evaluation
Table of Models
1. Descriptive Using
Statistical Statistical
Analysis Metrics
of the Dataset.
The reliability and efficiency of the models can be assessed using statistical measures
Variable OPC BFS Fly Ash values
such as R-square Water SP
and root mean squareCA FA for better
error (RMSE) AgepredictionCS
of the
Unit Kg/m3 Kg/m 3
developedKg/m 3 Kg/m 3
model. However, theKg/m 3
R-square value 3
Kg/mis also called3
Kg/m the coefficient
Day of determi-
MPa
Mean 281.17 nation and
73.9 is preferred
54.19 for better6.203
181.57 model prediction.
972.92 Several
773.58modeling approaches
45.66 have
35.818
St. Dev. 104.51 been explored
86.28 64 to develop
21.36 a forecasting
5.973 model for the final
77.75 concrete 63.17
80.18 strengths, owing
16.706 to
Variance 10,921.74 7444.08 4095.55
advancement 456.06
in machine 35.683
learning. 6045.66
We tried to 6428.1 and GT
construct ANNs 3990.44
regression279.08
models
Q1 192 0 0 164.9 0 932 730.3 7 23.695
Median 272.9 22 0 185 6.35 968 779.51 28 34.443
Q3 350 143 118.27 192 10.16 1029.4 824.25 56 46.209
Min 102 0 0 121.75 0 801 594 1 2.332
Max 540 359.4 200.1 247 32.2 1145 992.6 365 82.599
Range 438 359.4 200.1 125.25 32.2 344 398.6 364 80.267

5. Evaluation of Models Using Statistical Metrics


The reliability and efficiency of the models can be assessed using statistical measures
such as R-square values and root mean square error (RMSE) for better prediction of the
developed model. However, the R-square value is also called the coefficient of determi-
nation and is preferred for better model prediction. Several modeling approaches have
been explored to develop a forecasting model for the final concrete strengths, owing to
advancement in machine learning. We tried to construct ANNs and GT regression models
to promote the use of sustainable waste materials, i.e., slag and Fly Ash in concrete pro-
Buildings 2022, 12, 914 9 of 19

duction at an industrial scale. The compressive strength was then used to evaluate both
models and find out the best one forecasted among them to predict the result with minimal
or no variance. In this study, data analysis and the estimation of error indicators are used
to verify the models. These measurements can give us a lot of information about the errors
in our model. The performance of the proposed models is also assessed using the R-square
and RMSE.
The mean squared variation in between estimated and actual testing is called the
RMSE. It calculates the mean squared extent of the error. It is the standard deviation of the
anticipated error. This method gives higher weight to big deviations, such as outliers, due
to its higher square variations and lower square variations. When estimating the output
response for multiple input factors, the RMSE measures the mean standard errors of the
models. The algorithm is effective if the RMSE is minimal. The RMSE value of 0.5 signifies
that it is inadequate to anticipate the data accurately. Equation (1) is used to estimate the
RMSE of the models. These statistical matrices will help check the validity and accuracy of
the predicted models. R2 readings around 0.65 to 0.75 reflect positive outcomes, whereas
R2 values < 0.50 represent poor outcomes. R-square can be estimated using the following
Equation (2). In Equation (2), SSE stands for sum of square errors, while SSy is the overall
variation.
r
1 N
 
RMSE = ( ∑n=1 actual − predicted)2 (2)
N
SSE
R2 = 1 − (3)
SSy

6. Results and Discussions


6.1. Outcomes of DT Model
As demonstrated in Figures 5 and 6, the predicted response of concrete strength using
supervised-based machine learning modeling gave successful and accurate results. The
developed DT model showed a strong correlation with R-square values of 0.943 and 0.836
during the training and testing of the DT model, respectively. The RMSE values of 4.00
and 6.54 were achieved during the training and testing phases of the DT model. Moreover,
the data analysis shows that 82% of the overall dataset had a mean squared error (MSE)
value of 16.02 during model training. However, the remaining 18% of the dataset had
an MSE value of 42.82 in the testing phase. Similarly, the data distribution had mean
absolute deviation (MAD) values of 3.03 and 4.817, respectively, throughout the training
and testing processes. The mean absolute percent error (MAPE) values were 0.1143 and
0.1804, respectively. The statistical summary of the DT model is presented in Table 2. The
R-squared values vs. number of nodes are presented in Figure 5. The figure demonstrates
that optimal values of R2 equal to 0.98 and 0.843 were observed at 284 nodes during the
training and testing of DT model prediction of CS (MPa). The DT model developed a strong
relationship between CS (MPa) and other factors. However, the R-squared values of 0.943
and 0.836 were achieved at 90 nodes, which was quite close to the optimum R2 value at
284 nodes (reference node). The scatterplot of response fits vs. actual values using the DT
model is presented in Figure 6. The graph demonstrates a strong relationship between the
actual and predicted values with a minimal dataset error using DT modeling. The tree
diagram produced during decision tree modeling is also presented in Figure 7.

6.2. Outcomes of ANNs Model


The ANNs model is a type of supervised learning approach used in machine learning,
and its usage in concrete shows a significant correlation between predicted and real concrete
strength. The accuracy and validity of ANNs models were checked based on values of
R2 , RMSE and MAD in Table 3. The ANNs model showed R-squared values of 0.873 and
0.848 in the training and testing phase of concrete prediction. The overall dataset consists
of 1030 samples; of these, 686 samples were used for training the ANNs model, while the
Buildings 2022, 12, 914 10 of 19

Buildings 2022, 12, x FOR PEER REVIEW 10 of 19


Buildings 2022, 12, x FOR PEER REVIEW 10 of 19
remaining 344 were utilized for testing and validation purposes. Similarly, the RMSE values
of predicted CS using the ANNs model were 5.90 and 6.60 during the training and testing
stage
Table of the ANNs
2. Statistical
Statistical model.
Model The mean
Summary absolute
of Training
Training anddeviation (MAD)
Testing Dataset
Dataset values
Using of 4.549 and 5.086
DT Modeling.
Modeling.
Table
were 2.
found in theModel Summary
training of
and testing and
of the Testing
ANNs model for Using DT
CS strength. However, a
Data Set
Set N % of N Mean
considerable StDev
increase in R 2 wasMin
seen using Q1
the ANNs Median
approaches, Q3
as presented Max
in Figure 8.
Data N % of N Mean StDev Min Q1 Median Q3 Max
Training 842 81.7The predicted
35.861 and actual
16.822 values of
2.3318 CS made with
23.695 waste materials
34.084 also
46.229 showed a positive
82.599
Training 842 81.7 35.861 16.822 2.3318 23.695 34.084 46.229 82.599
Testing
Testing 188
188 18.3correlation
18.3 35.624
35.624
in both the training3.3198
16.215
16.215
and testing23.633
3.3198
phase. As a35.549
23.633
result, cross-validation
35.549 45.243
45.243
was advised
79.297
79.297
as a useful method for precisely measuring a model’s prediction performance and setting
R22
R RMSE
RMSE MSE
MSE MAD
MAD MAPE
model parameters. As a result, improved achievable accuracy for predictedMAPE
value of CS
Training
Training 0.9433 parameters 4.0032
0.9433 4.0032
was attained. Statistical 16.0254
16.0254
parameter estimations 3.0366
3.0366 0.1143 have
0.1143
using ANNs modeling
Testing
Testing 0.8362
0.8362 also been shown 6.5445
6.5445in Table 4. 42.8299
42.8299 4.8176
4.8176 0.1804
0.1804

Figure 5.
Figure 5. R-Squared
R-Squared Values
Values vs.
vs. Number
vs. Number of
of Terminal
Terminal Nodes.
Terminal Nodes.
Nodes.

Figure 6.
Figure 6. Predicted and
and Actual
Actual Values
Values of
of CS
CS During
During the
the Training
Training and
and Testing
Testing Phase.
Phase.
Figure 6. Predicted
Predicted and Actual Values of CS During the Training and Testing Phase.
Buildings 2022, 12, 914 11 of 19

Table 2. Statistical Model Summary of Training and Testing Dataset Using DT Modeling.

Data Set N % of N Mean StDev Min Q1 Median Q3 Max


Training 842 81.7 35.861 16.822 2.3318 23.695 34.084 46.229 82.599
Testing 188 18.3 35.624 16.215 3.3198 23.633 35.549 45.243 79.297
R2 RMSE MSE MAD MAPE
2, x FOR PEER REVIEW 11 of 19
Training 0.9433 4.0032 16.0254 3.0366 0.1143
Testing 0.8362 6.5445 42.8299 4.8176 0.1804

Figure 7. Decision Tree


FigureDiagram using
7. Decision DT Model
Tree Diagram forDT
using Training Dataset.
Model for Training Dataset.

6.2. Outcomes of ANNs Model


The ANNs model is a type of supervised learning approach used in machine learn-
ing, and its usage in concrete shows a significant correlation between predicted and real
concrete strength. The accuracy and validity of ANNs models were checked based on val-
ues of R2, RMSE and MAD in Table 3. The ANNs model showed R-squared values of 0.873
and 0.848 in the training and testing phase of concrete prediction. The overall dataset con-
Sum Freq 824 206

Table 4. Statistical parameter estimations using ANNs modeling.

Parameter H1_1 H1_2 H1_3 H1_4 H1_5 H1_6 H1_7 H1_8


Buildings 2022, 12, 914 12 of 19
OPC (Kg/m3) 0.01538 0.01484 0.0015 0.00453 0.00424 −0.004 0.00649 0.01068
BFS (Kg/m3) 0.01204 −0.0075 0.0050 0.01288 0.00602 −0.007 0.00073 −0.0048
F. Ash (Kg/m3) 0.01279 0.00379 −0.008 −0.009 0.0031 −0.001 −0.0067 0.02515
Table 3. Predicted Statistical Measures with Both Training and Validation Values.
Water (Kg/m3) −0.057 −0.044 −0.042 0.0653 0.0234 −0.011 0.01277 −0.1049
SP (Kg/m3) 0.03559
Measures −0.009 −0.181 −0.199 0.0622
Training 0.0189 −0.0682
Validation0.0359
CA (Kg/m3) 0.00298
R-Square −0.015 −0.0003 −0.008 0.0012
0.8730396 −0.0003 0.00169
0.8487035−0.0188
FA (Kg/m3) 0.01079
RMSE −0.001 0.0031 0.01179 −0.0068
5.9002536 0.0010 0.00171
6.6008002−0.0096
Age (Day) MSE
0.00309 −0.032 −0.0553 0.0104 34.8200
−0.0744 −0.0004 0.0889343.5700 −0.0151
Mean Abs Dev 4.5491846 5.0868193
Intercept −7.684 20.4865 6.341 −13.51 −2.331 3.1527 −7.1926 41.281
-Log Likelihood 2191.0386 1137.3085
Int. H1_1SSE H1_2 H1_3 H1_4 H1_5
23,881.713 H1_6 H1_714,988.274 H1_8
CS (MPa) 25.261 12.857Sum Freq 0.4490 −7.4665 −4.9352 −8.9412
824 −28.071 11.594 206 1.6980

Training Plot Validation Plot

(a)

(b)
Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase
Figure 8. Actual by predicted plot (a); Residual by predicted plot (b) in training and validation phase
of ANNs model.
of ANNs model.

Table 4. Statistical parameter estimations using ANNs modeling.

Parameter H1_1 H1_2 H1_3 H1_4 H1_5 H1_6 H1_7 H1_8


OPC (Kg/m3 ) 0.01538 0.01484 0.0015 0.00453 0.00424 −0.004 0.00649 0.01068
BFS (Kg/m3 ) 0.01204 −0.0075 0.0050 0.01288 0.00602 −0.007 0.00073 −0.0048
F. Ash (Kg/m3 ) 0.01279 0.00379 −0.008 −0.009 0.0031 −0.001 −0.0067 0.02515
Water (Kg/m3 ) −0.057 −0.044 −0.042 0.0653 0.0234 −0.011 0.01277 −0.1049
SP (Kg/m3 ) 0.03559 −0.009 −0.181 −0.199 0.0622 0.0189 −0.0682 0.0359
CA (Kg/m3 ) 0.00298 −0.015 −0.0003 −0.008 0.0012 −0.0003 0.00169 −0.0188
FA (Kg/m3 ) 0.01079 −0.001 0.0031 0.01179 −0.0068 0.0010 0.00171 −0.0096
Age (Day) 0.00309 −0.032 −0.0553 0.0104 −0.0744 −0.0004 0.08893 −0.0151
Intercept −7.684 20.4865 6.341 −13.51 −2.331 3.1527 −7.1926 41.281
Int. H1_1 H1_2 H1_3 H1_4 H1_5 H1_6 H1_7 H1_8
CS (MPa) 25.261 12.857 0.4490 −7.4665 −4.9352 −8.9412 −28.071 11.594 1.6980
Buildings 2022, 12, x FOR PEER REVIEW 13 of 19

Buildings 2022, 12, 914 13 of 19


6.3. Prediction Profiler
The prediction profiler can help with better understanding each parameter contribu-
tion in concreteProfiler
6.3. Prediction compressive strength. The relationships established between eight input
variables Theand output profiler
prediction response canare
helppresented as aunderstanding
with better prediction profiler in Figurecontribu-
each parameter 9. The target
optimization
tion in concrete compressive strength. The relationships established between eightside
value of CS (MPa) is 44.48, which is presented on the top left inputof the
prediction
variables profiler along
and output with combinations
response are presented asofa eight
predictionfactors. It was
profiler observed
in Figure 9. Thethat the com-
target
optimization value of CS (MPa) is 44.48, which is presented on the top
pressive strength of concrete was increased as the OPC content changed from 102 to 540left side of the predic-
tion3. profiler
kg/m along with
The optimum OPCcombinations
content was of eight factors.
achieved at It281.17
was observed that the compressive
kg/m3. Similarly, the content of
strength of concrete was increased as the OPC content changed from 102 to 540 kg/m3 . The
BFS and Fly Ash increased in the range of 0 to 3593 and 200 kg/m , respectively. The opti-
3
optimum OPC content was achieved at 281.17 kg/m . Similarly, the content of BFS and3 Fly
mum value of BFS and Fly Ash contents was achieved at 73.9 and 54.19 kg/m . On the
Ash increased in the range of 0 to 359 and 200 kg/m3 , respectively. The optimum value of
other
BFShand,
and Flythe
Asheffect of the
contents waswater factor
achieved hadand
at 73.9 a negative
54.19 kg/m influence
3 . On theon thehand,
other compressive
the
strength. The positive correlation was observed in case of curing ages ranging
effect of the water factor had a negative influence on the compressive strength. The positive between 1
and 365 dayswas
correlation andobserved
the CS in ofcase
the of
concrete. A slight
curing ages ranging increase
betweenof CS (MPa)
1 and 365 dayswas
andobserved
the CS in
case
of of
theSP, CA, and
concrete. FA. increase of CS (MPa) was observed in case of SP, CA, and FA.
A slight

Figure 9. Prediction
Figure 9. Predictionprofiler
profiler of
of CS vs.
vs. eight
eightfactors.
factors.

6.4. Interaction Profiler


6.4. Interaction Profiler
The interaction profiler plot of eight factors with corresponding compressive strength
The interaction profiler plot of eight factors with corresponding compressive
(CS) is presented in Figure 10. The minimum and maximum ranges of each predicted
strength
parameter(CS)are is presented
representedinin Figure
the form 10. The
of redminimum
and blueand trendmaximum ranges of each
lines, respectively. The pre-
dicted
interaction profiler can be used as a visual graphical aid to describe the effects of different The
parameter are represented in the form of red and blue trend lines, respectively.
interaction
factors onprofiler can be used
the CS properties as concrete.
of the a visual graphical
The effectsaid to describe
of various the effects
parameters on theofOPC
different
and CS
factors onarethedescribed
CS propertiesin the offirst row
the of FigureThe
concrete. 10. effects
The explanation
of various ofparameters
the interaction onplot
the OPC
is presented from left to right and so on. In the first row, a positive
and CS are described in the first row of Figure 10. The explanation of the interaction plot correlation was seen
between CSfrom
is presented and the
leftOPC content,
to right andwhichso on.increased from row,
In the first 102 toa540 kg/m3 correlation
positive along with other
was seen
factors, as they showed a slight increase in their values except for the water variable, which
between CS and the OPC content, which increased from 102 to 540 kg/m along with other 3
had an indirect effect on CS and OPC. In the 2nd row, the interaction plot among CS and
factors, as they showed a slight increase in their values except for the water variable,
BFS with other variables is presented and showed a positive relationship between increased
which
CS andhadBFS an content
indirectranging
effect on CS and
between OPC.
0 and 359Inkg/mthe 2nd row, thethe
3 . However, interaction plot among
CS of BFS-based
CSconcrete
and BFShad with other variables is presented and showed a positive
decreased strength with an increase in water content from 121 to 247 kg/m relationship between
3.

increased
The 3rd CS rowand BFS an
showed content ranging
interaction plotbetween
of CS and0 BFS andin 359 kg/m . However,
combination3
with other thefactors.
CS of BFS-
It showed
based concrete a significant
had decreasedincrease in the presence
strength with anofincrease
OPC, BFS inand
waterdays factorsfrom
content for the CS.to 247
121
However, a slight increase of CS was attained in case of SP,
kg/m . The 3rd row showed an interaction plot of CS and BFS in combination with other
3 CA, and FA factors. The change
in water
factors. content had
It showed a negativeincrease
a significant effect on in thethe
combined
presence CSof and BFS BFS
OPC, interaction
and daysplot.factors
The for
4th row of the plot describes the direct effect of water on CS and other factors. A significant
the CS. However, a slight increase of CS was attained in case of SP, CA, and FA factors.
increasing trend is observed in the 5th row, which shows a direct relation between three
Thefactors:
change in water content had a negative effect on the combined CS and BFS interaction
OPC, BFS and curing days. However, there is a declining trend of CS and SP with
plot.
an The 4th in
increase row of the
water plot describes
ranging between 121 theanddirect effect3 .of
247 kg/m water
The on CS and
incorporation of other
CA and factors.
FA A
significant
had a strong increasing trend
direct effect onisOPC,
observed
BFS and in days
the 5th row,but
factors, which shows
a slight increasea direct relation
was seen in be-
tween
case three
of Flyfactors:
Ash, CAOPC,and FA BFS and curing
factors. Similarly,days.
the However,
last row of there is a declining
the interaction trend of CS
plot showed
and SPthe
that with an increase
increased curingin water
days ranging
ranging from between
1 to 365 days 121significantly
and 247 kg/m 3. The incorporation
improved the CS of
the concrete,
of CA and FA had but water
a strong has direct
an inverse
effectrelation
on OPC, to CS,BFS which
andshows differentbut
days factors, varying trends.
a slight increase
was seen in case of Fly Ash, CA and FA factors. Similarly, the last row of the interaction
plot showed that the increased curing days ranging from 1 to 365 days significantly im-
proved the CS of the concrete, but water has an inverse relation to CS, which shows dif-
ferent varying trends.
Buildings 2022,12,
Buildings2022, 12,914
x FOR PEER REVIEW 14 of 19
14 19

Figure10.
Figure 10. Interaction
Interaction Profiler
Profilerof
ofCS
CSvs.
vs. Eight
Eight Different
DifferentFactors.
Factors.

6.5.
6.5. Relative
Relative Variable
Variable Importance
Importance
In
In this study, weused
this study, we usedeight variables
eight such
variables as OPC,
such BFS,BFS,
as OPC, Fly Ash, water,
Fly Ash, SP, days,
water, SP, sand,
days,
and coarse aggregates to carry out sensitivity analysis. The effects of each input
sand, and coarse aggregates to carry out sensitivity analysis. The effects of each input variable
var-
are described
iable in Figure
are described 11. However,
in Figure the Fly
11. However, theAsh
Fly and
Ash fine
and aggregates havehave
fine aggregates the minimal
the min-
sensitivity effect effect
imal sensitivity on theon development of models
the development for concrete
of models compressive
for concrete strength.
compressive OPC
strength.
(kg/m 3 ) and3)age
OPC (kg/m and(day)
age had thehad
(day) significant sensitivesensitive
the significant parameters for compressive
parameters strength.
for compressive
Both Fly Ash and FA played a small role in the creation of both models.
strength. Both Fly Ash and FA played a small role in the creation of both models.
6.6. Comparative Analysis of Statistical Metrics and K-Fold
To analyze the biased and variances reduction for the testing dataset, a conventional
method of K-fold based on cross-validation is used. The statistical measures such as
coefficient of determination (R2 ), root mean square error (RMSE), mean square error (MSE)
and mean absolute deviation (MAD) were evaluated and compared with both models, as
presented in Figure 12.
In contrast, the output prediction of both machine learning approaches displays small
swings in their statistical results. In comparison to ANNs and DT, the DT model has fewer
errors and a higher R2 value. This may be due to the increasing number of nodes during
the DT modeling prediction. However, the validation values of both models displayed
close results when compared to each other. The R-squared values of 0.943 and 0.836 were
achieved during training and testing of the DT model. In case of the ANNs model, R2 is
Buildings 2022, 12, 914 15 of 19

equal to 0.873 and 0.848, respectively, in the training and validation of the dataset. Each
Buildings 2022, 12, x FOR PEER REVIEW 15 of 19
single model showed fewer validation errors. The RMSE, MSE, MAD and MAPE values for
the DT model are 6.54 MPa, 42.82 MPa, 4.82 and 0.1804 MPa, respectively, as per the testing
or validation results. However, the RMSE, MSE, and MAD values for the ANNs prediction
model are 6.60 MPa, 43.57 MPa, and 5.086, respectively, as per the validation results. The
statistical analysis of both models DT and ANNs is also described in Tables 2 and 3. When
compared against each other, the DT and ANN model both showed less variance in their
Buildings 2022, 12, x FOR PEER REVIEW 15 of 19
predicted output. The R2 value is directly correlated to this check; the lower the error rates,
the better the R2 value of the predicted model.

Figure 11. Relative Variable Importance for Different Factors.

6.6. Comparative Analysis of Statistical Metrics and K-Fold


To analyze the biased and variances reduction for the testing dataset, a conventional
method of K-fold based on cross-validation is used. The statistical measures such as coef-
ficient of determination (R2), root mean square error (RMSE), mean square error (MSE)
Figure
and 11. Relative
mean absoluteVariable Importance
deviation (MAD)forfor Different
were Factors.
evaluated and compared with both models, as
Figure 11. Relative Variable Importance Different Factors.
presented in Figure 12.
6.6. Comparative Analysis of Statistical Metrics and K-Fold
50 To analyze the biased and variances reduction for the testing dataset, a conventional
method of K-fold basedR2on cross-validation
RMSE is MSE
used. The statistical
MAD measures such as coef-
ficient of determination (R2), root mean square error (RMSE), mean square error (MSE)
40 and mean absolute deviation (MAD) were evaluated and compared with both models, as
Statistical Values

presented in Figure 12.


30
50
20 R2 RMSE MSE MAD
40
Statistical Values

10
30
0
20 Training Testing Training Testing

DT (Model) ANNs Model


10
Figure 12. Comparative Analysis of Prediction Using DT and ANNs models.
Figure 12. Comparative Analysis of Prediction Using DT and ANNs models.
0
In contrast, this,
Training
To conclude the Testing
output prediction
this research of boththat
Training
demonstrates machine
multiplelearning
Testing
machine approaches displays
learning algorithms
small swings in their statistical results. In comparison to ANNs and
can be used to anticipate outcomes. It provides better knowledge to engineers about DT, the DT model hasa
fewer DT (Model) ANNs Model
better errors
choice and a higher R value.and Thisregressors
may be due to the increasing number
and of nodes
2
of model parameters to execute the algorithm produce
during
precise the DT modeling
projected results. prediction.
The machine However,
learning the validation
techniques valuesinof
utilized both
this models
study dis-
produce
played
Figure close
12. results
Comparative when compared
Analysis of to
Predictioneach
Usingother.
DT The
and R-squared
ANNs models. values
a significant correlation between the desired and projected outcomes. The significance of of 0.943 and
0.836
using were
models achieved during
with higher traininginand
precision civiltesting of the isDT
engineering model. In case
demonstrated of theusage
by their ANNsin
In R
model, contrast,
2 is equal the
to output
0.873 and prediction of both machine
0.848, respectively, in the learning approaches
training and displays
validation of the
small swings
dataset. Each in theirmodel
single statistical
showedresults. In comparison
fewer validation to ANNs
errors. Theand DT, the
RMSE, DT model
MSE, MAD and has
fewer errors
MAPE valuesand for athe
higher R2 value.
DT model are This
6.54 may
MPa,be due MPa,
42.82 to the4.82
increasing number
and 0.1804 MPa,ofrespec-
nodes
duringas
tively, theperDTthemodeling
testing orprediction.
validation However, the validation
results. However, values
the RMSE, of and
MSE, bothMAD
models dis-
values
Buildings 2022, 12, 914 16 of 19

predicting concrete parameters, which is a time-consuming process. Finally, the precise


formulations of proposed models will help to promote the utilization of waste materials
such as BFS and FA instead of dumping as industrial waste for future construction works.

7. Limitations and Future Recommendations


This research studied two types of machine learning techniques, ANNs and decision
tree, for predicting the compressive strength of high-strength concrete. However, there
are some certain limitations. The data provided are limited to the estimated model’s
performance. The database used in this research had a maximum size of 1030 records.
Moreover, this study was limited to predicting compressive strength and ignored the
flexural strength and corrosive performance of the concrete under elevated temperature
conditions. Certainly, an appropriate dataset and validation are required for the modeling
of engineering parameters in various civil engineering applications. This work used a
diverse variety of datasets comprising eight independent factors; however, the dataset
and input factors can be extended to further improve the performance and accuracy of
the models.
Moreover, additional new factors such as extreme temperature, moisture, and other
weather conditions should be included as input parameters to assess the model’s efficiency
for predicting desirable results. As climatic factors have a major effect on concrete charac-
teristics, other ML techniques such as genetic algorithms, ANFIS and k-nearest neighbors
should be used to investigate their effects under various scenarios.

8. Conclusions
Concrete producers all around world employ mixed design methodologies to produce
a mix design that meets the needs of their clients. It is vital for field engineers to understand
how a mix proportion was developed to properly modify the mix design based on field
constraints, comprehend fresh properties, and achieve standards of quality. However, iden-
tifying the process used to develop a mix design can be challenging because most design
methodologies offer identical mix design proportions for specific required features. This
research used two machine learning approaches, ANNs and DT, to tackle the classification
challenge by studying eight important parameters. The following conclusions are derived
from our study:
• The outcomes of individual ANNs and DT models showed a significant correlation
between the predicted and actual values of CS with R2 values of 0.848 and 0.836,
respectively, in testing of the model for the CS prediction. However, the R2 values of
0.873 and 0.943 were attained during the training phase of both models.
• The increased R2 values and reduced RMSE and MAD showed the higher precision
and accuracy of both predictive models. These statistical measures showed satisfactory
outcomes for both models, i.e., ANNs and decision tree.
• Sensitivity analysis demonstrated that OPC (kg/m3 ), Age (curing) and water content
are the key factors in the creation of a model for the CS of concrete. However, other
factors such as Fly Ash and fine aggregates (FA) had the least effect on the CS in the
created model.
• Decision tree and ANNs are types of supervised learning techniques which produced
significant correlations between the predicted and observed values. The DT proved
effective, according to the K-fold cross-validation technique, which was used to verify
the prediction performance.
• The precise formulations and models will help to promote the utilization of waste
materials such as SFG and Fly Ash instead of dumping as industrial waste for future
construction works. This research provides a long-term sustainability by reducing
energy use, disposal waste, and carbon emissions.
Future work will comprise training and validating ML models employing a detailed
experimental setup. In addition, the author plans to test more machine learning models to
find the most accurate and effective model for identifying mixed design processes.
Buildings 2022, 12, 914 17 of 19

Author Contributions: Conceptualization, S.A.R.S. and M.A.; methodology, S.A.R.S. and M.K.A.;
software, M.K.A. and H.M.S.E.; validation, M.A., O.B. and Y.B.; formal analysis, S.A.R.S. and M.K.A.;
investigation, O.B. and H.M.S.E.; writing—original draft preparation, S.A.R.S. and M.K.A.; writing—
review and editing, M.A. and Y.B. All authors have read and agreed to the published version of the
manuscript.
Funding: This research received no external funding.
Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.
Data Availability Statement: Data will be available on suitable demand.
Acknowledgments: This research has been assisted by the data support of I-Cheng Yeh (Taiwan)
along with Dua, D. and Graff, C., UCI Machine Learning Repository, University of California, USA.
Conflicts of Interest: The authors declare no conflict of interest.

References
1. Xiao, H.; Duan, Z.; Zhou, Y.; Zhang, N.; Shan, Y.; Lin, X.; Liu, G. CO2 emission patterns in shrinking and growing cities: A case
study of Northeast China and the Yangtze River Delta. Appl. Energy 2019, 251, 113384. [CrossRef]
2. Yan, H.; Shen, Q.; Fan, L.C.; Wang, Y.; Zhang, L. Greenhouse gas emissions in building construction: A case study of One Peking
in Hong Kong. Build. Environ. 2010, 45, 949–955. [CrossRef]
3. Benhelal, E.; Zahedi, G.; Shamsaei, E.; Bahadori, A. Global strategies and potentials to curb CO2 emissions in cement industry.
J. Clean. Prod. 2013, 51, 142–161. [CrossRef]
4. Behnood, A.; Golafshani, E.M. Predicting the compressive strength of silica fume concrete using hybrid artificial neural network
with multi-objective grey wolves. J. Clean. Prod. 2018, 202, 54–64. [CrossRef]
5. Cheng, A.; Huang, R.; Wu, J.-K.; Chen, C.-H. Influence of GGBS on durability and corrosion behavior of reinforced concrete.
Mater. Chem. Phys. 2005, 93, 404–411. [CrossRef]
6. Özbay, E.; Erdemir, M.; Durmuş, H.I. Utilization and efficiency of ground granulated blast furnace slag on concrete properties—A
review. Constr. Build. Mater. 2016, 105, 423–434. [CrossRef]
7. Song, H.-W.; Saraswathy, V. Studies on the corrosion resistance of reinforced steel in concrete with ground granulated blast-furnace
slag—An overview. J. Hazard. Mater. 2006, 138, 226–233. [CrossRef] [PubMed]
8. Mai, H.-V.T.; Nguyen, T.-A.; Ly, H.-B.; Tran, V.Q. Investigation of ANN Model Containing One Hidden Layer for Predicting
Compressive Strength of Concrete with Blast-Furnace Slag and Fly Ash. Adv. Mater. Sci. Eng. 2021, 2021, 5540853. [CrossRef]
9. Oner, A.; Akyuz, S. An experimental study on optimum usage of GGBS for the compressive strength of concrete.
Cem. Concr. Compos. 2007, 29, 505–514. [CrossRef]
10. Shariq, M.; Prasad, J.; Masood, A. Effect of GGBFS on time dependent compressive strength of concrete. Constr. Build. Mater.
2010, 24, 1469–1478. [CrossRef]
11. Siddique, R.; Kaur, D. Properties of concrete containing ground granulated blast furnace slag (GGBFS) at elevated temperatures.
J. Adv. Res. 2012, 3, 45–51. [CrossRef]
12. Tüfekçi, M.M.; Çakır, Ö. An Investigation on Mechanical and Physical Properties of Recycled Coarse Aggregate (RCA) Concrete
with GGBFS. Int. J. Civ. Eng. 2017, 15, 549–563. [CrossRef]
13. Majhi, R.K.; Nayak, A.N.; Mukharjee, B.B. Development of sustainable concrete using recycled coarse aggregate and ground
granulated blast furnace slag. Constr. Build. Mater. 2018, 159, 417–430. [CrossRef]
14. Gehlot, T.; Sankhla, S.S.; Parihar, S. Modelling compressive strength, flexural strength and chloride ion permeability of high
strength concrete incorporating metakaolin and fly ash. Mater. Today Proc. 2021, in press. [CrossRef]
15. Li, Q.; Zhang, Q. Experimental study on the compressive strength and shrinkage of concrete containing fly ash and ground
granulated blast-furnace slag. Struct. Concr. 2019, 20, 1551–1560. [CrossRef]
16. Dao, D.V.; Ly, H.-B.; Vu, H.-L.T.; Le, T.-T.; Pham, B.T. Investigation and optimization of the C-ANN structure in predicting the
compressive strength of foamed concrete. Materials 2020, 13, 1072. [CrossRef] [PubMed]
17. Pham, B.T.; Nguyen, M.D.; Ly, H.-B.; Pham, T.A.; Hoang, V.; Van Le, H.; Le, T.-T.; Nguyen, H.Q.; Bui, G.L. Development of
Artificial Neural Networks for Prediction of Compression Coefficient of Soft Soil. In CIGOS 2019, Innovation for Sustainable
Infrastructure; Ha-Minh, C., Dao, D.V., Benboudjema, F., Derrible, S., Huynh, D.V.K., Tang, A.M., Eds.; Springer: Singapore, 2020;
Volume 54, pp. 1167–1172.
18. Pham, B.T.; Nguyen-Thoi, T.; Ly, H.-B.; Nguyen, M.D.; Al-Ansari, N.; Tran, V.-Q.; Le, T.-T. Extreme Learning Machine Based
Prediction of Soil Shear Strength: A Sensitivity Analysis Using Monte Carlo Simulations and Feature Backward Elimination.
Sustainability 2020, 12, 2339. [CrossRef]
19. Ly, H.-B.; Pham, B.T.; Le, L.M.; Le, T.-T.; Le, V.M.; Asteris, P.G. Estimation of axial load-carrying capacity of concrete-filled steel
tubes using surrogate models. Neural Comput. Appl. 2020, 33, 3437–3458. [CrossRef]
Buildings 2022, 12, 914 18 of 19

20. Ly, H.-B.; Le, T.-T.; Vu, H.-L.T.; Tran, V.Q.; Le, L.M.; Pham, B.T. Computational Hybrid Machine Learning Based Prediction of
Shear Capacity for Steel Fiber Reinforced Concrete Beams. Sustainability 2020, 12, 2709. [CrossRef]
21. Asteris, P.G.; Armaghani, D.J.; Hatzigeorgiou, G.D.; Karayannis, C.G.; Pilakoutas, K. Predicting the shear strength of reinforced
concrete beams using Artificial Neural Networks. Comput. Concr. Int. J. 2019, 24, 469–488.
22. Naderpour, H.; Poursaeidi, O.; Ahmadi, M. Shear resistance prediction of concrete beams reinforced by FRP bars using artificial
neural networks. Measurement 2018, 126, 299–308. [CrossRef]
23. Jiang, G.; Keller, J.; Bond, P.L.; Yuan, Z. Predicting concrete corrosion of sewers using artificial neural network. Water Res. 2016,
92, 52–60. [CrossRef] [PubMed]
24. Avila, C.; Shiraishi, Y.; Tsuji, Y. Crack width prediction of reinforced concrete structures by artificial neural networks. In
Proceedings of the 7th Seminar on Neural Network Applications in Electrical Engineering, Belgrade, Serbia, 23–25 September
2004.
25. Perera, R.; Barchín, M.; Arteaga, A.; De Diego, A. Prediction of the ultimate strength of reinforced concrete beams FRP-
strengthened in shear using neural networks. Compos. Part B Eng. 2010, 41, 287–298. [CrossRef]
26. Khademi, F.; Jamal, S.M.; Deshpande, N.; Londhe, S. Predicting strength of recycled aggregate concrete using Artificial Neural
Network, Adaptive Neuro-Fuzzy Inference System and Multiple Linear Regression. Int. J. Sustain. Built Environ. 2016, 5, 355–369.
[CrossRef]
27. Özcan, F.; Atiş, C.D.; Karahan, O.; Uncuoğlu, E.; Tanyildizi, H. Comparison of artificial neural network and fuzzy logic models
for prediction of long-term compressive strength of silica fume concrete. Adv. Eng. Softw. 2009, 40, 856–863. [CrossRef]
28. Kandiri, A.; Golafshani, E.M.; Behnood, A. Estimation of the compressive strength of concretes containing ground granulated
blast furnace slag using hybridized multi-objective ANN and salp swarm algorithm. Constr. Build. Mater. 2020, 248, 118676.
[CrossRef]
29. Han, I.-J.; Yuan, T.-F.; Lee, J.-Y.; Yoon, Y.-S.; Kim, J.-H. Learned Prediction of Compressive Strength of GGBFS Concrete Using
Hybrid Artificial Neural Network Models. Materials 2019, 12, 3708. [CrossRef]
30. Pitroda, D. Prediction of strength for fly ash cement concrete through soft computing approaches. Int. J. Adv. Res. Eng. Sci. Manag.
2014, 1, 1–11.
31. Feng, D.-C.; Liu, Z.-T.; Wang, X.-D.; Chen, Y.; Chang, J.-Q.; Wei, D.-F.; Jiang, Z.-M. Machine learning-based compressive strength
prediction for concrete: An adaptive boosting approach. Constr. Build. Mater. 2019, 230, 117000. [CrossRef]
32. Dao, D.V.; Ly, H.B.; Trinh, S.H.; Le, T.-T.; Pham, B.T. Artificial intelligence approaches for prediction of compressive strength of
geopolymer concrete. Materials 2019, 12, 983. [CrossRef] [PubMed]
33. Ahmadi, M.; Naderpour, H.; Kheyroddin, A. Utilization of artificial neural networks to prediction of the capacity of CCFT short
columns subject to short term axial load. Arch. Civ. Mech. Eng. 2014, 14, 510–517. [CrossRef]
34. Cachim, P.B. Using artificial neural networks for calculation of temperatures in timber under fire loading. Constr. Build. Mater.
2011, 25, 4175–4180. [CrossRef]
35. Bilim, C.; Atis, C.; Tanyildizi, H.; Karahan, O. Predicting the compressive strength of ground granulated blast furnace slag
concrete using artificial neural network. Adv. Eng. Softw. 2008, 40, 334–340. [CrossRef]
36. Chopra, P.; Sharma, R.K.; Kumar, M. Prediction of Compressive Strength of Concrete Using Artificial Neural Network and
Genetic Programming. Adv. Mater. Sci. Eng. 2016, 2016, 7648467. [CrossRef]
37. Yeh, I.-C. Modeling of strength of high-performance concrete using artificial neural networks. Cem. Concr. Res. 1998, 28, 1797–1808.
[CrossRef]
38. Foucquier, A.; Robert, S.; Suard, F.; Stéphan, L.; Jay, A. State of the art in building modelling and energy performances prediction:
A review. Renew. Sustain. Energy Rev. 2013, 23, 272–288. [CrossRef]
39. Lv, Y.; Liu, J.; Yang, T.; Zeng, D. A novel least squares support vector machine ensemble model for NOx emission prediction of a
coal-fired boiler. Energy 2013, 55, 319–329. [CrossRef]
40. Dou, J.; Yunus, A.P.; Bui, D.T.; Merghadi, A.; Sahana, M.; Zhu, Z.; Chen, C.-W.; Khosravi, K.; Yang, Y.; Pham, B.T. Assessment of
advanced random forest and decision tree algorithms for modeling rainfall-induced landslide susceptibility in the Izu-Oshima
Volcanic Island, Japan. Sci. Total Environ. 2019, 662, 332–346. [CrossRef]
41. Amershi, S.; Begel, A.; Bird, C.; DeLine, R.; Gall, H.; Kamar, E.; Nagappan, N.; Nushi, B.; Zimmermann, T. Software engineering
for machine learning: A case study. In Proceedings of the 2019 IEEE/ACM 41st International Conference on Software Engineering:
Software Engineering in Practice (ICSE-SEIP), Montreal, QC, Canada, 25–31 May 2019.
42. Chou, J.-S.; Pham, A.-D. Enhanced artificial intelligence for ensemble approach to predicting high performance concrete
compressive strength. Constr. Build. Mater. 2013, 49, 554–563. [CrossRef]
43. Galar, M.; Fernandez, A.; Barrenechea, E.; Bustince, H.; Herrera, F. A Review on Ensembles for the Class Imbalance Problem:
Bagging-, Boosting-, and Hybrid-Based Approaches. IEEE Trans. Syst. Man Cybern. Part C Appl. Rev. 2011, 42, 463–484. [CrossRef]
44. Gomes, H.M.; Barddal, J.P.; Enembreck, F.; Bifet, A. A Survey on Ensemble Learning for Data Stream Classification.
ACM Comput. Surv. 2018, 50, 1–36. [CrossRef]
45. Rahman, J.; Ahmed, K.S.; Khan, N.I.; Islam, K.; Mangalathu, S. Data-driven shear strength prediction of steel fiber reinforced
concrete beams using machine learning approach. Eng. Struct. 2021, 233, 111743. [CrossRef]
46. Güçlüer, K.; Özbeyaz, A.; Göymen, S.; Günaydın, O. A comparative investigation using machine learning methods for concrete
compressive strength estimation. Mater. Today Commun. 2021, 27, 102278. [CrossRef]
Buildings 2022, 12, 914 19 of 19

47. Getahun, M.A.; Shitote, S.M.; Gariy, Z.C.A. Artificial neural network based modelling approach for strength prediction of concrete
incorporating agricultural and construction wastes. Constr. Build. Mater. 2018, 190, 517–525. [CrossRef]
48. Ling, H.; Qian, C.X.; Kang, W.C.; Liang, C.Y.; Chen, H.C. Machine and K-Fold cross validation to predict compressive strength of
concrete in marine environment. Constr. Build. Mater. 2019, 206, 355–363. [CrossRef]
49. Yaseen, Z.M.; Deo, R.C.; Hilal, A.; Abd, A.M.; Bueno, L.C.; Salcedo-Sanz, S.; Nehdi, M.L. Predicting compressive strength of
lightweight foamed concrete using extreme learning machine model. Adv. Eng. Softw. 2018, 115, 112–125. [CrossRef]
50. Taffese, W.Z.; Sistonen, E. Machine learning for durability and service-life assessment of reinforced concrete structures: Recent
advances and future directions. Autom. Constr. 2017, 77, 1–14. [CrossRef]
51. Yokoyama, S.; Matsumoto, T. Development of an Automatic Detector of Cracks in Concrete Using Machine Learning. Procedia Eng.
2017, 171, 1250–1255. [CrossRef]
52. Ben Chaabene, W.; Flah, M.; Nehdi, M.L. Machine learning prediction of mechanical properties of concrete: Critical review.
Constr. Build. Mater. 2020, 260, 119889. [CrossRef]
53. Su, M.; Zhong, Q.; Peng, H.; Li, S. Selected machine learning approaches for predicting the interfacial bond strength between
FRPs and concrete. Constr. Build. Mater. 2020, 270, 121456. [CrossRef]
54. Park, Y.-S.; Lek, S. Artificial Neural Networks: Multilayer Perceptron for Ecological Modeling. In Developments in Environmental
Modelling; Elsevier: Amsterdam, The Netherlands, 2016; pp. 123–140.
55. BKA, M.A.R.; Ngamkhanong, C.; Wu, Y.; Kaewunruen, S. Recycled aggregates concrete compressive strength prediction using
artificial neural networks (ANNs). Infrastructures 2021, 6, 17.
56. Xiao, F.; Amirkhanian, S.N. Artificial Neural Network Approach to Estimating Stiffness Behavior of Rubberized Asphalt Concrete
Containing Reclaimed Asphalt Pavement. J. Transp. Eng. 2009, 135, 580–589. [CrossRef]
57. Yeh, I.-C. Analysis of Strength of Concrete Using Design of Experiments and Neural Networks. J. Mater. Civ. Eng. 2006, 18,
597–604. [CrossRef]
58. Yeh, I.-C. Modeling Concrete Strength with Augment-Neuron Networks. J. Mater. Civ. Eng. 1998, 10, 263–268. [CrossRef]
59. Yeh, I.-C. Design of High-Performance Concrete Mixture Using Neural Networks and Nonlinear Programming. J. Comput.
Civ. Eng. 1999, 13, 36–42. [CrossRef]
60. Yeh, I.-C. Prediction of strength of fly ash and slag concrete by the use of artificial neural networks. J. Chin. Inst. Civil
Hydraul. Eng. 2003, 15, 659–663.
61. Yeh, I.-C. A mix proportioning methodology for fly ash and slag concrete using artificial neural networks. Chung Hua J. Sci. Eng.
2003, 1, 77–84.
62. Metha, P.; Monteiro, P. Concrete: Structure, Properties and Materials; Pini: São Paulo, Brazil, 1994; p. 572.
63. Siddique, R.; Aggarwal, P.; Aggarwal, Y. Prediction of compressive strength of self-compacting concrete containing bottom ash
using artificial neural networks. Adv. Eng. Softw. 2011, 42, 780–786. [CrossRef]

You might also like