You are on page 1of 8

Sustainable Cities and Society 72 (2021) 103009

Contents lists available at ScienceDirect

Sustainable Cities and Society


journal homepage: www.elsevier.com/locate/scs

Applying machine learning in intelligent sewage treatment: A case study of


chemical plant in sustainable cities
Sheng Miao a ,1 , Changliang Zhou a ,1 , Salman Ali AlQahtani b , Mubarak Alrashoud c ,∗,
Ahmed Ghoneim c,d , Zhihan Lv a
a
School of Data Science and Software Engineering, Qingdao University, 308 Ningxia Road, Qingdao, 266071 Shandong, China
b Department of Computer Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
c Department of Software Engineering, College of Computer and Information Sciences, King Saud University, Riyadh 11543, Saudi Arabia
d Department of Mathematics and Computer Science, Faculty of Science, Menoufia University, Shebin El-Koom 32511, Egypt

ARTICLE INFO ABSTRACT

Keywords: Nowadays, sewage treatment in sustainable cities attracts more researchers both from academic and industrial
Sustainable smart cities communities. Especially, since industrial sewage is normally highly toxic, which could cause serious pollution
Sewage treatment in a city and lead to health problems of residents, it is critical to monitor and predictably maintain sewage
Machine learning
treatment facilities in cities. This paper presents an intelligent sewage treatment system based on machine
COD prediction
learning and Internet of Things sensors to assist to manage the sewage treatment in a fine chemical plant. The
Intelligent system
implemented system has operated for twenty months, acquired multi-dimension data such as temperatures
in different treatment processes, operation parameters of devices, and real-time Chemical Oxygen Demand
(COD). Since the change trend of outflow COD is highly related to operation status, this paper innovatively
uses different types of temperature and water inflow data as model inputs and applies three algorithms to make
prediction, which are Support Vector Regression (SVR), Long Short-Term Memory (LSTM) neural network, and
Gated Recurrent Unit (GRU) neural network. The experimental results show that GRU model performs better
(MAPE = 10.18%, RMSE = 35.67, MAE = 31.16) than LSTM and SVR. This study can be extended to various
sewage treatment scenarios in sustainable cities.

1. Introduction treatment of industrial sewage has become very necessary, sewage


treatment in sustainable cities attracts more researchers both from aca-
With the development of the economy, there are more industries demic and industrial communities, and many governments and relevant
and more sewage discharges. The industrial sewage which generated is departments have begun to pay attention to sewage treatment and
discharged freely without treatment not only damages the environment invested a lot of money in building sewage treatment plants. However,
but also causes unexpected consequences. At present, many countries most of the current sewage treatment plants are large-scale, and small-
have industrial water pollution problems, industrial sewage is normally scale sewage treatment plants have not been paid enough attention.
highly toxic, which could cause serious pollution in a city and lead For special areas with incomplete infrastructure construction and few
to health problems of residents, thus increasing people’s morbidity professionals, large-scale sewage treatment plants are not suitable for
and even threatening the lives of residents. In short, the pollution of building, and the lack of professional supervision and maintenance
sewage has caused negative impacts on the health of residents and the of sewage treatment process makes effective sewage treatment be-
efficiency of industries. Besides, sustainability becomes a hot research come extremely difficult. With the development of sustainable smart
area of smart cities and modern societies (e.g. Islam & Jashimuddin, cities, intelligent sewage treatment is highly recommended as a great
2017; Silva, Khan, & Han, 2018). And the exploitation and utilization way to deal with sewage pollution. It can effectively supervise the
of water resources are significant in sustainable cities. operation of devices and predict performance. Moreover, intelligent
For the sustainable development of smart cities, the health of resi- sewage treatment can manage energy consumption accurately and
dents, and prevention of further environmental deterioration, effective reduce manpower, which can decrease the cost of production.

∗ Corresponding author.
E-mail addresses: smiao@qdu.edu.cn (S. Miao), zhouchangliang6688@qq.com (C. Zhou), salmanq@ksu.edu.sa (S.A. AlQahtani), malrashoud@ksu.edu.sa
(M. Alrashoud), ghoneim@ksu.edu.sa (A. Ghoneim), lvzhihan@gmail.com (Z. Lv).
1
Sheng Miao and Changliang Zhou contributed equally to this work and they are co-first authors of the article.

https://doi.org/10.1016/j.scs.2021.103009
Received 12 February 2021; Received in revised form 6 May 2021; Accepted 7 May 2021
Available online 14 May 2021
2210-6707/© 2021 Elsevier Ltd. All rights reserved.
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

With the rapid development of information technologies, the appli- of the intelligent sewage treatment system. In Section 4, this paper
cations of some emerging technologies are becoming more and more presents the experimental results and analysis to prove the feasibility
extensive (Chen et al., 2018; Rahman, Rashid, Hossain, Hassanain, and effectiveness in the methodology. Finally, the future study direction
Alhamid, & Guizani, 2019; Rahman et al., 2018; Sangaiah et al., 2020). and conclusion are given in Section 5.
The Internet of Things (IoT) technology is highly applied in infrastruc-
ture fields such as industry, agriculture, environment, transportation,
smart home and healthcare (e.g. Alhussein et al., 2018; Alshehri & 2. Related works
Muhammad, 2021; Amin et al., 2019; Muhammad, Hossain, & Kumar,
2021; Shorfuzzaman, Hossain, & Alhamid, 2021; Yassine et al., 2019).
One goal of city builders is to make cities smarter and more sus-
These technologies effectively promote the intelligent development of
tainable, the development of smart cities is necessary for the economic
these fields and improve the efficiency of the industry (e.g. Fu et al.,
2020; Han et al., 2020; Hu et al., 2015). Besides, deep learning models transformation of cities, live style amelioration of residents, environ-
such as Recurrent Neural Network (RNN) and Convolutional Neural mental protection, and society management (Gu, Yang, & Liu, 2013;
Network (CNN) have been widely used, and have been achieved many Hossain et al., 2018). In (Bibri & Krogstie, 2017), the current and
breakthroughs in multidisciplinary cooperation. Kong et al. (2017) future development for smart sustainable cities is made a detailed
propose a method for forecasting residential loads with the Long Short- introduction. Without sustainable development, cities cannot be re-
Term Memory (LSTM) neural network. Guo, Lei, Li, Yan, and Li (2018) ally smart (Yigitcanlar et al., 2019). Singh et al. (2020) discuss the
propose a method that automatically constructing health indicators information technologies for sustainable smart cities.
based on CNN. It can be seen from the above description that IoT and At present, sewage treatment in smart cities has been widely used in
deep learning are playing important roles in people’s daily life (Hossain health, transportation, energy, environmental protection, etc (Hossain,
& Muhammad, 2018). Applying machine learning and IoT technology Muhammad and Alamri, 2018). Applying the IoT sensors technol-
to the sewage treatment, the operating modes of traditional water
ogy, administrators realize information management of the sewage
systems are changed, so as to improve sewage treatment efficiency and
treatment process (Edmondson et al., 2018; Wen-zhen, 2013). Data
save energy consumption (Min et al., 2015).
acquisition is the foundation of the IoT technology, and the sensor
In industries, the predictive maintenance of devices is very neces-
technology is an indispensable part of data acquisition. Through the
sary and affects the life and efficiency of intelligent devices, the device
is maintained only when it is needed, the operators can deal with IoT sensor technology, multi-dimension data in the objective world are
device faults before they occur (Carvalho et al., 2019). Through the IoT obtained, such as temperature, humidity, and illumination (Liu & Zhou,
technology, the big amount of data from the plant is real-time collected, 2012). By real-time collecting the data of the sewage treatment plant,
predictive maintenance of devices based on massive data has many the data information of intelligent devices is mastered in all aspects,
advantages such as reducing costs and maintenance times, extending and the method effectively reduces labor costs.
device life, and ensuring the safety of operators (Peres, Rocha, Leitao, In the sewage treatment process, due to some uncertain factors and
& Barata, 2018; Sezer, Romero, Guedea, Macchi, & Emmanouilidis, external interferences, the probability of fault in the sewage treatment
2018). process increases greatly. The process may be affected by weather or
The intelligent management of water resources is an indispensable toxic substances, and some sensors are in an acidic environment for
part of the sustainable development of smart cities. In addition, in order a long time, which are easily corroded. Therefore, predictive mainte-
to prevent further environmental deterioration, effective treatment of
nance of sewage treatment process is necessary. In predictive mainte-
sewage is particularly important. Aiming at the safety, stability, energy
nance of sewage treatment, the data-driven method is an important key.
saving, and consumption reduction of water plants, this study has
Combined with machine learning algorithms, data-driven predictive
established a complete production, management, and service system
maintenance has been widely used in many fields (Zhang, Yang, &
for sewage treatment, so as to realize the intelligent management of
small-scale sewage treatment plants in sustainable cities. Wang, 2019). Zhang, Liang, Zhou, et al. (2015) utilize an optimized
In this paper, a chemical plant in China is taken as the case study, hybrid Support Vector Machine (SVM) model to detect the fault of
‘‘Self-Mixing Anaerobic Digester (SMAD)’’ and ‘‘Baffled BioReactor motor bearing and classify the fault types and fault severity. Li et al.
(BBR)’’ are used to treat the generated sewage by the chemical plant. (2016) use a deep random forest algorithm combining acoustic and vi-
From this study, several findings are presented: bratory signals to predict gearbox faults. In Aydin and Guldamlasioglu
(2017), the LSTM neural network model by processing a large quantity
1. This paper presents an intelligent sewage treatment system
of sensor data is used to predict engine conditions. The predictive main-
based on machine learning and IoT sensors and applies it to a
tenance based on data-driven saves energy consumption and ensures
fine chemical plant.
the stable operation of the sewage treatment process.
2. The study changes the operating modes of traditional water
systems, so as to ensure the safety and stability of water plants
and reduce the cost of the plant. 3. System design and methodology
3. In this paper, we innovatively use different types of temperature
and water inflow data as model inputs and apply three machine
learning algorithms to predict the change trend of outflow Chem- Applying machine learning and IoT sensors to the sewage treatment
ical Oxygen Demand (COD) of the chemical plant. This paper process, the intelligent sewage treatment is realized, so as to improve
provides a new idea for predicting the change trend of outflow sewage treatment efficiency and save energy consumption. In this
COD. study, multiple machine learning algorithms are used to improve the
4. According to the results, this study reduces workload and work- intelligence of sewage treatment.
ing difficulty for sewage professionals significantly. And it can As shown in Fig. 1, a number of intelligent devices are placed
be extended to various special sewage treatment scenarios, such in the fine chemical plant, then the collected data is transferred to
as small-scale industries with toxic sewage in sustainable cities.
the server. The intelligent sewage treatment system is mainly divided
The rest of this paper is structured as follows: Section 2 mainly into two modules, namely remote interaction module and predictive
reviews the research status of sewage treatment and related tech- maintenance module. In the next subsections, these two modules are
nologies. Section 3 introduces the design idea and theoretical basis introduced in detail respectively.

2
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

Fig. 2. The structure of LSTM neural network model.

appropriate measures. This paper applies three machine learning algo-


rithms to make prediction, so as to realize the predictive maintenance
of the sewage treatment process.

3.2.1. Support vector regression


SVM is a common data classification model in many applications
of data mining. It is a generalized linear classifier that classifies data
by supervised learning. It has been widely used in face recognition,
text classification, bioinformatics, and other fields (e.g. Huang et al.,
2018; Ramesh & Sathiaseelan, 2015). SVR is an extension of the SVM
in regression problems. The core idea of the SVR algorithm is to find
a separating hyperplane (hypersurface) to minimize the expected risk.
In recent years, the applications of support vector regression have
gradually increased. In the aspect of data prediction, the SVR algorithm
has also been deeply studied by researchers (e.g. Ceperic, Ceperic, &
Fig. 1. The architecture of intelligent sewage treatment system.
Baric, 2013; Kazem, Sharifi, Hussain, Saberi, & Hussain, 2013).

3.2.2. Long short-term memory neural network


3.1. Remote interaction The LSTM neural network is a special RNN model, which is designed
to deal with the long-term dependence problem of traditional RNN.
The remote interaction module is divided into the PC monitoring LSTM can effectively maintain long-term memory. Due to its special
platform and mobile monitoring platform. The module mainly realizes design, LSTM is suitable for processing and predicting time series data
real-time data collection of the intelligent devices and fault alarm no- with very long intervals and delays. And it has been widely used in
tification by the IoT sensors and Modbus protocol. The PC monitoring speech recognition, language modeling, text prediction, etc. In the as-
platform adopts the B/S architecture, the programming language uses pect of fault diagnosis and prediction, LSTM also has further study (e.g.
Java and JavaScript, and the database uses MySQL relational database
Zhang, Wang, Liu, & Bao, 2017; Zhao, Sun, & Jin, 2018).
with small volume and high speed. And when device faults occur, the
As shown in Fig. 2, the LSTM neural network is a gated RNN model,
administrators and operators are notified in time to check and sends
the model has three gated units: forget gate (𝑓𝑡 ), input gate (𝑖𝑡 ), and
operation instructions. The mobile monitoring platform gets rid of the
output gate (𝑜𝑡 ). And the model mainly depends on the memorized cell
inconvenience of computers. Through the two platforms, all aspects of
state 𝐶 and the current outputs ℎ to train the model. The cell state 𝐶 is
device data information are obtained, so as to realize remote interaction
equivalent to the path of information transmission, allowing important
with the sewage treatment process.
information to be transmitted.
In industries, Modbus protocol is an important communication pro-
The forget gate determines which unnecessary information in the
tocol in control systems. And this system uses a variety of intelli-
cell state 𝐶𝑡−1 is forgotten:
gent devices that supporting Modbus protocol, such as electromagnetic
flowmeter, COD monitor, smart temperature control sensor, pH mon- 𝑓𝑡 = Sigmoid(𝑊𝑓 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑓 ) (1)
itor, etc. The system can obtain relevant data such as flow rate, pH,
temperature, dissolved oxygen (DO), and real-time COD. And the col- The input gate determines which new information is stored to the
lected data at the plant is transferred to the database on the server cell state 𝐶𝑡 and updates cell state 𝐶𝑡 :
through the Data Transfer Unit (DTU). Besides that, the Modbus proto-
𝑖𝑡 = Sigmoid(𝑊𝑖 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑖 ) (2)
col also supports sending operation instructions to intelligent devices
in order to realize the function of parameter correction, which is 𝐶𝑡′ = tanh(𝑊𝑐 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑐 ) (3)
convenient for operators to manage and control the sewage treatment
𝐶𝑡 = 𝑓𝑡 ∗ 𝐶𝑡−1 + 𝑖𝑡 ∗ 𝐶𝑡′ (4)
process.
The output gate determines the current output ℎ𝑡 based on the cell
3.2. Predictive maintenance state 𝐶𝑡 :

Through predicting the change trend of outflow COD and reading 𝑜𝑡 = Sigmoid(𝑊𝑜 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ] + 𝑏𝑜 ) (5)
sensor data (such as current, voltage, and liquid level), the predictive ℎ𝑡 = 𝑜𝑡 ∗ tanh(𝐶𝑡 ) (6)
maintenance module realizes remaining useful life prediction of vulner-
able devices and fault prediction, operators find faults in time and take

3
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

3.2.4. Model evaluation


The study uses the Root Mean Squared Error (RMSE) (Eq. (12)),
Mean Absolute Error (MAE) (Eq. (13)), and Mean Absolute Percentage
Error (MAPE) (Eq. (14)) to evaluate the quality of the models. The three
evaluation indexes are commonly used in the evaluation of regression
models. The error is lower, the model is better. The difference between
RMSE and MAE is that RMSE is more sensitive to deviations. And the
model error reduction is calculated by Eq. (15).


√1 ∑ 𝑁
𝑅𝑀𝑆𝐸 = √ (𝑦 − 𝑌𝑖 )2 (12)
𝑁 𝑖=1 𝑖

1 ∑|
𝑁
𝑦 − 𝑌𝑖 ||
Fig. 3. The structure of GRU neural network model.
𝑀𝐴𝐸 = (13)
𝑁 𝑖=1 | 𝑖

100% ∑ || 𝑦𝑖 − 𝑌𝑖 ||
𝑁
𝑀𝐴𝑃 𝐸 = (14)
The Sigmoid function (Eq. (7)) and tanh function (Eq. (8)) are 𝑁 𝑖=1 || 𝑦𝑖 ||
commonly used activation functions in deep learning models.
𝐸𝑟𝑟𝑜𝑟𝑀𝑜𝑑𝑒𝑙 − 𝐸𝑟𝑟𝑜𝑟𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒
1 𝐸𝑟𝑟𝑜𝑟 𝑅𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 = − ∗ 100% (15)
Sigmoid(𝑧) = (7) 𝐸𝑟𝑟𝑜𝑟𝑏𝑎𝑠𝑒𝑙𝑖𝑛𝑒
1 + 𝑒−𝑧
sinh(𝑥) 𝑒𝑥 − 𝑒−𝑥 where,
tanh(𝑥) = = 𝑥 (8)
cosh(𝑥) 𝑒 + 𝑒−𝑥 𝑦𝑖 = actual value of the sample,
Three gates are used to control the status of LSTM block, so as to 𝑌𝑖 = predicted value of the model.
realize the discarding and updating of information and status updating.
And 𝑊𝑓 , 𝑊𝑖 , 𝑊𝑐 and 𝑊𝑜 represent the corresponding weights, each 𝑊
4. Experimental analysis
includes inner weight of the hidden layer and input weight, 𝑏 represents
the corresponding bias of each gate.
The intelligent sewage treatment system has been working steadily
for about 20 months, and the collected data are stored in the database.
3.2.3. Gated recurrent unit neural network Through applying machine learning in intelligent sewage treatment, it
The GRU neural network model is an important variant of LSTM can change the operating modes of traditional water systems.
neural network (see Cho et al., 2014). GRU has a simpler structure than
LSTM, and it is also a very popular RNN model at present. GRU has
good performances in data processing and prediction (e.g. Chen, Jing, 4.1. Data collection and display
Chang, & Liu, 2019; Zhao et al., 2017).
As shown in Fig. 3, in GRU block, input gate and forget gate are In the remote interaction module, the data of the sewage treatment
integrated into a update gate, that is, the GRU block has two gated plant are real-time collected, to ensure the control of intelligent devices,
units: update gate and reset gate. The hidden state ℎ is used for thereby laying the foundation for the predictive maintenance of the
information transmission, the cell state 𝐶 is removed. Therefore, GRU sewage treatment process.
model has low computational overhead, for large-scale data sets, GRU Taking the outflow temperature in the outflow electromagnetic
is more suitable. The output ℎ𝑡 of GRU is calculated as follows: flowmeter as an example and the queried date is from July 16, 2020 to
𝑈𝑡 = Sigmoid(𝑊𝑈 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ]) (9) September 13, 2020, the data display is shown in Fig. 4. It can be seen
that for the outflow temperature in the experimental stage, the lowest
𝑅𝑡 = Sigmoid(𝑊𝑅 ⋅ [ℎ𝑡−1 , 𝑥𝑡 ]) (10)
temperature is about 25 ◦ C, and the highest temperature is close to 50
ℎ𝑡 = (1 − 𝑈𝑡 ) ∗ ℎ𝑡−1 + 𝑈𝑡 ∗ tanh(𝑊ℎ ⋅ [𝑅𝑡 ∗ ℎ𝑡−1 , 𝑥𝑡 ]) (11) ◦ C.

Fig. 4. The change trend of outflow temperature in the experimental stage.

4
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

Table 1
The details of experimental data.
Attribute name Unit Min Max Mean Data type
SMAD temperature ◦C 33.99 36.42 35.22 Input

Inflow temperature C 19.64 30.1 22.74 Input

Outflow temperature C 28.41 49.86 39.69 Input
Low-concentration inflow m3 0 5.41 2.0 Input
(Sum of Past 48 h)
High-concentration inflow m3 0.23 6.38 1.34 Input
(Sum of Past 48 h)
Outflow COD mg/L 200.62 452.29 339.56 Output

Table 2
The error analysis of three models.
Model name MAPE RMSE MAE Compared with baseline model
RMSE reduction MAE reduction
SVR 14.47% 48.03 43.05 – –
LSTM 10.92% 37.14 32.82 22.67% 23.76%
GRU 10.18% 35.67 31.16 25.73% 27.62%

temperature environment, the degree of microbial activity and the


efficiency of sewage treatment increase with increasing temperature.
The inflow temperature, outflow temperature, and SMAD temperature
affect the activity of microorganisms and the effect of biological treat-
ment, and then affect the efficiency of COD degradation. In addition,
water inflow also plays a key role in COD degradation. If water inflow
is too large, which exceeds the sewage treatment capacity, it also has
Fig. 5. The process of COD prediction. adverse effects on the outflow quality. Therefore, these five kinds of
time series data are used as the inputs for modeling in this study.

4.2. COD prediction 4.2.2. Data preprocessing


This study obtains data every five minutes, the total number of data
Collecting data through remote interaction and combining it with is 463 964, and then cleans these data and construct data features. In
some algorithms, the change of sewage quality can be accurately this study, SMAD temperature, outflow temperature, inflow tempera-
predicted (Yaqub, Asif, Kim, & Lee, 2020; Zhou, Li, Snowling, Baetz, ture, high-concentration inflow, and low-concentration inflow are used
Na, & Boyd, 2019). Besides, on the issues of COD prediction, many re- as the input attributes, the average value of each data within six hours
searchers have already conducted in-depth discussions and researches (e.g. is used as input attribute value. Especially, the water inflow refers to
Najafzadeh & Ghaemi, 2019; Valente, Mendonça, Pereira, & Felix, the sum of past 48 h. And the outputs are outflow COD values. The
2014; Yang, Zhang, Wang, & Gao, 2013). Given the time-sensitive data details are shown in Table 1. These data need to be normalized to
characteristic of the collected data, this study uses the GRU, LSTM and convert all experimental data to data between 0 and 1 (Eq. (16)). The
SVR to predict the change trend of outflow COD of the chemical plant, ratio of training data to validation data is 8 to 2.
so as to realize predictive maintenance of the sewage treatment process. 𝑥 − 𝑀𝑖𝑛𝑉 𝑎𝑙𝑢𝑒
𝑥𝑛𝑜𝑟𝑚𝑎𝑙𝑖𝑧𝑎𝑡𝑖𝑜𝑛 = (16)
The process of COD prediction is shown in Fig. 5. After the real-time 𝑀𝑎𝑥𝑉 𝑎𝑙𝑢𝑒 − 𝑀𝑖𝑛𝑉 𝑎𝑙𝑢𝑒
data collected from the chemical plant is extracted from the database,
data cleaning, feature construction and data normalization operations 4.2.3. Modeling
are performed, and then three machine learning models are modeled In this study, the construction of GRU model is similar to LSTM.
respectively, so as to make accurate prediction of COD. Through many experiments and combined with the characteristics of
sewage treatment, the number of hidden layers is 3, the time-step is
4.2.1. Experimental data description 4, which means that each predicted COD depends on the first four
Through a series of sensors and the remote interaction module, the time series samples. The input is a 4*5 matrix, and each input 𝑥𝑡 =
collection of data information of intelligent devices is realized, in which (𝑥𝑡1 , 𝑥𝑡2 , 𝑥𝑡3 , 𝑥𝑡4 , 𝑥𝑡5 ) is five attribute values of the corresponding time-
the water inflow is divided into two types, namely high-concentration step, and then sent to the corresponding LSTM or GRU layer. Finally,
inflow and low-concentration inflow. In this study, six time series data the outputs of the final network layer input into a conventional fully
are used for modeling, which are outflow COD, SMAD temperature, connected neural network layer, and map the outputs of the final LSTM
outflow temperature, inflow temperature, high-concentration inflow, or GRU to a value 𝑌𝑖 , that is, the predicted COD of the target time.
and low-concentration inflow. The data used in modeling are real- Taking LSTM as an example, the prediction framework is shown in
time data for 60 days which are from July 16, 2020 to September 13, Fig. 6.
2020. During this period, the sludge activity and efficiency of COD In the construction of SVR algorithm, the kernel function is one
degradation are good, and the collected data are abundant, so it can of the important functions. The kernel function can effectively process
be used as the time range for predicting the change trend of outflow high-dimension input, thus avoiding ‘‘Curse of Dimensionality’’, greatly
COD. reducing the amount of calculation, making the model more powerful,
Temperature is an important factor affecting the effect of biological and improving the performance of the algorithm. In this study, the
treatment, and the growth of microorganisms has high requirements on kernel function used is Gaussian Radial Basis Function (RBF), which
the environment. Too high or too low temperature affects the treatment is one of the most commonly used kernel functions. And because SVR
of organic compounds in sewage by microorganisms. In an appropriate algorithm does not have the concept of time-step, this study adds value

5
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

Fig. 6. The LSTM-based prediction framework.

the early stage of system working, due to the simultaneous debugging


of SMAD and BBR, the concentration of SMAD outflow was too high,
and there was not enough clear water in the homogeneous regulating
tank to dilute SMAD outflow, and the water inflow was not well
controlled. As a result, the water inflow load of BBR increased too
fast, outflow COD was abnormal and the concentration change was
extremely unstable. By analyzing the trend of outflow COD in BBR and
reading the outflow data and the inflow data, administrators found
that the sewage may contain a large amount of SMAD undegraded
organic macromolecular pollutants and toxic and hazardous substances,
resulting in sludge poisoning in BBR, then the operators were notified
in time to take appropriate measures, such as adding new sludge after
emptying BBR, so as to ensure the stability of sewage treatment.
From the above analysis, it can be seen that compared with SVR
algorithm, the GRU neural network and LSTM neural network can
be better used as the algorithm support for the predictive mainte-
nance module of the intelligent sewage treatment system, and GRU
has the best performance in the three models. The study provides a
new idea for COD prediction. And through reading sensor data and
analyzing the change trend of data, administrators judge whether there
Fig. 7. The histogram of the error comparison between models. are some problems in the sewage treatment. Through the two methods,
predictive maintenance of the sewage treatment process is realized.

5. Conclusion
weights to the first 4 samples data related to the predicted COD in
SVR algorithm, and the weights are 0.1, 0.2, 0.3, 0.4, respectively, that
The large proportion of sewage is from the manufacture of industry,
is, for sample data closer to the predicted COD, the greater the value
which is complex and poisonous. To protect the health of citizens and
weight and the greater the importance.
sustainability of cities, it is important to monitor and predictive main-
4.2.4. Results and discussion tain sewage treatment process effectively and accurately. This paper
Because SVR algorithm is widely used in data prediction and has presents a methodology which is based on machine learning and IoT
good performances, the baseline model is SVR algorithm in this study. sensors and applies to a fine chemical plant. The experimental results
And the three machine learning models are modeled respectively. show that the proposed intelligent sewage treatment system works
Finally, in order to make the study more convincing, this study uses efficiently in the methodology. According to the results, the proposed
three trained models to predict many same samples, the results are system shows a high performance in early fault alert, which can reduce
shown in Table 2, Figs. 7 and 8. workload and working difficulty for sewage professionals significantly
According to the results, the prediction errors of LSTM and GRU and ensure the stability of sewage treatment. And multiple machine
are much lower compared with SVR in the COD prediction, LSTM and learning algorithms are implemented for outflow COD prediction in the
GRU are more than 20% lower than SVR in RMSE and MAE, and the presented predictive maintenance module, and the comparison results
prediction error of GRU is more than 25% lower than baseline model. show that GRU has a better performance than SVR and LSTM. This
The prediction results of GRU and LSTM are basically the same, the study can be extended to various special sewage treatment scenarios,
prediction of GRU model is slightly better than LSTM model. Compared such as small-scale industries with toxic sewage in sustainable cities.
with traditional machine learning algorithms, deep learning models In the future work, the study will continue to collect data to improve
perform better in COD prediction. and optimize algorithms, the performance in data samples of high-
Besides, the system also supports judging whether an intelligent concentration sewage (outflow COD ≥ 500 mg/L), the three models will
device is malfunctioning based on the sensor data. For example, in be tested. And the impact of sludge activity, dissolved oxygen, seasonal

6
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

Fig. 8. Models prediction results.

changes and other factors on COD degradation will also be considered References
to obtain more accurate COD prediction, so as to ensure the long-term
stability of the intelligent sewage treatment in sustainable cities. Alhussein, M., et al. (2018). Cognitive IoT-cloud integration for smart healthcare:
Case study for epileptic seizure detection and monitoring. Mobile Networks and
Applications, 23, 1624–1635.
Declaration of competing interest
Alshehri, F., & Muhammad, G. (2021). A comprehensive survey of the internet of things
(IoT) and AI-based smart healthcare. IEEE Access, 9, 3660–3678.
The authors declare that they have no known competing finan- Amin, S. U., et al. (2019). Deep learning for EEG motor imagery classification based
cial interests or personal relationships that could have appeared to on multi-layer CNNs feature fusion. Future Gener Comp Sy., 101, 542–554.
influence the work reported in this paper. Aydin, O., & Guldamlasioglu, S. (2017). Using LSTM networks to predict engine
condition on large scale data processing framework. In 2017 4th international
Acknowledgments conference on electrical and electronic engineering (ICEEE) (pp. 281–285). IEEE.
Bibri, S. E., & Krogstie, J. (2017). Smart sustainable cities of the future: An extensive
interdisciplinary literature review. Sustainable Cities and Society, 31, 183–212.
The authors extend their appreciation to the Deanship of Scientific Carvalho, T. P., Soares, F. A., Vita, R., Francisco, R. d. P., Basto, J. P., & Alcalá, S.
Research at King Saud University, Riyadh, Saudi Arabia for funding this G. (2019). A systematic literature review of machine learning methods applied to
work through the research group project no. RG-1440-135. predictive maintenance. Computers & Industrial Engineering, 137, Article 106024.

7
S. Miao et al. Sustainable Cities and Society 72 (2021) 103009

Ceperic, E., Ceperic, V., & Baric, A. (2013). A strategy for short-term load forecasting Peres, R. S., Rocha, A. D., Leitao, P., & Barata, J. (2018). IDARTS–Towards intelligent
by support vector regression machines. IEEE Transactions on Power Systems, 28(4), data analysis and real-time supervision for industry 4.0. Computers in Industry, 101,
4356–4364. 138–146.
Chen, J., Jing, H., Chang, Y., & Liu, Q. (2019). Gated recurrent unit based recurrent Rahman, M. A., Rashid, M. M., Hossain, M. S., Hassanain, E., Alhamid, M. F., &
neural network for remaining useful life prediction of nonlinear deterioration Guizani, M. (2019). Blockchain and IoT-based cognitive edge framework for sharing
process. Reliability Engineering & System Safety, 185, 372–382. economy services in a smart city. IEEE Access, 7, 18611–18621.
Chen, M., et al. (2018). Urban healthcare big data system based on crowdsourced and Rahman, M. A., et al. (2018). Semantic multimedia fog computing and IoT environment:
cloud-based air quality indicators. IEEE Communications Magazine, 56(11), 14–20. Sustainability perspective. IEEE Communications Magazine, 56(5), 80–87.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Ramesh, B., & Sathiaseelan, J. (2015). An advanced multi class instance selection
et al. (2014). Learning phrase representations using RNN encoder-decoder for based support vector machine for text classification. Procedia Computer Science, 57,
statistical machine translation. arXiv preprint arXiv:1406.1078. 1124–1130.
Edmondson, V., Cerny, M., Lim, M., Gledson, B., Lockley, S., & Woodward, J. (2018). Sangaiah, A. K., et al. (2020). Energy-aware green adversary model for cyberphysical
A smart sewer asset information model to enable an ‘Internet of Things’ for security in industrial system. IEEE Transactions on Industrial Informatics, 16(5),
operational wastewater management. Automation in Construction, 91, 193–205. 3322–3329.
Fu, H., Manogaran, G., Wu, K., Cao, M., Jiang, S., & Yang, A. (2020). Intelli- Sezer, E., Romero, D., Guedea, F., Macchi, M., & Emmanouilidis, C. (2018). An industry
gent decision-making of online shopping behavior based on internet of things. 4.0-enabled low cost predictive maintenance approach for smes. In 2018 IEEE
International Journal of Information Management, 50, 515–525. international conference on engineering, technology and innovation (ICE/ITMC) (pp.
Gu, S., Yang, J., & Liu, J. (2013). Problems in the development of smart city in China 1–8). IEEE.
and their solution. China Soft Science, 1, 6–12. Shorfuzzaman, M., Hossain, M. S., & Alhamid, M. F. (2021). Towards the sustainable
Guo, L., Lei, Y., Li, N., Yan, T., & Li, N. (2018). Machinery health indicator construction development of smart cities through mass video surveillance: A response to the
based on convolutional neural networks considering trend burr. Neurocomputing, COVID-19 pandemic. Sustainable Cities and Society, 64, Article 102582.
292, 142–150. Silva, B. N., Khan, M., & Han, K. (2018). Towards sustainable smart cities: A review of
Han, Y., Han, Z., Wu, J., Yu, Y., Gao, S., Hua, D., et al. (2020). Artificial intelligence trends, architectures, components, and open challenges in smart cities. Sustainable
recommendation system of cancer rehabilitation scheme based on IoT technology. Cities and Society, 38, 697–713.
IEEE Access, 8, 44924–44935. Singh, S., Sharma, P. K., Yoon, B., Shojafar, M., Cho, G. H., & Ra, I.-H. (2020).
Hossain, M. S., & Muhammad, G. (2018). Emotion-aware connected healthcare big data Convergence of blockchain and artificial intelligence in IoT network for the
towards 5G. IEEE Internet of Things Journal, 5(4), 2399–2406. sustainable smart city. Sustainable Cities and Society, 63, Article 102364.
Hossain, M. S., Muhammad, G., & Alamri, A. (2018). Smart healthcare monitoring: Valente, G., Mendonça, R., Pereira, J., & Felix, L. (2014). Artificial neural network
a voice pathology detection paradigm for smart cities. Multimedia Systems, 25(5), prediction of chemical oxygen demand in dairy industry effluent treated by
565–575. electrocoagulation. Separation and Purification Technology, 132, 627–633.
Hossain, M. S., et al. (2018). Cloud-assisted secure video transmission and sharing Wen-zhen, W. (2013). Intelligent monitoring system for wastewater treatment based on
framework for smart cities. Future Generation Computer Systems, 83, 596–606. internet of things. Control and Instruments in Chemical Industry, 2.
Hu, L., et al. (2015). Software defined healthcare networks. IEEE Wireless Yang, T., Zhang, L., Wang, A., & Gao, H. (2013). Fuzzy modeling approach to
Communications, 22(6), 67–75. predictions of chemical oxygen demand in activated sludge processes. Information
Huang, S., Cai, N., Pacheco, P. P., Narrandes, S., Wang, Y., & Xu, W. (2018). Sciences, 235, 55–64.
Applications of support vector machine (SVM) learning in cancer genomics. Cancer Yaqub, M., Asif, H., Kim, S., & Lee, W. (2020). Modeling of a full-scale sewage treatment
Genomics-Proteomics, 15(1), 41–51. plant to predict the nutrient removal efficiency using a long short-term memory
Islam, K. N., & Jashimuddin, M. (2017). Reliability and economic analysis of moving to- (LSTM) neural network. Journal of Water Process Engineering, 37, Article 101388.
wards wastes to energy recovery based waste less sustainable society in Bangladesh: http://dx.doi.org/10.1016/j.jwpe.2020.101388.
The case of commercial capital city Chittagong. Sustainable Cities and Society, 29, Yassine, A., et al. (2019). IoT big data analytics for smart homes with fog and cloud
118–129. computing. Future Generation Computer Systems, 91, 563–573.
Kazem, A., Sharifi, E., Hussain, F. K., Saberi, M., & Hussain, O. K. (2013). Support vector Yigitcanlar, T., Kamruzzaman, M., Foth, M., Sabatini-Marques, J., da Costa, E., & Iop-
regression with chaos-based firefly algorithm for stock market price forecasting. polo, G. (2019). Can cities become smart without being sustainable? A systematic
Applied Soft Computing, 13(2), 947–958. review of the literature. Sustainable Cities and Society, 45, 348–365.
Kong, W., Dong, Z. Y., Jia, Y., Hill, D. J., Xu, Y., & Zhang, Y. (2017). Short- Zhang, X., Liang, Y., Zhou, J., et al. (2015). A novel bearing fault diagnosis model
term residential load forecasting based on LSTM recurrent neural network. IEEE integrated permutation entropy, ensemble empirical mode decomposition and
Transactions on Smart Grid, 10(1), 841–851. optimized SVM. Measurement, 69, 164–179.
Li, C., Sanchez, R.-V., Zurita, G., Cerrada, M., Cabrera, D., & Vásquez, R. E. (2016). Zhang, S., Wang, Y., Liu, M., & Bao, Z. (2017). Data-based line trip fault prediction in
Gearbox fault diagnosis based on deep random forest fusion of acoustic and power systems using LSTM networks and SVM. IEEE Access, 6, 7675–7686.
vibratory signals. Mechanical Systems and Signal Processing, 76, 283–293. Zhang, W., Yang, D., & Wang, H. (2019). Data-driven methods for predictive
Liu, Y., & Zhou, G. (2012). Key technologies and applications of internet of things. In maintenance of industrial equipment: A survey. IEEE Systems Journal, 13(3),
2012 fifth international conference on intelligent computation technology and automation 2213–2227.
(pp. 197–200). IEEE. Zhao, H., Sun, S., & Jin, B. (2018). Sequential fault diagnosis based on LSTM neural
Min, W., et al. (2015). Cross-platform multi-modal topic modeling for personal- network. IEEE Access, 6, 12929–12939.
ized inter-platform recommendation. IEEE Transactions on Multimedia, 17(10), Zhao, R., Wang, D., Yan, R., Mao, K., Shen, F., & Wang, J. (2017). Machine
1787–1801. health monitoring using local feature-based gated recurrent unit networks. IEEE
Muhammad, G., Hossain, M. S., & Kumar, N. (2021). EEG-based pathology detection for Transactions on Industrial Electronics, 65(2), 1539–1548.
home health monitoring. IEEE Journal on Selected Areas in Communications, 39(2), Zhou, P., Li, Z., Snowling, S., Baetz, B. W., Na, D., & Boyd, G. (2019). A random forest
603–610. model for inflow prediction at wastewater treatment plants. Stochastic Environmental
Najafzadeh, M., & Ghaemi, A. (2019). Prediction of the five-day biochemical oxygen Research and Risk Assessment, 33(10), 1781–1792.
demand and chemical oxygen demand in natural streams using machine learning
methods. Environmental Monitoring and Assessment, 191(6), 380.

You might also like