You are on page 1of 8

Environmental Challenges 4 (2021) 100155

Contents lists available at ScienceDirect

Environmental Challenges
journal homepage: www.elsevier.com/locate/envc

PM2.5 concentration prediction during COVID-19 lockdown over Kolkata


metropolitan city, India using MLR and ANN models
Biswajit Bera a, Sumana Bhattacharjee b,∗, Nairita Sengupta c, Soumik Saha d
a
Department of Geography, Sidho-Kanho-Birsha University, Ranchi Road, P.O. Purulia Sainik School, 723104, India
b
Department of Geography, Jogesh Chandra Chaudhuri College (University of Calcutta), 30, Prince Anwar Shah Road, Kolkata 700 033, India
c
Department of Geography, Diamond Harbour Women’s University, Sarisha, 743368, India
d
Department of Geography, University of Calcutta 35, Ballygunge Circular Road, Ballygunge, Kolkata-700019

a r t i c l e i n f o a b s t r a c t

Keywords: Kolkata is the third densely populated city of India and Kolkata stands in the World’s 25 most polluted cities
Concentration of PM2.5 along with 10 worse polluted cities in India. The relevant study claims that due to the imposition of lockdown
Multiple linear regression (MLR) during COVID-19 pandemic, the atmospheric pollution level has been significantly reduced over the metropoli-
Artificial neural network (ANN)
tan city Kolkata like other cities of the world. The main objective of this study is to predict the concentration
Accuracy level
of PM2.5 using multiple linear regression (MLR) and artificial neural network (ANN) models and similarly, to
compare the accuracy level of two models. The concentration of PM2.5 data has been obtained from state pol-
lution control board, Govt. of West Bengal and daily meteorological data have been collected from the world
weather website. The results show that non-linear artificial neural network model is more rational compared
with multiple linear regression model due to its high precision and accuracy level (in respect to RMSE, MAE
and R2 ). In this research artificial neural network (ANN) model exhibited higher accuracy during the training
and testing phases (root mean square error (RMSE), mean absolute error (MAE) and R2 indicate 3.74, 1.14 and
0.91 respectively in training phase and 2.55, 4.32 and 0.69 in testing phase respectively). This model (ANN))
can be applied to predict the concentration of PM2.5 during the execution of urban air quality management
plan.

1. Introduction the air pollution standard of the major metropolitan areas of India has
been upgraded during the COVID-19 lockdown as a consequence of the
The unprecedented massacre has been created through the rapid partial pausing of different economic sectors along with developmental
transmission and fatal aftermath of novel coronavirus (COVID-19) in projects (CPCB, 2020; Sharma et al., 2020). In this milieu, it is stated
the entire world. During the end of 2019, the noxious COVID-19 has that the tremendous threat of air pollution can trigger the probability
initiated to blowout its acute impact and as a consequence, an alarming of deadly cardiovascular and respiratory diseases (Pope et al., 2004).
nuisance has been evoked in every sphere of modern human civiliza- Suspended particulate matter (SPM) has an important role in terms of
tion (Wang et al., 2020). As the outbreak of this pandemic is rapidly acute health disorders and environmental degradation as well as the
diffused through physical interaction, social isolation is recommended massive concentration of SPM over an area is highly responsible to in-
as a safest remedial measure to arrest the infectious transmission of fluence the regional climatic change (Haywood and Boucher, 2000).The
coronavirus (Bera et al., 2020b; Chakraborty et al., 2020, 2021). The microscopic SPM is capable to penetrate into human respiratory system
worldwide acceptance of expanded lockdown along with social distanc- and this SPM brings hazardous consequences like lethal cardiovascular
ing proves its inevitability for weakening the terminal effects of COVID- and respiratory diseases (Liu et al., 2019; Sahu et al., 2019). Excessive
19 (Huang et al., 2020). In India, the government strictly imposed the concentration of PM2.5 in lower atmosphere snatches 3.15 million lives
lockdown along with social distancing regulation from 24th March 2020 in every year whereas globally outdoor air pollution causes 3.3million
to 31st May 2020 through four phases to handle COVID-19 pandemic. mortality per year (Lelieveld et al., 2015). It was registered that in 2015,
Amazingly, the environmental pollution level is evidently reduced dur- around 27.1% deaths were caused due to chronic obstructive pulmonary
ing the lockdown phase due to the stop of multi-dimensional anthro- disease (COPD) and the extreme accumulation of PM2.5 was considered
pogenic actions (Dutheil et al., 2020). It has been already reported that as a triggering factor for such disaster (Cohen et al., 2017). The fact


Corresponding author.
E-mail address: sumana.aarohi@gmail.com (S. Bhattacharjee).

https://doi.org/10.1016/j.envc.2021.100155
Received 15 April 2021; Received in revised form 16 May 2021; Accepted 21 May 2021
2667-0100/© 2021 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/)
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

must be evident that the perilous upshots of PM2.5 and PM10 have the predict and assessment of the concentration of one of the most impor-
potentiality to accelerate the rate of human casualties (Qi et al., 2020; tant vehicular pollutants (CO) (Majumdar et al., 2010). Shahraiyni and
Stafoggia and Bellander, 2020). Sodoudi (2016) analysed 36 different research works which have been
In this situation, it is stated that PM2.5 concentration can be predicted carried out in different parts of the world and related with particulate
and simulated through different scientific models such as artificial intel- matter model forecasting and concentration analysis. In these research
ligence, chemical transport, linear and nonlinear regression, time series works, around 50 percent study considered artificial neural network
analysis etc. (Ventura et al., 2019; Sun et al., 2013; Vlachogianni et al., model (ANN) with the help of multilayer perception and feed forward
2011; Baker and Foley, 2011; Wang et al., 2012). PM2.5 has ≤ 2.5 μm back propagation network topologies method whereas around 30% stud-
aerodynamic diameters and it is regarded as the toxic component which ies are used multiple linear regression model to predict PM10 concentra-
has the competency to increase human morbidity and mortality share tion in urban areas. And rest studies are used different machine learning
(Walsh, 2014). The more accurate estimation about the concentration algorithms (Dutta and Jinsart, 2020). MLR and ARIMA have been suc-
of PM2.5 would be predicted through the combination of few models cessfully applied over Delhi and Hong-Kong to predict respirable sus-
such as adaptive neuro fuzzy inference system (ANFIS), artificial neu- pended particulate matter (Goyal et al., 2006). ANN model was success-
ral network (ANN), multiple linear regression (MLR), general regression fully applied over Delhi to predict the NO2 dispersion (Nagendra and
neural network (GRNN). Few scholars claimed that the ANFIS model Khare, 2006). Multilayer perception class of ANN model has applied
has more accuracy (concentration of PM2.5 over Tehran and Iran) in successfully to predict the concentration of toxic metals and PM10 in
compared to other models (Mirzaei et al., 2019). Similarly, multiple lin- Jaipur city (Chelani et al., 2002). In the present study, multiple linear
ear regression model (MLR) is also a correct and suitable method in regression model and artificial neural network model have been used
this perspective as this model has successfully projected for the pres- for simulating and predicting the PM2.5 concentration as a dependent
ence of NOX and PM10 in lower atmosphere over Athens and Helsinki variable. Meteorological data such as maximum temperature (°C), min-
(Vlachogianni et al., 2011). Dust storm events in the western part of imum temperature (°C), relative humidity (RH), air pressure (AP), wind
Iran and PM2.5 concentration in Sanandaj of Iran have been predicted speed (WS) etc. and gaseous factors (CO, NO2 , O3 , PM10 , SO2 ) have been
through MLR model considering the changing pattern of temperature considered as independent variables for better result as well as accuracy
over the Mediterranean Sea, Damascus Deserts and various meteoro- of these models. The significant fact is that the accumulation of PM10
logical data respectively (Amanollahi et al., 2015; Ausati and Amanol- and PM2.5 has been curtailed over the 22 cities of India during the lock-
lahi, 2016). Meanwhile, the main purpose of artificial neural network down in 2020 compared with the year 2017(Sharma et al., 2020). Delhi
model is to detect the linear and non-linear relationship between the (the capital of India) has witnessed the surprising reduction of partic-
independent and dependent variables and this model has already ef- ulate matter throughout the quarantine period (Rodríguez-Urrego and
fectively estimated and simulated PM2.5 concentration over the copper Rodríguez-Urrego, 2020). Subsequently, the lockdown period improves
mines of India. Simultaneously, this model has the efficacy to examine the clarity of the entire environment and intensifies the holistic environ-
the air quality warning systems (Fernando et al., 2012). ANN model mental restoration (Gautam, 2020). Kolkata, the economic growth pole
has also been applied to estimate the daily existence of PM2.5 in Rio de of eastern India is enlisted among the 10 most polluted cities of India and
Janeiro, Brazil (Ventura et al., 2019). The prediction accuracy of PM2.5 25 worst contaminated cities of the world (WHO, 2011).This metropoli-
concentrations largely controls public health management. Both multi- tan city is brutally victimized due to the catastrophic impact of COVID-
ple linear regression and artificial neural network models have high pre- 19 as almost 1261 people have lost their lives because of the menace of
cession and accuracy compared with other models for short- and long- coronavirus as of 29thAugust 2020 (Health and Family Welfare Depart-
term predictions. Presently, various types of modeling methodologies ment, Govt. of West Bengal, 2020).The previous research works have de-
(linear and non-linear) have been developed through the rapid progress picted that if the existence of lethal pollutants is amplified by 10 𝜇gm−3
of science and technology. Simultaneously, the trend of using neural over the troposphere, then it may markedly upsurge the daily propor-
networks seems to be growing in different studies by the different re- tion of symptomatic novel coronavirus positive cases (Mehmood et al.,
searcher worldwide. These classic statistical methods have been now 2020).The concentration of PM2.5 over Kolkata metropolitan city has
widely used in different cases and purposes particularly in the differ- been notably dwindled about 17.5% in 2020 compared with the pre-
ent branches of environmental sciences like pollution modeling, envi- ceding years due to the closure of transport movement and economic
ronmental modeling etc. (Perez et al., 2000). ANN model was used to actions throughout the lockdown period (Bera et al., 2020a). PM is now
predict 1-hr PM2.5 concentration in Santiago, Chile. In Kuopio, Finland renowned as carcinogenic to humans (IARC 2013) and it is also con-
and Jaipur, India this model was used for the prediction of maximum sidered as one of the leading factors of cardiovascular diseases such as
and averaged PM10 concentration (Chelani et al., 2002). A scientific stroke, asthma, ischemic heart disease, bronchitis, chronic obstructive
study was done over Kolkata (India) to describe the inter-dependency pulmonary disease (COPD) and estimated reduction of life expectancy
of particulate matters with different climatic variables within the period (WHO 2013).Such study is carried out by high accuracy models in short-
2015–2017. After that random forest machine learning algorithm has term perspective. Long-term approach is used to find out the effects of
been applied to predict the particulate matter concentration (Basu and permanent disclosure (at least ten years), while short term approach is
Salui, 2021). Another study showed the determination of a long-term applied to determine the severe health effects and risks, specially linked
trend of different particulate matters (PM10 & PM2.5 ) over Kolkata with severe air pollution or particularly of PM2.5 .The aim of this research
metropolitan city on the basis of historical data and different statistical was therefore to fill up an existing gap of knowledge on the short-term
and deep learning algorithms (Nath et al., 2021). Ground level concen- effect of PM2.5 on the health of the residents of Kolkata, one of the
tration of emitted pollutants was also measured by the Gaussian distri- biggest agglomerations in the region of North Eastern India and to com-
bution model (Bandyopadhyay, 2010). Gaussian type dispersion models pare the results with other cities worldwide. The lockdown process was
are widely used by the researcher, policy makers and environment plan- remarkably reduced the pollution rate and concentration over Kolkata
ners to determine the environmental impact assessment on diverse en- and Howrah municipality area and the AQI (air quality index) had im-
vironmental projects (Bandyopadhyay, 2010). Here, various Gaussian proved from poor to good category. And after analysis, it has been ob-
models are widely used in India for prediction of different pollutants. served that the PM10 & PM2.5 are the primary sources of pollution here
These models are required various input parameters such as meteorol- (Sarkar et al., 2020). A recent study has been done to analyze the spa-
ogy, land use land cover, traffic etc. (Aggarwal et al., 2014). Recently, tiotemporal variation of PM10 and NO2 over three megacities of India
Gaussian model has been applied over Dhaka, Bangladesh for predic- (Delhi, Mumbai and Kolkata). It has been observed that a significant
tion of carbon monoxide concentration (Ferdous and Ali, 2005). Simi- reduction of pollutants has been registered over the megacities during
larly, a Gaussian dispersion model has been applied over Kolkata city to lockdown period and the concentration level of PM10 has significantly

2
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

Fig. 1. The structure of artificial neural net-


work (ANN) model.

Table 1 2. Method and materials


The coefficients with VIF and tolerance values of the independent variables.

Model Predictor Unstandardized coefficients t VIF Tolerance 2.1. Data source and data acquisition

Estimate SE
The spatiotemporal concentration data of PM2.5 has been collected
Intercept −171.73 260.67 −0.65 from State Pollution Control Board, Govt. of West Bengal from24th
PM10 0.35 0.05 6.01 3.13 0.32
March to 31st May 2020 (lockdown period). The five automatic sta-
CO 41.50 10.77 3.85 3.99 0.25
NO2 0.17 0.31 0.56 4.30 0.23 tions (pollution measurement) like IACS Jadavpur, Fort William, Vic-
O3 0.00 0.06 0.13 2.22 0.45 toria Memorial, RabindraSarobarand Rabindrabharati University have
SO2 0.73 0.53 1.36 3.25 0.30 been considered to maintain the spatial integrity along with augmenta-
Max temperature −0.03 0.42 −0.07 6.30 0.15 tion of data accuracy. The existence of PM2.5 in the troposphere has been
Min temperature −0.39 0.24 −1.63 1.51 0.66
obtained from these stations. Other important parameters such as PM10 ,
Wind speed −0.04 0.09 −0.46 2.40 0.41
Relative humidity −0.22 0.08 −2.56 7.71 0.13 CO, NO2 , SO2 , and O3 have been brought from the same monitoring sta-
Air pressure 0.17 0.25 0.67 2.93 0.34 tions in a daily basis and averaged it. Similarly, the daily meteorological
data has been taken from the world weather website (World Weather
Online, 2020). The scatter plot and correlation matrix have been de-
signed by R Studio programming software.

2.2. Multiple linear regression model (MLR)


dropped over Kolkata metropolitan area mainly due to imposed of lock-
down (Ganguly et al., 2021). But the study has some limitations because The regression analysis is frequently used for prediction and the ob-
the whole study has been conducted within a short time frame and in jective of this model is to construct a mathematical model that can be
India, the air pollution parameters are not routinely updated or moni- utilized for predict the dependent variable based on the inputs of in-
tored. Different climatic parameters are the main indicator of this study. dependent variables or the predictors (Juneng et al., 2011). MLR model
These are NO2 , O3 , SO2 , maximum temperature, minimum tempera- has been used to obtain the significant relationship as well as correlation
ture, wind speed, relative humidity etc. These parameters are selected between the dependent variable and the predictors or the independent
on the basis of previous important studies (Amanollahi and Ausati, 2019; variables (Table 1). Here, around 75% data has been used for training
Mirzaei et al., 2019). This paper focuses on the simulation as well as pre- and 25% data has been applied for testing of this model. Statistical Pack-
diction of PM2.5 concentration with a perfect accuracy level to portray age for Social Science (SPSS version 25) has been used here to run the
more relevant and vivid scenario in terms of the contemporary context. model accurately.
The main objectives of this paper are (i) to predict and simulate of PM2.5 Multiple Linear Regression model follows:
concentration (24th March to 31st May 2020) through the application
of linear and non-linear models and (ii) to compare the accuracy level Y = B0 + B1 X1 + B2 X2 + … … ..𝐵𝑛 𝑋𝑛 + 𝜀 (1)
of two models by obtaining the model validation values and also detects Where B0 refers for Y-intercept; whereas X1, X2…..Xn stands for the in-
the factors which have the maximum association with PM2.5 concentra- dependent variables and B1 , B2…Bn are the coefficients of independent
tion during the lockdown phase over Kolkata metropolitan city. variables and 𝜀 refers for the error term; and Y is the dependent variable.

3
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

Table 2 Hyperbolic tangent follows,


The Importance of predictors depicting by artificial neural net-
𝑒𝑥 − 𝑒−𝑥
work. 𝑓 (𝑥) = tanh (𝑥) = (3)
𝑒𝑥 + 𝑒−𝑥
Predictors Importance Normalized Importance The derivative of loss function is designed by the gradient descent
PM10 .28 100.0% method which is associated to back propagation. The following equa-
CO .19 66.5% tions are maintained the algorithm of back propagation,
NO2 .11 40.4% The square error function is
O3 .01 5.2%
SO2 .03 10.4% 𝐸 = 𝐿(𝑡, 𝑦) (4)
Maximum temperature .08 28.4%
Minimum temperature .04 15.7% Where, E is the loss of the output y and target value t. t denotes output
Wind speed .02 7.6% of training samples. y is the output of the neuron.
Relative humidity .07 27.3% For each neuron j and output𝑜𝑗 is defined as
Air pressure .13 46.1% ( 𝑛 )
( ) ∑
𝑜𝑗 = 𝜑 𝑛𝑒𝑡𝑗 = 𝜑 𝑤𝑘𝑗 𝑜𝑘 (5)
𝑘=1

Where, the activation function 𝜑 is non-linear. The main activation func-


tion is logistic function which is followed as,
1
𝜑 (𝑧 ) = (6)
1 + 𝑒−𝑧
The input (𝑛𝑒𝑡𝑗 ) to a neuron is the weighted sum of outputs 𝑜𝑘 of
previous neuron.

2.4. Model validation

Any scientific prediction models require validation for determina-


tion of its performance over the dependent variable. There are three
different methods which have been used here for the validation of the
models. These are the root mean square error (RMSE), mean absolute
error (MAE) and Pearson’s correlation coefficient (R). Mean absolute er-
ror (MAE) and Root mean square error (RMSE) show the average error
Fig. 2. The importance of independent variables in terms of the prediction of
PM2.5 accumulation. of the models. Both techniques are ranged from 0 to oo and both are the
negative oriented values or scores. So the lower value of RMSE and MAE
shows better results in the model prediction. Whereas, R2 is the statisti-
Co-linearity happens when two models have a linear relationship. cal parameter which has been used for the validation of the model and
It creates the individual contribution of each variable and introduces it ranges from 0 to 1. The R2 value near 1 specifies the strong associ-
redundancy. Similarly, it makes the model excessively sensitive to the ation between variables and contraries the lowest association between
data. Here, multi-collinearity problem is varied by Variance of Inflation variables. Standardized coefficient and the changing pattern of R2 are
Factor (VIF) and the value of VIF while it is less than 10 which indicates used to determine the importance of a factor in any regression model.
that there is no multi-collinearity problem and the regression is fit. Here, on the basis of these values the importance of these predictors
The VIF follows is determined. The models are not highly fitted here due to some data
1 shortage and completed with short time frame.
VIF = (2) These equations are given below,
1 − 𝑅2𝐽

√ 2
Where, VIF stands for variance inflation factor, R2 mean multiple coef- √1 ∑ 𝑁
( )
RMSE = √ 𝑦 − 𝑦̄𝑖 (7)
ficient of determination in a regression. 𝑁 𝑖=1 𝑖

Where,(𝑦𝑖 − 𝑦̄𝑖 ) signifies differences and N denotes sample size.


2.3. Artificial neural network (ANN) model
𝑁
1 ∑|
MAE = 𝑦 − 𝑦̄𝑖 || (8)
Artificial neural network (ANN) model was primarily suggested by 𝑁 𝑖=1 | 𝑖
McCulloch and Pitts who were enthused by neural network systems and
|𝑦𝑖 − 𝑦̄𝑖 | shows the absolute errors, N points out the number of errors.
the brain of living organisms. The artificial neural network model is ( )
∑(
a basically statistical and mathematical based complex interconnection 𝑥𝑖 − 𝑥̄ ) 𝑦𝑖 − 𝑦̄
𝑅= √ (9)
which characterizes the biological neurons that are fundamental for the ∑( ( )
𝑥𝑖 − 𝑥̄ )2 𝑦𝑖 − 𝑦̄ 2
human brain processes. The multi-layer perceptron (MLP) is universally
applied in ANN model. ANN has three diverse layers such as -i) input Where,𝑥𝑖 denotes values of x variable in the sample. 𝑥̄ specifies mean
layer in which the data are distributed over the network, ii) hidden layer value of x variable. 𝑦𝑖 stands for the value of y variable and 𝑦̄ means the
where the data are processed and finally iii) the output layer where the value of y variable.
results for certain inputs are taken out (Amanollahi and Ausati, 2019)
(Fig. 1). The ANN model is designed through SPSS (v25) software. There 3. Result and discussion
are more than one hidden layer and a key parameter. The ANN model
has been broadly and commonly used for air quality estimation, simula- 3.1. Prediction of PM2.5 accumulation over Kolkata through multiple
tion and prediction purposes (Alimissis et al., 2018). Here, around 75% linear regression model
data is used for training and 25% data has been applied for testing of
the model. In ANN, a hyperbolic tangent or a sigmoid function has been In the current study, the result showed that multiple linear regression
considered for mathematical expediency. (MLR) is the high-quality model which is used here for the prediction

4
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

Fig. 3. The comparative analysis of MLR and ANN model regarding PM2.5 concentration over Kolkata during lockdown period. a. observed and simulated values
using MLR b. observed and predicted values using MLR c. observed and simulated values using ANN d. observed and predicted values using ANN.

Fig. 4. Scatter diagram displaying the trend of correlation of different variables with PM2.5 (with 95% confidence level).

of the PM2.5 concentration over Kolkata during the lockdown. The coef- compared to the regression model usually might be due to the uneven
ficients of different corresponding predictors are mentioned in Table 1. distribution of sample data (Zhao et al., 2018).
The result of different validation methods in training phase of MLR mod-
elshows the model validation and the summarized result of this section 3.2. Outcomes of artificial neural network model in the prediction of PM2.5
is presented here, RMSE = 3.77, MAE = 1.69 and R2 = 0.833. On the accumulation over Kolkata
other side, the result of testing phase section is stated here, RMSE = 3.33,
MAE = 5.19 and R2 = 0.0.510. A better R2 result for prediction and Artificial neural network model is another type of model which is
used here for simulation and prediction of PM2.5 concentration over

5
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

Table 3
Correlation matrix (Pearson’s method) showing the association between variables.

Independent variables PM2.5 PM10 CO(mg/m3) NO2(μg/m3) O3(μg/m3) SO2(μg/m3) Max Min Wind Relative Air Pres-
(μg/m3) (μg/m3) Temp(C) Temp(C) speed(km/h) humidity sure(mb)

PM2.5 (μg/m3) 1 0.84 0.83 0.76 0.55 0.73 0.47 0.05 −0.26 −0.66 0.41
PM10 (μg/m3) 0.84 1 0.76 0.60 0.39 0.62 0.29 0.10 −0.02 −0.42 0.22
CO(mg/m3) 0.83 0.76 1 0.73 0.55 0.58 0.25 0.06 −0.15 −0.44 0.19
NO2 (μg/m3) 0.76 0.60 0.73 1 0.60 0.73 0.24 −0.15 −0.29 −0.49 0.40
O3 (μg/m3) 0.55 0.39 0.55 0.60 1 0.61 0.37 0.05 −0.16 −0.48 0.14
SO2 (μg/m3) 0.73 0.62 0.58 0.73 0.61 1 0.40 −0.05 −0.17 −0.53 0.39
Max. Temp. (◦ C) 0.47 0.29 0.25 0.24 0.37 0.40 1 0.44 −0.29 −0.87 0.43
Min. Temp. (◦ C) 0.05 0.10 0.06 −0.15 0.05 −0.05 0.44 1 −0.08 −0.29 0.10
Wind speed (km/h) −0.26 −0.02 −0.15 −0.029 −0.16 −0.17 −0.29 −0.08 1 0.45 −0.69
Relative humidity (%) −0.66 −0.42 −0.44 −0.49 −0.48 −0.53 −0.87 −0.29 0.45 1 −0.55
Air Pressure (mb) 0.41 0.22 0.19 0.40 0.14 0.39 0.43 0.10 −0.69 −0.55 1

Table 4
Tropical and sub-tropical high pollution tolerant plant species.

Scientific name of plants Local name of plants Pollutants absorption by the plants

Psidium guajava Guava SO2 , PM2.5 , PM10


Ficus bengalensis Banyan CO, NO2 , SO2 , PM2.5 , PM10
Azadirachta indica Neem CO, SO2 , PM2.5 , PM10 , NO2
Hibiscus rosa-sinensis China rose PM10 , PM2.5 , SO2
Neolamarkia cadamba Kadam/ Burflower tree CO, NO2 , SO2 , PM2.5 , PM10
Mangifera indica Mango CO, NO2 , SO2 , PM2.5 , PM10
Eucalyptus globus Southern blue gum CO, NO2 , SO2 , PM2.5 , PM10
Ficus religiosa Peepul CO, NO2 , SO2 , PM2.5 , PM10
Bougainvillea spectabilis Bougainvillea PM2.5 , PM10 , NO2 , SO2 , CO
Ricinus communis Castor oil plant CO, NO2
Cascabela thevetia Yellow oleander NO2 , SO2 , PM2.5 , PM10
Cassia siamea Cassod tree SO2 , NO2 , CO

with high importance value whereas O3 is the least effectual predictor


over PM2.5 with a low level of importance value (Table 2; Fig. 2).

3.3. Comparative study in predicting PM2.5 concentration over Kolkata

In this research two different machine learning algorithms have been


applied for prediction of PM2.5 concentration during the lockdown pe-
riod (24th March to 31st May). From the comparison between the results
of the two models it has been represented that the ANN model has better
prediction precision and simulation result due to its lower RMSE (Fig. 3).
Here, the average concentration of PM2.5 of each day during the study
period has been calculated. The RMSE and MAE value of ANN model is
3.74 and 1.14 respectively in the training phase. Whereas the RMSE and
MAE value of MLR model is 3.77 and 1.69 respectively in training phase.
In the case of testing phase the RMSE and MAE value is 2.55 and 4.32
respectively in the case of ANN model whereas RMSE and MAE value
is 3.33 and 5.19 respectively in the case of MLR model. The prediction
ofPM2.5 concentration is a complicated issue because it can be easily
affected by different factors or the predictors. The correlation (based
on Pearson’s method) matrix shows the degree of association between
variables (Table 3; Fig. 4). It has been exhibited that PM10 , CO, SO2
Fig. 5. The correlation among the variables affecting the prediction of PM2.5 and NO2 are highly correlated with PM2.5 that means the air pollution
accumulation in lower atmosphere. level mostly controls on the concentration of PM2.5 (Table 2; Table 3)
whereas the meteorological factors (wind speed and relative humidity)
is negatively inter-related with the concentration of PM2.5 (Fig. 5). This
Kolkata (Fig. 1). In this case, an input layer, hidden layer and out- scenario showed that rainfall and high wind speed are competent to get
put layer have been incorporated for prediction. The ANN model has back the atmospheric purity and directly diminish the PM2.5 concen-
widely applied by the researchers for quantification and air quality pre- tration over the lower atmosphere. Comparatively, the better results of
diction (Suleiman et al., 2019; Radojevic et al., 2019). The concise re- these models can be derived during the cold season compared with the
sult of the applied ANN model in training stage has represented such as warm season (Mirzaei et al., 2019). It is supposed that the temperature
RMSE = 3.74, MAE = 1.14 and R2 = 0.916 while the result of testing inversion during the cold season may take a vital role because this event
phase section highlights the RMSE = 2.55, MAE = 4.32 and R2 = 0.697. restricts the suspended particulate matters in the lower part of the at-
It has been noticed that PM10 is the most efficient predictor of PM2.5 mosphere and subsequently decreases the accuracy level of the models.

6
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

4. Conclusions Bera, B., Bhattacharjee, S., Shit, P.K., Sengupta, N., Saha, S., 2020a. Significant impacts of
COVID19 lockdown on urban air pollution in Kolkata (India) and amelioration of en-
vironmental health. Environ. Dev. Sustainability doi:10.1007/s10668-020-00898-5.
As Kolkata metropolitan city is labelled as one of the worst polluted Bera, B., Bhattacharjee, S., Sengupta, N., 2020b. Human behavior, trustworthiness, and
cities in India as well as in the world, the future projection of its atmo- attitude during COVID-19 lockdown in Indian modern societal and cultural antiquity.
spheric fatal pollutants would be beneficial for the human health safety J. Hum. Behav. Social Environ. doi:10.1080/10911359.2020.1829241.
Chakraborty, B., Roy, S., Bera, A., Adhikary, P.P., Bera, B., Sengupta, D., Bhunia, G.S.,
as well as environmental cleanliness. This research highlights the com- Shit, P.K., 2020. Cleaning the river Damodar (India): impact of COVID‑19 lock-
parison between the precision of linear model i.e., MLR and nonlinear down on water quality and future rejuvenation strategies. Environ. Dev. Sustainability
model i.e., ANN in the aspect of prediction the occurrence of PM2.5 in doi:10.1007/s10668-020-01152-8.
Chakraborty, B., Roy, S., Bera, A., Adhikary, P.P., Bera, B., Sengupta, D., Bhu-
Kolkata amidst the lockdown period. The entire study signifies that the
nia, G.S., Shit, P.K., 2021. Eco-restoration of river water qualityduring COVID-
nonlinear model has exhibited the more precise prediction of PM2.5 ac- 19 lockdown in the industrial belt of eastern India. Environ. Sci. Pollution Res.
cumulation over this metropolitan city compared with the linear model. doi:10.1007/s11356-021-12461-4.
Chelani, A.B., Gajghate, D.G., Hasan, M.Z., 2002. Prediction of ambient PM10 and toxic
The comparative analysis between the two above-mentioned models fo-
metals using artificial neural networks. J. Air Waste Manag. Assoc. 52 (7), 805–810.
cuses that the ANN model has attained the maximum perfection in case doi:10.1080/10473289.2002.10470827.
of training and testing stages for predicting the existence of PM2.5. The Cohen, A.J., Brauer, M., Burnett, R., Anderson, H.R., Frostad, J., Estep, K., et al., 2017.
most appropriate model is ANN and it is principally composed of three Estimates and 25-year trends of the global burden of disease attributable to ambient
air pollution: an analysis of data from the Global Burden of Diseases Study 2015.
distinct layers. So, it must be concluded that Artificial Neural Network Lancet 389 (10082), 1907e1918. doi:10.1016/S0140-6736(17)30505-6.
(ANN) has designed to predict the concentration of PM2.5 over the worst CPCB, 2020. Impact of lockdown (25th March to 15th April) on air qual-
polluted city Kolkata amid the lockdown session compared with Multi- ity. Ministry Environ. Forest Clim. Change, Govt. of India, Delhi 1–62.
https://cpcb.nic.in/latest-cpcb.php.
ple Linear Regression (MLR) model. This artificial neural network model Dutheil, F., Baker, S.J., Navel, V., 2020. COVID-19 as a factor influencing air pollution?
is very rational model which can apply to estimate the spatiotemporal Environ Pollut doi:10.1016/j.envpol.2020.114466.
concentration of PM2.5 over any city’s of the world during the imple- Dutta, A., Jinsart, W., 2020. Risks to health from ambient particulate matter (PM2.5) to
the residents of Guwahati city, India: an analysis of prediction model. Hum. Ecol. Risk
mentation of long term environmental management plan. Assess. doi:10.1080/10807039.2020.1807902.
Subsequently, the application of lockdown system is not a perma- Ferdous, M.R., Ali, M.A., 2005. Air Quality Modelling for Predicting Traffic Pollution in
nent solution to combat the threat of pollution. So, a substitute sustain- Dhaka City. UAP J. Civil Environ. Eng. 1 (1), 27–33.
Fernando, H.J., Mammarella, M.C., Grandoni, G., Fedele, P., Di Marco, R., Dimitrova, R.,
able management method should be applied to maintain the cleanliness
Hyde, P., 2012. Forecasting PM10 in metropolitan areas: efficacy of neural networks.
of environment. Relevant applied researches focused that plants are the Environ. Pollut. 163, 62–67. doi:10.1016/j.envpol.2011.12.018.
primary receiver of various types of air pollutants and perform as a mas- Ganguly, R., Sharma, D., Kumar, P., 2021. Short-term impacts of air pollutants in
three megacities of India during COVID-19 lockdown. Environ. Dev. Sustain.
sive sink (Kaur and Nagpal 2017; Letter and Jager, 2020). A contempo-
doi:10.1007/s10668-021-01434-9.
rary study revealed that the important plant species have high absorp- Gautam, S., 2020. COVID-19: air pollution remains low as people stay at home. Air Qual.
tion capability for definite pollutants (Salih et al., 2017; Table 4). Intro- Atmosphere Health doi:10.1007/s11869-020-00842-6.
duction of air pollution-tolerant species in urban vacant spaces is highly Goyal, P., Chan, A.T., Jaiswal, N., 2006. Statistical models for the prediction of respirable
suspended particulate matter in urban cities. Atmos. Environ. 40 (11), 2068–2077.
necessary to improve the environmental health along with ecosystem doi:10.1016/j.atmosenv.2005.11.041.
values of urban life (Bamniya et al., 2011). The expansion of green en- Haywood, J., Boucher, O., 2000. Estimates of the direct and indirect radiative forc-
circle by plantation of tolerant species can definitely reduce the high air ing due to tropospheric aerosols: a review. Rev. Geophys. 38 (4), 513–543.
doi:10.1029/1999RG000078.
pollution to a certain level. Health and Family Welfare Department, 2020. West Bengal COVID-
19 Health Bulletin –15th May 2020. Govt. of West Bengal..
Declaration of Competing Interest https://www.wbhealth.gov.in/pages/corona/bulletin .
Huang, X., Ding, A., Gao, J., Zheng, B., Zhou, D., Qi, X., Tang, R., Ren, C., Nie, W., Chi, X.,
Wang, J., 2020. Enhanced Secondary Pollution Offset Reduction of Primary Emissions
The authors declare that they have no known competing financial
during COVID-19 Lockdown in China. EarthArXiv doi:10.31223/osf.io/hvuzy.
interests or personal relationships that could have appeared to influence Juneng, L., Latif, M.T., Tangang, F., 2011. Factors influencing the variations of PM10
the work reported in this paper. aerosol dust in Klang Valley, Malaysia during the summer. Atmos. Environ. 45, 4370–
4378. doi:10.1016/j.atmosenv.2011.05.045.
References Kaur, M., Nagpal, A.K., 2017. Evaluation of air pollution tolerance index and an-
ticipated performance index of plants and their application in development of
Aggarwal, A., Haritash, A.K., Kansal, G., 2014. Air pollution modelling - a review. Int. J. green space along the urban areas. Environ. Sci Pollut Res. 24, 18881–18895.
Adv. Technol. Eng. Sci. 2, 255–264. doi:10.1007/s11356-017-9500-9.
Alimissis, A., Philippopoulos, K., Tzanis, C.G., Deligiorgi, D., 2018. Spatial estimation of Lelieveld, J., Evans, J.S., Fnais, M., Giannadaki, D., Pozzer, A., 2015. The contribution of
urban air pollution with the use of artificial neural network models. Atmos. Environ. outdoor air pollution sources to pre-mature mortality on a global scale. Nature 525,
191, 205–213. doi:10.1016/j.atmosenv.2018.07.058. 367–371. doi:10.1038/nature15371.
Amanollahi, J., Ausati, S., 2019. PM2.5 concentration forecasting using ANFIS, EEMD- Letter, C., Jäger, G., 2020. Simulating the potential of trees to reduce particulate matter
GRNN, MLP, and MLR models: a case study of Tehran, Iran. Air Qual. Atmosphere pollution in urban areas throughout the year. Environ. Dev. Sustain. 22, 4311–4321.
Health doi:10.1007/s11869-019-00779-5. doi:10.1007/s10668-019-00385-6.
Amanollahi, J., Kaboodvandpour, S.H., Qhavami, S., Mohammadi, B., Amanollahi, J., Liu, X., Nie, D., Zhang, K., Wang, Z., Li, X., Shi, Z., Wang, Y., Huag, L., Chen, M.,
Kaboodvandpour, S., Qhavami, S., Mohammadi, B., 2015. Effect of the tem- Ge, X., Ying, Q., Yu, X., Liu, X., Hu, J., 2019. Evaluation of particulate matter deposi-
perature variation between Mediterranean Sea and Syrian deserts on the dust tion in the human respiratory tract during winter in Nanjing using size and chem-
storm occurrence in the western half of Iran. Atmos. Res. 154, 116–125. ically resolved ambient measurements. Air Qual. Atmos. Health 12 (5), 529–538.
doi:10.1016/j.atmosres.2014.11.003. doi:10.1007/s11869-019-00663-2.
Ausati, S., Amanollahi, J., 2016. Assessing the accuracy of ANFIS, EEMDGRNN, Majumdar, B.K., Dutta, A., Chakrabarty, S., et al., 2010. Assessment of vehicular pollu-
PCR, and MLR models in predicting PM2.5 . Atmos. Environ. 142, 465–474. tion in Kolkata, India, using CALINE 4 model. Environ. Monit. Assess. 170, 33–43.
doi:10.1016/j.atmosenv.2016.08.007. doi:10.1007/s10661-009-1212-2.
Baker, K.R., Foley, K.M., 2011. A nonlinear regression model estimating single source con- Mehmood, K., China, P.R., Saifullah, Abrar, M.M., Iqbal, M., Haider, E., Shoukat, H.M.H.,
centrations of primary and secondarily formed PM2.5 . Atmos. Environ. 45, 3758–3767. 2020. Can PM2.5 pollution worsen the death rate due to COVID-19 in India and Pak-
https://www.researchgate.net/deref/http%3A//dx.doi.org/10.1016/j.atmosenv.2011 istan? Sci. Total Environ. 742, 140557. doi:10.1016/j.scitotenv.2020.140557.
.03.074. Mirzaei, M., Amanollahi, J., Tzanis, C.G., 2019. Evaluation of linear, nonlinear, and hybrid
Bamniya, B.R., Kapoor, C.S., Kapoor, K., Kapasya, V., 2011. Harmful effect of air pollution models for predicting PM2.5 based on a GTWR model and MODIS AOD data. Air Qual.
on physiological activities of Pongamia pinnata (L.) Pierre. Clean Technol. Environ. Atmos. Health 12 (10), 1215–1224. doi:10.1007/s11869-019-00739-z.
Policy 14, 115–124. doi:10.1007/s10098-011-0383-z. Nagendra, S.M.S., Khare, M., 2006. Artificial neural network approach for modelling ni-
Bandyopadhyay, A., 2010. Dispersion modeling in assessing air quality of industrial trogen dioxide dispersion from vehicular exhaust emissions. Ecol. Model. 190 (1–2),
projects under Indian regulatory regime. Int. J. Energy Environ. 1. 99–115. doi:10.1016/j.ecolmodel.2005.01.062.
Basu, E., Salui, C.L., 2021. Estimating Particulate Matter Concentrations from MODIS Nath, P., Saha, P., Middya, A.I., et al., 2021. Long-term time-series pollution
AOD Considering Meteorological Parameters Using Random Forest Algorithm. Spa- forecast using statistical and deep learning methods. Neural Comput. Applic.
tial Modeling and Assessment of Environmental Contaminants. Environmental Chal- doi:10.1007/s00521-021-05901-2.
lenges and Solutions In: Shit P.K., Adhikary P.P., Sengupta D. (eds). Springer, Cham Perez, P., Trier, A., Reyes, J., 2000. Prediction of PM2.5 concentrations several hours in
doi:10.1007/978-3-030-63422-3_29. advance using neural networks in Santiago, Chile. Atmos. Environ. 34, 1189–1196.
doi:10.1016/S1352-2310(99)00316-7.

7
B. Bera, S. Bhattacharjee, N. Sengupta et al. Environmental Challenges 4 (2021) 100155

Pope, C.A., Burnett, R.T., Thurston, G.D., Thun, M.J., Calle, E.E., Krewski, D., Stafoggia, M., Bellander, T., 2020. Short-term effects of air pollutants on daily
Godleski, J.J., 2004. Cardiovascular mortality and long-term exposure to particulate mortality in the Stockholm county – a spatiotemporal analysis. Environ. Res.
air pollution: epidemiological evidence of general pathophysiological pathways of doi:10.1016/j.envres.2020.109854.
disease. Circulation 109 (1), 71–77. doi:10.1161/01.cir.0000108927.80044.7f. Suleiman, A., Tight, M.R., Quinn, A.D., 2019. Applying machine learning methods in man-
Qi, J., Ruan, Z., Qian, Z., Yin, P., Yang, Y., Acharya, B.K., et al., 2020. Potential gain aging urban concentrations of traffic-related particulate matter (PM10 and PM2.5 ).
in life expectancy by attaining daily ambient fine particulate matter pollution stan- Atmos. Pollut. Res. 10 (1), 134–144. doi:10.1016/j.apr.2018.07.001.
dards in mainland China.: a modelling study based on nationwide data. PLoS Med 17, Sun, W., Zhang, H., Palazoglu, A., Singh, A., Zhang, W., Liu, S., 2013. Prediction of
e1003027. doi:10.1371/journal.pmed.1003027. 24-hour-average PM2.5 concentrations using a hidden Markov model with differ-
Radojevic, D., Antanasijevic, D., Peric-Grujic, A., Ristic, M., Pocajt, V., 2019. The sig- ent emission distributions in northern California. Sci. Total Environ. 443, 93–103.
nificance of periodic parameters for ANN modelling of daily SO2 and NOX con- doi:10.1016/j.scitotenv.2012.10.070.
centrations: a case study of Belgrade, Serbia. Atmos. Pollut. Res. 10 (2), 621–628. Ventura, L.M.B., Pinto, F.O., Soares, L.M., Luna, A.S., Gioda, A., 2019. Forecast of daily
doi:10.1016/j.apr.2018.11.004. PM2.5 concentrations applying artificial neural networks and Holt-Winters models.
Rodríguez-Urrego, D., Rodríguez-Urrego, L., 2020. Air quality during the COVID-19: PM2.5 Air Qual. Atmos. Health 12 (3), 317–325. doi:10.1007/s11869-018-00660-x.
analysis in the 50 most polluted capital cities in the world. Environ. Pollut. 266, Vlachogianni, A., Kassomenos, P., Karppinen, A., Karakitsios, S., Kukkonen, J., 2011.
115042. doi:10.1016/j.envpol.2020.115042. Evaluation of a multiple regression model for the forecasting of the concentrations
Sahu, S.K., Zhang, H., Guo, H., Hu, J., Ying, Q., Kota, S.K., 2019. Health risk associated of NOx and PM10 in Athens and Helsinki. Sci. Total Environ. 409 (8), 1559–1571.
with potential source regions of PM2.5 in Indian cities. Air Qual. Atmos. Health 12 doi:10.1016/j.scitotenv.2010.12.040.
(3), 327–340. doi:10.1007/s11869-019-00661-4. Walsh, M.P., 2014. PM2.5: global progress in controlling the motor vehicle contribution.
Salih, A.A., Mohamed, A.A., Abahussain, A.A., Tashtoosh, F., 2017. Use of some trees to Environ. Sci. Engineer. 8 (1), 1–17. doi:10.1007/s11783-014-0634-4.
mitigate air and soil pollution around oil refinery, Kingdom of Bahrain wastewater. Wang, C., Horby, P.W., Hayden, F.G., Gao, G.F., 2020. A novel coronavirus outbreak of
J. Env. Sci. Pollut. Res. 3 (2), 167–170. http://www.jacsdirectory.com/jespr. global health concern. Lancet 395, 470–473. doi:10.1016/s0140-6736.
Sarkar, M., Das, A., Mukhopadhyay, S., 2020. Assessing the immediate impact of COVID- Wang, Y., Wang, J., Zhao, G., Dong, Y., 2012. Application of residual modification ap-
19 lockdown on the air quality of Kolkata and Howrah, West Bengal, India. Environ. proach in seasonal ARIMA for electricity demand forecasting: a case study of China.
Dev. Sustain.. 10.21203/rs.3.rs-38142/v1. Energ. Policy 48, 284–294. doi:10.1016/j.enpol.2012.05.026.
Shahraiyni, H.T., Sodoudi, S., 2016. Statistical Modeling Approaches for PM10 Predic- WHO, 2011. Urban outdoor air pollution database.
tion in Urban Areas; A Review of 21st-Century Studies. Atmosphere (Basel) 7, 15. https://www.who.int/phe/health_topics/outdoorair/databases/cities-2011/en/.
doi:10.3390/atmos7020015. World Weather Online, 2020. https://www.worldweatheronline.com/.
Sharma, S., Zhang, M., Gao, J., Zhang, H., Kota, S.H., 2020. Effect of restricted emis- Zhao, R., Gu, X., Xue, B., Zhang, J., Ren, W., 2018. Short period PM2.5 prediction based on
sions during COVID-19 on air quality in India. Sci. Total Environ. 728, 1–8. multivariate linear regression model. PLoS One 13 (7), e0201011. doi:10.1371/jour-
doi:10.1016/j.scitotenv.2020.138878. nal.pone.0201011.

You might also like