You are on page 1of 51

Short-term forecasting of atmospheric air pollutants using

Long Short-Term Memory networks

Nicholas Danesi, BEng BCom

The thesis is submitted to University College Dublin in part fulfilment of the


requirements for the degree of Master of Science in Environmental Sustainability

School of Biology and Environmental Science

Supervisor: Assistant Prof. Soumyabrata Dev

August 2021
Abstract
Air quality degradation driven by rapid global development, industrialisation, and
urbanisation puts human health at serious risk. Accurate air quality forecasting and early
warning systems are critical to allow health authorities to provide warnings and health
recommendations to citizens.

This paper uses Holt Winters’ triple exponential smoothing (TES) and a Long Short-Term
Memory (LSTM) models to forecast PM2.5, O3 and NO2 concentrations on a short-term basis
(1-6 hours) using hourly-captured historical data from ground-based observation stations in
different Queensland climate zones. The results indicated that predictions made with the TES
model did not significantly outperform the benchmarking random walk model for all climates
and pollutants. In addition, when forecasting PM2.5, the LSTM model did not significantly
outperform the benchmarking random walk model. However, when forecasting ozone, the
LSTM model (3.92 ppb O3, 5.65 ppb O3, 7.22 ppb O3) had significantly lower RMSE values
at all lead times than the TES (5.68 ppb O3, 8.01 ppb O3, 9.72 ppb O3), average (14.97 ppb
O3, 15.15 ppb O3, 15.98 ppb O3) and persistence models (5.74 ppb O3, 8.40 ppb O3, 10.73
ppb O3).

The strong prediction performance of ozone by the LSTM model indicates that it has
potential to be used reliably in an early warning system in the Rocklea area for ozone
exceedances.

1
Table of Contents
Abstract ................................................................................................................................ 1

Glossary ................................................................................................................................ 5

1. Introduction ................................................................................................................... 6

1.1. Background............................................................................................................. 6

1.2. Aim....................................................................................................................... 10

1.3. Objectives ............................................................................................................. 11

2. Methodology ............................................................................................................... 12

2.1. Data description and extraction ............................................................................. 12

2.2. Evaluation metrics ................................................................................................ 14

2.3. Experiment methods ............................................................................................. 14

2.3.1. Identifying the correlation between meteorological variables and air pollutants
14

2.3.2. Assessing a data-driven model’s forecasting accuracy across different climates


15

2.3.3. Assessing an LSTM model’s forecasting accuracy ......................................... 16

2.3.4. Comparing forecasting accuracy from a literature review ............................... 19

2.4. Literature review assessment method .................................................................... 20

3. Results ........................................................................................................................ 21

3.1. Interdependence of meteorological parameters ...................................................... 21

3.2. Prediction performance of data driven models ....................................................... 22

3.2.1. Particulate matter (<2.5μm)............................................................................ 22

3.2.2. Ozone ............................................................................................................ 26

3.2.3. Nitrogen dioxide ............................................................................................ 29

3.3. Prediction performance of LSTM model ............................................................... 32

3.3.1. Particulate matter (<2.5μm)............................................................................ 32

3.3.2. Ozone ............................................................................................................ 35

2
3.4. Literature review ................................................................................................... 38

4. Discussion and Conclusion .......................................................................................... 41

4.1. Main findings ........................................................................................................ 41

4.1.1. Triple exponential smoothing model performance .......................................... 41

4.1.2. LSTM model development and performance .................................................. 41

4.1.3. Study limitations and reliability ..................................................................... 43

4.1.4. Study Implications and further research ......................................................... 44

4.2. Conclusion ............................................................................................................ 46

5. Appendices .................................................................................................................. 47

Appendix A ..................................................................................................................... 47

6. Bibliography ............................................................................................................... 47

3
List of Figures
Figure Number Title Page
Figure 1. Comparison between standard feedforward neural network and an RNN. The red arrow could be 8
classified as the recurrent layer. (Hochreiter & Schmidhuber, 1997)
Figure 2. The structure of an LSTM memory cell. (Hochreiter & Schmidhuber, 1997) 9
Figure 3. Climatic map of Queensland, showing locations of meteorological stations. (Australian Building 13
Codes Board, 2019)
Figure 4. LSTM model development configuration 16
Figure 5. Observed PM2.5 values over a period of 3 years split into training and testing data. 17
Figure 6. Frequency of PM2.5 concentrations over the training period. 17
Figure 7. Normalised (Box-Cox) frequency data over the training period. 18
Figure 8. a) Learning rate vs loss curve 19
b) Loss vs number of epoch graph
Figure 9. Heatmaps showing the relationships between meteorological parameters and pollutant 21
concentrations across tropical, subtropical, and arid climates.
Figure 10. PM2.5 RMSE comparison of the TES model benchmarked against the persistence and average 22
models using subtropical data.
Figure 11. PM2.5 RMSE comparison of the TES model performance across tropical, subtropical, and arid 23
climates.
Figure 12. The PM2.5 prediction performance of the TES model in different climates using regression 25
analysis. Red markers indicate the tropical climate dataset, blue markers indicate the arid climate
dataset and green markers indicate the subtropical climate dataset.
Figure 13. Ozone RMSE comparison of the TES model benchmarked against the persistence and average 26
models using subtropical data.
Figure 14. The ozone prediction performance of the TES model in different climates using regression 28
analysis. Blue markers indicate the arid climate dataset and green markers indicate the subtropical
climate dataset.
Figure 15. Nitrogen dioxide RMSE comparison of the TES model benchmarked against the persistence and 29
average models using subtropical data.
Figure 16. The NO2 prediction performance of the TES model in different climates using regression analysis. 31
Red markers indicate the tropical climate dataset, blue markers indicate the arid climate dataset
and green markers indicate the subtropical climate dataset.
Figure 17. PM2.5 RMSE comparison between the LSTM, TES and benchmarking models using the 33
subtropical dataset at lead times of 1 - 8 hours.
Figure 18. A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using a 1- 34
hour lead time.
Figure 19. A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using a 4- 34
hour lead time.
Figure 20. A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using a 8- 34
hour lead time.
Figure 21. Ozone RMSE comparison between the LSTM, TES and benchmarking models using the subtropical dataset at 35
lead times of 1 - 8 hours.

4
Figure 22. A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using a 1- 36
hour lead time.
Figure 23. A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using a 4- 36
hour lead time.
Figure 24. A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using a 8- 37
hour lead time.
Figure 25. Comparison of PM2.5 RMSE values for different lead times separated by case study. Table 6 38
provides details for each case study.
Figure 26. Comparison of PM2.5 MAPE values for different lead times separated by case study. Table 6 39
provides the details for each case study.

Glossary
Acronym Definition
ARIMA Autoregressive integrated moving average
CAMS Copernicus Atmospheric Monitoring Service
LSTM Long Short-term Memory
LSTME Extended LSTM
A-LSTM Aggregated LSTM
C-LSTME Extended spatiotemporal convolutional LSTM
MAPE Mean actual percentage error
Naïve Another name for the persistence model
NO2 Nitrogen dioxide
O3 Ozone
PM10 Particulate matter less than 10 μm in diameter
PM2.5 Particulate matter less than 2.5 μm in diameter
ppb Parts per billion
RMSE Root mean square error
RNN Recurrent Neural Network
SO2 Sulphur dioxide
TES Triple exponential smoothing
VOCs Volatile Organic Compounds

5
1. Introduction
1.1. Background
Rapid global development, industrialisation and urbanisation are putting significant pressure
on the environment. Dependence on fossil fuels for power and industrial processes,
particularly in emerging economies, has significantly deteriorated air quality (Manisalidis, et
al., 2020). Through its continued degradation, air quality has become a critical issue in human
health (Manisalidis, et al., 2020). Recent data from the World Health Organisation indicates
that air quality in most cities fails to meet the minimum air quality standards, with developing
countries having the highest concentrations due to uncontrolled urbanisation and minimal air
quality regulation (WHO, 2014).

Major air pollutants such as particulate matter, SO2, O3, NO2 and volatile organic compounds
(VOCs) have serious long-term exposure effects such as heart disease, nerve damage, lung
cancer, and respiratory diseases such as emphysema (Manisalidis, et al., 2020). The short-
term effects are also serious and can include irritation to the nose, throat, eyes, skin, and
illnesses such as pneumonia or bronchitis (Manisalidis, et al., 2020). Research documents that
even limited exposure to atmospheric pollutants can seriously impact human health (Kelly, et
al., 2012). Therefore, health authorities need to provide accurate and timely short-term (1 – 6
hour) forecasts of air pollutants for early warning purposes. Early warning systems can
prevent prolonged exposure to unsafe air quality conditions by providing citizens with pre-
emptive warnings and health recommendations (Bai, et al., 2018).

Air pollution disasters in the mid-20th century, including in the Great Smog of London in
1952, which was estimated to directly cause the deaths of 4000 people and cause respiratory
damage to a further 100 000 (Bell, et al., 2004) and the 1948 Donora smog in Pennsylvania
demonstrated to governments that air quality is an important health issue and propelled
research and control into air pollution (Boissoneault, 2018). The disasters prompted the first
air quality legislation to be developed in the mid-1950s in the United States and the United
Kingdom. The U.S. 1963 Clean Air Act outlined the first set of minimum ambient air quality
standards (Boissoneault, 2018). These guidelines were established for human health and have
been continuously developed and amended; however, they are more of a reactionary control
(Bell, et al., 2004). Air quality forecasting is needed to anticipate poor air quality and warn
citizens to protect human health prior to exceedances (Manisalidis, et al., 2020).

6
As stated, accurate air quality forecasting and early warning systems are critical for health
authorities to provide the population with timely ambient air quality warnings and health
recommendations, especially to those most vulnerable to exceedances (Manisalidis, et al.,
2020). Since the adoption of air quality standards in the 1960s, scientists and statisticians
have been working to develop accurate air quality forecasting, which has resulted in the
development of many statistical forecasting models (Bai, et al., 2018). Previous models have
used data from pollutant sources, weather, and meteorological conditions, ranging in
complexity from simple qualitative and semi-qualitative to quantitative methods (Bai, et al.,
2018). Various forecasting models have been used to issue health warnings to citizens;
however, the most common approach for health authorities is to issue warnings based on live
air quality conditions (Kelly, et al., 2012) (Australian Disaster Resilience, 2021).

The traditional methods for air quality forecasting have included statistical forecasting,
artificial intelligence methods and numerical models. The majority of these models have been
statistical and limited in both forecasting range and accuracy. Bai et al. have composed a
comprehensive overview of various methods for forecasting air quality. Their overview
included and discussed a variety of data-driven forecasting methods and models, including
regression, autoregressive integrated moving average (ARIMA), Projection Pursuit Model
(PP), and Principal Component Analysis (PCA) (Bai, et al., 2018). These forecasting methods
have the advantage of requiring low computational capacity to make predictions; however,
their performance is generally poorer relative to deep neural network models (Bai, et al.,
2018).

Significant advances in computation have allowed modern forecasting tools such as neural
networks and deep learning network models to provide more accurate time series forecasting
than statistical methods on short-term timescales (Bai, et al., 2018) (Hochreiter &
Schmidhuber, 1997). There are many popular deep learning techniques used in forecasting
including, Recurrent Neural Networks (RNN), Convolutional Neural Networks (CNN), Auto
Encoders and Multi-Layer Perceptron (MLP) (Bai, et al., 2018).

7
This paper focuses on examining Long Short-Term Memory (LSTM) deep learning models
for short-term pollutant forecasting; LSTM models are a type of RNN capable of learning
order dependence in time series forecasting (Hochreiter & Schmidhuber, 1997). RNN differ
from standard feedforward neural networks as they contain feedback connections and can
process multiple data points (Hochreiter & Schmidhuber, 1997). Fig. 1 shows the comparison
between a standard feedforward neural network and an RNN.

Figure 1 – Comparison between standard feedforward neural network and an RNN. The red arrow could be classified as the
recurrent layer (Hochreiter & Schmidhuber, 1997)

LSTM neural network models were proposed to improve the memory of both long and short
dynamic features of a time series. At a basic level, LSTM models differ from standard RNN
models by preventing backpropagated error from the recurrent network layer while still using
the feedback system. Essentially, LSTM cells are trained to remember useful information for
prediction capability and forget useless information (Hochreiter & Schmidhuber, 1997). The
‘memory unit’ component of an LSTM network allows the storage of information of any time
length and can obtain a more accurate time series forecast than regular RNN models
(Hochreiter & Schmidhuber, 1997).

8
Figure 2 - The structure of an LSTM memory cell (Hochreiter & Schmidhuber, 1997)

Fig. 2 shows the structure of an LSTM memory cell, where 𝑥𝑡 is the input, ℎ𝑡 is the output,
and the cell state is 𝑐𝑡 . The memory cell contains three separate gate structures, an input gate,
a forgetting gate, and an output gate. The general concept of the three gates is to switch gates
depending on the input information to control the increase or decrease of information within
the memory cell and LSTM network (Chang, et al., 2020). The gates are controlled through a
nonlinear function to determine the information throughput. Shown in Fig. 2 is a sigmoid
function, 𝜎; however, in this paper, a tanh activation function was used in the LSTM layers.
A methodology for the development of the LSTM model used in this paper can be seen in
Section 2.2.3.

The formulas for the above memory cell for the forgetting gate, input gate, output gate, input
conversion, cell state update and output of the hidden LSTM layer of the memory cell are
shown as:

Table 1 - Explanatory equations for an LSTM memory cell (Hochreiter & Schmidhuber, 1997)

Forgetting gate 𝑓𝑡 = 𝜎𝑔 (𝑊𝑓 𝑥𝑡 + 𝑈𝑓 ℎ𝑡−1 + 𝑏𝑓 ) (1)


Input gate 𝑖𝑡 = 𝜎𝑔 (𝑊𝑖 𝑥𝑡 + 𝑈𝑖 ℎ𝑡−1 + 𝑏𝑖 ) (2)
Output gate 𝑜𝑡 = 𝜎𝑔 (𝑊𝑜 𝑥𝑡 + 𝑈𝑜 ℎ𝑡−1 + 𝑏𝑜 ) (3)
Input conversion 𝑐̅𝑡 = 𝜎𝑐 (𝑊𝑐 𝑥𝑡 + 𝑈𝑐 ℎ𝑡−1 + 𝑏𝑐 ) (4)
Cell state update 𝑐𝑡 = 𝑓𝑡 ∘ ct−1 + 𝑖𝑡 ∘ c̅𝑡 (5)
Hidden layer output ℎ𝑡 = 𝑜𝑡 ∘ 𝜎ℎ (𝑐𝑡 ) (6)
where W and U contain the respective weights of the input and recurrent connections for the input gate, output gate, forget
gate or memory cell, b is the offset vector for the gates and ∘ denotes the Hadamard product.

9
LSTM deep learning models used to predict air quality on a short-term basis (<8 hours) have
been successfully tested through several case studies across South Korea, India, and China. In
a recent paper, Kim et al. developed a deep neural network based on LSTM for the daily
forecasting of PM10 and PM2.5 concentrations in South Korea (Kim, et al., 2019). Liu et al.
also proposed a hybrid type prediction model using wavelet decomposition, LSTM network
and information gain to forecast particulate matter PM2.5 across Beijing (Liu, et al., 2020).
Rao et al. have also predicted the air quality of Visakhapatnam in India using LSTM
networks (Rao, et al., 2019). The research group was able to generate networks with accurate
prediction capabilities for hourly-based air pollutant forecasting.

1.2. Aim
As outlined, even limited exposure to high concentrations of air pollutants is a detriment to
human health; therefore, there is a clear need for scientists to anticipate accurate, short-term
(1 – 6 hour) forecasting of atmospheric air pollutants for early warning purposes.

Australia is a nation with areas of both high urban population density and heavy industrial
areas with significant air pollutant emissions (Australian Disaster Resilience, 2021). So, it has
a considerable need for accurate short-term air pollutant forecasting to provide health
authorities with an early warning system to protect citizens from adverse health impacts due
to air pollutants. A timely and accurate forecasting system could also allow industrial emitters
to adjust their production output based on the forecasted air quality conditions. As of the 29th
of October 2020, the Australian Bureau of meteorology do not use LTSM or any deep
learning models for air quality forecasting as confirmed by phone through the
Commonwealth Scientific and Industrial Research Organisation (CSIRO); instead, they use a
chemical transport model for forecasting aerosols, AQFx. In addition, Australia's distinct
climate provides a unique opportunity to examine the accuracy of predictive models in
different climate zones.

This thesis aims to provide practical support to Australian health authorities and regulatory
bodies to provide early air pollutant exceedance warnings to citizens and industrial emitters.
The results will also fill a gap in research, as there has been limited prior research comparing
the effectiveness of air quality forecasting models in different climates in Oceania.

10
1.3. Objectives

To achieve the aim of this thesis, the primary objective is to develop an LSTM forecasting
model that can reliably and accurately predict atmospheric air pollutant concentrations on a
short-term timescale (1 – 6 hours) in Queensland, Australia.

In addition to the primary objective, several further objectives will be completed to better
understand the topic and build on the foundation for future research in the area. These further
objectives include the following:

1. To provide a review of the current state of the art deep LSTM learning-based models
used to forecast air pollutants.
2. To develop a data scraping tool that effectively gathers and collates existing,
historical air contaminant data from ground monitoring stations in Queensland.
3. To identify key meteorological variables that correlate with air pollutant
concentrations.
4. To assess a data-driven model’s accuracy in forecasting atmospheric air pollutants
across three different Queensland climates (tropical, subtropical, and arid).
5. To assess an LSTM model’s accuracy in forecasting atmospheric air pollutants.

11
2. Methodology
This paper's research and design methodology will logically develop and test the pollutant
forecasting accuracy of a triple exponential smoothing (TES) and LTSM deep learning
model. The data extraction and models developed for this paper were generated in Python
unless stated otherwise. Information on the code used in this paper can be viewed in
Appendix A.

2.1. Data description and extraction


The meteorological and air pollutant data used in this paper was obtained from ground
monitoring stations located across Queensland, Australia. Queensland data was selected as it
has a variable climate; Fig. 3 shows the location distribution of the monitoring stations across
Queensland climatic zones. The data was systematically scraped from the Queensland
Government website, using a data scraping tool explicitly developed for this thesis (Appendix
A). The Queensland Government data was chosen as it is the only publicly available
meteorological and air pollutant data in Queensland (Queensland Government, 2020); its
usage outlines its quality, the data is currently being used by major air quality indexes such as
IQ Air and the World Air Quality Project (IQAir, 2021) (World Air Quality Project, 2021).

All the stations used within the experiment follow the Australian and New Zealand Standards
(AS/NZS) for collecting data; however, not every station uses the same operating equipment
for data measurements (Queensland Government, 2020).

12
Figure 3 – Climatic map of Queensland, showing locations of meteorological stations (Australian Building Codes Board,
2019)

The collected data contains the timestamped, hourly averages of the various meteorological
parameters and pollutants. The meteorological variables include air temperature, wind
velocity and direction, relative humidity, barometric pressure, and rainfall. The pollutant
concentrations extracted comprise the hourly averages of PM2.5, PM10, O3, NO2, SO2.
Observations were obtained for the years 2017 – 2019. Each station’s dataset contained
approximately 5000-8000 observations per year that were used within the experiment.

The data from several monitoring stations was excluded when it was not recorded for at least
70% of the year to ensure consistency across the period. In addition, not all stations collected
every previously listed pollutant, so data was only extracted from stations that recorded at
least 2 of the following pollutants: PM2.5, O3 and NO2.

13
2.2. Evaluation metrics
The quantitative evaluation metrics used in this paper (including in the literature review
component) include standard methods such as root mean square error (RMSE), linear
regression (R2) and mean absolute percentage error (MAPE). In addition, variance and
outliers were used to visually compare results. RMSE and MAPE compare the difference
between a model’s predicted values and the actual observed values. MAPE was used to cross-
compare literature forecasting errors as it normalises errors which facilitates a direct
comparison. MAPE was not used for this paper’s data due to the high frequency of zero
concentration observations.

𝑛
(𝑥𝑖 − 𝑥̂𝑖 )2
𝑅𝑀𝑆𝐸 = √∑ (7)
𝑛𝑖
𝑖=1

𝑛
1 𝑥𝑖 − 𝑥̂𝑖 (8)
𝑀𝐴𝑃𝐸 = ∑ | | × 100
𝑛𝑖 𝑥𝑖
𝑖=1

where
𝑥𝑖 is the actual observed pollutant concentration
𝑥̂𝑖 is the predicted pollutant concentration
𝑖 is the pollutant
𝑛 is the number of measured pollutant observations

2.3. Experiment methods


2.3.1. Identifying the correlation between meteorological variables and air pollutants
The interdependence of meteorological parameters on atmospheric pollutants was tested
using three datasets containing 2017 – 2019 data aggregated from tropical, subtropical, and
arid stations, respectively. Pearson correlation coefficients were generated from the data as
test statistics and presented in correlation matrix heatmaps for simplistic data visualisation
using Python.

The Pearson correlation coefficients were used to identify correlated atmospheric variables to
be included in a future multivariate LSTM model.

14
2.3.2. Assessing a data-driven model’s forecasting accuracy across different climates
The data-driven model selected for this paper was an additive, triple exponential smoothing
model (TES) that used Holt Winters’ seasonal method to forecast three key pollutants, PM2.5,
O3 and NO3, on a 1 – 6-hour timescale. Triple exponential smoothing is forecasting based on
a weighted sum of past observations, using an exponentially decreasing weight applied to
observations (Hyndman & Athanasopoulos, 2018). The TES weights were assigned on the
level, trend, and seasonal components of the time series.

This paper used hourly timestamped pollutant data, with historical concentrations to predict
future concentrations after a certain time, t. The future values of pollutants, P(t+m) are
modelled as

𝑃𝑡+𝑚 = 𝑆𝑡 + 𝑚 × 𝐵𝑡 + 𝐶(𝑡−𝐿+1+(𝑚−1) 𝑚𝑜𝑑 𝐿, (9)

Where S is the smoothed version of the constant part of the observation, B is the best linear
trend estimate, L is the length of a season, and C is the series of seasonal corrections
(Hyndman & Athanasopoulos, 2018).

In addition to assessing the TES model’s performance at forecasting PM2.5, O3 and NO2, its
performance will be benchmarked against two standard benchmarking models: the
persistence and the average models.

The persistence model forecasts future values by assuming that the last observation value
remains constant, i.e., simulating a random walk model. This persistence benchmark is
modelled as:

𝑃𝑡+𝑚 = 𝑃𝑡 (10)

The average model forecasts future values by assuming that future values are the same as a
historical average and is modelled as:

1 (11)
𝑃𝑡+𝑚 = ∑ 𝑃𝑡
𝑡
𝑡

When generating the TES model and the two benchmarking models, 1000 hours of historical
data was used for ‘training’ so that the seasonality component of the model was suitably
adjusted for, and the model was tested in 30 individual experiments to reduce sampling bias.
Pollutant concentrations were forecasted for lead times of 1 – 6 hours. The three models’

15
performance was compared using aggregated datasets from weather stations and assessed
quantitatively using RMSE values.

The performance of the TES model in different climatic zones (tropical, subtropical, arid)
was compared by forecasting pollutants using the three individual datasets. The performance
was assessed using both RMSE values and R2 values from linear regression. RMSE values
were again generated for lead times of 1- 6 hours using a 2017 – 2019 historical data range.
The model was trained with 1000 hours of historical data to incorporate seasonality and 20
individual experiments. R2 values were generated using the same method with 50 individual
experiments.

2.3.3. Assessing an LSTM model’s forecasting accuracy


An LSTM deep neural network model was trained and tested using three years of aggregated
data from the Rocklea weather station (subtropical climate) for PM2.5 and ozone. The model
was tasked with forecasting concentrations at 1 – 8 hours, given the past 72 hours of data.
The model was built using two 32-unit LSTM hidden layers using tanh activation and two
feedforward layers, one with 1 unit and linear activation and one with 16 units and rectified
linear unit activation. The LSTM model in this paper has been developed initially as a
univariate model, with the only variable input as observed pollutant concentrations. The
development configuration of the LSTM model and layer structure can be seen in Fig. 4.

Figure 4 - LSTM model development configuration


16
To develop the model, the raw time series data (Fig. 5) was broken into data for training the
model and data for testing the model using a 4:1 training to testing ratio. The training data
was sorted by frequency and normalised using a box-cox transformation as seen in Fig. 6 and
Fig. 7.

Figure 5 - Observed PM2.5 values over a period of 3 years split into training and testing data

Figure 6 - Frequency of PM2.5 concentrations over the training period

17
Figure 7 - Normalised (Box-Cox) frequency data over the training period

The LSTM model was trained using the Adam optimiser with the default settings (Table 2).
The learning rate was minimised using the design function in Table 2, and the loss vs the
learning rate can be seen in Fig. 8 (Jain, et al., 2020). The learning rate, 𝜂, was minimised to
allow the model to learn a more optimal set of weights at the cost of additional training time.
The learning rate function was obtained through observing multiple experiments while trying
to minimise loss. Instead of using MSE or RMSE, Huber loss was used as it has been
observed to perform significantly better in literature (Jain, et al., 2020) (Friedman, 2001).

Table 2 - LSTM model parameters and configuration

Library – Tensorflow 2.5


Huber Loss – (𝛿 = 1)
Optimizer – Adam (𝜂, β1 = 0.9; β2 = 0.999; ϵ = 10−7 )
Batch Size – 128
Number of epochs – 180
Learning rate – −4
𝑒𝑝𝑜𝑐ℎ
𝜂 = {10 × 10 20 𝑖𝑓 𝜂 < 10−2
10−2 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

18
Figure 8 - a) Learning rate vs loss curve b) Loss vs number of epoch graph

The LSTM model was trained for 180 epochs with a batch size of 128; while it seems
visually apparent that further loss could be obtained with more epochs (Fig. 8 shows the loss
curve), further epochs were not added due to the increasing computational time of the model.

The forecasting accuracy of the LSTM model was assessed using RMSE values, comparing
against the Holt Winters’ TES model and two benchmarking models: the persistence model,
and the average model. In addition, 48 hours of results were qualitatively assessed for
predictions at 1, 4 and 8 hours to compare the predictive capabilities of the LSTM model for
PM2.5 against ozone and its ability to map concentration peaks.

2.3.4. Comparing forecasting accuracy from a literature review


The data from relevant literature case studies that used LSTM to forecast particle matter
concentrations were gathered using the methodology outlined in Section 2.4.

The gathered literature data was split based on the error metric used to analyse its accuracy
i.e., RMSE, MAPE. The split data was organised in Excel and examined qualitatively to
assess the industry standard and compare the industry standard against this paper’s
forecasting results.

19
2.4. Literature review assessment method
The literature review conducted within this paper provides a background on forecasting
models and evidence of their accuracy to predict atmospheric pollutants. A systematic
approach was taken to review prior pollutant forecasting literature based on relevance and
quality.

The literature utilised within the paper focuses on the previous 20 years (2000-2021); older
studies were excluded due to their relevance or outdated modelling unless they specifically
stated that they used additive triple exponential smoothing or LSTM. All information
pertaining to predictive modelling was reviewed, as the data sources for LSTM pollutant
forecasting are limited due to their novel nature. Only literature in English was reviewed,
though there was no exclusion of foreign data. The types of literature reviewed were
primarily composed of peer-reviewed journal papers, as this is where most forecasting case
studies are documented.

The quality constraints put on the literature excluded case studies with no transparent
forecasting methodologies or that did not explicitly state their data sources. No literature was
reviewed that included controversial opinions or views.

As a novel concept, there was a limited abundance of literature on LSTM atmospheric


pollutant forecasting, so the search terms used in electronic databases such as Google
Scholar, OneSearch and Scopus were broad. Search terms included “triple exponential
smoothing pollutant forecasting”, “air pollutant forecasting”, “LSTM pollutant
forecasting”, and “deep learning air pollutant forecasting”.

Results data from the literature review (primarily forecasting accuracy error metrics such as
RMSE and MAPE for PM2.5) were extracted and consolidated into visual data for ease of
comparison with this paper’s results, to provide a high-level verification. In addition,
methodology from case studies was examined and compared to identify any unforeseen
limitations of this paper and any future prospects.

20
3. Results
3.1. Interdependence of meteorological parameters
Fig. 9 demonstrates the relationship between meteorological parameters and pollutant
concentrations across tropical, subtropical, and arid climates based on Pearson correlation
coefficients. It can be observed that in all three climates, the PM10, PM2.5 pollutants are
strongly positively correlated with each other, and that wind speed and air temperature are
strongly positively correlated. In all three climates, there is a weak negative correlation
between relative humidity and both the PM10, PM2.5 pollutants.

Figure 9 - Heatmaps showing the relationships between meteorological parameters and pollutant concentrations across tropical, subtropical, and arid
climates.

In the subtropical climate, there are strong correlations between all four of the meteorological
variables (wind speed, air temperature, relative humidity, solar radiation) and the pollutants
O3 and NO2. A moderate correlation between O3 and both PM10, PM2.5 can also be observed.
In the arid climate, similar to the subtropical climate, there are moderate correlations between
the four meteorological variables and O3 and NO2. There is also a weak correlation between
PM2.5 and O3. In the tropical climate, aside from the aforementioned shared correlations
across the three climates, there are only weak negative correlations between NO2 and air
temperature and a weak positive correlation between SO2 and NO2.

The strong correlations observed between meteorological parameters and air pollutants in the
subtropical climate (particularly O3 and NO2) is a clear indication that a potential multivariate
LSTM model could be more effective at forecasting O3 and NO2 than a univariate model.

21
3.2. Prediction performance of data driven models
3.2.1. Particulate matter (<2.5μm)
Fig. 10 compares the RMSE values and their ranges for the subtropical data. The figure
shows that the average model performs considerably poorer than the TES and persistence
models. The TES model slightly outperformed the persistence model, particularly at lead
times of 1 - 4 hours. The persistence model performs well relative to the TES model, likely
due to the slow-changing nature of PM2.5 concentrations in the atmosphere (Zhao, et al.,
2021), aside from instances of abrupt changes in PM2.5 generating activities. This slow-
moving characteristic of PM2.5 makes random walk forecasts reasonable at low lead times. It
can also be observed that the error progressively increases with an increasing lead time for
both the TES and persistence models, as is expected with time advanced forecasting.

Figure 10 – PM2.5 RMSE comparison of the TES model benchmarked against the persistence and average models using
subtropical data

22
The performance of the TES model in different climatic zones (tropical, subtropical, arid)
was compared by predicting PM2.5 values using the three individual datasets. Fig. 11
compares the RMSE values of the TES model for the three climate zones. Fig. 11 shows that
the TES model performed the best using data from the tropical climate, where it consistently
showed significantly lower variance and RMSE values than the subtropical and arid data;
though this may be partially explained by its relatively lower mean concentrations. The TES
model's performance in the arid data outperforms the subtropical data at all lead times except
for a lead time of 4 hours - likely due to significant outliers.

Figure 11 - PM2.5 RMSE comparison of the TES model performance across tropical, subtropical and arid climates

To assess the year-on-year performance, for the three models’ predictive capabilities in the
three climates, the same tests were conducted using an average of 2017-2019 data, and the
results can be viewed in Table 3. The table shows that the year-on-year performance of the
TES model in all three climates has considerable variance; however, the tropical data
continually outperformed both the arid and subtropical climate data. In addition, the TES

23
model consistently outperforms both the average model in all climates and all lead times and
largely outperforms the persistence models in all climates and all lead times.

Table 3 – PM2.5 RMSE comparison of TES, average, and persistence model performance across tropical, subtropical, and
arid climates over 2017 - 2019. All data is shown in ug/m^3.

Climate Lead time 1 2 3 4 5 6


TES model 0.54 1.13 0.78 1.46 1.48 1.66
Tropical Average model 2.28 3.24 1.69 2.14 2.84 2.99
Persistence model 0.59 1.25 0.93 1.84 1.56 1.75
TES model 1.93 3.41 3.35 3.87 3.91 5.39
Subtropical Average model 5.26 5.04 4.10 5.20 4.70 6.92
Persistence model 2.02 3.55 3.25 3.61 3.95 5.32
TES model 1.17 2.77 3.12 3.08 3.17 3.59
Arid Average model 3.77 5.40 5.18 4.17 3.85 4.55
Persistence model 1.35 3.09 3.36 3.36 3.44 4.01

Over the following page, Fig. 12 shows the performance of the TES model in different
climates using regression analysis. It can be observed at lead times of 1 - 4 hours that the TES
model can accurately predict PM2.5 values in all climates with comparable accuracy. At lead
times of 6 hours, the TES model's performance sharply drops in the arid and tropical data,
though it can still predict PM2.5 values in the subtropical climate with moderate accuracy. At
lead times of 8 hours, all three models have poor or no prediction capability of PM2.5 values.

It can also be observed that the variance of the tropical data is significantly lower than the
arid and subtropical data. Even at a lead time of 6 and 8 hours, the variance of the tropical
data is very low, though its R2 values are poor due to significant outliers. In contrast, while
the subtropical data outperformed the tropical data, it had a significantly higher variance at all
lead times.

24
25
Figure 12 - The PM2.5 prediction performance of the TES model in different climates using regression analysis. Red
markers indicate the tropical climate dataset, blue markers indicate the arid climate dataset and green markers indicate
the subtropical climate dataset
3.2.2. Ozone
Fig. 13 compares the ozone RMSE values and their ranges for the subtropical data. It can be
observed that, similar to the PM2.5 forecasting, the ozone forecasting average model performs
considerably poorer than the TES and persistence models. The TES model slightly
outperformed the persistence model, at lead times of 1 – 6 hours. Again, the persistence
model performs well relative to the TES model, indicating that random walk forecasting is
almost as effective as the TES model. The error progressively increases with an increasing
lead time for both the TES and persistence models, as is expected with time advanced
forecasting.

Figure 13 - Ozone RMSE comparison of the TES model benchmarked against the persistence and average models using
subtropical data

To assess the year-on-year performance, for the three models’ predictive capabilities in the
subtropical and arid climates, the same tests were conducted using an average of 2017-2019
data, and the results can be viewed in Table 4 (no recording data is available for ozone among
tropical weather monitoring stations). The year-on-year performance of the TES and
persistence models considerably outperform the average model in both climates, and the TES
26
model slightly outperforms the persistence model in both climates. The parity between the
TES and persistence models’ results indicates that the TES model consistently just
outperforms random walk.

Table 4 - Ozone RMSE comparison of TES, average, and persistence model performance across subtropical and arid
climates over 2017 - 2019. All data is shown in O3 ppb.

Climate Lead time 1 2 3 4 5 6


TES model 2.76 5.01 5.60 7.31 7.93 10.38
Subtropical Average model 9.02 10.78 10.36 11.05 10.71 11.11
Persistence model 2.56 5.42 5.51 7.36 8.20 10.51
TES model 2.06 3.23 4.03 6.48 7.29 7.09
Arid Average model 6.05 6.73 7.54 8.86 9.03 8.73
Persistence model 4.67 5.64 6.22 8.72 8.83 7.82

Over the following page, Fig. 14 shows the performance of the TES model to predict ozone
in arid and subtropical climates using regression analysis. At lead times of 1 - 2 hours, the
TES model can accurately predict ozone values in the arid and subtropical climates with
comparable accuracy. At lead times of 4, 6 and 8 hours, the TES model’s performance has
poor or no capabilities to accurately forecast ozone concentrations. It can be observed that
both datasets also have similar variance, indicating that the TES model seems to perform
uniformly across both climates.

27
Figure 14 - The ozone prediction performance of the TES model in different climates using regression analysis. Blue markers indicate
the arid climate dataset and green markers indicate the subtropical climate dataset
28
3.2.3. Nitrogen dioxide
Fig. 15 compares the NO2 RMSE values and their ranges for the subtropical data. Similar to
the PM2.5 and O3 forecasting, the NO2 forecasting average model performs considerably
poorer than the TES and persistence models. The persistence model performed on-par with
the TES model, at lead times of 1 – 6 hours. The persistence model again performs well
relative to the TES model, indicating that it has comparable accuracy to random walk. Again,
the error progressively increases with an increasing lead time for both the TES and
persistence models, as is expected with time advanced forecasting.

Figure 15 – Nitrogen dioxide RMSE comparison of the TES model benchmarked against the persistence and average models
using subtropical data

To assess the performance year-on-year, for the three models’ predictive capabilities in the
three climates, the same tests were conducted using an average of 2017-2019 data, and the
results can be viewed in Table 5. The year-on-year performance of the TES and persistence
models considerably outperform the average model in the subtropical and arid climates, while
the average model performs on-par in the tropical climate. Again, the TES model slightly
outperforms the persistence model in all three climates, though not at a significant level.
29
Table 5 – Nitrogen dioxide RMSE comparison of TES, average, and persistence model performance across tropical,
subtropical, and arid climates over 2017 - 2019. All data is shown in NO2 ppb

Climate Lead time 1 2 3 4 5 6


TES model 1.56 1.55 1.45 1.45 1.81 2.02
Tropical Average model 1.62 1.86 1.73 1.58 2.11 1.73
Persistence model 1.64 1.52 1.60 1.54 1.83 2.01
TES model 1.97 2.49 3.51 4.09 5.60 5.54
Subtropical Average model 5.59 4.74 5.30 5.42 6.13 6.17
Persistence model 2.07 2.54 3.54 4.15 5.65 5.54
TES model 0.47 0.58 0.82 1.03 1.06 1.12
Arid Average model 0.99 1.04 1.31 1.13 1.28 1.25
Persistence model 0.55 0.67 0.92 1.05 1.09 1.20

Over the following page, Fig. 16 shows the performance of the TES model to predict NO2 in
arid, tropical, and subtropical climates using regression analysis. The TES model cannot
accurately predict future NO2 concentrations in the tropical climate even at a lead time of one
hour. It performs marginally better in the arid climate, but it still can only predict NO2 values
at a lead time of 1 hour with moderate accuracy. In the subtropical climate, the TES model
forecasts NO2 with a high degree of accuracy at lead times of 1 hour and a moderate degree
of accuracy at lead times of 2 hours.

The TES model’s regression performance is considerably poorer when predicting NO2
compared to predicting PM2.5 and ozone. It has little capability at any lead time to make
accurate predictions. This may be partially explained by the very low concentrations of NO2
observations (<10 ppb NO2), especially in the tropical and arid climate data. The fewer data
points due to a high number of zero observations significantly diminishes a model’s
predictive capability.

30
Figure 16 - The NO2 prediction performance of the TES model in different climates using regression analysis. Red markers
indicate the tropical climate dataset, blue markers indicate the arid climate dataset and green markers indicate the 31
subtropical climate dataset
3.3. Prediction performance of LSTM model
3.3.1. Particulate matter (<2.5μm)
Fig. 17 compares the PM2.5 RMSE values for the LSTM, TES and benchmarking models
using the subtropical dataset. While the LSTM model has been well-trained, the test data
shows that the persistence model outperforms the LSTM model at lead times of 1 – 4 hours,
after which the LSTM model slightly outperforms the persistence model. Some of the strong
early relative performance of the persistence model may be attributed to the slow-moving
characteristic of PM2.5 in the atmosphere; however, the variance between the persistence test
and train data indicates that the training and testing data varies significantly.

Fig. 5 in the methodology shows the raw PM2.5 data, and from the figure, it can be observed
that the testing data has significantly more outliers and a general increasing trend compared
with the training data. In literature, this underperformance is often explained by the
overfitting of the LSTM model in training (Tripathi, 2020); however, this is unlikely the case
due to the lack of a trend within the data. The LSTM model is outperformed by the TES
model at lead times 1 – 5 hours.

The mediocre performance of the LSTM model using PM2.5 data from the Rocklea station
compared with the persistence model (random walk) indicates that it would likely be a poor
predictor for an early warning system at Rocklea.

32
PM2.5 RMSE of LSTM & persistence model over time
8

6
PM2.5 RMSE (ug/m3)

0
0 1 2 3 4 5 6 7 8 9
Lead time (hours)
LSTM test Persistence test Persistence train
LSTM train Average test TES test

Figure 17 - PM2.5 RMSE comparison between the LSTM, TES and benchmarking models using the subtropical dataset at
lead times of 1 - 8 hours.

Figures 18, 19, and 20 show a 48-hour snapshot of the testing performance of the LSTM and
the persistence (Naïve) models at lead times of 1, 4 and 8 hours. From a qualitative
perspective, Fig. 18 shows that with a 1-hour lead time that the LSTM model has
considerable periods of variance from the actual data but still manages to map concentration
peaks. At the 4 and 8-hour forecasting predictions (Fig. 19 and 20), there is a very sizeable
variance between the actual data and the LSTM model’s prediction; the LSTM model follows
the general trend but is failing to map the concentration peaks.

33
Figure 18 – A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using a 1-hour lead time

Figure 19 - A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using a 4-hour lead time

Figure 20 - A 48-hour snapshot of the LSTM and persistence models’ PM2.5 prediction performance using an 8-hour lead
time

34
3.3.2. Ozone
Fig. 21 compares the ozone RMSE values for the LSTM, TES and benchmarking models
using the subtropical (Rocklea) dataset. Like in the PM2.5 forecasting, the LSTM model has
been well trained for ozone, though the ozone test performance is significantly better than the
PM2.5 test performance. The test data shows that the LSTM model significantly outperforms
the persistence model at all lead times. The strong relative performance of the persistence
model at lead times of 1 – 3 hours may be again attributed to the slow-moving characteristic
of ozone within the atmosphere. The almost identical performance of the persistence model in
the train and test set indicates that, unlike the PM2.5 data, there is no change of trend or
significant outliers between the training and testing data.

The LSTM model also substantially outperforms the statistical TES model in forecasting
ozone at all lead times. The LSTM model’s strong performance relative to the TES and
benchmarking models indicates that it may be used reliably in an early warning system at
Rocklea for ozone exceedances. The LSTM model’s greater performance in predicting ozone
compared to PM2.5 indicates that further verification testing is needed for PM2.5 datasets.

Ozone RMSE of LSTM & persistence model over time

20

18

16

14
Ozone RMSE (ppb)

12

10

0
0 1 2 3 4 5 6 7 8 9
Lead time (hours)
LSTM test Persistence test Persistence train
LSTM train TES test Average test

Figure 21 - Ozone RMSE comparison between the LSTM, TES and benchmarking models using the subtropical dataset at
lead times of 1 - 8 hours.

35
From a qualitative perspective, the testing performance of the LSTM and the persistence
models can be seen in Fig. 22, 23 and 24 at lead times of 1, 4 and 8 hours. Fig. 22 shows that
at 1 hour that the LSTM model has very minimal variance from the actual data and is
accurately mapping concentration peaks. Fig. 23 and Fig. 24, shows that even forecasting 4
and 8-hours into the future, the LSTM model has slight variance and is still managing to map
the concentration peaks.

Figure 22 - A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using a 1-hour lead
time

Figure 23 - A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using a 4-hour lead
time

36
Figure 24 - A 48-hour snapshot of the LSTM and persistence models’ ozone prediction performance using an 8-hour lead
time

37
3.4. Literature review
Fig. 25 and Fig. 26 compare the PM2.5 RMSE and MAPE values over lead times of 1 – 8
hours from various case studies around East Asia that used LSTM modelling to predict PM2.5,
alongside the PM2.5 LSTM modelling results from this paper (Rocklea). Table 6 provides
further background information on the locations, sources, and specific model types used. It
should be noted that direct comparison of RMSE values between sites is not a reasonable
indicator of relative model performance due to the significant variance in mean
concentrations between sites.

PM2.5 RMSE of various LSTM models over time


45
40
PM2.5 concentration (ug/m3)

35
30
25
20
15
10
5
0
0 1 2 3 4 5 6 7 8 9
Lead time (hours)
Beijing Seoul China (LSTM) Taiwan
Beijing-Tianjin-Hebei Taiwan 2 China (C-LSTME) Rocklea

Figure 25 - Comparison of PM2.5 RMSE values for different lead times separated by case study. Table 6 provides details for
each case study

38
PM2.5 MAPE of various LSTM models over time
45

40
PM2.5 concentration error (%)

35

30

25

20

15

10

0
0 2 4 6 8 10 12 14 16 18
Lead time (hours)
Beijing Beijing 2 China (LSTM) China (C-LSTME) Taiwan

Figure 26 - Comparison of PM2.5 MAPE values for different lead times separated by case study. Table 6 provides the details
for each case study

Table 6 - Case study details, including locations and modelling type

Legend name Researcher Model type Location


Beijing (Li, et al., 2017) LSTME 12 monitoring stations in Beijing
Seoul (Xayasouk, et al., 2020) LSTM 25 monitoring stations in Seoul
China (Wen, et al., 2019) C-LSTME 1233 monitoring stations in China
LSTM
Taiwan (Chang, et al., 2020) A-LSTM 5 regions of Taiwan
Beijing- (Seng, et al., 2021) LSTM 35 monitoring stations in the Beijing-
Tianjin
Tianjin-Hebei locale
Taiwan 2 (Tsai, et al., 2018) LSTM 66 monitoring stations in Taiwan
Beijing 2 (YongMing, et al., 2019) LSTM Shanghai

Wen et al. found a spatiotemporal convolutional LSTM extended model (C-LSTME)


outperformed a conventional LSTM model using the same dataset, at an average
improvement of 7.1 𝜇g/m3 (RMSE) & 7% (MAPE) (Wen, et al., 2019). This improved
performance using a C-LSTME model indicates that modifying an LSTM model can provide
better forecasting accuracy. Though, different types of modified LSTM models used in the
same locations in different papers did not show significantly different prediction
39
performance, such as an extended LSTM (LSTME) in Beijing (Li, et al., 2017) (Seng, et al.,
2021) and an aggregated LSTM (A-LSTM) in Taiwan (Chang, et al., 2020) (Tsai, et al.,
2018).

This improved performance of the C-LSTME model compared with the LSTME and A-
LSTM models are likely due to its increased complexity and use of data. The spatiotemporal
component of the C-LSTME model captures and integrates data from neighbouring stations,
such as temperature, humidity, wind speed, planetary boundary layer heights and aerosol
optical depths (Wen, et al., 2019). As these meteorological parameters have been shown
(including in this paper) to correlate with PM2.5 (Owoade, et al., 2012), their integration into
the model is expected to improve its prediction performance.

While none of the literature case studies used the same benchmarking models (average and
persistence) as this paper to outline a relative performance, most authors indicated their
results were consistently able to forecast PM2.5 on a short-term timescale more accurately
than other statistical models such as ARIMA (Wen, et al., 2019) (Chang, et al., 2020). The
papers overwhelmingly concluded that the potential use of LSTM modelling in early warning
systems was promising from their results; however, some cited a need for more focus on how
traffic (Chang, et al., 2020) and industrial emissions affect pollution rates (Xayasouk, et al.,
2020), more consistent data-recording (Wen, et al., 2019) and further testing prior to any
implementation.

40
4. Discussion and Conclusion
4.1. Main findings
4.1.1. Triple exponential smoothing model performance
The statistical TES model performed best in forecasting PM2.5, with the lowest RMSE values
obtained from the tropical climate dataset; however, the TES model performed best using the
subtropical dataset when assessing accuracy using regression. The TES model had a poorer
performance forecasting ozone where it performed comparably across all climates and lead
times. The TES model’s poorest performance was when forecasting NO2, which it was not
reliably able to do past a period of 1 hour. The poor performance in predicting NO 2 can be
partially attributed to the very low concentrations and large amounts of zero concentration
recordings compared with literature case studies (Kim, et al., 2019) (Wu, et al., 2017). The
sporadic data can decrease the performance of statistical time series models.

The lack of visible seasonality within the Queensland data was likely significantly
detrimental to the TES performance compared with the baseline persistence model; literature
indicates that the Holt Winters model performs poorly without seasonality within data (Wu,
et al., 2017) and performs strongly with seasonal patterned data (Muna & Kuntoro, 2021).
Nath et al. found that data with defined seasonal trends outperformed an LSTM model when
forecasting PM2.5 (Nath, et al., 2021).

For overall prediction performance, the statistical TES model performed comparably with the
persistence model across all climates, lead times and pollutants. While this may partially be
attributed to the slow-moving change of pollutants in the atmosphere lending to a strong
performance of a persistence model, it indicates that random walk is essentially as effective
as the TES model in forecasting pollutant concentrations. Therefore, it suggests that using the
TES model for early warning systems using Queensland monitoring data will be largely
ineffective.

4.1.2. LSTM model development and performance


Correlation between pollutants and meteorological parameters differed significantly across
different climate datasets, most notably in the tropical climate dataset, which had very limited
correlations between parameters. The arid and subtropical climate data showed correlations
more in line with literature (Table 7).

41
Table 7 - Correlations identified between meteorological parameters and air pollutants

Relationship identified in Correlation strength Supporting literature


paper (Pearson correlation coefficient)
PM10 & PM2.5 Strong positive (>0.6) (Luo, et al., 2021) (Owoade, et al.,
2012)
PM10 & Relative humidity Weak negative (0.2 – 0.4) (Owoade, et al., 2012)
PM10 & Wind speed Weak positive (0.2 – 0.4) (Owoade, et al., 2012)
PM2.5 & Relative humidity Weak negative (0.2 – 0.4) (Owoade, et al., 2012)
PM2.5 & Wind speed Weak positive (0.2 – 0.4) (Owoade, et al., 2012)
O3 & Air temperature Strong positive (>0.6) (Zoran, et al., 2020)
O3 & Relative humidity Moderate negative (0.4 – 0.6) (Watcharavitoon, et al., 2013) (Zoran, et
al., 2020)
O3 & Wind speed Strong positive (>0.6) (Liu, et al., 2020)
O3 & NO2 Moderate negative (0.4 – 0.6) (Venkitasamy, 2015)
NO2 & Air temperature Moderate negative (0.4 – 0.6) (Watcharavitoon, et al., 2013) (Liu, et
al., 2020)
NO2 & Wind speed Moderate negative (0.4 – 0.6) (Watcharavitoon, et al., 2013) (Zoran, et
al., 2020)
NO2 & Relative humidity Moderate positive (0.4 – 0.6) (Watcharavitoon, et al., 2013)

Table 7 indicates the correlations shown in this paper’s data, their type, strength, and
supporting literature. While there is literature supporting each of the correlations detected,
some of the case studies presented conflicting correlations; for example, a strong positive
relationship was found between ozone and wind speed, though Liu et al. detected a moderate
negative correlation (Liu, et al., 2020). This suggests that there is considerable variability in
correlations between meteorological parameters and air pollutants between sites. This theory
is supported by Venkitasamy’s findings that across five Indian cities with similar climates
and urban densities, that the correlation between O3 & NO2 varied between a strong negative
and a strong positive correlation (Venkitasamy, 2015).

This inconsistency of correlations between sites, even those with similar characteristics,
indicates that correlations should be re-tested when developing multivariate LSTM models to
ensure their accuracy.

Compared with the statistical TES model, the LSTM model performed significantly better
when predicting ozone concentrations over lead times of 1 – 8 hours, and its performance in
predicting peaks indicates that it may be effectively used in a novel early warning system in

42
the Rocklea area. Ozone’s correlations between air temperature, relative humidity, and wind
speed at the Rocklea site (Table 7), suggest that a multivariate LSTM model using these input
parameters may provide more accurate ozone forecasting.

The TES and LSTM models performed comparably when predicting PM2.5 concentrations,
though this may be attributed to the variance between the training and testing data used.
Further experiments using the LSTM model to forecast PM2.5 with different weather station
datasets will likely perform better considering the strong prediction accuracy of ozone at the
Rocklea site. Though at this stage, the LSTM model would be ineffective at accurately
forecasting PM2.5 in an early warning system in the Rocklea area.

The significantly different pollutant concentrations between this paper’s data and literature
case studies mean that this paper’s LSTM model’s quantitative performance was not directly
comparable with those examined in the literature review (Fig. 25 & Fig. 26). This is due to
the significantly different observed pollutant concentrations between case studies. In addition,
to climatic factors, the accuracy of forecasting in different regions is affected by various
environmental factors, including urban density and industrial activity (Bai, et al., 2018).

This paper compared the RMSEs of several different LSTM literature case studies in a
similar geographic area with similar observed ambient PM2.5 concentrations. The most
significant observation that can be taken from comparing literature case studies is the strong
relative performance of the spatiotemporal modified C-LSTME model compared with a
traditional LSTM model (Wen, et al., 2019). The improved prediction performance after the
integration of meteorological data from nearby stations supports further research into using
multivariate LSTM models for forecasting pollutants.

4.1.3. Study limitations and reliability


While this paper focused on collecting peer-reviewed case studies, the limited supply of
applied research using LSTM models for pollutant forecasting meant a wider range of
literature was reviewed.

In addition, while the collection methods were consistent for pollutants across sites, the
recording devices differed between the ground-based weather stations. In addition, there were
very limited weather stations and a smaller range of pollutants measured in tropical areas
(Queensland Government, 2020). In Queensland and Australia in general, there are
insufficient measurement stations to obtain accurate pollutant measurements throughout the
43
country (Royal Commission, 2020). The Australian Royal Commission into the 2019-2020
bushfires found that many areas, especially in rural and remote areas, did not have access to
timely and relevant air quality information (Royal Commission, 2020). Though the Royal
Commission concluded that establishing fixed air quality monitoring stations throughout
every town in Australia would be inefficient due to the high establishment costs ($120 000 –
$250 000 per site) (Royal Commission, 2020).

Finally, the data resolution was limited to hourly averages; more granular data with 5- or 15-
minute recording intervals could have improved the TES and LSTM models forecasting
accuracy.

4.1.4. Study Implications and further research


The current air quality warning systems used by health authorities in Australia are primarily
composed of relaying live-hourly averages to citizens. However recently, the government has
developed a PM2.5 forecasting model using statistical chemical transport modelling and
trajectory plume modelling (AQFx) (Australian Disaster Resilience, 2021) (DELWP, 2016).

The chemical transport and trajectory plume, AQFx model was developed after the 2009
Black Saturday bushfires and is accurate for modelling smoke dispersion and aerosols from
bushfires and other independent disasters (DELWP, 2016). LSTM model forecasting will not
replace the AQFx model, but instead could provide a more accurate and cost-effective day-to-
day pollutant forecast. In any case, the results from this paper and previous literature clearly
demonstrate that using LSTM models (including modified or hybridised approaches) for
forecasting atmospheric pollutants shows promising signs for the use in early warning
systems. Rapid advancement in technological capacity and artificial intelligence will also
undoubtedly provide further opportunities for more accurate predictions in the future.

For early warnings systems to be effectively introduced by health authorities across


Queensland and Australia using LSTM or other deep learning forecasting models, better data
clarity is needed both in terms of data frequency resolution (i.e., 5-minute recordings instead
of hourly) and a greater range of weather monitoring stations. Even though the Royal
Commission was dismissive of installing additional fixed air quality monitoring stations;
state and federal governments need to reassess their current operational procedures before
introducing any accurate early warning systems based on modelling (Royal Commission,
2020) (Kelly, et al., 2012).
44
One potential method to increase data frequency resolution and have a broader geographical
coverage without changing practices or creating more weather monitoring stations is to use
satellite-based data. The Copernicus Atmosphere Monitoring Service (CAMS) is a satellite-
based data provider that monitors a wide range of meteorological parameters and includes
pollutants such as ozone, nitrogen dioxide, sulphur dioxide, carbon monoxide, methane and
aerosols (particulate matter) (CAMS, 2021). CAMS measures and freely provides
atmospheric pollutant concentrations and can allow data to be isolated for highly specific
regions, including areas without ground-based monitoring stations (CAMS, 2021). Further
research needs to be done comparing the accuracy of satellite and ground-based readings,
including the effectiveness of forecast modelling using both datasets. This research may
remove the need for further installation of ground-based monitoring stations.

The significant correlations between meteorological parameters and air pollutants identified
in this paper and the improved forecasting performance using a C-LSTME model in
literature, indicate that further research needs to be conducted examining the prediction
performance of multivariate LSTM models using spatiotemporal data.

Further research and publications into LSTM pollutant forecasting is helping to provide a
scientific foundation for health authorities to develop early warning systems to protect
citizens. To further this goal and achieve the aim of this thesis, discussion papers developed
from this thesis have already been submitted and accepted into two conferences as:

• N. Danesi, M. Jain, Y. H. Lee, and S. Dev, 2021. Monitoring Atmospheric Pollutants


from Ground-based Observations, Proc. IEEE AP-S Symposium on Antennas and
Propagation and USNC-URSI Radio Science Meeting, 2021.
• N. Danesi, M. Jain, Y. H. Lee, and S. Dev, 2021. Predicting Ground-based PM2.5
Concentration in Queensland, Australia, Proc. PIERS Symposium on Photonics and
Electromagnetics Research Meeting, 2021.

In addition, a manuscript is being prepared for submission to the journal, Environmental


Pollution for publishment as:

• N. Danesi, M. Jain, Y. H. Lee, and S. Dev, 2021. Short-term forecasting of


atmospheric air pollutants using Long Short-Term Memory networks. Environmental
Pollution.

45
4.2. Conclusion
This paper described statistical TES and univariate LSTM approaches for forecasting PM2.5,
O3 and NO2 concentrations using hourly-captured historical data from ground-based
observation stations in different Queensland climate zones. The results indicated that for all
climates and pollutants, predictions made with the TES model did not significantly
outperform the benchmarking random walk model. In addition, when forecasting PM2.5, the
LSTM model did not significantly outperform the benchmarking random walk model;
however, when forecasting ozone, the LSTM model had significantly lower RMSE values at
all lead times, averaging over a 90% reduction in RMSE compared to the persistence model.
The LSTM model’s strong ozone prediction performance relative to the TES and
benchmarking models and ability to map concentration peaks indicates that it may potentially
be reliably used in an early warning system at the Rocklea station for ozone exceedances.

Based on the strong performance of the LSTM model in forecasting ozone relative to
forecasting PM2.5, the model should be retested for PM2.5 using different datasets to verify the
results. Further research needs to be conducted to compare the accuracy of satellite and
ground-based pollutant observations to investigate if there is a need to install additional
ground-based monitoring stations. In addition, this paper identified several significant
correlations between atmospheric pollutants and meteorological parameters, establishing a
basis for the future development of a multivariate LSTM model to improve forecasting
accuracy.

46
5. Appendices
Appendix A
All the organised code for the paper can be found at the author’s GitHub repository:
https://github.com/ndanesi/pollutantforecasting

6. Bibliography
Australian Building Codes Board, 2019. Climate Zone Map Queensland, Brisbane: Bureau of
Meteorology.

Australian Disaster Resilience, 2021. The New South Wales air quality alert system: a brief
history. Australian Journal of Emergency Management, pp. 21-24.

Bai, L., Wang, J., Ma, X. & Lu, H., 2018. Air Pollution Forecasts: An Overview.
Environmental Research and Public Health, 15(4).

Bell, M., Davis, D. & Fletcher, T., 2004. A retrospective assessment of mortality from the
London smog episode of 1952: the role of influenza and pollution.. Environmental Health
Perspectives, 112(1), pp. 6-8.

Boissoneault, L., 2018. The Deadly Donora Smog of 1948 Spurred Environmental
Protection—But Have We Forgotten the Lesson?, Washingdon D.C.: Smithsonian.

CAMS, 2021. Air Quality information. [Online]


Available at: https://atmosphere.copernicus.eu/
[Accessed 16 05 2021].

Chang, Y.-S.et al., 2020. An LSTM-based aggregated model for air pollution forecasting.
Atmospheric Pollution Research, 11(8), pp. 1451-1563.

Dean, A. & Green, D., 2017. Climate Change, Air Pollution and Health in Australia, Sydney:
UNSW.

DELWP, 2016. Smoke Emission and Transport Modelling, Melbourne: Department of


Environmental, Land, Water and Planning.

Friedman, J., 2001. Greedy function approximation: A gradient boosting machine. Ann.
Statistics, 29(5), pp. 1189-1232.

47
Hochreiter, S. & Schmidhuber, J., 1997. Long Short-Term Memory. Neural Computation,
9(8), pp. 1735-1780.

Hossen, T. et al., 2017. Short-term load forecasting using deep neural networks (DNN). s.l.,
North American Power Symposium (NAPS).

Hyndman, R. & Athanasopoulos, 2018. Forecasting: Principles and Practice. 2nd ed.
s.l.:OTexts.

IQAir, 2021. World Air Quality. [Online]


Available at: https://www.iqair.com/au/
[Accessed 10 10 2020].

Jain, M., Manandhar, S., Lee, Y. H. & Dev, S., 2020. Forecasting Precipitable Water Vapor
Using LSTMs. Montreal, IEEE.

Kaya, K. & Öğüdücü, S., 2020. Deep Flexible Sequential (DFS) Model for Air Pollution
Forecasting. Sci. Rep., 3346(10).

Kelly, F., Fuller, G., Walton, H. & Fussell, J., 2012. Monitoring air pollution: Use of early
warning systems for public health. Official Journal of the Asian Pacific Society of
Respirology, Volume 17, pp. 7-19.

Kim, H. et al., 2019. Development of daily PM10 and PM2.5 prediction system using a deep
long short-term memory neural network model. Atmos. Chem. Phys..

Liu, B., Guo, X. & Lai, M. W. Q., 2020. Air Pollutant Concentration Forecasting Using Long
Short-Term Memory Based on Wavelet Transform and Information Gain: A Case Study of
Beijing. Computational Intelligence and Neuroscience.

Liu, Y., Zhou, Y. & Lu, J., 2020. Exploring the relationship between air pollution and
meteorological conditions in China under environmental governance. Sci. Rep., Issue 10.

Liu, Y., Zhou, Y. & Lu, J., 2020. Exploring the relationship between air pollution and
meteorological conditions in China under environmental governance. Scientific reports, Issue
14518.

Li, X. et al., 2017. Long short-term memory neural network for air pollutant concentration
predictions: Method development and evaluation. Environmental Pollution, pp. 997-1004.

48
Luo, H., Zhou, W., Jiskani, I. & Wang, Z., 2021. Analyzing Characteristics of Particulate
Matter Pollution in Open-Pit Coal Mines: Implications for Green Mining. Energies, Volume
14.

Manisalidis, I., Stavropoulou, E., Stavropoulos, A. & Bezirtzoglou, E., 2020. (2020).
Environmental and Health Impacts of Air Pollution: A Review.. Frontiers in public health,
8(14).

Muna, S. & Kuntoro, 2021. APPLICATION OF THE HOLT-WINTERS EXPONENTIAL


SMOOTHING METHOD ON THE AIR POLLUTION STANDARD INDEX IN
SURABAYA. Jurnal Biometrika dan Kependudukan, 10(1), pp. 53-60.

Nath, P., Saha, P., Middya, A. i. & Roy, S., 2021. Long-term time-series pollution forecast
using statistical and deep learning methods. Neural Computing and Applications.

Owoade, O., Olise, F., Ogundele, L. & Fawole, O. a. O. H., 2012. CORRELATION
BETWEEN PARTICULATE MATTER CONCENTRATIONS AND METEOROLOGICAL
PARAMETERS AT A SITE IN ILE-IFE, NIGERIA.. Ife Journal of Science, 14(1).

Queensland Government, 2020. Air Quality Monitoring data, Brisbane: Queensland


Government.

Rao, S., Devi, G. & Ramesh, N., 2019. Air Quality Prediction in Visakhapatnam with LSTM
based Recurrent Neural Networks. Intelligent Systems and Applications, Volume 2, pp. 18-
24.

Royal Commission, 2020. Royal Commission into National Natural Disaster Arrangements:
Natural disasters and poor air quality, Canberra: Australian Government.

Seng, D. et al., 2021. Spatiotemporal prediction of air quality based on LSTM neural
network. Alexandria Engineering Journal, 60(2), pp. 2021-2032.

Tripathi, M., 2020. Underfitting and Overfitting in Machine Learning, Manchester:


Datascience Foundation.

Tsai, Y.-T., Zeng, Y.-R. & Chang, Y.-S., 2018. Air Pollution Forecasting Using RNN with
LSTM. s.l., IEEE.

49
Venkitasamy, S., 2015. Relationship between ozone with nitrogen dioxide and climatic
impacts over major cities in India. Sustainable Environment, Volume 25, pp. 295-304.

Watcharavitoon, P., Chio, C.-P. & Chan, C.-C., 2013. Temporal and Spatial Variations in
Ambient Air Quality during 1996–2009 in Bangkok, Thailand. Aerosol and Air Quality
Research, Volume 13, pp. 1741-1754.

Wen, C. et al., 2019. A novel spatiotemporal convolutional long short-term neural network
for air pollution prediction. Science of The Total Environment, pp. 1091-1099.

WHO, 2014. Air quality deteriorating in many of the world’s cities, Geneva: World Health
Organisation.

World Air Quality Project, 2021. World's Air Pollution: Real-time Air Quality Index.
[Online]
Available at: https://waqi.info/
[Accessed 10 10 2020].

Wu, L. et al., 2017. Using grey Holt–Winters model to predict the air quality index for cities
in China. Natural Jazards, pp. 1003-1012.

Xayasouk, T., Lee, H. & Lee, G., 2020. Air Pollution Prediction Using Long Short-Term
Memory (LSTM) and Deep Autoencoder (DAE) Models. Sustainability, 12(6), p. 2570.

YongMing, P., YaJie, W. & MingZhao, L., 2019. Research of Air Pollutant Concentration
Forecasting Based on Deep Learning Algorithms. Tianjin, IOP Conference Series: Earth and
Environmental Science.

Zhao, C., Pan, J. & Zhang, L., 2021. Spatio-Temporal Patterns of Global Population
Exposure Risk of PM2.5 from 2000-2016. Sustainability, Volume 13, p. 7427.

Zoran, M., Savastru, R., Savastru, D. & Tautan, M., 2020. Assessing the relationship between
ground levels of ozone (O3) and nitrogen dioxide (NO2) with coronavirus (COVID-19) in
Milan, Italy. The Science of the total environment, Volume 740.

50

You might also like