You are on page 1of 21

Environment International 183 (2024) 108430

Contents lists available at ScienceDirect

Environment International
journal homepage: www.elsevier.com/locate/envint

Review article

A comprehensive review of the development of land use regression


approaches for modeling spatiotemporal variations of ambient air
pollution: A perspective from 2011 to 2023
Xuying Ma a, b, c, *, Bin Zou d, *, Jun Deng b, e, Jay Gao f, Ian Longley g, Shun Xiao h, Bin Guo a,
Yarui Wu a, Tingting Xu i, Xin Xu j, Xiaosha Yang k, Xiaoqi Wang a, Zelei Tan a, Yifan Wang a,
Lidia Morawska c, *, Jennifer Salmond f
a
College of Geomatics, Xi’an University of Science and Technology, Xi’an 710054, China
b
College of Safety Science and Engineering, Xi’an University of Science and Technology, Xi’an 710054, China
c
International Laboratory for Air Quality and Health, Queensland University of Technology, Brisbane, Queensland 4000, Australia
d
School of Geosciences and Info-Physics, Central South University, Changsha, Hunan 410083, China
e
Shaanxi Key Laboratory of Prevention and Control of Coal Fire, Xi’an University of Science and Technology, Xi’an 710054, China
f
School of Environment, Faculty of Science, University of Auckland, Auckland 1010, New Zealand
g
National Institute of Water and Atmospheric Research, Auckland 1010, New Zealand
h
School of Geography and Tourism, Shaanxi Normal University, Xi’an 710119, China
i
School of Software Engineering, Chongqing University of Post and Telecommunications, Chongqing 400065, China
j
Xi’an Institute for Innovative Earth Environment Research, Xi’an 710061, China
k
Shandong Nova Fitness Co., Ltd., Baoji, Shaanxi 722404, China

A R T I C L E I N F O A B S T R A C T

Keywords: Land use regression (LUR) models are widely used in epidemiological and environmental studies to estimate
Air pollution humans’ exposure to air pollution within urban areas. However, the early models, developed using linear re­
Land use regression gressions and data from fixed monitoring stations and passive sampling, were primarily designed to model
Multi-source observations
traditional and criteria air pollutants and had limitations in capturing high-resolution spatiotemporal variations
Spatiotemporal modeling
of air pollution. Over the past decade, there has been a notable development of multi-source observations from
Linear regression
Advanced statistical methods low-cost monitors, mobile monitoring, and satellites, in conjunction with the integration of advanced statistical
methods and spatially and temporally dynamic predictors, which have facilitated significant expansion and
advancement of LUR approaches. This paper reviews and synthesizes the recent advances in LUR approaches
from the perspectives of the changes in air quality data acquisition, novel predictor variables, advances in model-
developing approaches, improvements in validation methods, model transferability, and modeling software as
reported in 155 LUR studies published between 2011 and 2023. We demonstrate that these developments have
enabled LUR models to be developed for larger study areas and encompass a wider range of criteria and un­
regulated air pollutants. LUR models in the conventional spatial structure have been complemented by more
complex spatiotemporal structures. Compared with linear models, advanced statistical methods yield better
predictions when handling data with complex relationships and interactions. Finally, this study explores new
developments, identifies potential pathways for further breakthroughs in LUR methodologies, and proposes
future research directions. In this context, LUR approaches have the potential to make a significant contribution
to future efforts to model the patterns of long- and short-term exposure of urban populations to air pollution.

1. Introduction (Hao et al., 2007; Miller et al., 2007; Cesaroni et al., 2013; Liu and He,
2016; Huang et al., 2018; Lim et al., 2018; Renzi et al., 2018; Ma et al.,
Numerous epidemiological studies have conclusively established the 2020c; Owusu and Sarkodie, 2020; Zou et al., 2020; Guo et al., 2021a; Li
link between exposure to air pollution and detrimental health outcomes et al., 2021; Bai et al., 2022a). Quantifying exposure to air pollution is of

* Corresponding authors.
E-mail addresses: xma295@aucklanduni.ac.nz (X. Ma), 210010@csu.edu.cn (B. Zou), l.morawska@qut.edu.au (L. Morawska).

https://doi.org/10.1016/j.envint.2024.108430
Received 3 September 2023; Received in revised form 26 November 2023; Accepted 4 January 2024
Available online 7 January 2024
0160-4120/© 2024 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
X. Ma et al. Environment International 183 (2024) 108430

great importance in the development of effective mitigation policies (Ma synthesis of the recent advances in LUR approaches for modeling the
et al., 2020b). Approaches for air pollution modeling include deter­ spatiotemporal variations of ambient air pollution based on publications
ministic modeling (dispersion and chemical transport modeling), remote from 2011 to 2023. After giving an overview of the identified studies,
sensing inversion, interpolation methods, and land use regression (LUR) the review was structured according to the modeling framework of LUR.
models (Jerrett et al., 2005; Oyjinda and Pochai, 2017; Ma et al., 2019; Some application tools and software that can automate LUR approaches
Ma et al., 2022a; Bai et al., 2023; Li et al., 2023). LUR approaches were also recommended. Lastly, new developments and future research
(Fig. S1 illustrates a general LUR modeling framework) were first directions in LUR studies were also discussed.
introduced by Briggs et al. in 1997 and have since gained widespread
popularity for modeling ambient air quality in many epidemiological 2. Methods
studies such as the SAVIAH (Small Area Variations In Air Quality and
Health) study and ESCAPE (European Study of Cohorts for Air Pollution 2.1. Literature search
Effects) project in Europe, the MESA Air (Multi-Ethnic Study of
Atherosclerosis and Air Pollution) and ACT-AP (Adult Changes in Extensive literature searches in Google Scholar, ScienceDirect, and
Thought – Air Pollution) studies in the US (Briggs et al., 1997; Beelen Web of Science were carried out to identify a wide range of studies that
et al., 2007; Hoek et al., 2008; Sampson et al., 2011; de Hoogh et al., use LUR approaches. Search strings in various combinations used in the
2013; Wang et al., 2013a; van Nunen et al., 2017; Bi et al., 2022). In search were: “air pollution modeling”, “air pollution mapping”, “land
comparison with alternative approaches, LUR modeling stands out as a use regression”, “exposure modeling”, “spatial variations”, “spatiotem­
cost-effective and approachable method (Allen et al., 2011; Song et al., poral variations”, “machine learning”, and “statistical modeling”. The
2019). final search was implemented on 20 May 2023. Fully and online pub­
Since its advent, LUR modeling has undergone rapid development, lished papers only in English between 2011 and 2023 were included in
with the concept being extended over time. From a narrow perspective, this review. This was supplemented by papers listed in the references of
the original format of an LUR model was developed using multiple linear the traced papers and relevant papers that were acquired through con­
regressions with temporally averaged observations from routine stations tact with the authors of the traced papers. Thirteen papers suggested by
or passive sampling and traditional land-use-related predictor variables, the reviewers were also included.
and its main purpose was to capture long-term spatial variations of air
pollution. Therefore, there were limitations in capturing local-scale 2.2. Eligibility criteria
spatial patterns of air pollution with high-temporal resolution.
Furthermore, although moderate to good performance was achieved in Articles included in our review were required to meet at least one of
modeling traditional and criteria air pollutants (e.g., fine particulate the following criteria: (1) the authors explicitly identified the developed
matter [PM2.5], nitrogen dioxide [NO2], black carbon [BC]), modeling of models in their studies as LUR models; (2) the studies developed sta­
some unregulated air pollutants (e.g., non-tailpipe particles, secondary tistical or hybrid models for estimating spatial or spatiotemporal vari­
organic aerosols) was less effective (Liu et al., 2022). Over the last ations of air pollution following an LUR framework in a general view,
decade, with the escalating demands of epidemiological research, LUR even if the authors did not explicitly label the models as LUR models; (3)
approaches have been further developed and expanded to model the the focus of the studies was on exploring factors influencing the per­
spatiotemporal variations of a wider range of air pollutants. Therefore, formance of LUR modeling; and (4) the studies included the use of tools
in a broader and more recent context, an LUR model can be described as and software designed to automate the implementation of LUR
an empirical or statistical model developed within the original or modeling. In cases where multiple studies reported a similar approach,
extended modeling framework. These models use multi-source obser­ we selected and included only the earliest one in our review. Fig. 1
vations (data obtained from either fixed monitoring stations [FMSs], shows a roadmap illustrating the strategy used to perform the literature
passive diffusion tubes, low-cost monitors, mobile monitoring, or sat­ search. It is noted that despite the systematic search, we do not claim this
ellites or a combination of them) of air pollution and are developed by review to be exhaustive and cover all the relevant papers.
advanced statistical techniques incorporating a wide range of spatially
and temporally variable predictors. They can estimate spatial and tem­ 3. Overview of the identified studies
poral variations of various air pollutants, making them suitable for both
long-term and short-term exposure studies. In the context of our review, 3.1. Identified studies
the term “LUR” refers to this comprehensive definition.
The development of LUR approaches has been reviewed by many We identified 155 relevant research articles. Among them, 137 were
scholars. The three older reviews (Jerrett et al., 2005; Ryan and case-specific studies for LUR modeling of different air pollutants
LeMasters, 2007; Hoek et al., 2008) only covered 36 early LUR studies. (Table S1 in the Supplementary Information [SI]); 5 studies investigated
However, hundreds of new studies have emerged since then, reporting the effects of different factors (e.g., spatial extent, sampling design) on
advances in various aspects of LUR approaches, including changes in air the performance of LUR modeling; 7 studies (refer to Section 8) explored
quality data acquisition, availability of novel predictor variables, model the transferability of developed models, and 6 studies (refer to Section 9)
development, and model validation techniques, among others. The three introduced LUR modeling tools and software.
recent studies by Amini et al. (2017), Ma et al., (2020a), and He et al.
(2018) only reported on the application of LUR for volatile organic 3.2. Study locations
compounds (VOCs) and ozone (O3) worldwide and the criteria air pol­
lutants (PM2.5, PM10, and NO2) solely in China, while their wider ap­ Fig. 2 shows the study locations of the reviewed studies. Most were
plications for other unregulated air pollutants were not discussed. carried out in Western Europe, the USA, and China at the city or regional
Additionally, the review by Rybarczyk and Zalakeviciute (2018) scale. Larkin et al. (2017) even developed a global scale model using
emphasized the introduction of model construction algorithms, data from 5,220 sites in 58 countries across the world. The trend towards
neglecting the development of other key aspects within LUR approaches. large study area models on national, continental, and even global scales
Therefore, to the best of our knowledge, there is currently no compre­ allows the implementation of multi-center epidemiological studies. An
hensive review available that effectively summarizes and synthesizes increasing number of studies (Rose et al., 2011; Knibbs et al., 2014;
the recent advances in LUR modeling of ambient air pollution in the Miskell et al., 2015; Weissert et al., 2018; Yeganeh et al., 2018; Cowie
above-mentioned context. et al., 2019; Ma et al., 2019; Weissert et al., 2019; Rahman et al., 2020)
The aim of this paper is to provide a comprehensive review and in Oceania explored the performance of LUR approaches in a Southern

2
X. Ma et al. Environment International 183 (2024) 108430

Fig. 1. A roadmap illustrating the process of the literature search.

Fig. 2. Study locations in the reviewed articles (areas highlighted in color indicate that they were modeled either at the national or regional scale).

Hemisphere setting. More recently, several pilot studies (Abera et al., 3.3. Air pollutants modeled
2020; Coker et al., 2021; Tularam et al., 2021) were also implemented in
Africa. It has to be noted that no studies were reported from South Initially, LUR approaches were primarily designed to model air
America. Therefore, future studies could further examine the applica­ pollutants associated with traffic emissions, specifically focusing on NO2
tions of LUR modeling in South American countries. and NOX (Stedman et al., 1997; Ryan and LeMasters, 2007; Hoek et al.,
2008). From 2001 to 2010, LUR modeling expanded successfully to
encompass various other criteria air pollutants, including particulate

3
X. Ma et al. Environment International 183 (2024) 108430

matter (PM), O3, carbon monoxide (CO), and sulfur dioxide (SO2). models. First, there is a trend toward developing LUR models for larger
Subsequently, the advances in LUR approaches led to the modeling of a study areas to facilitate the implementation of multi-center epidemio­
wider range of air pollutants, including BC, VOCs, and particle number logical studies. These models are often for regulated air pollutants, based
concentration (PNC). More recently, many unregulated air pollutants on data from routine monitoring followed by researcher monitoring.
including PM components, non-tailpipe particles, and organic aerosols More recently, researchers have become increasingly interested in
have been modeled by novel LUR approaches in some pilot studies (Li applying this to air pollutants that are difficult to measure over large
et al., 2017; Meng et al., 2018; Robinson et al., 2019; Tripathy et al., areas, such as ultrafine particles (UFP). Second, the recent approaches
2019; Liu et al., 2022; Yin et al., 2022). This rapid development has have advanced to encompass a wider range of air pollutants, including
expanded the scope of LUR beyond traditional traffic-related air pol­ some unregulated pollutants such as non-tailpipe particles and organic
lutants (e.g., NO2) and allowed for the effective modeling of diverse air aerosols. This expansion allows for a more comprehensive understand­
pollutants. Fig. 3 illustrates the frequency distribution of air pollutants ing of air pollution and its potential health impacts. Third, LUR models
modeled in the reviewed articles. Among the various pollutants, fine in the conventional spatial structure have been complemented by more
particulate matter (PM2.5) and NO2 are the most extensively studied. complex spatiotemporal structures. This enhancement allows for the
They are followed by particles with a diameter of 10 µm or less (PM10), incorporation of temporal dynamics, enabling a better understanding of
PM compositions, BC, O3, PNC, and SO2. The figure highlights the how air pollutant concentrations vary over time in addition to their
prevalence of research focused on these specific air pollutants in the spatial distribution. The evolution in these three aspects has further
reviewed literature. addressed the demands of epidemiological and environmental studies.

3.4. Model performance 4. Changes in air quality data acquisition

Fig. 4 provides an overview of the performance of LUR models for This section provides a summary of how the latest advances in air
various air pollutants as reported in the reviewed articles. The range of quality data acquisition techniques have contributed to the evolution of
R2 values observed in these studies spans from 0.07 to 0.97. It is LUR approaches. Additionally, we delve into the understanding of how
important to note that the performance of LUR models can differ the design of monitoring networks impacts the modeling results,
significantly for different air pollutants. In general, LUR models for consolidating insights in this area.
criteria air pollutants tend to perform better than LURs for unregulated
air pollutants. As illustrated in Fig. 4, the model R2 for NO2 ranges from 4.1. Low-cost monitors and sensors
0.16 to 0.96, whereas the model R2 for PM1 ranges from 0.45 to 0.80.
Considering the temporal aspect, long-term LUR models (e.g., annual Recent advances in technologies have enabled expensive and com­
models) tend to perform better than short-term LURs (e.g., hourly plex stationary equipment for air pollution monitoring to be supple­
models). Additionally, LUR models based on fixed-site monitoring data mented with more affordable, portable monitors or sensors (Snyder
generally perform better than those developed using mobile monitoring et al., 2013; Masiol et al., 2018; Masiol et al., 2019; Dong et al., 2021).
data. Another factor influencing the model performance is the number of Low-cost monitors (LCMs; note that the ongoing debate revolves around
monitoring sites used in model development. LUR models developed what constitutes an LCM in terms of the threshold price) enable air
using a small number of sites often perform better than those developed pollution to be measured within an affordable budget, especially wire­
using a large number of sites. This discrepancy can be attributed to the less distributed sensor networks (WDSNs) that can even provide
potential inflation of the model R2 when using small samples. extremely high spatiotemporal patterns of air pollution in real-time
(Miskell et al., 2017; Caubel et al., 2019; Cao et al., 2020; Weissert
3.5. The evolution of LUR et al., 2020; Shafran-Nathan et al., 2021).
Large-scale practical LCM networks have been deployed in major
In the last decade, LUR approaches have been significantly expanded cities in several leading countries, such as the PurpleAir network in the
due to the emergence of multi-source observations and advances in US and micro-station networks in China (Han et al., 2020; Zhang et al.,
statistical techniques. This has led to a substantial evolution in LUR 2020; PurpleAir, 2021; Guo et al., 2022). There has been marked growth

Fig. 3. The frequency of different air pollutants shown in the reviewed articles from 2011 to 2023.

4
X. Ma et al. Environment International 183 (2024) 108430

Fig. 4. Performance (model R2) of LUR models for different air pollutants in the reviewed articles.

in using the PurpleAir data because they are publicly available and quantitatively assess the trade-offs between adding LCMs and the quality
calibrated by the US Environmental Protection Agency (EPA) ensuring of the measurements (Bi et al., 2022). Therefore, careful attention
reliability (Bi et al., 2022; Liu et al., 2022). Generally, two strategies can should be paid to the quality of the sensors, maintenance of the moni­
be employed in modeling when incorporating these observations: (1) toring network, calibration, and correction of the raw measurements.
using them as inputs on the dependent side of the model, and (2) using Recent studies have explored some solutions to improve the reliability of
them as a predictor variable on the independent side of the model. Both low-cost monitoring (Boubrima et al., 2015; Smith et al., 2017; Miskell
strategies can enhance performance compared with the baseline model. et al., 2018a; Williams, 2019; Mao et al., 2019; Considine et al., 2021;
Furthermore, strategy (1) tends to yield superior results, particularly Han et al., 2021). Lastly, it’s important to note that, for air pollutants
when data are appropriately calibrated (Bi et al., 2022; Liu et al., 2022). requiring non-routine monitoring approaches, such as UFP and particle
The use of LCM data offers numerous advantages. First, its deploy­ composition, there are currently no affordable, low-cost sensors
ment in diverse settings, including urban, suburban, and natural areas, available.
surpasses the limited coverage of sparse regular stations primarily
concentrated in urban regions (Abera et al., 2020; Bi et al., 2022). 4.2. Mobile monitoring
Second, they can provide predictions at a finer spatial resolution,
capturing localized pollution hotspots and variations within neighbor­ Air quality data collected by mobile monitoring have been widely
hoods or specific areas (Weissert et al., 2019; Xu et al., 2019b; Munir used to develop LUR models in numerous recent studies (Hankey and
et al., 2020). This level of detail is particularly valuable for assessing air Marshall, 2015; Hatzopoulou et al., 2017; Lim et al., 2019; Liu et al.,
quality impacts on a smaller scale or near pollution sources. Third, the 2019a; Van den Hove et al., 2020; Xu et al., 2021; Xu et al., 2022b).
integration of LCM data (e.g., as a secondary predictor variable) can Mobile monitoring shows great promise for understanding fine-scale
facilitate the modeling of unregulated air pollutants that require spatial pollutant patterns and identifying emission sources and urban
extensive and costly field measurements. Liu et al. (2022) used Pur­ air pollution hotspots efficiently (Villa et al., 2016; Miller et al., 2020;
pleAir PM2.5 data as an auxiliary variable in modeling the spatial vari­ Wu et al., 2020).
ations of brake and tire-wear metals, as well as oxidative stress potential, Generally, mobile measurement campaigns can be carried out by five
in Southern California. kinds of mobile monitoring platforms: (1) pedestrian-based platforms
However, low-cost monitors and sensors also have limitations. On (Van den Bossche et al., 2016; Van den Bossche et al., 2018); (2) bicycle-
one hand, while the sensors are cheap, the operational costs for running based platforms (Hankey and Marshall, 2015; Hankey et al., 2019; Zhou
a dense network are still high – almost as high as for compliance and Lin, 2019); (3) customized vehicle-based platforms (Shi et al., 2016;
monitoring. From this viewpoint, the traditional passive samplers are Liu et al., 2019a); (4) the mobile lab (e.g., Google Street View [GSV]
possibly the best low-cost monitoring solution for air pollutants such as cars) (Apte et al., 2017; Messier et al., 2018; Qi and Hankey, 2021;
NO2 to improve modeling for weekly resolutions or longer. On the other Kerckhoffs et al., 2022); and (5) mobile fleets (e.g., hundreds of moni­
hand, LCMs are not generally used for regulatory purposes and the most toring taxis) (Wu et al., 2020; Zhao et al., 2021).
contentious point is reliability, particularly when weather conditions Two types of data collection protocols can be applied: (1) short-term
change during measurements (Clifford et al., 2011; Yeganeh et al., 2018; stationary monitoring (Li et al., 2018a; Blanco et al., 2021); and (2) non-
Rahman et al., 2020; Coker et al., 2021). The tolerance for measurement stationary or on-road mobile monitoring (Kerckhoffs et al., 2016; Minet
uncertainty may become an issue when incorporating a large number of et al., 2018). This difference can potentially result in minor or sub­
LCM sites in LUR modeling. Further research is necessary to stantial differences in the developed models. Kerckhoffs et al. (2016)

5
X. Ma et al. Environment International 183 (2024) 108430

found that LUR models based on the two methods generated highly models and the spatiotemporal information from the satellite models
correlated UFP and BC concentration surfaces; in contrast, in a similar (Kloog et al., 2011).
study, Minet et al. (2018) found that the predictors captured in each A primary satellite observation commonly used in modeling ground-
model were different, and the exposure surfaces were also dissimilar. level PM concentrations is aerosol optical depth (AOD) (Sorek-Hamer
Nevertheless, both studies indicated a common observation that LUR et al., 2020). The ongoing evolution of AOD products, exemplified by the
models derived from on-road mobile monitoring tended to consistently transition from Deep Blue to Multiangle Implementation of Atmospheric
overestimate pollutant concentrations. This can be attributed to the fact Correction (MAIAC), with spatial resolutions being enhanced from 10
that the measurements were only representative of concentrations along km to 1 km, has played a pivotal role in advancing the development of
the road (Shi et al., 2016). PM2.5 and PM10 LUR models (Kloog et al., 2014; Just et al., 2015; Lee
For on-road mobile monitoring, special attention should be given to et al., 2016; Meng et al., 2016; de Hoogh et al., 2018; Lee, 2019; Sta­
the processing of the data because the data quality is highly influenced foggia et al., 2019). This enables researchers to capture more localized
by the surrounding environment during instantaneous sampling. To variations in air pollutant concentrations and develop more accurate
strengthen data representativeness and minimize uncertainties, it is LUR models. Furthermore, the use of AOD products from multiple sat­
essential to increase the number of revisits during the monitoring ellites (e.g., MODIS, SeaWiFS, and MISR) has proven valuable in
campaign and implement appropriate spatial aggregation techniques for developing a global PM2.5 model. This approach can be further
the raw measurements (Blanco et al., 2021). These measures can improved by incorporating GEOS-Chem simulation and geographically
contribute to more robust and reliable models. The performance of weighted regression (Van Donkelaar et al., 2015; Van Donkelaar et al.,
models based on these data is also sensitive to various factors, including 2016). Satellite observations offer insight into long-term changes and
the locations visited, the number of locations, the frequency of visits at spatial characterization in ambient PM2.5 concentrations on a global
each location, and the chosen aggregation method (Hankey and scale.
Marshall, 2015; Hatzopoulou et al., 2017). The use of other satellite-retrieved products in LUR studies is also
The strengths of mobile monitoring are straightforward. First, it is a abundant. The tropospheric vertical column density (VCD) retrieved
cost-effective way to cover large study areas and capture the spatial from the Ozone Monitoring Instrument (OMI) has emerged as a valuable
variations of air pollution at the street scale. Mobile monitoring con­ data source for modeling ground-level NO2 concentrations across
ducted by two GSV cars enables air pollution patterns to be revealed at a various spatial scales (Novotny et al., 2011; Anand and Monks, 2017; Qi
spatial precision of 4–5 orders of magnitude greater than that of tradi­ et al., 2022). The incorporation of OMI data in a multistage framework
tional FMSs (Apte et al., 2017; Messier et al., 2018). Second, its mobility with mixed-effect and random forest models has been used to develop a
offers an opportunity to explore understudied areas (e.g., rural com­ NO2 model with fine-scale spatial (100×100 m) and temporal (daily)
munities). Third, comparable data-only mapping of air pollution can be resolution (de Hoogh et al., 2019). Similar to OMI but with enhanced
achieved solely through dense mobile monitoring (sufficient repeat capabilities, the Tropospheric Monitoring Instrument (TROPOMI)
visits on roads) without the need for additional modeling efforts (Apte products have been used in the modeling of ground-level NO2 (Kim
et al., 2017; Messier et al., 2018). Fourth, it also facilitates the mea­ et al., 2021), NOX (Goldberg et al., 2019), and O3 (Wang et al., 2022). Its
surement of unregulated air pollutants across a wide geographical area. superior spatial resolution enables more detailed and precise observa­
With the help of mobile monitoring, Li et al., (2018a) and Robinson et al. tions, allowing for a finer-scale characterization of pollutant distribu­
(2019) for the first time developed LUR models for organic carbon and tions. More recently, observations from the new generation of
its thermally resolved fractions and PM1 and sub-components in Pitts­ geostationary satellites were also incorporated into modeling. She et al.
burgh, PA, and Oakland, CA, respectively. (2020) and Vu et al. (2022) used AOD products retrieved by the Geo­
However, mobile monitoring also faces shortcomings and challenges. stationary Ocean Color Imager (GOCI) and Geostationary Operational
On one hand, a key challenge is the need for extensive work on sampling Environmental Satellite-16 (GOES-16) to estimate hourly PM2.5 in Cal­
and the careful handling of data aggregation. This is crucial to minimize ifornia, USA, and the Yangtze River Delta, China, respectively. These
uncertainties caused by temporal variations, short-term events, and advanced geostationary satellites provide near-real-time and continuous
other influential factors that may arise during mobile monitoring measurements of key atmospheric parameters, enabling researchers to
(Hatzopoulou et al., 2017). On the other hand, measurements or model capture and analyze air pollution with higher temporal resolution (e.g.,
outcomes obtained from mobile monitoring may not directly reflect sub-hour).
outdoor human air pollution exposure. This is because the data often However, satellite observations also face limitations and challenges.
capture on-road air pollutants, which can lead to biased results if they One of the primary challenges is the coarse spatial resolution (Bai et al.,
are directly applied to background areas without proper calibration. 2022b). While newer satellite instruments provide improved spatial
Furthermore, most existing mobile monitoring studies only focus on resolution compared with their predecessors, they may still lack the fine-
data collection during the daytime, with limited reporting on nighttime scale detail required for accurate modeling at the local level. Another
results, leading to a gap in the understanding of on-road air pollution challenge is the potential for uncertainties and biases in the data because
throughout the day. However, this limitation can be overcome by the satellites measure air pollution indirectly. Various factors, such as at­
recent transfer-learning method proposed by Yuan et al. (2022) and the mospheric conditions, instrument calibration, and data processing al­
deployment of 24-hour continuous mobile monitoring networks in gorithms, can introduce errors in satellite observations. These
China (Zhao et al., 2021). uncertainties need to be carefully addressed and accounted for in the
modeling process to avoid inaccuracies in the results. Furthermore, they
4.3. Satellite observations are limited to certain atmospheric pollutants (e.g., NO2, O3) and may not
capture the full spectrum of air pollutants of interest in modeling. Lastly,
In addition to in situ monitoring, the use of satellite observations has satellite data availability and coverage can vary across regions and pe­
gained increasing prominence in LUR modeling (Lee et al., 2016; Meng riods. This can pose challenges when conducting comparative studies or
et al., 2016; de Hoogh et al., 2018; Lee, 2019; Stafoggia et al., 2019; when attempting to capture temporal variations in air pollution over
Sorek-Hamer et al., 2020). A significant benefit is that satellite data offer extended periods.
extensive and synoptic coverage, capturing both spatial and temporal
variations across vast geographical areas (Vienneau et al., 2013; Van 4.4. New insights into the impact of monitoring network design
Donkelaar et al., 2015). The integration of satellite observations has
significantly enhanced the spatiotemporal capabilities of LUR models, This section delves into the impact of various factors, such as the
benefiting from both the resolved spatial information from the LUR type, number, density, and distribution of sites on the fixed-site network,

6
X. Ma et al. Environment International 183 (2024) 108430

as well as the number of visits, road coverage, and sampling temporality aiming to accurately assess long-term exposure in epidemiologic
on the mobile monitoring network. cohorts.

4.4.1. Fixed sites 5. Novel predictor variables


As NO2 is an indicator of traffic-related air pollution and is easy to
measure, LUR studies primarily employ it as a key parameter for analysis In recent years, there has been a remarkable expansion of the pre­
of the design of monitoring networks. Models including stations in dictor variable pool, owing to the rapid growth in data availability. The
highly populated areas tend to yield better NO2 predictions across res­ wide range of novel predictor variables considered in models is intro­
idential locations. However, models including more roadside stations duced below.
better capture the higher end of residential NO2 concentrations, albeit
with larger errors in representing the full range of concentrations (Wu
et al., 2017). Regarding the influence of the number of sites, models 5.1. Traffic volumes
using more stations tend to yield better predictions and enhanced model
stability, especially when the number of sites is below 30 (Wu et al., Traffic volumes, or traffic intensity, play a crucial role in modeling
2017; Dong et al., 2021). However, the model performance slightly traffic-related air pollution (TRAP). NO2 and NOX models including
decreases with a continuous increase in the number of training sites traffic volume variables generally exhibit a 10 % higher R2 compared
when measured through internal validation. Conversely, the trend is the with models without considering them (Hoek et al., 2011; Beelen et al.,
opposite if measured by external validation. This is because models with 2013; Hassanpour Matikolaei et al., 2019). However, site-specific vari­
small samples tend to exhibit an inflated R2 compared with external ations such as pollutant choice and the proportion of local emissions
validation results (Basagaña et al., 2012; Wang et al., 2012). It is rec­ from traffic sources can also influence the effectiveness of the incorpo­
ommended that a minimum of 30 sites should be used for developing an ration of these variables (Dons et al., 2013). For long-term modeling,
appropriate model, with an ideal number exceeding 60 for specific incorporating long-term traffic volume variables can be advantageous,
studies and at least 80 sites for complex urban settings. Linear NO2 although they can be substituted with the length of specific road types
models can be robust to variations in spatial sampling density and when such data are unavailable (Brauer et al., 2003; Madsen et al., 2007;
extent. While the model’s performance may not be influenced by the Henderson et al., 2007).
spatial extent, the spatial variability of NO2 surfaces can be affected
(Maddix and Adams, 2020). Therefore, careful consideration should be
given to the selection of sampling density and extent to capture the 5.2. Emission sources
desired spatial variation in air pollution surfaces.
Industrial and traffic emissions are major contributors to air pollu­
4.4.2. Mobile monitoring network tion in most cities and including them helps to enhance the model per­
For on-road measurement campaigns, studies have revealed the formance (Chen et al., 2012; Guttikunda et al., 2014; Huang et al.,
significance of road coverage (number of road segments) and repeat 2017). They have been identified as significant predictors in models
visits in determining the model’s performance. Hatzopoulou et al. conducted at various scales for different air pollutants. Chen et al.
(2017) demonstrated that a stable long-term model for PNC and NO2 (2012) incorporated point industrial source predictors in the modeling
could be achieved with approximately 150 to 200 road segments and for Tianjin, China, at the city scale. They proved to be significant in the
10–12 visits per segment. Further supported by Saha et al. (2019), which models for SO2, NO2, and PM10, improving the model performance.
found that at least 10–15 days with 1 h of sampling per day were Similarly, at the country scale, the density of point source NOX emissions
necessary to develop PNC models with good precision and low error. was a significant variable in the NO2 model developed for Australia
Among these factors, the number of repeat visits is a particularly (Knibbs et al., 2014). It is anticipated that the incorporation of emission
influential factor. Messier et al. (2018) revealed that data-only NO and data on a finer temporal scale, such as hourly data, will help develop
BC mapping yielded poor results with only a few (1–2) repeat visits. models with fine temporal resolution.
However, when models were developed using only 30 % of the roads in
the modeling domain but 4–8 repeat visits per road segment, the results
became comparable with those obtained with full data. Similarly, the 5.3. Meteorological parameters
inclusion of road segments with at least three visits resulted in R2 values
of 0.60 for PNC and 0.51 for BC, while including road segments with at Air pollution dispersion is influenced by meteorological factors, and
least 16 visits raised these values to 0.74 for PNC and 0.55 for BC incorporating them can lead to improvements in model performance
(Hatzopoulou et al., 2017). Increasing the number of repeat visits can (Guttikunda and Gurjar, 2012; Meng et al., 2016; Shen et al., 2021).
significantly enhance the model’s performance. Meteorological variables such as boundary layer height, cloud cover,
Two recent studies by Blanco et al. (2021), Blanco et al. (2022) have precipitation, and solar exposure are important to models developed for
provided valuable insights into the design of short-term stationary large areas (Liu et al., 2015; Yeganeh et al., 2018; Liu et al., 2019a; Ma
monitoring campaigns. Consistent with previous findings, their results et al., 2021a). Temporally dynamic meteorological variables (e.g.,
further revealed that the number of visits holds greater significance than hourly parameters) are the key factors in improving the temporal reso­
the number of sites. It was observed that as the number of sites and lution of the models (Su et al., 2008a; Miskell et al., 2018b; Son et al.,
repeat site visits increased, the model performances improved, 2018). Furthermore, Yin et al. (2022) have further confirmed that
approaching those of the all-data campaign. However, the improvement meteorological variables of high resolution are an essential aspect of
exhibited diminishing returns at higher stop counts. Interestingly, it spatiotemporal modeling.
revealed that increasing the number of sites versus visits did not The development of reanalysis products has significantly enhanced
significantly impact the model performances for any of the pollutants, as the spatiotemporal resolution and accessibility of meteorology data
long as the total stop count was met. Furthermore, they also confirmed (Azizian and Ramezani, 2019; Marcos et al., 2019). More recently, the
that collecting temporally balanced samples can lead to additional release of the High-Resolution Rapid Refresh (HRRR) project and Dark
model enhancements. This was evident in the sampling designs that Sky API (replaced by Apple’s WeatherKit API) have made real-time
incorporated more extensive sampling periods, covering various sea­ higher-resolution meteorology data available (Vu et al., 2022; Yin
sons, days of the week, and hours. The findings underscore the impor­ et al., 2022). Additionally, some LCMs can also provide real-time on-site
tance of a temporally-balanced sampling design for mobile monitoring meteorological parameters.

7
X. Ma et al. Environment International 183 (2024) 108430

5.4. Socioeconomic factors environment. These imagery-derived variables have proven effective in
developing models for PNC, BC, and NO2, at both city and national
Air pollution is influenced by human activities, making socioeco­ scales. The inclusion of imagery in LUR modeling allows for the gener­
nomic factors a contributor to accurate modeling. The commonly used ation of predictor variables that align with the fine spatial scale of data
variables include population growth, gross domestic product (GDP), and collected through dense LCM networks and mobile monitoring. More­
median family income (Vizcaino and Lavalle, 2018; Ma et al., 2019; Liu over, this approach ensures the generation of variables consistently
et al., 2019a). These data can be obtained from local councils, census across administrative and political boundaries, facilitating comparisons
bureaus, and model estimation (Kummu et al., 2018). While socioeco­ and analysis on a broader scale.
nomic factors have been widely used in many LUR studies, special
attention should be paid to whether the developed models are designed
5.8. Variables derived from other simulation outputs
for epidemiological studies and investigation of environmental justice.
This is because these variables wield significant influence as con­
There has been a growing trend to incorporate outputs from other
founding factors in such studies, often overlapping with socioeconomic
simulations as variables in model development, creating what is known
status data. Consequently, it becomes necessary to exclude socioeco­
as hybrid models. Michanowicz et al. (2016a), Michanowicz et al.
nomic variables during the development of models in scenarios of this
(2016b) integrated AERMOD-based PM2.5 and Caline3-based NO2 pre­
nature.
dictions to construct hybrid PM2.5 LUR/AERMOD and NO2 LUR/Caline3
models. They observed an improvement in predictive power ranging
5.5. Urban morphology
from 2 % to 10 %. More recent studies have further extended this
approach by incorporating additional variables, including air quality
Incorporating urban morphology variables can contribute to a 2 % –
model (AQM) estimates, GEOS-Chem simulated PM2.5 constituents,
13 % improvement in R2 of the models (Su et al., 2008b; Eeftens et al.,
chemical transport model (CTM) estimates for PM2.5, NO2, and O3,
2013). More recently, urban morphology variables have been exten­
SIMAIR estimates for NOX, CALPUFF/ADMS-Urban simulated disper­
sively applied in the micro-scale and street-level modeling based on
sion of emissions, as well as AERMOD simulated PM2.5, BC, and metal
crowdsourced LCMs and mobile monitoring (Shi et al., 2016; Miskell
components (de Hoogh et al., 2016; Di et al., 2016; Wang et al., 2016;
et al., 2018b; Shi et al., 2018; Zhou and Lin, 2019; Liu et al., 2021a).
Korek et al., 2017; Yang et al., 2017; Meng et al., 2018; Tripathy et al.,
Additionally, incorporating wind direction factors when generating
2019; Tularam et al., 2021). The inclusion of these variables in the
these variables can further enhance the model performance. Ghassoun
hybrid models not only has enhanced the spatial predictive power of the
et al. (2019) introduced pseudo dynamic parameters (PDPs) that
models but also has significantly improved their temporal resolution.
describe the interaction between wind and urban morphology, resulting
in improved performance of a UFP model. However, incorporating
urban morphology data can present challenges because of the limited 5.9. New variable generation techniques
data accessibility. Furthermore, generating certain variables, such as sky
view factor (SVF), involves complex 3D spatial calculations (Miao et al., While circular buffers have been predominantly used, recent
2020; Li et al., 2020). research has indicated that variables extracted using irregular buffers
may be a more suitable choice. Li et al. (2015) introduced the
5.6. Remote-sensing-derived variables semicircular-buffer-based (SCBB) variable generation method, which
takes wind direction into account. The SCBB LUR models demonstrated
Satellite-derived variables promoted by the advancement of remote an improvement of 7 % to 16 % in R2. Naughton et al. (2018) took a
sensing have proven valuable, particularly in modeling criteria air pol­ further step and extracted variables within wind-dependent sectors or
lutants (Mao et al., 2012; Larkin et al., 2017; Xu et al., 2019a). The “wedges.” This approach was applied to develop a national-scale LUR
commonly used variables in modeling NO2 and PM are NO2 VCD from model for the Republic of Ireland, which accounted for 78 % of the
the OMI/TROPOMI, and AOD from the MODIS and other stationary spatial variability in NO2 concentrations.
instruments (He et al., 2018). Novotny et al. (2011) developed a The selection of buffer sizes also influences the spatial resolution and
national-scale NO2 LUR model for the contiguous United States and performance of the model. Ideally, the strategy for selecting buffer sizes
observed a significant improvement when including OMI tropospheric should consider the dispersion patterns of air pollutants and the scale of
NO2 column data. In addition to VCD and AOD data, satellite-derived the modeling exercise. Liu et al., 2021b developed an optimal buffer
land cover, enhanced vegetation index (EVI), greenness, brightness, selection strategy based on dichotomy and the model explained an
wetness, and nighttime light variables are also introduced in LUR additional 5 % of the variability in PM2.5 concentrations.
modeling (Vienneau et al., 2013; Huang et al., 2019; Ji et al., 2019). In summary, the use of an extensive array of spatially and temporally
However, one limitation of these variables is their relatively low spatial variable predictors has greatly enhanced the predictive capabilities of
resolution, typically ranging from 1 to 10 km. Therefore, they are useful current models. Variables such as urban morphology and imagery-
information for background instead of local (more suitable for larger- derived predictors contribute to a better understanding of spatial vari­
scale modeling) and generally are used with additional more refined ations, while meteorological variables significantly improve the model’s
data. ability to explain temporal variations. These advances facilitate a more
comprehensive modeling of air quality, ultimately leading to a more
5.7. Imagery-derived variables accurate representation of air pollution from a spatiotemporal
perspective.
Fine-scale spatial modeling encounters a challenge when traditional
variables fail to align with the high spatial resolution of monitoring data. 6. Advances in model-developing approaches
To address this issue, a promising solution is to use imagery to extract
hyperlocal variables. Ganji et al. (2020) extracted built environment 6.1. Modeling algorithms
features from Google aerial and street view images, providing insights
into the micro-characteristics and functional aspects of urban locations. A wide range of modeling algorithms can be employed to develop the
Expanding on this concept, Qi and Hankey (2021) used deep learning mathematical relationship between air pollution and predictor vari­
scene parsing techniques to extract variables from GSV imagery, ables. These algorithms are typically classified into five categories,
capturing street-level characteristics of both the built and natural which are outlined below (Table 1).

8
X. Ma et al. Environment International 183 (2024) 108430

Table 1 6.1.1. Linear models


LUR studies using different modeling algorithms. Multiple linear regression (MLR) is the most used algorithm (Ker­
Categories (sub- Algorithms Studies ckhoffs et al., 2019; Liu et al., 2019b). Among various methods in
categories) developing an MLR model, the standardized supervised forward step­
Linear MLR Chen et al. (2012); Kheirbek et al. wise regression method introduced by the ESCAPE (European Study of
(2012); Beelen et al. (2013); Miskell Cohorts for Air Pollution Effects) project is particularly popular in
et al. (2015); Huang et al. (2017); Ma relevant studies (Eeftens et al., 2012; van Nunen et al., 2017). However,
et al. (2019); Guo et al. (2020) MLR models can encounter challenges when dealing with a large num­
Regularization LASSO Larkin et al. (2017); Son et al.
(2018); Chen et al. (2019);
ber of variables that are highly correlated or collinear. They may pro­
Kerckhoffs et al. (2019); Coker et al. duce unstable or unreliable estimates of the coefficients, which can lead
(2021) to difficulties in interpreting the model and making accurate
Elastic Net Suleiman et al. (2016); Mandal et al. predictions.
(2019); Mandal et al. (2020); Ren
In this regard, regularization methods are introduced to address the
et al. (2020); Zhou et al. (2020)
RIDGE Polat and Gunay (2015); Kerckhoffs potential correlation among predictor variables by applying different
et al. (2019); Mahanta et al. (2019); penalties (Kerckhoffs et al., 2019). They provide valuable tools for
Coker et al. (2021) managing correlated predictors, enhancing model performance, and
GWR Hu et al. (2013); Robinson et al. facilitating model interpretation (Chen et al., 2019; Mahanta et al.,
(2013); Zhai et al. (2018); Guo et al.,
(2021b); Xuan et al. (2021)
2019). Additionally, MLR faces another challenge in capturing local
LME Rose et al. (2011); Meng et al. spatial variations of air pollution. To address this issue and account for
(2016); Son et al. (2018); Mirzaei spatial heterogeneity, geographically weighted regression (GWR) is
et al. (2020); Kerckhoffs et al. (2022) introduced, enabling the coefficients to vary spatially to generate local
GLM Sun et al. (2013); Garcia et al.
regression results, providing a better depiction of spatial patterns (Hu
(2016); Chen et al. (2019); Gujral
and Sinha (2021); Zhang et al., et al., 2013). Linear mixed-effects (LME) regression, another extension
(2021a) of MLR, allows for the incorporation of both fixed and random effects
and is particularly suitable for handling non-independent data arising
Advanced interpolation UK Mercer et al. (2011); Sampson et al. from a hierarchical structure (Rose et al., 2011). Unlike other linear
(2013); Xu et al., (2019a); Xu et al., models, a generalized linear model (GLM) allows the expected value of
(2019c); Nori-Sarma et al. (2020) the outcome to be modeled as a non-linear transformation of the stan­
dard linear function of regression parameters and predictor variables
Nonlinear GAM Clifford et al. (2011); Zou et al. (Nelder and Wedderburn, 1972). Compared with MLR, GLM allows
(2016); Li et al. (2017); Chen et al. response variables to have arbitrary distributions instead of simply
(2019); Gerling et al. (2020); Zhang
normal distributions (Zhang et al., 2021a).
et al., (2021a)
MARS Nieto and Antón (2014); Shahraiyni In summary, linear models offer several advantages, such as
et al. (2015); Roy et al. (2016); Kisi simplicity, efficiency, stability, and interpretability. However, to
et al. (2017); Chen et al. (2019) leverage these benefits, it is also crucial to carefully consider the as­
BME Akita et al. (2012); Akita et al. sumptions, assess the linearity of relationships, address multicollinearity
(2014); Adam-Poupart et al. (2014);
Reyes and Serre (2014); Chen et al.
issues, and handle outliers and influential points appropriately.
(2018); Cowie et al. (2019); DeLang
et al. (2021) 6.1.2. Advanced interpolation
Universal kriging (UK), also known as regression-kriging, is an
Machine learning (can be ANN Singh et al. (2012); Zou et al., advanced interpolation technique that offers significant improvements
applied in both spatial (2015b); Alimissis et al. (2018); over traditional interpolation methods, and is especially useful in large
and spatiotemporal) Wang et al. (2019); Park et al. study areas (Mercer et al., 2011; Sampson et al., 2013). By combining
(2020); Chen et al. (2021)
SVM Yeganeh et al. (2012); Awad et al.
the strengths of MLR and kriging, UK leverages both the global trend
(2017); Yang et al. (2018); Su et al. captured by MLR and the local spatial autocorrelation captured by
(2020); Mogollón-Sotelo et al. kriging, leading to more accurate and reliable interpolation results.
(2021); Zhang et al., (2021b) Studies have demonstrated that UK models outperform corresponding
RF Araki et al. (2018); Brokamp et al.
MLR models, making them a state-of-the-art technique for spatial
(2018); Zhan et al. (2018); Yuchi
et al. (2019); Shao et al. (2020); Ma interpolation (Mercer et al., 2011; Ma et al., 2019; Xu et al., 2019a).
et al., (2021a); Ma et al., (2021b)
KNN Marjovi et al. (2015); Bozdağ et al. 6.1.3. Nonlinear models
(2020); Cihan et al. (2021); Tella Nonlinear models have been widely used to develop LUR models in
et al. (2021); Tella and Balogun
recent years. A generalized additive model (GAM) is a specialized form
(2021)
XGBoost Hu et al. (2017); Wang et al., of a GLM where linear terms are replaced by smooth functions of pre­
(2020a); AlThuwaynee et al. (2021); dictor variables (Zou et al., 2016). This allows GAM to offer greater
Liu et al. (2021c); Zhao et al. (2021) flexibility in fitting the data and capturing complex relationships,
Ensemble Di et al. (2019); Lim et al. (2019);
potentially resulting in a substantially different model compared with
Van Roode et al. (2019); Adams et al.
(2020); Liu (2021b); Huang et al.
the corresponding GLM (Clifford et al., 2011). Multivariate adaptive
(2022) regression spline (MARS) extends linear models to automatically model
nonlinearities and interactions among predictor variables (Friedman,
More complex ST-LUR Sampson et al. (2011); Keller et al. 1991; Kisi et al., 2017). MARS does not rely on assuming a specific
spatiotemporal (2015); Wang et al. (2015); Wang functional relationship between the response and predictors, making it
methods et al. (2016); Wang et al. (2023) particularly useful for estimating general functions in high-dimensional
Two-step local Leung et al. (2019)
settings with limited data (Shahraiyni et al., 2015; Roy et al., 2016).
regression
Bayesian maximum entropy (BME) is an estimation approach that al­
lows the use of multivariate data from diverse sources with varying

9
X. Ma et al. Environment International 183 (2024) 108430

quality, spatial resolution, and measurement techniques (Christakos and 6.2. Comparison among different algorithms
Li, 1998; Chen et al., 2018). This can be particularly valuable in
modeling for capturing within-city variability and is applicable to large In recent years, several model-developing algorithms have been
geographic domains, as other methods may have limited capabilities in compared by Clifford et al. (2011), Mercer et al. (2011), Adam-Poupart
managing spatial non-stationarity over large areas and addressing the et al. (2014), Brokamp et al. (2017), Kisi et al. (2017), Cowie et al.
uncertainty associated with missing monitoring data (Akita et al., 2012). (2019), Lim et al. (2019), and Bozdağ et al. (2020) to assess their pre­
The BME modeling framework offers the capability to merge monitoring dictive performance. These studies offer valuable insights into the
data with outputs from existing LUR and CTM models (Akita et al., benefits of using various modeling algorithms across diverse contexts.
2014). It has demonstrated superior performance when compared with However, these comparative studies also have certain limitations,
more traditional approaches because it jointly considers both the global- making it challenging to draw consistent conclusions about the best-
scale variability in concentration derived from CTM outputs and the performing algorithm as they compared different pairs of algorithms
intraurban-scale variability from LUR model outputs. across each study.
In summary, the use of nonlinear models in LUR modeling brings More recently, three insightful studies have significantly contributed
several benefits, including the flexibility to capture complex relation­ to our understanding of this issue. Chen et al. (2019) conducted a
ships, the ability to handle non-normal data distributions, and improved comprehensive comparison of 16 algorithms to predict annual mean
predictive performance. concentrations of air pollutants using a large dataset comprising 543
PM2.5 and 2399 NO2 monitoring sites across Europe. Kerckhoffs et al.
6.1.4. Machine learning (2019) evaluated the performance of 20 algorithms for predicting long-
Machine learning (ML) algorithms have gained widespread use in term UFP concentrations based on 8200 segments of mobile monitoring
various LUR studies. The commonly used ML algorithms are the artificial data and 368 sites of short-term stationary measurements collected in
neural network (ANN), support vector machine (SVM), random forest multiple cities. Ren et al. (2020) compared 13 algorithms for fine-
(RF), K-nearest neighbors (KNN), extreme gradient boosting (XGBoost), resolution estimation of ambient O3 concentrations based on data
and ensemble learning (Liu, 2021; Tella et al., 2021; Wang et al., 2020a; from 1313 monitoring sites across the contiguous United States. The
Cabaneros et al., 2019; Di et al., 2019; Yang et al., 2018; Araki et al., comparisons in these studies were more rigorous and comprehensive.
2018). ML approaches can be applied to develop both fine-scale spatial Several general conclusions and consistent patterns regarding the
and spatiotemporal models. predictive performance of these algorithms can be drawn from the
Overall, ML algorithms offer powerful tools for modeling complex available studies:
relationships and interactions in data, and their application in various
scenarios has demonstrated their effectiveness in generating accurate (1) For predicting long-term PM2.5, NO2, and UFP across large areas,
predictions and uncovering patterns that may not be captured by most algorithms exhibit similar performance, with only a few ML
traditional linear approaches. However, to fully leverage these benefits, approaches such as RF and bagging slightly outperforming
it is essential to address certain considerations, including managing the others.
increased computational requirements, addressing potential overfitting (2) In the context of predicting O3 concentrations, ML approaches
issues, and handling model interpretation appropriately. generally produce better predictions than linear models.
(3) The ensemble approach may not necessarily improve the per­
6.1.5. More complex spatiotemporal methods formance of the best individual model if most models perform
In addition, there are several methods specifically designed for similarly across algorithms. The effectiveness of ensemble
developing spatiotemporal models. The spatiotemporal LUR (ST-LUR) learning depends on the diversity and variability among the
approach, was initially developed for the Multi-Ethnic Study of alternative models.
Atherosclerosis and Air Pollution (MESA Air) project (Sampson et al., (4) “Simple” linear models can perform comparably well when
2011). It provides a hierarchical modeling framework specifically compared with ML algorithms, particularly when the relation­
designed to address air pollution datasets consisting of a small number ships between air pollutant concentrations and predictor vari­
of sites that offer information on temporal variations, along with a larger ables are not complex.
number of sites with short-term measurements, providing broader (5) ML algorithms demonstrate their strength in handling data with
spatial coverage. This approach, as well as its extensions incorporating complex relationships and interactions, especially in modeling
CTM and satellite data, have proven to be successful in various studies spatiotemporal variations of air pollution.
(Wang et al., 2015; Keller et al., 2015; Wang et al., 2016; Xu et al.,
2022a; Wang et al., 2023). One benefit is its ability to capture spatially 6.3. Advancements in model structures
varying temporal trends and nonstationary spatial correlation struc­
tures. This is achieved by incorporating complex spatiotemporal in­ 6.3.1. Spatial models with more refined temporal resolutions
teractions that account for the variation in seasonal patterns across The early annual LUR models were not sufficient to capture seasonal
different locations. Additionally, it is also designed to handle missing and diurnal variations of air pollution that are needed in epidemiolog­
data effectively, even in scenarios where data may be incomplete or ical studies (Dons et al., 2013; Lu et al., 2020). Therefore, to expand the
irregularly sampled. temporal scope of LUR models, several approaches have been intro­
Another noteworthy method is the two-step local regression duced to integrate temporal variabilities into their model structures,
approach employed by Leung et al. (2019) for modeling spatiotemporal enhancing the time resolutions of these models from annual to more
patterns of PM2.5 concentrations based on an integration of fixed-site refined temporal scales.
and mobile monitoring data. It combines the strengths of GWR in the One simple method involves adjusting the temporal trend observed
spatial dimension and local polynomial smoothing techniques in the at an FMS and applying it to LUR predictions. Gan et al. (2011) used
temporal dimension (Yan and Mei, 2014). The benefit is that it not only month-year adjustment factors derived from regulatory monitoring data
considers both spatial and temporal information of the data but also to estimate monthly mean concentrations based on an annual mean
corrects the so-called boundary effect by using the local linear smoother. source model. Another approach introduced a dummy variable to indi­
Furthermore, it also allows different bandwidths to be chosen at cate different periods in modeling and has been used to develop models
different time points and spatial locations, which allows flexible with various temporal resolutions (e.g., monthly [MacIntyre et al., 2011;
exploration of the underlying local spatiotemporal patterns of the data. Mao et al., 2012] and weekday/weekend [Noth et al., 2011]). However,
these approaches assume that temporal trends are consistent across all

10
X. Ma et al. Environment International 183 (2024) 108430

sites and are not suitable for modeling air pollutants with significant spatiotemporal models for criteria air pollutants were developed, based
spatiotemporal variability (e.g., BC). To address this issue, one approach on an advanced geostatistical framework (ST-LUR) incorporating sat­
was introduced by recalibrating the existing LUR model with new ellite data and estimated daily Air Quality Index (AQI) levels at 100-
measurements at the same sites to account for different periods (Wang meter resolution in Shanghai, China.
et al., 2013b). Subsequently, an improved approach was employed that LURs in the spatial form have been enriched by the introduction of
allowed for variables and coefficients to change over time, resulting in spatiotemporal models, a development that carries significant implica­
the development of several separate models. Masiol et al. (2018) tions and benefits (Zhang et al., 2018). Spatial models estimate con­
developed 24 hourly PM LUR models that included 16–26 significant centrations at a specific averaged time point, mostly assuming that the
predictor variables with the model R2 ranging from 0.63 to 0.77. Don relationship between response and variables remains constant over
et al. (2013) further compared the performance of hourly BC models time. In contrast, spatiotemporal models incorporate temporal infor­
developed by different methodologies (e.g., employing dummy vari­ mation, allowing for the relationship to vary both across space and over
ables, with dynamic dependent variables and/or with dynamic and time. This transition facilitates more accurate predictions of air pollu­
static independent variables). The results indicate that developing new tion patterns, improves the assessment of exposure across space and over
independent hourly models, with static or dynamic covariates, is time, and enables the identification of temporal trends and hotspots.
considered the optimal solution. Ultimately, the application of spatiotemporal models can also enhance
Advancements in developing spatial models with more refined our ability to address the challenges and complexities associated with air
temporal resolutions have substantial implications and advantages. The pollution research, policy-making, and public health interventions.
inclusion of temporal variability in LUR modeling allows for a deeper
understanding of the health effects of air pollution when combined with 7. Improvements in model validation
human space-time activity. Employing air pollution predictions from
more refined or multiple temporally aggregated models in long-term The commonly used validation methods can be divided into two
exposure studies can account for the considerable effect of temporal categories: cross-validation (CV) and hold-out validation (HV). CVs can
variability in air pollution, resulting in a more accurate exposure be further grouped into two types: (1) k-fold CV (KCV), and (2) leave-
assessment (Lu et al., 2020). one-out CV (LOOCV) (Rose et al., 2011; Olvera et al., 2012; Weissert
et al., 2018; Rahman et al., 2020). Although these two CV methods are
6.3.2. Spatiotemporal models widely used, they have been criticized for tending to overestimate per­
To improve LUR models’ capability to capture fine-scale variations of formance as the CV-R2 of the model may be inflated when models are
air pollution in both space and time, researchers in recent years have based on a small number of training sites and when many variables are
made great efforts to develop spatiotemporal LUR models (Wang et al., available (Wang et al., 2013a; Dong et al., 2021). To partly mitigate the
2016; Chen et al., 2018; Xu et al., 2019; Van den Bossche et al., 2020). overestimation, they have been extended to some new forms. KCV is
These spatiotemporal models predict air pollutant concentrations at a implemented multiple times in a bootstrap fashion to reduce un­
specific time-point (e.g., on a specific day or hour) across a spatial certainties (Larkin et al., 2017; Simon et al., 2018). The LOOCV method
domain, while traditional spatial models with more refined temporal has been extended to more forms such as leave-one-group-out CV
resolutions still predict the average concentration but in a more refined (Weissert et al., 2018; Weissert et al., 2019), leave-one-day-out CV (Di
time resolution than the annual average. The development of epidemi­ et al., 2016), leave-one-area-out CV (Wang et al., 2014; van Nunen et al.,
ological studies and monitoring techniques has promoted the advance­ 2017; Xu et al., 2019a), and leave-one-month-out temporal CV (Ren
ment of spatiotemporal modeling. On one hand, cohort studies et al., 2020). A key concern in adopting CV is whether in the process a
investigating the association between air pollution exposure and chronic new model is built for each fold. This can lead to significant differences
health effects necessitate accurate exposure predictions across various in model structures and statistical results. However, most authors did
locations over extended periods. On the other hand, the availability of not clarify this matter in our reviewed studies. The HV method is suit­
abundant yet irregular spatiotemporal monitoring data, collected able for a large dataset as it can reduce uncertainties raised by the
through a combination of diverse monitoring techniques, also requires random split of the data. Moreover, it can also be used iteratively in a
the developed models to effectively handle both spatial and temporal bootstrap fashion (Hankey and Marshall, 2015).
dimensions of the data (Sampson et al., 2011). Early studies primarily focused on validating the spatial variability of
More recently, the increase in spatiotemporal modeling was driven LUR models. However, as models in the spatial structure have been
by the emergence of fine-scale satellite data (e.g., MAIAC AOD), the use complemented by spatiotemporal structures, recent studies have adop­
of advanced statistical methods (e.g., ST-LUR, ML), and the incorpora­ ted validations in both spatial and temporal dimensions to assess the
tion of temporally dynamic predictors (e.g., high-resolution meteo­ spatiotemporal variability of the developed model (Wang et al., 2015;
rology data). These developments have enabled the extension of LUR Ren et al., 2020). Furthermore, they have demonstrated the importance
approaches to create more robust and flexible spatiotemporal models of employing a combination of validation methods alongside different
capable of handling high-resolution spatial and temporal information. data-splitting strategies instead of relying on a single method. Ren et al.
Advanced statistical methods employ more flexible structures to effec­ (2020) used six data splitting and validation strategies including 75 %
tively capture the complex correlations and interactions between high criterion HV, 10-repeated HV, 10-fold CV, 10-fold spatial validation,
spatiotemporal resolution air quality measurements and various types of leave-month-out temporal validation, and peak validation to fully
static and dynamic predictor variables. Additionally, satellite observa­ compare the predictive skills of models developed using different algo­
tions offer extensive spatial coverage, fine-scale resolution, and rich rithms in their study.
temporal dynamics, making them indispensable in enhancing the In summary, by incorporating both spatial and temporal dimensions
spatiotemporal modeling capabilities of LUR models. Moreover, into the validation process and adopting a multi-method, multi-data-
temporally dynamic predictors can effectively capture and describe the splitting approach, researchers can gain a more comprehensive under­
intricate temporal variations in air pollution, accounting for factors that standing of the spatiotemporal variability and predictive capabilities of
change over time and have a direct impact on pollutant concentrations. their LUR models.
For example, Rahman et al. (2020) developed spatiotemporal (hourly)
PNC models in Brisbane, Australia, based on the RF algorithm, and the 8. Model transferability
best models explained 73 %, 64 %, and 88 % of the spatiotemporal
variability for three particle size ranges of <30 nm, <414 nm, and Transferability refers to the degree to which an LUR model devel­
<3000 nm, respectively. In a more recent study by Wang et al. (2023), oped in one place can be applied to other places (Patton et al., 2015;

11
X. Ma et al. Environment International 183 (2024) 108430

Zalzal et al., 2019). It has predominantly been studied in the context of interpretation of the transferred model challenging. The improvement in
transferring models between cities. Direct transfer leads to a significant fitting comes at the expense of physical interpretability rather than true
decrease in predictive performance, whereas indirect transfer results in transferability.
a minor-to-moderate drop in performance (Zalzal et al., 2019). In conclusion, our analysis indicates that LUR models exhibit limited
Furthermore, although the transferred models exhibited reduced per­ transferability to areas beyond their original development locations.
formance compared with the source models, they still outperformed This limitation stems from the empirical nature of these models.
simple binary or continuous road proximity metrics (Allen et al., 2011).
In recent studies, the transferability was tested in a wider scope. 9. Application tools and modeling software
Large-scale regional models have modest-to-good performance when
they are directly transferred to other regions; however, the regional While LUR approaches have been widely used in studies, documen­
source model is unable to capture local-scale variations of air pollution tation is scarce regarding the automation of the modeling process and
(Wang et al., 2014; Marcon et al., 2015). Regarding the temporal the availability of accessible application tools and software. Currently,
transferability, it was observed that transferring long-term NO2 models the options for such resources are limited, with only one R package
over time with the recalibration of coefficients yielded satisfactory (SpatioTemporal), two ArcGIS toolboxes (LUR Tools and XLUR), and
performance (R2 = 0.67–0.80) as annual NO2 concentrations were four software applications (RLUR, PyLUR, OpenLUR, and eLUR) pub­
highly correlated and stable over time (Marcon et al., 2015). licly available. Table 2 provides a summary of the general information
More recently, there has been a focus on the within-city trans­ about these tools and software, while their merits and suitability are
ferability of neighborhood-specific or area-specific LUR models. This is briefly discussed in Text S2 in the SI.
particularly relevant for developing countries’ megacities, where the The limited availability of automated tools and software for LUR
lack of routine monitoring makes it challenging to develop citywide modeling signifies the need for further development and accessibility in
models. Directly transferred neighborhood-specific models can effec­ this field. Efforts should be directed toward creating user-friendly ap­
tively estimate relative ranking in exposures within a city but struggle to plications and resources that facilitate the automation of the modeling
predict absolute values accurately (Patton et al., 2015; Yang et al., process, thus streamlining and expediting the application of LUR
2020). Consistent with findings from inter-city transferability studies, modeling in research and practice.
indirectly transferred models exhibit performance levels close to those
of the source models and significantly outperform directly transferred 10. New developments and future research directions
models.
Previous studies have consistently demonstrated that direct transfer This section highlights various potential avenues for the further
often yields poor performance and successful transfer requires indirect development and improvement of LUR methodology. By focusing on
recalibration of the source model to suit the local setting (Poplawski these areas of improvement, the LUR methodology can evolve and
et al., 2009; Marcon et al., 2015; Miskell et al., 2015; Patton et al., 2015; progress, leading to more robust and versatile models for assessing and
Weissert et al., 2018; Ma et al., 2019; Zalzal et al., 2019; Yang et al., understanding air pollution in diverse settings.
2020). However, the indirect transfer also faces two limitations. First, it
involves additional cost and time for data collection. Furthermore, a 10.1. Modeling unregulated air pollutants
new local-specific model can be developed based on the recalibration
monitoring campaign as well as the recalibrated transferred model. Recent studies (Strak et al., 2017; Verma et al., 2015) have high­
Many studies (Miskell et al., 2015; Patton et al., 2015; Yang et al., 2020) lighted the adverse health effects of unregulated air pollutants on
have demonstrated that this new local model is superior to the trans­ humans, leading to increased demand for modeling these pollutants.
ferred model. Consequently, the desirability of indirect transfer of the However, this poses a current challenge as previous research indicates
source model diminishes. Second, after recalibration, the sign of the that LUR approaches perform less effectively for unregulated pollutants
coefficient for at least one predictor often changes, rendering the compared with criteria air pollutants within studies. Robinson et al.

Table 2
General information on accessible application tools and software for LUR modeling.
Tool names Authors Languages Interface Running Functions Built-in modules Modeling algorithms
environment

SpatioTemporal Lindström et al. R N/A R Fitting and evaluative of a Smooth temporal Hierarchical
(2013) class of Gaussian functions, parameter spatiotemporal
spatiotemporal processes estimation, predictions, modeling framework
cross-validation
LUR tools Akita (2014) N/A Wizard-style ArcGIS 10.0+ Predictor variable Cell count tools, extract N/A
interface generation tools, feature size tools,
general tools
RLUR Morley and Gulliver R Graphical RStudio Modeling, prediction, and Variable creation, model MLR
(2018) user visualization builder, prediction
interface
PyLUR Ma et al., (2020d) Python N/A Python 2.x Modeling and prediction Variable generation, MLR
regression modeling,
model validation,
prediction & mapping
OpenLUR Lautenschlager et al. Python N/A Python 3.x Modeling and prediction Feature extraction, AutoML
(2020) modeling, regression
runner
XLUR Mölter and Lindley Python Wizard-style ArcGIS Pro Modeling, prediction, and BuildLUR, ApplyLUR MLR
(2021); Mölter interface v.2.2+ visualization
(2020)
eLUR Li (2022) Python Graphical Python 3.x Modeling, prediction, and File uploading, model MLR
user visualization builder, mapping, model
interface validation

12
X. Ma et al. Environment International 183 (2024) 108430

(2019) revealed that LUR models for secondary PM components LUR modeling. For instance, recent technologies such as panoramic
exhibited poorer performance compared with the models for primary images, 3D point clouds, and the use of GSV cars have revolutionized the
PM components. This is because secondary species, which form over field, making it more feasible to gather such data on a larger scale and at
long atmospheric time scales and are subject to regional transport, a lower cost (Wu et al., 2021; Qi and Hankey, 2021; Bonczak and
cannot be accurately described using traditional predictor variables. Kontokosta, 2019). In the future, we can leverage advanced techniques
To address this issue, future studies can consider the following so­ such as drone-based data collection, aerial LiDAR, and remote sensing to
lutions. One major challenge in modeling unregulated air pollutants is obtain vast amounts of data. These data can encompass a wide array of
the limited availability of regular monitoring data. This can be overcome novel predictor variables that capture spatial and temporal character­
by incorporating complementary data sources, such as LCMs and mobile istics relevant to modeling, providing a better understanding of the
monitoring (Liu et al., 2022; Robinson et al., 2019). In addition to complex relationships between air pollution and its influencing factors
traditional land-use variables, future models can incorporate novel and enhancing predictive accuracy.
predictors (e.g., incorporate CTM when generating predictor variables,
and introduce CTM-derived variables) that capture regional transport 10.4. Optimizing predictor variable selection strategies
patterns, atmospheric chemistry, and meteorology. They can provide a
more comprehensive understanding of the spatiotemporal variations of Currently, most studies focus on data fitting and aim to maximize the
unregulated air pollutants. Furthermore, it is valuable to explore the R2 of the model, particularly when using ML algorithms. However,
integration of other modeling approaches, such as air dispersion models relying solely on data fitting for predictor variable selection strategies
and CTMs, in the modeling process. Unregulated air pollutants often can result in limited robustness, interpretability, and transferability of
result from intricate chemical processes that are not fully captured by the fitted model.
data-driven models alone. By incorporating these deterministic models, To address this limitation, future studies should strive to optimize
researchers can gain insights into the underlying processes and improve predictor variable selection strategies by considering additional factors.
the accuracy of predictions for unregulated air pollutants. Integrating air pollution dispersion mechanisms into the variable se­
lection process can lead to a significant breakthrough. Qi and Hankey
10.2. Integration of stationary and mobile monitoring data (2021) proposed an advanced variable selection strategy that combined
three methods: theory-driven variable selection, data-driven variable
Stationary monitoring provides adequate temporal coverage but selection, and integrated variable selection. Building upon this idea,
limited spatial coverage, whereas mobile monitoring offers satisfactory future studies should address the following questions: (1) What types of
spatial coverage but limited temporal coverage. To address these limi­ variables are essential among the pool of potential predictor variables?
tations, the integration of stationary and mobile data can provide and (2) What criteria should be employed in the predictor variable se­
spatiotemporally dense observations, enabling the development of lection strategies? Additionally, dimensionality reduction techniques
models with enhanced spatial and temporal resolution. Several pilot such as principal component analysis (PCA) can be used to reduce the
studies have explored this field. Simon et al. (2018) developed hybrid number of variables in the model while retaining the most important
models by combining measurements from mobile monitoring and a fixed information (Xu et al., 2022b; Coker et al., 2021; Olvera et al., 2012).
reference station to estimate PNC in Chelsea and Boston, MA, USA. The Finally, the developed novel strategies should aim not only to improve
results showed that the hybrid model noticeably outperformed the model fitting but also to enhance the robustness, interpretability, and
mobile-only model. Leung et al. (2019) integrated measurements from transferability of the resulting models.
stations and mobile monitoring to generate spatiotemporally dense
datasets based on the assumption that spatial variations of PM2.5 con­ 10.5. Incorporating remote sensing and downscaling techniques
centrations are insignificant within a distance of 10 s of driving time,
and then a model for estimating PM2.5 concentrations was developed The integration of remote sensing observations and the downscaling
using the integrated datasets and two-step local regression. Zhao et al. technique holds great potential for advancing the modeling of spatio­
(2021) developed a model based on combined stationary and mobile temporal variations in air pollution across wider areas with finer reso­
monitoring to estimate PM2.5 concentrations in Beijing, China, at 1 km lutions. A notable example is the work of Huang et al. (2021), who
by 1 km and 1 h resolutions. The core of the proposed methodology is to proposed a novel modeling framework incorporating remote sensing
transform the surrounding mobile monitoring data to approximate the data and downscaling techniques to estimate spatiotemporal variations
readings of PM2.5 concentrations in the grid without a stationary site. of ground-level PM2.5 concentrations at both national (1 km) and local
This model significantly outperformed the baseline model using only (100 m) scales.
stationary monitoring data. To further enhance this modeling framework and generate more
While these studies have made strides in integrating stationary and robust estimations with even higher spatial and temporal resolution,
mobile data for modeling purposes, this field remains largely unex­ future studies could consider incorporating additional elements. One
plored, presenting opportunities for future research. Two key questions potential avenue is the integration of satellite data with LCM or mobile
that could be explored to further advance this approach are: (1) What monitoring observations. This combination would leverage the
are the spatiotemporal correlations between stationary observations and strengths of both approaches, enabling the capture of more detailed and
surrounding mobile monitoring data? Understanding these correlations comprehensive information on air pollution modeling.
can help establish the relationships between different monitoring plat­
forms and inform data integration strategies, and (2) How can we 10.6. Deep integration of air pollution dispersion mechanisms in modeling
appropriately weigh measurements from different platforms in the
modeling process, considering that data quality from stationary sites and Previous hybrid models demonstrated slightly improved perfor­
mobile sensors may differ? Developing robust methodologies for mance in terms of fitting accuracy compared to standalone LUR models
incorporating and weighting data from distinct platforms is crucial for (Wang and Xu, 2021; Wu et al., 2018). However, the level of integration
reliable modeling results. remained relatively low, primarily relying on the use of outputs from
deterministic models as predictor variables in LUR modeling. Despite
10.3. Introducing more novel predictor variables the progress made, these models still exhibited a limited capacity to
explain the mechanisms behind the formation, atmospheric transport,
By applying the latest techniques, we can enhance the acquisition of and dispersion of air pollutants. Thus, achieving a deep integration be­
predictor data, thereby expanding the range of variables considered in tween LUR and deterministic models while considering air pollution

13
X. Ma et al. Environment International 183 (2024) 108430

dynamics in the modeling remains a significant challenge. training dataset. However, defining the spatial resolution of a model
To address this challenge, future studies should focus on developing poses a greater challenge. Many studies directly associated the model’s
novel methods for generating variables and variable selection strategies spatial resolution with the resolution of pollution maps (ranging from
by incorporating deterministic models. Moreover, it is crucial to 50 m to 1 km in the reviewed studies). However, we believe that this
consider air pollution dispersion mechanisms within the statistical approach may be inappropriate and inaccurate.
modeling procedures to establish advanced hybrid LUR models capable It is essential to recognize that a pollution map can be generated at
of providing deeper insights into the formation, transportation, and any desired spatial resolution based on the developed model. To illus­
distribution of air pollutants. trate this, consider a scenario where a model is built using measure­
ments from a sparse monitoring network comprising only 20 sites across
10.7. Keeping a balance between efficient computation and high-quality the entire city and coarse GIS data. While it is possible to generate a
model performance pollution map at a fine resolution of, for example, 50 m × 50 m, the
“true” spatial resolution of the model might be at a kilometer level due
The growing trend of using integrated modeling approaches incor­ to the limitations imposed by the data used for its development.
porating diverse data combinations and multiple algorithms has Several factors influence the spatial resolution of a model, such as the
prompted a crucial question of how to strike a balance between density and distribution of monitoring sites and the spatial scale of
achieving more efficient computation and maintaining high-quality predictor variables. Addressing these concerns represents a significant
model performance. To address this challenge, several potential strate­ research gap that necessitates further exploration. Future studies should
gies are outlined below. aim to develop appropriate methodologies for identifying the spatial
resolution of LUR models. By bridging this gap, researchers can establish
(1) Prioritization of predictor variables: identify the most important a clearer understanding of the spatial characteristics and limitations of
predictors by optimizing the predictor variable selection to LUR models, leading to more informed and accurate assessments of air
simplify the model structure. pollution at various scales.
(2) Algorithm selection and optimization: consider the computa­
tional complexity and scalability of different algorithms, opting 10.10. Improving the transferability of LUR models
for those that offer a good trade-off between efficiency and per­
formance. Fine-tune hyperparameters and optimize algorithmic Improving the transferability of LUR models is crucial for promoting
settings to achieve the best compromise between computational their wider applications in different studies (Hoek et al., 2008). Despite
speed and model quality. the significance, most previous studies primarily focused on assessing
(3) Use of parallel computing and distributed systems: leverage the ability to transfer models across different locations, with only a few
parallel computing techniques and distributed systems to taking the additional step of investigating approaches to enhance
distribute the computational workload across multiple processors transferability. Miskell et al. (2015) developed a multi-scale LUR
or machines. modeling approach using datasets at the local and city scales to improve
(4) Incremental learning and model updating: instead of retraining the model’s intra-urban adaptability between the two datasets. Zalzal
the entire model from scratch, consider updating the model et al. (2019) developed a transferable LUR model by combining data
incrementally, taking advantage of the existing knowledge, and from two cities under an assumption of spatial invariance for some of the
only incorporating new data as needed to minimize computa­ model coefficients. While these two methods demonstrated significant
tional costs. improvements in the model’s adaptability within the given datasets, it is
important to note that the transferability of the model may still be
10.8. Expanding current model validation frameworks limited when applied to a third location that differs from the training
data. This highlights the need for future research to focus on alternative
The current point-based validation framework, which assesses the approaches that enhance transferability without relying heavily on
model at discrete sites, may not fully reveal its predictive performance. costly local recalibration.
To address this limitation, some studies (Hankey and Marshall, 2015; A promising approach to developing transferable LUR models lies in
Boniardi et al., 2019) introduced the line-based validation framework the careful selection of predictor variables used during model develop­
that assessed the model’s ability to simulate continuous variations of air ment. Previous studies (Hoek et al., 2008; Briggs et al., 2000) have
pollution along specific routes. In addition, Zou et al., (2015a) proposed provided evidence that the choice of predictor variables in the final
an innovative area-based validation framework that employed infor­ model significantly impacts its transferability. Ma et al., (2022b) intro­
mation entropy as a metric to evaluate the model’s performance in duced a method to develop transferrable neighborhood-scale NO2 LUR
mapping surface concentrations of air pollutants. The method provided models with comparable predictive power based on only micro-scale
additional information and revealed subtle differences that the tradi­ predictor variables. The proposed models had the strongest direct
tional point-based validation could not capture. More recently, Li et al., transferability and moderate-to-good indirect transferability but with
(2018b) validated the spatial patterns of PM2.5 LUR models by much better model interpretability. The findings of this study hold
comparing them with PM2.5 maps derived from MAIAC AOD products. considerable significance as they inspire researchers to delve into novel
This study introduced the concept of verifying air pollution models using approaches aimed at enhancing the transferability of LUR models,
uniformly observed spatial data. particularly by focusing on the selection of predictor variables. Another
Future studies could adopt a comprehensive validation framework notable gap in the current literature is the lack of knowledge about the
that combines point-based, line-based, and area-based methods to transferability of those models developed using ML algorithms. There­
appropriately evaluate a particular model. By incorporating these ap­ fore, future studies could focus on this aspect.
proaches, researchers can gain a more comprehensive understanding of
the strengths and limitations of LUR models, enabling them to refine and 10.11. Extension to 3D estimates of spatial variations of air pollution
improve the models for accurate predictions.
LUR modeling has traditionally focused on estimating pollution
10.9. How to determine the spatial resolution of an LUR model surfaces in a two-dimensional manner. However, recent studies have
demonstrated its potential to incorporate vertical variations in air pol­
Determining the temporal resolution of an LUR model is relatively lutants, thus enabling three-dimensional applications. Ho et al. (2015)
straightforward and is primarily dictated by the temporal scale of the measured PM2.5 and elements at different heights in Kaohsiung and

14
X. Ma et al. Environment International 183 (2024) 108430

developed models to fit vertical variations of measurements. The results computation and high-quality model performance. An interesting
showed that floor level was identified as a predictor variable in the question that needs to be answered is how to determine the spatial
PM2.5, Si, and Fe models. Building upon this concept, Barratt et al. resolution of the developed LUR model. It is anticipated that the
(2018) and Jin et al. (2019) introduced the concept of vertical decay investigation of the integration of stationary and mobile monitoring
rates and extended the ground-based LUR model to predict variations in data, the combination of remote sensing and downscaling techniques,
the vertical direction based on a linear relationship assumption. More and the deep integration of air pollution dispersion mechanisms in the
recently, Xu et al., (2022c) developed a nonlinear GAM model to modeling could be three research pathways to more breakthroughs in
describe the vertical decay rates and further proposed a modeling LUR methodologies. Ultimately, this review offers insights into possible
framework for estimating 3D PM2.5 exposure by combining a standard future areas of LUR studies for other researchers.
LUR model with a vertical variation model.
To facilitate the modeling of city-wide 3D air quality, promising CRediT authorship contribution statement
techniques such as vehicle-based mobile monitoring, unmanned aerial
vehicle (UAV)-based vertical monitoring, and lidar vertical observation Xuying Ma: . Bin Zou: Funding acquisition, Writing – review &
networks have emerged. These offer potential avenues for capturing editing. Jun Deng: Conceptualization, Methodology, Writing – review &
spatial variations in air pollution across different locations and heights editing. Jay Gao: Methodology, Writing – review & editing. Ian Long­
(Cai et al., 2020; Lv et al., 2020; Wang et al., 2020b; Xiang et al., 2021). ley: Resources. Shun Xiao: Funding acquisition, Resources. Bin Guo:
The integration of vertical dimensions into LUR modeling opens up new Funding acquisition, Writing – review & editing. Yarui Wu: Resources.
possibilities for understanding and managing air pollution in three- Tingting Xu: Resources. Xin Xu: Resources. Xiaosha Yang: Resources.
dimensional space. Xiaoqi Wang: Resources. Zelei Tan: Resources. Yifan Wang: Re­
sources. Lidia Morawska: . Jennifer Salmond: Methodology, Re­
10.12. Extension to estimates of air quality in multiple application sources, Writing – review & editing.
scenarios

Declaration of competing interest


LUR approaches have predominantly been employed for estimating
outdoor air quality; however, their applicability extends beyond that
The authors declare that they have no known competing financial
domain. Yuchi et al. (2019) carried out a pilot study that used LUR
interests or personal relationships that could have appeared to influence
modeling to estimate indoor air quality for 447 rooms in 342 apartments
the work reported in this paper.
in Ulaanbaatar, Mongolia. The developed blended model (a combination
of MLR and RF) explained 82 % of the variability in indoor PM2.5 con­
Data availability
centrations. This pioneering application highlights the potential of LUR
approaches in diverse scenarios. To further expand their utility, future
No data was used for the research described in the article.
studies could explore the application of LUR models in additional set­
tings, such as predicting air quality in subway systems and school
classrooms. By adapting and refining the modeling techniques to suit Acknowledgements
these specific environments, LUR approaches could provide valuable
insights into modeling indoor air quality. The authors thank Helen Jeays for polishing the English writing. We
would like to express our sincere gratitude to the five anonymous re­
11. Conclusion viewers for their invaluable feedback, which greatly contributed to
enhancing the quality of our paper.
This paper has comprehensively reviewed the recent development of
LUR approaches for modeling spatiotemporal variations of ambient air Funding
pollution. Through our review, we discovered that the emergence of
multi-source observations along with advances in statistical techniques This work was primarily funded by the National Natural Science
and the availability of novel predictor variables, has revolutionized LUR Foundation of China (Grant Number: 42201469; Grant Number:
approaches. These developments have enabled LUR models to encom­ 41871317), China Scholarship Council (File No. 202208610078), and
pass a wider range of criteria and unregulated air pollutants. Moreover, the Australian Research Council (ARC) Linkage Grant (Project Identifi­
LUR models in the conventional spatial structure have been com­ cation Number: LP180100516). This work was also partly funded by the
plemented by more complex spatiotemporal structures. Some general Key Research and Development Program of Shaanxi Province (Grant
findings are: (1) LCM and mobile monitoring have significantly Number: 2021SF-435) and the Science and Technology Department of
improved our capability to obtain high-resolution air quality data, but Shaanxi Province (Grant Number: 2021JM-388/01).
we should pay close attention to the data quality and reliability; (2)
involving novel predictor variables can notably improve the fitting R2 of Appendix A. Supplementary data
the models; (3) major factors contributing to the increase in spatio­
temporal modeling are fine-scale satellite data, advanced statistical Supplementary data to this article can be found online at https://doi.
methods, and incorporation of temporally dynamic predictors; (4) org/10.1016/j.envint.2024.108430.
compared with linear models, advanced statistical methods can yield
better predictions when handling data with complex relationships and
References
interactions; (5) LUR models, as an empirical approach, have limited
transferability. Some application tools and software that can automate Abera, A., Mattisson, K., Eriksson, A., Ahlberg, E., Sahilu, G., Mengistie, B., Bayih, A.G.,
LUR approaches are also recommended. We suggest that future studies Aseffaa, A., Malmqvist, E., Isaxon, C., 2020. Air pollution measurements and land-
use regression in urban Sub-Saharan Africa using low-cost sensors—possibilities and
could concentrate on modeling unregulated air pollutants, introducing
pitfalls. Atmosphere 11 (12), 1357.
novel predictor variables, optimizing predictor variable selection stra­ Adam-Poupart, A., Brand, A., Fournier, M., Jerrett, M., Smargiassi, A., 2014.
tegies, proposing a comprehensive validation framework, improving the Spatiotemporal modeling of ozone levels in Quebec (Canada): a comparison of
transferability of LUR models, extending LURs to 3D, and extending the kriging, land-use regression (LUR), and combined Bayesian maximum entropy–LUR
approaches. Environmental Health Perspectives 122 (9), 970–976.
applications of LURs in multiple scenarios. A practical concern that Adams, M.D., Massey, F., Chastko, K., Cupini, C., 2020. Spatial modelling of particulate
needs to be addressed is how to keep a balance between efficient matter air pollution sensor measurements collected by community scientists while

15
X. Ma et al. Environment International 183 (2024) 108430

cycling, land use regression with spatial cross-validation, and applications of International Conference on Computer Communication and Networks (ICCCN).
machine learning for data correction. Atmospheric Environment 230, 117479. IEEE, pp. 1–7.
Akita, Y., 2014. LURTools: ArcGIS Toolbox for Land Use Regression (LUR) Model. Bozdağ, A., Dokuz, Y., Gökçek, Ö.B., 2020. Spatial prediction of PM10 concentration
Available Online at. https://hub.arcgis.com/content/de058ff6b6d44ac98a180fa1b using machine learning algorithms in Ankara. Turkey. Environmental Pollution 263,
7bcbf82/about. 114635.
Akita, Y., Chen, J.C., Serre, M.L., 2012. The moving-window Bayesian maximum entropy Brauer, M., Hoek, G., van Vliet, P., Meliefste, K., Fischer, P., Gehring, U., Heinrich, J.,
framework: estimation of PM2.5 yearly average concentration across the contiguous Cyrys, J., Bellander, T., Lewne, M., Brunekreef, B., 2003. Estimating long-term
United States. Journal of Exposure Science & Environmental. Epidemiology 22 (5), average particulate air pollution concentrations: application of traffic indicators and
496–501. geographic information systems. Epidemiology 228–239.
Akita, Y., Baldasano, J.M., Beelen, R., Cirach, M., De Hoogh, K., Hoek, G., Briggs, D.J., Collins, S., Elliott, P., Fischer, P., Kingham, S., Lebret, E., Pryl, K., Van
Nieuwenhuijsen, M., Serre, M.L., De Nazelle, A., 2014. Large scale air pollution Reeuwijk, H., Smallbone, K., Van Der Veen, A., 1997. Mapping urban air pollution
estimation method combining land use regression and chemical transport modeling using GIS: a regression-based approach. International Journal of Geographical
in a geostatistical framework. Environmental Science & Technology 48 (8), Information Science 11 (7), 699–718.
4452–4459. Briggs, D.J., de Hoogh, C., Gulliver, J., Wills, J., Elliott, P., Kingham, S., Smallbone, K.,
Alimissis, A., Philippopoulos, K., Tzanis, C.G., Deligiorgi, D., 2018. Spatial estimation of 2000. A regression-based method for mapping traffic-related air pollution:
urban air pollution with the use of artificial neural network models. Atmospheric application and testing in four contrasting urban environments. Science of the Total
Environment 191, 205–213. Environment 253 (1–3), 151–167.
Allen, R.W., Amram, O., Wheeler, A.J., Brauer, M., 2011. The transferability of NO and Brokamp, C., Jandarov, R., Rao, M.B., LeMasters, G., Ryan, P., 2017. Exposure
NO2 land use regression models between cities and pollutants. Atmospheric assessment models for elemental components of particulate matter in an urban
Environment 45 (2), 369–378. environment: A comparison of regression and random forest approaches.
AlThuwaynee, O.F., Kim, S.W., Najemaden, M.A., Aydda, A., Balogun, A.L., Fayyadh, M. Atmospheric Environment 151, 1–11.
M., Park, H.J., 2021. Demystifying uncertainty in PM10 susceptibility mapping using Brokamp, C., Jandarov, R., Hossain, M., Ryan, P., 2018. Predicting daily urban fine
variable drop-off in extreme-gradient boosting (XGB) and random forest (RF) particulate matter concentrations using a random forest model. Environmental
algorithms. Environmental Science and Pollution Research 28 (32), 43544–43566. Science & Technology 52 (7), 4173–4179.
Amini, H., Yunesian, M., Hosseini, V., Schindler, C., Henderson, S.B., Künzli, N., 2017. Cabaneros, S.M., Calautit, J.K., Hughes, B.R., 2019. A review of artificial neural network
A systematic review of land use regression models for volatile organic compounds. models for ambient air pollution prediction. Environmental Modelling & Software
Atmospheric Environment 171, 1–16. 119, 285–304.
Anand, J.S., Monks, P.S., 2017. Estimating daily surface NO2 concentrations from Cai, M., Huang, Y., Wang, Z., 2020. Dynamic three-dimensional distribution of traffic
satellite data–a case study over Hong Kong using land use regression models. pollutant at urban viaduct with the governance strategy. Atmospheric Pollution
Atmospheric Chemistry and Physics 17 (13), 8211–8230. Research 11 (8), 1418–1428.
Apte, J.S., Messier, K.P., Gani, S., Brauer, M., Kirchstetter, T.W., Lunden, M.M., Cao, R., Li, B., Wang, Z., Peng, Z.R., Tao, S., Lou, S., 2020. Using a distributed air sensor
Marshall, J.D., Portier, C.J., Vermeulen, R.C., Hamburg, S.P., 2017. High-resolution network to investigate the spatiotemporal patterns of PM2.5 concentrations.
air pollution mapping with Google Street view cars: exploiting big data. Environmental Pollution 264, 114549.
Environmental Science & Technology 51 (12), 6999–7008. Caubel, J.J., Cados, T.E., Preble, C.V., Kirchstetter, T.W., 2019. A distributed network of
Araki, S., Shima, M., Yamamoto, K., 2018. Spatiotemporal land use random forest model 100 black carbon sensors for 100 days of air quality monitoring in West Oakland.
for estimating metropolitan NO2 exposure in Japan. Science of the Total California. Environmental Science & Technology 53 (13), 7564–7573.
Environment 634, 1269–1277. Cesaroni, G., Badaloni, C., Gariazzo, C., Stafoggia, M., Sozzi, R., Davoli, M.,
Awad, Y.A., Koutrakis, P., Coull, B.A., Schwartz, J., 2017. A spatio-temporal prediction Forastiere, F., 2013. Long-term exposure to urban air pollution and mortality in a
model based on support vector machine regression: Ambient Black Carbon in three cohort of more than a million adults in Rome. Environmental Health Perspectives
New England States. Environmental Research 159, 427–434. 121 (3), 324–331.
Azizian, A., Ramezani, H., 2019. Assessing the Accuracy of European Center for Medium Chen, J., de Hoogh, K., Gulliver, J., Hoffmann, B., Hertel, O., Ketzel, M., Bauwelinck, M.,
Range Weather Forecasts (ECMWF) Reanalysis Datasets for Estimation of Daily and Van Donkelaar, A., Hvidtfeldt, U.A., Katsouyanni, K., Janssen, N.A., 2019.
Monthly Precipitation. Iranian Journal of Soil and Water Research 50 (4), 777–791. A comparison of linear regression, regularization, and machine learning algorithms
Bai, K., Li, K., Guo, J., Cheng, W. and Xu, X., 2022a. Do more frequent temperature to develop Europe-wide spatial models of fine particles and nitrogen dioxide.
inversions aggravate haze pollution in China?. Geophysical Research Letters, 49(4), Environment International 130, 104934.
p.e2021GL096458. Chen, L., Wang, Y., Li, P., Ji, Y., Kong, S., Li, Z., Bai, Z., 2012. A land use regression
Bai, K., Li, K., Guo, J., Chang, N.B., 2022b. Multiscale and multisource data fusion for model incorporating data on industrial point source pollution. Journal of
full-coverage PM2.5 concentration mapping: Can spatial pattern recognition come Environmental Sciences 24 (7), 1251–1258.
with modeling accuracy? ISPRS Journal of Photogrammetry and Remote Sensing Chen, L., Gao, S., Zhang, H., Sun, Y., Ma, Z., Vedal, S., Mao, J., Bai, Z., 2018.
184, 31–44. Spatiotemporal modeling of PM2.5 concentrations at the national scale combining
Bai, K., Li, K., Sun, Y., Wu, L., Zhang, Y., Chang, N.B., Li, Z., 2023. Global synthesis of land use regression and Bayesian maximum entropy in China. Environment
two-decade of research on improving PM2.5 estimation models: From remote International 116, 300–307.
sensing and data science perspectives. Earth-Science Reviews, 104461. Chen, B., You, S., Ye, Y., Fu, Y., Ye, Z., Deng, J., Wang, K., Hong, Y., 2021. An
Barratt, B., Lee, M., Wong, P., Tang, R., Tsui, T.H., Cheng, W., Yang, Y., Lai, P.C., Tian, L., interpretable self-adaptive deep neural network for estimating daily spatially-
Thach, T.Q. and Allen, R., 2018. A dynamic three-dimensional air pollution exposure continuous PM2.5 concentrations across China. Science of the Total Environment
model for Hong Kong. Research Reports: Health Effects Institute, 2018. 768, 144724.
Basagaña, X., Rivera, M., Aguilera, I., Agis, D., Bouso, L., Elosua, R., Foraster, M., de Christakos, G., Li, X., 1998. Bayesian maximum entropy analysis and mapping: a farewell
Nazelle, A., Nieuwenhuijsen, M., Vila, J., Künzli, N., 2012. Effect of the number of to kriging estimators? Mathematical Geology 30 (4), 435–462.
measurement sites on land use regression models in estimating local air pollution. Cihan, P., Ozel, H., Ozcan, H.K., 2021. Modeling of atmospheric particulate matters via
Atmospheric Environment 54, 634–642. artificial intelligence methods. Environmental Monitoring and Assessment 193 (5),
Beelen, R., Hoek, G., Fischer, P., van den Brandt, P.A., Brunekreef, B., 2007. Estimated 1–15.
long-term outdoor air pollution concentrations in a cohort study. Atmospheric Clifford, S., Choy, S.L., Hussein, T., Mengersen, K., Morawska, L., 2011. Using the
Environment 41 (7), 1343–1358. generalised additive model to model the particle number count of ultrafine particles.
Beelen, R., Hoek, G., Vienneau, D., Eeftens, M., Dimakopoulou, K., Pedeli, X., Tsai, M.Y., Atmospheric Environment 45 (32), 5934–5945.
Künzli, N., Schikowski, T., Marcon, A., Eriksen, K.T., 2013. Development of NO2 and Coker, E.S., Amegah, A.K., Mwebaze, E., Ssematimba, J., Bainomugisha, E., 2021. A Land
NOx land use regression models for estimating air pollution exposure in 36 study Use Regression Model using Machine Learning and Locally Developed Low Cost
areas in Europe-The ESCAPE project. Atmospheric Environment 72, 10–23. Particulate Matter Sensors in Uganda. Environmental Research, 111352.
Bi, J., Carmona, N., Blanco, M.N., Gassett, A.J., Seto, E., Szpiro, A.A., Larson, T.V., Considine, E.M., Reid, C.E., Ogletree, M.R., Dye, T., 2021. Improving accuracy of air
Sampson, P.D., Kaufman, J.D., Sheppard, L., 2022. Publicly available low-cost sensor pollution exposure measurements: Statistical correction of a municipal low-cost
measurements for PM2.5 exposure modeling: Guidance for monitor deployment and airborne particulate matter sensor network. Environmental Pollution 268, 115833.
data selection. Environment International 158, 106897. Cowie, C.T., Garden, F., Jegasothy, E., Knibbs, L.D., Hanigan, I., Morley, D., Hansell, A.,
Blanco, M.N., Bi, J., Austin, E., Larson, T.V., Marshall, J.D., Sheppard, L., 2022. Impact of Hoek, G., Marks, G.B., 2019. Comparison of model estimates from an intra-city land
Mobile Monitoring Network Design on Air Pollution Exposure Assessment Models. use regression model with a national satellite-LUR and a regional Bayesian
Environmental Science & Technology. Maximum Entropy model, in estimating NO2 for a birth cohort in Sydney, Australia.
Blanco, M.N., Doubleday, A., Austin, E., Marshall, J.D., Seto, E., Larson, T. and Sheppard, Environmental Research 174, 24–34.
L., 2021. Design and evaluation of mobile monitoring campaigns for air pollution de Hoogh, K., Wang, M., Adam, M., Badaloni, C., Beelen, R., Birk, M., Cesaroni, G.,
exposure assessment in epidemiologic cohorts. medRxiv, pp.2021-04. Cirach, M., Declercq, C., Dedele, A., Dons, E., 2013. Development of land use
Bonczak, B., Kontokosta, C.E., 2019. Large-scale parameterization of 3D building regression models for particle composition in twenty study areas in Europe.
morphology in complex urban landscapes using aerial LiDAR and city administrative Environmental Science & Technology 47 (11), 5778–5786.
data. Computers, Environment and Urban Systems 73, 126–142. de Hoogh, K., Gulliver, J., van Donkelaar, A., Martin, R.V., Marshall, J.D., Bechle, M.J.,
Boniardi, L., Dons, E., Campo, L., Van Poppel, M., Int Panis, L., Fustinoni, S., 2019. Is a Cesaroni, G., Pradas, M.C., Dedele, A., Eeftens, M., Forsberg, B., 2016. Development
land use regression model capable of predicting the cleanest route to school? of West-European PM2.5 and NO2 land use regression models incorporating
Environments 6 (8), 90. satellite-derived and chemical transport modelling data. Environmental Research
Boubrima, A., Matigot, F., Bechkit, W., Rivano, H., Ruas, A., 2015. Optimal deployment 151, 1–10.
of wireless sensor networks for air pollution monitoring. In: In 2015 24th

16
X. Ma et al. Environment International 183 (2024) 108430

de Hoogh, K., Héritier, H., Stafoggia, M., Künzli, N., Kloog, I., 2018. Modelling daily Han, L., Zhao, J., Gao, Y., Gu, Z., Xin, K., Zhang, J., 2020. Spatial distribution
PM2.5 concentrations at high spatio-temporal resolution across Switzerland. characteristics of PM2. 5 and PM10 in Xi’an City predicted by land use regression
Environmental Pollution 233, 1147–1154. models. Sustainable Cities and Society 61, 102329.
de Hoogh, K., Saucy, A., Shtein, A., Schwartz, J., West, E.A., Strassmann, A., Puhan, M., Hankey, S., Marshall, J.D., 2015. Land use regression models of on-road particulate air
Roosli, M., Stafoggia, M., Kloog, I., 2019. Predicting fine-scale daily NO2 for pollution (particle number, black carbon, PM2.5, particle size) using mobile
2005–2016 incorporating OMI satellite data across Switzerland. Environmental monitoring. Environmental Science & Technology 49 (15), 9194–9202.
Science & Technology 53 (17), 10279–10287. Hankey, S., Sforza, P., Pierson, M., 2019. Using mobile monitoring to develop hourly
DeLang, M.N., Becker, J.S., Chang, K.L., Serre, M.L., Cooper, O.R., Schultz, M.G., empirical models of particulate air pollution in a rural Appalachian community.
Schröder, S., Lu, X., Zhang, L., Deushi, M., Josse, B., 2021. Mapping yearly fine Environmental Science & Technology 53 (8), 4305–4315.
resolution global surface ozone through the Bayesian Maximum Entropy data fusion Hao, J., He, K., Duan, L., Li, J., Wang, L., 2007. Air pollution and its control in China.
of observations and model output for 1990–2017. Environmental Science & Frontiers of Environmental Science & Engineering in China 1 (2), 129–142.
Technology 55 (8), 4389–4398. Hassanpour Matikolaei, S.A.H., Jamshidi, H., Samimi, A., 2019. Characterizing the effect
Di, Q., Koutrakis, P., Schwartz, J., 2016. A hybrid prediction model for PM2.5 mass and of traffic density on ambient CO, NO2, and PM2.5 in Tehran, Iran: an hourly land-use
components using a chemical transport model and land use regression. Atmospheric regression model. Transportation Letters 11 (8), 436–446.
Environment 131, 390–399. Hatzopoulou, M., Valois, M.F., Levy, I., Mihele, C., Lu, G., Bagg, S., Minet, L., Brook, J.,
Di, Q., Amini, H., Shi, L., Kloog, I., Silvern, R., Kelly, J., Sabath, M.B., Choirat, C., 2017. Robustness of land-use regression models developed from mobile air pollutant
Koutrakis, P., Lyapustin, A., Wang, Y., 2019. Assessing NO2 concentration and measurements. Environmental Science & Technology 51 (7), 3938–3947.
model uncertainty with high spatiotemporal resolution across the contiguous United He, B., Heal, M.R., Reis, S., 2018. Land-use regression modelling of intra-urban air
States using ensemble model averaging. Environmental Science & Technology 54 pollution variation in China: current status and future needs. Atmosphere 9 (4), 134.
(3), 1372–1384. Henderson, S.B., Beckerman, B., Jerrett, M., Brauer, M., 2007. Application of land use
Dong, J., Ma, R., Cai, P., Liu, P., Yue, H., Zhang, X., Xu, Q., Li, R., Song, X., 2021. Effect of regression to estimate long-term concentrations of traffic-related nitrogen oxides and
sample number and location on accuracy of land use regression model in NO2 fine particulate matter. Environmental Science & Technology 41 (7), 2422–2428.
prediction. Atmospheric Environment 246, 118057. Ho, C.C., Chan, C.C., Cho, C.W., Lin, H.I., Lee, J.H., Wu, C.F., 2015. Land use regression
Dons, E., Van Poppel, M., Kochan, B., Wets, G., Panis, L.I., 2013. Modeling temporal and modeling with vertical distribution measurements for fine particulate matter and
spatial variability of traffic-related air pollution: Hourly land use regression models elements in an urban area. Atmospheric Environment 104, 256–263.
for black carbon. Atmospheric Environment 74, 237–246. Hoek, G., Beelen, R., De Hoogh, K., Vienneau, D., Gulliver, J., Fischer, P., Briggs, D.,
Eeftens, M., Beelen, R., De Hoogh, K., Bellander, T., Cesaroni, G., Cirach, M., 2008. A review of land-use regression models to assess spatial variation of outdoor
Declercq, C., Dedele, A., Dons, E., De Nazelle, A., Dimakopoulou, K., 2012. air pollution. Atmospheric Environment 42 (33), 7561–7578.
Development of land use regression models for PM2.5, PM2.5 absorbance, PM10 and Hoek, G., Beelen, R., Kos, G., Dijkema, M., Zee, S.C.V.D., Fischer, P.H., Brunekreef, B.,
PMcoarse in 20 European study areas; results of the ESCAPE project. Environmental 2011. Land use regression model for ultrafine particles in Amsterdam.
Science & Technology 46 (20), 11195–11205. Environmental Science & Technology 45 (2), 622–628.
Eeftens, M., Beekhuizen, J., Beelen, R., Wang, M., Vermeulen, R., Brunekreef, B., Hu, K., Rahman, A., Bhrugubanda, H., Sivaraman, V., 2017. HazeEst: Machine learning
Huss, A., Hoek, G., 2013. Quantifying urban street configuration for improvements based metropolitan air pollution estimation from fixed and mobile sensors. IEEE
in air pollution models. Atmospheric Environment 72, 1–9. Sensors Journal 17 (11), 3517–3525.
Friedman, J.H., 1991. Multivariate adaptive regression splines. The Annals of Statistics Hu, X., Waller, L.A., Al-Hamdan, M.Z., Crosson, W.L., Estes Jr, M.G., Estes, S.M.,
19 (1), 1–67. Quattrochi, D.A., Sarnat, J.A., Liu, Y., 2013. Estimating ground-level PM2.5
Gan, W.Q., Koehoorn, M., Davies, H.W., Demers, P.A., Tamburic, L., Brauer, M., 2011. concentrations in the southeastern US using geographically weighted regression.
Long-term exposure to traffic-related air pollution and the risk of coronary heart Environmental Research 121, 1–10.
disease hospitalization and mortality. Environmental Health Perspectives 119 (4), Huang, C., Hu, J., Xue, T., Xu, H. and Wang, M., 2021. High-resolution spatiotemporal
501–507. modeling for ambient PM2. 5 exposure assessment in China from 2013 to 2019.
Ganji, A., Minet, L., Weichenthal, S., Hatzopoulou, M., 2020. Predicting traffic-related air Environmental Science & Technology, 55(3), pp.2152-2162.
pollution using feature extraction from built environment images. Environmental Huang, C.S., Lin, T.H., Hung, H., Kuo, C.P., Ho, C.C., Guo, Y.L., Chen, K.C., Wu, C.F.,
Science & Technology 54 (17), 10688–10699. 2019. Incorporating satellite-derived data with annual and monthly land use
Garcia, J.M., Teodoro, F., Cerdeira, R., Coelho, L.M.R., Kumar, P., Carvalho, M.G., 2016. regression models for estimating spatial distribution of air pollution. Environmental
Developing a methodology to predict PM10 concentrations in urban areas using Modelling & Software 114, 181–187.
generalized linear models. Environmental Technology 37 (18), 2316–2325. Huang, L., Zhang, C., Bi, J., 2017. Development of land use regression models for PM2.5,
Gerling, L., Löschau, G., Wiedensohler, A., Weber, S., 2020. Statistical modelling of SO2, NO2 and O3 in Nanjing. China. Environmental Research 158, 542–552.
roadside and urban background ultrafine and accumulation mode particle number Huang, L., Qian, H., Deng, S., Guo, J., Li, Y., Zhao, W., Yue, Y., 2018. Urban residential
concentrations using generalized additive models. Science of the Total Environment indoor volatile organic compounds in summer, Beijing: Profile, concentration and
703, 134570. source characterization. Atmospheric Environment 188, 1–11.
Ghassoun, Y., Löwner, M.O., Weber, S., 2019. Wind direction related parameters Huang, C., Sun, K., Hu, J., Xue, T., Xu, H., Wang, M., 2022. Estimating 2013–2019 NO2
improve the performance of a land use regression model for ultrafine particles. exposure with high spatiotemporal resolution in China using an ensemble model.
Atmospheric Pollution Research 10 (4), 1180–1189. Environmental Pollution 292, 118285.
Goldberg, D.L., Lu, Z., Streets, D.G., de Foy, B., Griffin, D., McLinden, C.A., Lamsal, L.N., Jerrett, M., Arain, A., Kanaroglou, P., Beckerman, B., Potoglou, D., Sahsuvaroglu, T.,
Krotkov, N.A., Eskes, H., 2019. Enhanced capabilities of TROPOMI NO2: estimating Morrison, J., Giovis, C., 2005. A review and evaluation of intraurban air pollution
NOx from North American cities and power plants. Environmental Science & exposure models. Journal of Exposure Science & Environmental Epidemiology 15
Technology 53 (21), 12594–12601. (2), 185–204.
Gujral, H., Sinha, A., 2021. Association between exposure to airborne pollutants and Ji, G., Tian, L., Zhao, J., Yue, Y., Wang, Z., 2019. Detecting spatiotemporal dynamics of
COVID-19 in Los Angeles, United States with ensemble-based dynamic emission PM2.5 emission data in China using DMSP-OLS nighttime stable light data. Journal of
model. Environmental Research 194, 110704. Cleaner Production 209, 363–370.
Guo, R., Qi, Y., Zhao, B., Pei, Z., Wen, F., Wu, S., Zhang, Q., 2022. High-Resolution Urban Jin, L., Berman, J.D., Warren, J.L., Levy, J.I., Thurston, G., Zhang, Y., Xu, X., Wang, S.,
Air Quality Mapping for Multiple Pollutants Based on Dense Monitoring Data and Zhang, Y., Bell, M.L., 2019. A land use regression model of nitrogen dioxide and fine
Machine Learning. International Journal of Environmental Research and Public particulate matter in a complex urban core in Lanzhou. China. Environmental
Health 19 (13), 8005. Research 177, 108597.
Guo, B., Wang, X., Zhang, D., Pei, L., Zhang, D., Wang, X., 2020. A land use regression Just, A.C., Wright, R.O., Schwartz, J., Coull, B.A., Baccarelli, A.A., Tellez-Rojo, M.M.,
application into simulating spatial distribution characteristics of particulate matter Moody, E., Wang, Y., Lyapustin, A., Kloog, I., 2015. Using high-resolution satellite
(PM2.5) concentration in city of Xi’an. China. Polish Journal of Environmental aerosol optical depth to estimate daily PM2.5 geographical distribution in Mexico
Studies 29 (6). City. Environmental Science & Technology 49 (14), 8576–8584.
Guo, B., Wang, X., Pei, L., Su, Y., Zhang, D., Wang, Y., 2021a. Identifying the Keller, J.P., Olives, C., Kim, S.Y., Sheppard, L., Sampson, P.D., Szpiro, A.A., Oron, A.P.,
spatiotemporal dynamic of PM2.5 concentrations at multiple scales using Lindström, J., Vedal, S., Kaufman, J.D., 2015. A unified spatiotemporal modeling
geographically and temporally weighted regression model across China during approach for predicting concentrations of multiple air pollutants in the multi-ethnic
2015–2018. Science of the Total Environment 751, 141765. study of atherosclerosis and air pollution. Environmental Health Perspectives 123
Guo, B., Wang, Y., Pei, L., Yu, Y., Liu, F., Zhang, D., Wang, X., Su, Y., Zhang, D., (4), 301–309.
Zhang, B., Guo, H., 2021b. Determining the effects of socioeconomic and Kerckhoffs, J., Hoek, G., Messier, K.P., Brunekreef, B., Meliefste, K., Klompmaker, J.O.,
environmental determinants on chronic obstructive pulmonary disease (COPD) Vermeulen, R., 2016. Comparison of ultrafine particle and black carbon
mortality using geographically and temporally weighted regression model across concentration predictions from a mobile and short-term stationary land-use
Xi’an during 2014–2016. Science of the Total Environment 756, 143869. regression model. Environmental Science & Technology 50 (23), 12894–12902.
Guttikunda, S.K., Goel, R., Pant, P., 2014. Nature of air pollution, emission sources, and Kerckhoffs, J., Hoek, G., Portengen, L., Brunekreef, B., Vermeulen, R.C., 2019.
management in the Indian cities. Atmospheric Environment 95, 501–510. Performance of prediction algorithms for modeling outdoor air pollution spatial
Guttikunda, S.K., Gurjar, B.R., 2012. Role of meteorology in seasonality of air pollution surfaces. Environmental Science & Technology 53 (3), 1413–1421.
in megacity Delhi. India. Environmental Monitoring and Assessment 184 (5), Kerckhoffs, J., Khan, J., Hoek, G., Yuan, Z., Ellermann, T., Hertel, O., Ketzel, M.,
3199–3211. Jensen, S.S., Meliefste, K., Vermeulen, R., 2022. Mixed-Effects Modeling Framework
Han, P., Mei, H., Liu, D., Zeng, N., Tang, X., Wang, Y., Pan, Y., 2021. Calibrations of Low- for Amsterdam and Copenhagen for Outdoor NO2 Concentrations Using
Cost Air Pollution Monitoring Sensors for CO, NO2, O3, and SO2. Sensors 21 (1), Measurements Sampled with Google Street View Cars. Environmental Science &
256. Technology.

17
X. Ma et al. Environment International 183 (2024) 108430

Kheirbek, I., Johnson, S., Ross, Z., Pezeshki, G., Ito, K., Eisl, H., Matte, T., 2012. Spatial Liu, Z., Guan, Q., Luo, H., Wang, N., Pan, N., Yang, L., Xiao, S., Lin, J., 2019b.
variability in levels of benzene, formaldehyde, and total benzene, toluene, Development of land use regression model and health risk assessment for NO2 in
ethylbenzene and xylenes in New York City: a land-use regression study. different functional areas: A case study of Xi’an, China. Atmospheric Environment
Environmental Health 11, 1–12. 213, 515–525.
Kim, M., Brunner, D., Kuhlmann, G., 2021. Importance of satellite observations for high- Liu, Z., Guan, Q., Lin, J., Yang, L., Luo, H., Wang, N., 2021b. A new buffer selection
resolution mapping of near-surface NO2 by machine learning. Remote Sensing of strategy for land use regression model of PM2.5 in Xi’an, China. Environmental
Environment 264, 112573. Science and Pollution Research 28 (17), 21245–21255.
Kisi, O., Parmar, K.S., Soni, K., Demir, V., 2017. Modeling of air pollutants using least Liu, H., He, K., 2016. China keeps carrying forward the key special project of “Air
square support vector regression, multivariate adaptive regression spline, and M5 Pollution Causes and Control”. Frontiers of Environmental Science & Engineering 10
model tree models. Air Quality, Atmosphere & Health 10 (7), 873–883. (5), 18.
Kloog, I., Koutrakis, P., Coull, B.A., Lee, H.J., Schwartz, J., 2011. Assessing temporally Liu, W., Li, X., Chen, Z., Zeng, G., León, T., Liang, J., Huang, G., Gao, Z., Jiao, S., He, X.,
and spatially resolved PM2.5 exposures for epidemiological studies using satellite Lai, M., 2015. Land use regression models coupled with meteorology to model
aerosol optical depth measurements. Atmospheric Environment 45 (35), 6267–6275. spatial and temporal variability of NO2 and PM10 in Changsha, China. Atmospheric
Kloog, I., Chudnovsky, A.A., Just, A.C., Nordio, F., Koutrakis, P., Coull, B.A., Environment 116, 272–280.
Lyapustin, A., Wang, Y., Schwartz, J., 2014. A new hybrid spatio-temporal model for Liu, M., Peng, X., Meng, Z., Zhou, T., Long, L., She, Q., 2019a. Spatial characteristics and
estimating daily multi-year PM2.5 concentrations across northeastern USA using determinants of in-traffic black carbon in Shanghai, China: Combination of mobile
high resolution aerosol optical depth data. Atmospheric Environment 95, 581–590. monitoring and land use regression model. Science of the Total Environment 658,
Knibbs, L.D., Hewson, M.G., Bechle, M.J., Marshall, J.D., Barnett, A.G., 2014. A national 51–61.
satellite-based land-use regression model for air pollution exposure assessment in Lu, M., Soenario, I., Helbich, M., Schmitz, O., Hoek, G., van der Molen, M.,
Australia. Environmental Research 135, 204–211. Karssenberg, D., 2020. Land use regression models revealing spatiotemporal co-
Korek, M., Johansson, C., Svensson, N., Lind, T., Beelen, R., Hoek, G., Pershagen, G., variation in NO2, NO, and O3 in the Netherlands. Atmospheric Environment 223,
Bellander, T., 2017. Can dispersion modeling of air pollution be improved by land- 117238.
use regression? An example from Stockholm, Sweden. Journal of Exposure Science & Lv, L., Xiang, Y., Zhang, T., Chai, W., Liu, W., 2020. A synergistic approach for regional
Environmental Epidemiology 27 (6), 575–581. particle pollution tracking using multiple mobile vehicle-based lidars. Atmospheric
Kummu, M., Taka, M., Guillaume, J.H., 2018. Gridded global datasets for gross domestic Environment 233, 117585.
product and Human Development Index over 1990–2015. Scientific Data 5 (1), Ma, R., Ban, J., Wang, Q., Li, T., 2020a. Statistical spatial-temporal modeling of ambient
1–15. ozone exposure for environmental epidemiology studies: A review. Science of the
Larkin, A., Geddes, J.A., Martin, R.V., Xiao, Q., Liu, Y., Marshall, J.D., Brauer, M., Total Environment 701, 134463.
Hystad, P., 2017. Global land use regression model for nitrogen dioxide air pollution. Ma, R., Ban, J., Wang, Q., Zhang, Y., Yang, Y., He, M.Z., Li, S., Shi, W., Li, T., 2021b.
Environmental Science & Technology 51 (12), 6957–6964. Random forest model based fine scale spatiotemporal O3 trends in the Beijing-
Lautenschlager, F., Becker, M., Kobs, K., Steininger, M., Davidson, P., Krause, A., Tianjin-Hebei region in China, 2010 to 2017. Environmental Pollution 276, 116635.
Hotho, A., 2020. OpenLUR: Off-the-shelf air pollution modeling with open features Ma, Z., Dey, S., Christopher, S., Liu, R., Bi, J., Balyan, P., Liu, Y., 2022b. A review of
and machine learning. Atmospheric Environment 233, 117535. statistical methods used for developing large-scale and long-term PM2.5 models
Lee, H.J., 2019. Benefits of high resolution PM2.5 prediction using satellite MAIAC AOD from satellite data. Remote Sensing of Environment 269, 112827.
and land use regression for exposure assessment: California examples. Ma, X., Longley, I., Gao, J., Kachhara, A., Salmond, J., 2019. A site-optimised multi-scale
Environmental Science & Technology 53 (21), 12774–12783. GIS based land use regression model for simulating local scale patterns in air
Lee, H.J., Chatfield, R.B., Strawa, A.W., 2016. Enhancing the applicability of satellite pollution. Science of the Total Environment 685, 134–149.
remote sensing for PM2.5 estimation using MODIS deep blue AOD and land use Ma, X., Longley, I., Gao, J., Salmond, J., 2020b. Evaluating the Effect of Ambient
regression in California, United States. Environmental Science & Technology 50 Concentrations, Route Choices, and Environmental (in) Justice on Students’ Dose of
(12), 6546–6555. Ambient NO2 While Walking to School at Population Scales. Environmental Science
Leung, Y., Zhou, Y., Lam, K.Y., Fung, T., Cheung, K.Y., Kim, T., Jung, H., 2019. & Technology 54 (20), 12908–12919.
Integration of air pollution data collected by mobile sensors and ground-based Ma, X., Longley, I., Gao, J., Salmond, J., 2020c. Assessing schoolchildren’s exposure to
stations to derive a spatiotemporal air pollution profile of a city. International air pollution during the daily commute-A systematic review. Science of the Total
Journal of Geographical Information Science 33 (11), 2218–2240. Environment, 140389.
Li, R. 2022. Easy land use regression software (eLUR) for detailed and fast air pollution Ma, X., Longley, I., Salmond, J., Gao, J., 2020d. PyLUR: Efficient software for land use
modeling. https://pan.baidu.com/s/15ggXOerrb0GBilFHTTL30w, Code: hdvz. regression modeling the spatial distribution of air pollutants using GDAL/OGR
Li, H.Z., Dallmann, T.R., Li, X., Gu, P., Presto, A.A., 2018a. Urban organic aerosol library in Python. Frontiers of Environmental Science & Engineering 14 (3), 44.
exposure: spatial variations in composition and source impacts. Environmental Ma, X., Gao, J., Longley, I., Zou, B., Guo, B., Xu, X., Salmond, J., 2022a. Development of
Science & Technology 52 (2), 415–426. transferable neighborhood land use regression models for predicting intra-urban
Li, J., Huang, L., Han, B., van der Kuijp, T.J., Xia, Y., Chen, K., 2021. Exposure and ambient nitrogen dioxide (NO2) pollution exposure. Environmental Science and
perception of PM2.5 pollution on the mental stress of pregnant women. Environment Pollution Research 1–16.
International 156, 106686. Ma, M., Yao, G., Guo, J., Bai, K., 2021a. Distinct spatiotemporal variation patterns of
Li, X., Liu, W., Chen, Z., Zeng, G., Hu, C., León, T., Liang, J., Huang, G., Gao, Z., Li, Z., surface ozone in China due to diverse influential factors. Journal of Environmental
Yan, W., 2015. The application of semicircular-buffer-based land use regression Management 288, 112368.
models incorporating wind direction in predicting quarterly NO2 and PM10 MacIntyre, E.A., Karr, C.J., Koehoorn, M., Demers, P.A., Tamburic, L., Lencar, C.,
concentrations. Atmospheric Environment 103, 18–24. Brauer, M., 2011. Residential air pollution and otitis media during the first two years
Li, R., Ma, T., Xu, Q., Song, X., 2018b. Using MAIAC AOD to verify the PM2.5 spatial of life. Epidemiology 81–89.
patterns of a land use regression model. Environmental Pollution 243, 501–509. Maddix, M., Adams, M.D., 2020. Effects of spatial sampling density and spatial extent on
Li, S., Zhai, L., Zou, B., Sang, H., Fang, X., 2017. A generalized additive model combining linear land use regression modelling of NO2 estimates in an automobile-oriented
principal component analysis for PM2.5 concentration estimation. ISPRS city. Atmospheric Environment 238, 117735.
International Journal of Geo-Information 6 (8), 248. Madsen, C., Carlsen, K.C.L., Hoek, G., Oftedal, B., Nafstad, P., Meliefste, K., Jacobsen, R.,
Li, Z., Zhang, H., Wen, C.Y., Yang, A.S., Juan, Y.H., 2020. Effects of frontal area density Nystad, W., Carlsen, K.H., Brunekreef, B., 2007. Modeling the intra-urban variability
on outdoor thermal comfort and air quality. Building and Environment 180, 107028. of outdoor traffic pollution in Oslo, Norway—A GA2LEN project. Atmospheric
Li, S., Zou, B., Ma, X., Liu, N., Zhang, Z., Xie, M., Zhi, L., 2023. Improving air quality Environment 41 (35), 7500–7511.
through urban form optimization: A review study. Building and Environment Mahanta, S., Ramakrishnudu, T., Jha, R.R., Tailor, N., 2019, October.. Urban air quality
110685. prediction using regression analysis. In: TENCON 2019–2019 IEEE Region 10
Lim, C.C., Hayes, R.B., Ahn, J., Shao, Y., Silverman, D.T., Jones, R.R., Garcia, C., Conference (TENCON). IEEE, pp. 1118–1123.
Thurston, G.D., 2018. Association between long-term exposure to ambient air Mandal, S., Madhipatla, K., Prabhakaran, D., Schwartz, J., 2019. High resolution
pollution and diabetes mortality in the US. Environmental Research 165, 330–336. spatiotemporal assessment of ambient air pollution using ensemble modeling and
Lim, C.C., Kim, H., Vilcassim, M.R., Thurston, G.D., Gordon, T., Chen, L.C., Lee, K., links with hypertension in a Delhi based cohort. Environmental Epidemiology 3,
Heimbinder, M., Kim, S.Y., 2019. Mapping urban air quality using mobile sampling 259.
with low-cost sensors and machine learning in Seoul. South Korea. Environment Mandal, S., Madhipatla, K.K., Guttikunda, S., Kloog, I., Prabhakaran, D., Schwartz, J.D.
International 131, 105022. and Team, G.H.I., 2020. Ensemble averaging based assessment of spatiotemporal
Lindström, J., Szpiro, A., Sampson, P.D., Bergen, S. and Sheppard, L., 2013. variations in ambient PM2.5 concentrations over Delhi, India, during 2010–2016.
Spatiotemporal: An r package for spatio-temporal modelling of air-pollution. J stat Atmospheric Environment 224, 117309.
softw (in press)(http://cran. rproject. org/web/packages/SpatioTemporal/index. Mao, F., Khamis, K., Krause, S., Clark, J., Hannah, D.M., 2019. Low-cost environmental
html). sensor networks: recent advances and future directions. Frontiers in Earth Science 7,
Liu, J., 2021. Mapping high resolution national daily NO2 exposure across mainland 221.
China using an ensemble algorithm. Environmental Pollution 279, 116932. Mao, L., Qiu, Y., Kusano, C., Xu, X., 2012. Predicting regional space–time variation of
Liu, J., Banerjee, S., Oroumiyeh, F., Shen, J., Del Rosario, I., Lipsitt, J., Paulson, S., PM2.5 with land-use regression model and MODIS data. Environmental Science and
Ritz, B., Su, J., Weichenthal, S., Lakey, P., 2022. Cokriging with a low-cost sensor Pollution Research 19 (1), 128–138.
network to estimate spatial variation of brake and tire-wear metals and oxidative Marcon, A., de Hoogh, K., Gulliver, J., Beelen, R., Hansell, A.L., 2015. Development and
stress potential in Southern California. Environment International 168, 107481. transferability of a nitrogen dioxide land use regression model within the Veneto
Liu, M., Chen, H., Wei, D., Wu, Y., Li, C., 2021a. Nonlinear relationship between urban region of Italy. Atmospheric Environment 122, 696–704.
form and street-level PM2.5 and CO based on mobile measurements and gradient Marcos, R., González-Reviriego, N., Torralba, V., Soret, A., Doblas-Reyes, F.J., 2019.
boosting decision tree models. Building and Environment 205, 108265. Characterization of the near surface wind speed distribution at global scale: Era-

18
X. Ma et al. Environment International 183 (2024) 108430

interim reanalysis and ecmwf seasonal forecasting system 4. Climate Dynamics 52 Nelder, J.A., Wedderburn, R.W., 1972. Generalized linear models. Journal of the Royal
(5), 3307–3319. Statistical Society: Series A (general) 135 (3), 370–384.
Marjovi, A., Arfire, A. and Martinoli, A., 2015, June. High resolution air pollution maps Nieto, P.G., Antón, J.Á., 2014. Nonlinear air quality modeling using multivariate
in urban environments using mobile sensor networks. In 2015 International adaptive regression splines in Gijón urban area (Northern Spain) at local scale.
Conference on Distributed Computing in Sensor Systems (pp. 11-20). IEEE. Applied Mathematics and Computation 235, 50–65.
Masiol, M., Zíková, N., Chalupa, D.C., Rich, D.Q., Ferro, A.R., Hopke, P.K., 2018. Hourly Nori-Sarma, A., Thimmulappa, R.K., Venkataramana, G.V., Fauzie, A.K., Dey, S.K.,
land-use regression models based on low-cost PM monitor data. Environmental Venkareddy, L.K., Berman, J.D., Lane, K.J., Fong, K.C., Warren, J.L., Bell, M.L., 2020.
Research 167, 7–14. Low-cost NO2 monitoring and predictions of urban exposure using universal kriging
Masiol, M., Squizzato, S., Chalupa, D., Rich, D.Q., Hopke, P.K., 2019. Spatial-temporal and land-use regression modelling in Mysore. India. Atmospheric Environment 226,
variations of summertime ozone concentrations across a metropolitan area using a 117395.
network of low-cost monitors to develop 24 hourly land-use regression models. Noth, E.M., Hammond, S.K., Biging, G.S., Tager, I.B., 2011. A spatial-temporal regression
Science of the Total Environment 654, 1167–1178. model to predict daily outdoor residential PAH concentrations in an epidemiologic
Meng, X., Fu, Q., Ma, Z., Chen, L., Zou, B., Zhang, Y., Xue, W., Wang, J., Wang, D., study in Fresno. CA. Atmospheric Environment 45 (14), 2394–2403.
Kan, H., Liu, Y., 2016. Estimating ground-level PM10 in a Chinese city by combining Novotny, E.V., Bechle, M.J., Millet, D.B., Marshall, J.D., 2011. National satellite-based
satellite data, meteorological information and a land use regression model. land-use regression: NO2 in the United States. Environmental Science & Technology
Environmental Pollution 208, 177–184. 45 (10), 4407–4414.
Meng, X., Hand, J.L., Schichtel, B.A., Liu, Y., 2018. Space-time trends of PM2.5 Olvera, H.A., Garcia, M., Li, W.W., Yang, H., Amaya, M.A., Myers, O., Burchiel, S.W.,
constituents in the conterminous United States estimated by a machine learning Berwick, M., Pingitore Jr, N.E., 2012. Principal component analysis optimization of a
approach, 2005–2015. Environment International 121, 1137–1147. PM2.5 land use regression model with small monitoring network. Science of the
Mercer, L.D., Szpiro, A.A., Sheppard, L., Lindström, J., Adar, S.D., Allen, R.W., Avol, E.L., Total Environment 425, 27–34.
Oron, A.P., Larson, T., Liu, L.J.S., Kaufman, J.D., 2011. Comparing universal kriging Owusu, P.A., Sarkodie, S.A., 2020. Global estimation of mortality, disability-adjusted life
and land-use regression for predicting concentrations of gaseous oxides of nitrogen years and welfare cost from exposure to ambient air pollution. Science of the Total
(NOx) for the Multi-Ethnic Study of Atherosclerosis and Air Pollution (MESA Air). Environment 742, 140636.
Atmospheric Environment 45 (26), 4412–4420. Oyjinda, P. and Pochai, N., 2017. Numerical simulation to air pollution emission control
Messier, K.P., Chambliss, S.E., Gani, S., Alvarez, R., Brauer, M., Choi, J.J., Hamburg, S.P., near an industrial zone. Advances in Mathematical Physics, 2017.
Kerckhoffs, J., LaFranchi, B., Lunden, M.M., Marshall, J.D., 2018. Mapping air Park, Y., Kwon, B., Heo, J., Hu, X., Liu, Y., Moon, T., 2020. Estimating PM2.5
pollution with Google Street View cars: Efficient approaches with mobile monitoring concentration of the conterminous United States via interpretable convolutional
and land use regression. Environmental Science & Technology 52 (21), neural networks. Environmental Pollution 256, 113395.
12563–12572. Patton, A.P., Zamore, W., Naumova, E.N., Levy, J.I., Brugge, D., Durant, J.L., 2015.
Miao, C., Yu, S., Hu, Y., Zhang, H., He, X., Chen, W., 2020. Review of methods used to Transferability and generalizability of regression models of ultrafine particles in
estimate the sky view factor in urban street canyons. Building and Environment 168, urban neighborhoods in the Boston area. Environmental Science & Technology 49
106497. (10), 6051–6060.
Michanowicz, D.R., Shmool, J.L., Cambal, L., Tunno, B.J., Gillooly, S., Hunt, M.J.O., Polat, E., Gunay, S., 2015. The comparison of partial least squares regression, principal
Tripathy, S., Shields, K.N., Clougherty, J.E., 2016a. A hybrid land use regression/ component regression and ridge regression with multiple linear regression for
line-source dispersion model for predicting intra-urban NO2. Transportation predicting pm10 concentration level based on meteorological parameters. Journal of
Research Part d: Transport and Environment 43, 181–191. Data Science 13 (4), 663–692.
Michanowicz, D.R., Shmool, J.L., Tunno, B.J., Tripathy, S., Gillooly, S., Kinnee, E., Poplawski, K., Gould, T., Setton, E., Allen, R., Su, J., Larson, T., Henderson, S.,
Clougherty, J.E., 2016b. A hybrid land use regression/AERMOD model for Brauer, M., Hystad, P., Lightowlers, C., Keller, P., 2009. Intercity transferability of
predicting intra-urban variation in PM2.5. Atmospheric Environment 131, 307–315. land use regression models for estimating ambient concentrations of nitrogen
Miller, D.J., Actkinson, B., Padilla, L., Griffin, R.J., Moore, K., Lewis, P.G.T., Gardner- dioxide. Journal of Exposure Science & Environmental Epidemiology 19 (1),
Frolick, R., Craft, E., Portier, C.J., Hamburg, S.P., Alvarez, R.A., 2020. Characterizing 107–117.
elevated urban air pollutant spatial patterns with mobile monitoring in Houston. PurpleAir, 2021. PurpleAir — Real Time Air Quality Monitoring. PurpleAir, Inc.
Texas. Environmental Science & Technology 54 (4), 2133–2142. Qi, M., Hankey, S., 2021. Using street view imagery to predict street-level particulate air
Miller, K.A., Siscovick, D.S., Sheppard, L., Shepherd, K., Sullivan, J.H., Anderson, G.L., pollution. Environmental Science & Technology 55 (4), 2695–2704.
Kaufman, J.D., 2007. Long-term exposure to air pollution and incidence of Qi, M., Dixit, K., Marshall, J.D., Zhang, W., Hankey, S., 2022. National Land Use
cardiovascular events in women. New England Journal of Medicine 356 (5), Regression Model for NO2 Using Street View Imagery and Satellite Observations.
447–458. Environmental Science & Technology 56 (18), 13499–13509.
Minet, L., Liu, R., Valois, M.F., Xu, J., Weichenthal, S., Hatzopoulou, M., 2018. Rahman, M.M., Karunasinghe, J., Clifford, S., Knibbs, L.D., Morawska, L., 2020. New
Development and comparison of air pollution exposure surfaces derived from on- insights into the spatial distribution of particle number concentrations by applying
road mobile monitoring and short-term stationary sidewalk measurements. non-parametric land use regression modelling. Science of the Total Environment
Environmental Science & Technology 52 (6), 3512–3519. 702, 134708.
Mirzaei, M., Bertazzon, S., Couloigner, I., Farjad, B., Ngom, R., 2020. Estimation of local Ren, X., Mi, Z., Georgopoulos, P.G., 2020. Comparison of Machine Learning and Land
daily PM2.5 concentration during wildfire episodes: integrating MODIS AOD with Use Regression for fine scale spatiotemporal estimation of ambient air pollution:
multivariate linear mixed effect (LME) models. Air Quality, Atmosphere & Health 13 Modeling ozone concentrations across the contiguous United States. Environment
(2), 173–185. International 142, 105827.
Miskell, G., Salmond, J., Longley, I., Dirks, K.N., 2015. A novel approach in quantifying Renzi, M., Cerza, F., Gariazzo, C., Agabiti, N., Cascini, S., Di Domenicantonio, R.,
the effect of urban design features on local-scale air pollution in central urban areas. Davoli, M., Forastiere, F., Cesaroni, G., 2018. Air pollution and occurrence of type 2
Environmental Science & Technology 49 (15), 9004–9011. diabetes in a large cohort study. Environment International 112, 68–76.
Miskell, G., Salmond, J., Williams, D.E., 2017. Low-cost sensors and crowd-sourced data: Reyes, J.M., Serre, M.L., 2014. An LUR/BME framework to estimate PM2.5 explained by
Observations of siting impacts on a network of air-quality instruments. Science of the on road mobile and stationary sources. Environmental Science & Technology 48 (3),
Total Environment 575, 1119–1129. 1736–1744.
Miskell, G., Salmond, J.A., Williams, D.E., 2018a. Solution to the problem of calibration Robinson, D.P., Lloyd, C.D., McKinley, J.M., 2013. Increasing the accuracy of nitrogen
of low-cost air quality measurement sensors in networks. ACS Sensors 3 (4), dioxide (NO2) pollution mapping using geographically weighted regression (GWR)
832–843. and geostatistics. International Journal of Applied Earth Observation and
Miskell, G., Salmond, J.A., Williams, D.E., 2018b. Use of a handheld low-cost sensor to Geoinformation 21, 374–383.
explore the effect of urban design features on local-scale spatial and temporal air Robinson, E.S., Shah, R.U., Messier, K., Gu, P., Li, H.Z., Apte, J.S., Robinson, A.L.,
quality variability. Science of the Total Environment 619, 480–490. Presto, A.A., 2019. Land-use regression modeling of source-resolved fine particulate
Mogollón-Sotelo, C., Casallas, A., Vidal, S., Celis, N., Ferro, C., Belalcazar, L., 2021. matter components from mobile sampling. Environmental Science & Technology 53
A support vector machine model to forecast ground-level PM2.5 in a highly populated (15), 8925–8937.
city with a complex terrain. Air Quality, Atmosphere & Health 14 (3), 399–409. Rose, N., Cowie, C., Gillett, R., Marks, G.B., 2011. Validation of a spatiotemporal land use
Mölter, A., 2020. XLUR: A land use regression wizard for ArcGIS Pro. Journal of Open regression model incorporating fixed site monitors. Environmental Science &
Source Software 5 (50), 2177. Technology 45 (1), 294–299.
Mölter, A., Lindley, S., 2021. Developing land use regression models for environmental Roy, S.S., Pratyush, C., Barna, C., 2016. In: August. Predicting Ozone Layer
science research using the XLUR tool–more than a one-trick pony. Environmental Concentration Using Multivariate Adaptive Regression Splines, Random Forest and
Modelling & Software, 105108. Classification and Regression Tree. Springer, Cham, pp. 140–152.
Morley, D.W., Gulliver, J., 2018. A land use regression variable generation, modelling Ryan, P.H., LeMasters, G.K., 2007. A review of land-use regression models for
and prediction tool for air pollution exposure assessment. Environmental Modelling characterizing intraurban air pollution exposure. Inhalation Toxicology 19 (sup1),
& Software 105, 17–23. 127–133.
Munir, S., Mayfield, M., Coca, D., Mihaylova, L.S., 2020. A nonlinear land use regression Rybarczyk, Y., Zalakeviciute, R., 2018. Machine learning approaches for outdoor air
approach for modelling NO2 concentrations in urban areas—Using data from low- quality modelling: A systematic review. Applied Sciences 8 (12), 2570.
cost sensors and diffusion tubes. Atmosphere 11 (7), 736. Saha, P.K., Li, H.Z., Apte, J.S., Robinson, A.L., Presto, A.A., 2019. Urban ultrafine particle
Naughton, O., Donnelly, A., Nolan, P., Pilla, F., Misstear, B.D., Broderick, B., 2018. exposure assessment with land-use regression: influence of sampling strategy.
A land use regression model for explaining spatial variation in air pollution levels Environmental Science & Technology 53 (13), 7326–7336.
using a wind sector based approach. Science of the Total Environment 630, Sampson, P.D., Szpiro, A.A., Sheppard, L., Lindström, J., Kaufman, J.D., 2011. Pragmatic
1324–1334. estimation of a spatio-temporal air quality model with irregular monitoring data.
Atmospheric Environment 45 (36), 6593–6606.

19
X. Ma et al. Environment International 183 (2024) 108430

Sampson, P.D., Richards, M., Szpiro, A.A., Bergen, S., Sheppard, L., Larson, T.V., Tripathy, S., Tunno, B.J., Michanowicz, D.R., Kinnee, E., Shmool, J.L., Gillooly, S.,
Kaufman, J.D., 2013. A regionalized national universal kriging model using Partial Clougherty, J.E., 2019. Hybrid land use regression modeling for estimating spatio-
Least Squares regression for estimating annual PM2.5 concentrations in temporal exposures to PM2.5, BC, and metal components across a metropolitan area
epidemiology. Atmospheric Environment 75, 383–392. of complex terrain and industrial sources. Science of the Total Environment 673,
Shafran-Nathan, R., Etzion, Y., Broday, D.M., 2021. Fusion of land use regression 54–63.
modeling output and wireless distributed sensor network measurements into a high Tularam, H., Ramsay, L.F., Muttoo, S., Brunekreef, B., Meliefste, K., de Hoogh, K.,
spatiotemporally-resolved NO2 product. Environmental Pollution 271, 116334. Naidoo, R.N., 2021. A hybrid air pollution/land use regression model for predicting
Shahraiyni, H.T., Shahsavani, D., Sargazi, S., Habibi-Nokhandan, M., 2015. Evaluation of air pollution concentrations in Durban. South Africa. Environmental Pollution 274,
MARS for the spatial distribution modeling of carbon monoxide in an urban area. 116513.
Atmospheric Pollution Research 6 (4), 581–588. Van den Bossche, J., Theunis, J., Elen, B., Peters, J., Botteldooren, D., De Baets, B., 2016.
Shao, Y., Ma, Z., Wang, J., Bi, J., 2020. Estimating daily ground-level PM2.5 in China Opportunistic mobile air pollution monitoring: a case study with city wardens in
with random-forest-based spatiotemporal kriging. Science of the Total Environment Antwerp. Atmospheric Environment 141, 408–421.
740, 139761. Van den Bossche, J., De Baets, B., Verwaeren, J., Botteldooren, D., Theunis, J., 2018.
She, Q., Choi, M., Belle, J.H., Xiao, Q., Bi, J., Huang, K., Meng, X., Geng, G., Kim, J., Development and evaluation of land use regression models for black carbon based on
He, K., Liu, M., 2020. Satellite-based estimation of hourly PM2.5 levels during heavy bicycle and pedestrian measurements in the urban environment. Environmental
winter pollution episodes in the Yangtze River Delta. China. Chemosphere 239, Modelling & Software 99, 58–69.
124678. Van den Bossche, J., De Baets, B., Botteldooren, D., Theunis, J., 2020. A spatio-temporal
Shen, L., Zhao, T., Wang, H., Liu, J., Bai, Y., Kong, S., Zheng, H., Zhu, Y., Shu, Z., 2021. land use regression model to assess street-level exposure to black carbon.
Importance of meteorology in air pollution events during the city lockdown for Environmental Modelling & Software 133, 104837.
COVID-19 in Hubei Province, Central China. Science of the Total Environment 754, Van den Hove, A., Verwaeren, J., Van den Bossche, J., Theunis, J., De Baets, B., 2020.
142227. Development of a land use regression model for black carbon using mobile
Shi, Y., Lau, K.K.L., Ng, E., 2016. Developing street-level PM2.5 and PM10 land use monitoring data and its application to pollution-avoiding routing. Environmental
regression models in high-density Hong Kong with urban morphological factors. Research 183, 108619.
Environmental Science & Technology 50 (15), 8178–8187. Van Donkelaar, A., Martin, R.V., Brauer, M., Boys, B.L., 2015. Use of satellite
Shi, Y., Xie, X., Fung, J.C.H., Ng, E., 2018. Identifying critical building morphological observations for long-term exposure assessment of global concentrations of fine
design factors of street-level air pollution dispersion in high-density built particulate matter. Environmental Health Perspectives 123 (2), 135–143.
environment using mobile monitoring. Building and Environment 128, 248–259. Van Donkelaar, A., Martin, R.V., Brauer, M., Hsu, N.C., Kahn, R.A., Levy, R.C.,
Simon, M.C., Patton, A.P., Naumova, E.N., Levy, J.I., Kumar, P., Brugge, D., Durant, J.L., Lyapustin, A., Sayer, A.M., Winker, D.M., 2016. Global estimates of fine particulate
2018. Combining measurements from mobile monitoring and a reference site to matter using a combined geophysical-statistical method with information from
develop models of ambient ultrafine particle number concentration at residences. satellites, models, and monitors. Environmental Science & Technology 50 (7),
Environmental Science & Technology 52 (12), 6985–6995. 3762–3772.
Singh, K.P., Gupta, S., Kumar, A., Shukla, S.P., 2012. Linear and nonlinear modeling van Nunen, E., Vermeulen, R., Tsai, M.Y., Probst-Hensch, N., Ineichen, A., Davey, M.,
approaches for urban air quality prediction. Science of the Total Environment 426, Imboden, M., Ducret-Stich, R., Naccarati, A., Raffaele, D., Ranzi, A., 2017. Land use
244–255. regression models for ultrafine particles in six European areas. Environmental
Smith, K.R., Edwards, P.M., Evans, M.J., Lee, J.D., Shaw, M.D., Squires, F., Wilde, S., Science & Technology 51 (6), 3336–3345.
Lewis, A.C., 2017. Clustering approaches to improve the performance of low cost air Van Roode, S., Ruiz-Aguilar, J.J., González-Enrique, J., Turias, I.J., 2019. An artificial
pollution sensors. Faraday Discussions 200, 621–637. neural network ensemble approach to generate air pollution maps. Environmental
Snyder, E.G., Watkins, T.H., Solomon, P.A., Thoma, E.D., Williams, R.W., Hagler, G.S., Monitoring and Assessment 191 (12), 1–15.
Shelow, D., Hindin, D.A., Kilaru, V.J., Preuss, P.W., 2013. The changing paradigm of Verma, V., Fang, T., Xu, L., Peltier, R.E., Russell, A.G., Ng, N.L., Weber, R.J., 2015.
air pollution monitoring. Environmental Science & Technology 47 (20), Organic aerosols associated with the generation of reactive oxygen species (ROS) by
11369–11377. water-soluble PM2.5. Environmental Science & Technology 49 (7), 4646–4656.
Son, Y., Osornio-Vargas, Á.R., O’Neill, M.S., Hystad, P., Texcalac-Sangrador, J.L., Vienneau, D., De Hoogh, K., Bechle, M.J., Beelen, R., Van Donkelaar, A., Martin, R.V.,
Ohman-Strickland, P., Meng, Q., Schwander, S., 2018. Land use regression models to Millet, D.B., Hoek, G., Marshall, J.D., 2013. Western European land use regression
assess air pollution exposure in Mexico City using finer spatial and temporal input incorporating satellite-and ground-based measurements of NO2 and PM10.
parameters. Science of the Total Environment 639, 40–48. Environmental Science & Technology 47 (23), 13555–13564.
Song, W., Jia, H., Li, Z., Tang, D., Wang, C., 2019. Detecting urban land-use configuration Villa, T.F., Salimi, F., Morton, K., Morawska, L., Gonzalez, F., 2016. Development and
effects on NO2 and NO variations using geographically weighted land use regression. validation of a UAV based system for air pollution measurements. Sensors 16 (12),
Atmospheric Environment 197, 166–176. 2202.
Sorek-Hamer, M., Chatfield, R., Liu, Y., 2020. Strategies for using satellite-based products Vizcaino, P., Lavalle, C., 2018. Development of European NO2 Land Use Regression
in modeling PM2.5 and short-term pollution episodes. Environment International Model for present and future exposure assessment: Implications for policy analysis.
144, 106057. Environmental Pollution 240, 140–154.
Stafoggia, M., Bellander, T., Bucci, S., Davoli, M., De Hoogh, K., De’Donato, F., Vu, B.N., Bi, J., Wang, W., Huff, A., Kondragunta, S., Liu, Y., 2022. Application of
Gariazzo, C., Lyapustin, A., Michelozzi, P., Renzi, M., Scortichini, M., 2019. geostationary satellite and high-resolution meteorology data in estimating hourly
Estimation of daily PM10 and PM2.5 concentrations in Italy, 2013–2015, using a PM2.5 levels during the Camp Fire episode in California. Remote Sensing of
spatiotemporal land-use random-forest model. Environment International 124, Environment 271, 112890.
170–179. Wang, M., Beelen, R., Eeftens, M., Meliefste, K., Hoek, G., Brunekreef, B., 2012.
Stedman, J.R., Vincent, K.J., Campbell, G.W., Goodwin, J.W., Downing, C.E., 1997. New Systematic evaluation of land use regression models for NO2. Environmental Science
high resolution maps of estimated background ambient NOx and NO2 concentrations & Technology 46 (8), 4481–4489.
in the UK. Atmospheric Environment 31 (21), 3591–3602. Wang, M., Beelen, R., Basagana, X., Becker, T., Cesaroni, G., De Hoogh, K., Dedele, A.,
Strak, M., Janssen, N., Beelen, R., Schmitz, O., Vaartjes, I., Karssenberg, D., van den Declercq, C., Dimakopoulou, K., Eeftens, M., Forastiere, F., 2013a. Evaluation of land
Brink, C., Bots, M.L., Dijst, M., Brunekreef, B., Hoek, G., 2017. Long-term exposure to use regression models for NO2 and particulate matter in 20 European study areas: the
particulate matter, NO2 and the oxidative potential of particulates and diabetes ESCAPE project. Environmental Science & Technology 47 (9), 4357–4364.
prevalence in a large national health survey. Environment International 108, Wang, M., Beelen, R., Bellander, T., Birk, M., Cesaroni, G., Cirach, M., Cyrys, J., de
228–236. Hoogh, K., Declercq, C., Dimakopoulou, K., Eeftens, M., 2014. Performance of multi-
Su, X., An, J., Zhang, Y., Zhu, P., Zhu, B., 2020. Prediction of ozone hourly city land use regression models for nitrogen dioxide and fine particles.
concentrations by support vector machine and kernel extreme learning machine Environmental Health Perspectives 122 (8), 843–849.
using wavelet transformation and partial least squares methods. Atmospheric Wang, R., Henderson, S.B., Sbihi, H., Allen, R.W., Brauer, M., 2013b. Temporal stability
Pollution Research 11 (6), 51–60. of land use regression models for traffic-related air pollution. Atmospheric
Su, J.G., Brauer, M., Ainslie, B., Steyn, D., Larson, T., Buzzelli, M., 2008a. An innovative Environment 64, 312–319.
land use regression model incorporating meteorology for exposure analysis. Science Wang, Y., Huang, L., Huang, C., Hu, J., Wang, M., 2023. High-resolution modeling for
of the Total Environment 390 (2–3), 520–529. criteria air pollutants and the associated air quality index in a metropolitan city.
Su, J.G., Brauer, M., Buzzelli, M., 2008b. Estimating urban morphometry at the Environment International 172, 107752.
neighborhood scale for improvement in modeling long-term average air pollution Wang, M., Keller, J.P., Adar, S.D., Kim, S.Y., Larson, T.V., Olives, C., Sampson, P.D.,
concentrations. Atmospheric Environment 42 (34), 7884–7893. Sheppard, L., Szpiro, A.A., Vedal, S., Kaufman, J.D., 2015. Development of long-term
Suleiman, A., Tight, M.R., Quinn, A.D., 2016. Hybrid neural networks and boosted spatiotemporal models for ambient ozone in six metropolitan regions of the United
regression tree models for predicting roadside particulate matter. Environmental States: the MESA Air study. Atmospheric Environment 123, 79–87.
Modeling & Assessment 21 (6), 731–750. Wang, W., Liu, X., Bi, J., Liu, Y., 2022. A machine learning model to estimate ground-
Sun, W., Zhang, H., Palazoglu, A., 2013. Prediction of 8 h-average ozone concentration level ozone concentrations in California using TROPOMI data and high-resolution
using a supervised hidden Markov model combined with generalized linear models. meteorology. Environment International 158, 106917.
Atmospheric Environment 81, 199–208. Wang, M., Sampson, P.D., Hu, J., Kleeman, M., Keller, J.P., Olives, C., Szpiro, A.A.,
Tella, A., Balogun, A.L., Adebisi, N., Abdullah, S., 2021. Spatial assessment of PM10 Vedal, S., Kaufman, J.D., 2016. Combining land-use regression and chemical
hotspots using Random Forest, K-Nearest Neighbour and Naïve Bayes. Atmospheric transport modeling in a spatiotemporal geostatistical model for ozone and PM2.5.
Pollution Research 12 (10), 101202. Environmental Science & Technology 50 (10), 5111–5118.
Tella, A., Balogun, A.L., 2021. GIS-based air quality modelling: spatial prediction of Wang, D., Wang, Z., Peng, Z.R., Wang, D., 2020b. Using unmanned aerial vehicle to
PM10 for Selangor State, Malaysia using machine learning algorithms. investigate the vertical distribution of fine particulate matter. International Journal
Environmental Science and Pollution Research 1–17. of Environmental Science and Technology 17, 219–230.

20
X. Ma et al. Environment International 183 (2024) 108430

Wang, A., Xu, J., Tu, R., Saleh, M., Hatzopoulou, M., 2020a. Potential of machine Yang, X., Zheng, Y., Geng, G., Liu, H., Man, H., Lv, Z., He, K., de Hoogh, K., 2017.
learning for prediction of traffic related air pollution. Transportation Research Part Development of PM2.5 and NO2 models in a LUR framework incorporating satellite
d: Transport and Environment 88, 102599. remote sensing and air quality model data in Pearl River Delta region. China.
Wang, J., Xu, H., 2021. A novel hybrid spatiotemporal land use regression model system Environmental Pollution 226, 143–153.
at the megacity scale. Atmospheric Environment 244, 117971. Yeganeh, B., Motlagh, M.S.P., Rashidi, Y., Kamalan, H., 2012. Prediction of CO
Wang, W., Zhao, S., Jiao, L., Taylor, M., Zhang, B., Xu, G., Hou, H., 2019. Estimation of concentrations based on a hybrid Partial Least Square and Support Vector Machine
PM2.5 concentrations in China using a spatial back propagation neural network. model. Atmospheric Environment 55, 357–365.
Scientific Reports 9 (1), 1–10. Yeganeh, B., Hewson, M.G., Clifford, S., Tavassoli, A., Knibbs, L.D., Morawska, L., 2018.
Weissert, L.F., Alberti, K., Miskell, G., Pattinson, W., Salmond, J.A., Henshaw, G., Estimating the spatiotemporal variation of NO2 concentration using an adaptive
Williams, D.E., 2019. Low-cost sensors and microscale land use regression: Data neuro-fuzzy inference system. Environmental Modelling & Software 100, 222–235.
fusion to resolve air quality variations with high spatial and temporal resolution. Yin, X., Franklin, M., Fallah-Shorshani, M., Shafer, M., McConnell, R., Fruin, S., 2022.
Atmospheric Environment 213, 285–295. Exposure models for particulate matter elemental concentrations in Southern
Weissert, L., Alberti, K., Miles, E., Miskell, G., Feenstra, B., Henshaw, G.S., California. Environment International 165, 107247.
Papapostolou, V., Patel, H., Polidori, A., Salmond, J.A., Williams, D.E., 2020. Low- Yuan, Z., Kerckhoffs, J., Hoek, G., Vermeulen, R., 2022. A knowledge transfer approach
cost sensor networks and land-use regression: Interpolating nitrogen dioxide to map long-term concentrations of hyperlocal air pollution from short-term mobile
concentration at high temporal and spatial resolution in Southern California. measurements. Environmental Science & Technology 56 (19), 13820–13828.
Atmospheric Environment 223, 117287. Yuchi, W., Gombojav, E., Boldbaatar, B., Galsuren, J., Enkhmaa, S., Beejin, B.,
Weissert, L.F., Salmond, J.A., Miskell, G., Alavi-Shoshtari, M., Williams, D.E., 2018. Naidan, G., Ochir, C., Legtseg, B., Byambaa, T., Barn, P., 2019. Evaluation of random
Development of a microscale land use regression model for predicting NO2 forest regression and multiple linear regression for predicting indoor fine particulate
concentrations at a heavy trafficked suburban area in Auckland, NZ. Science of the matter concentrations in a highly polluted city. Environmental Pollution 245,
Total Environment 619, 112–119. 746–753.
Williams, D.E., 2019. Low cost sensor networks: how do we know the data are reliable? Zalzal, J., Alameddine, I., El Khoury, C., Minet, L., Shekarrizfard, M., Weichenthal, S.,
ACS Sensors 4 (10), 2558–2565. Hatzopoulou, M., 2019. Assessing the transferability of landuse regression models
Wu, H., Reis, S., Lin, C., Heal, M.R., 2017. Effect of monitoring network design on land for ultrafine particles across two Canadian cities. Science of the Total Environment
use regression models for estimating residential NO2 concentration. Atmospheric 662, 722–734.
Environment 149, 24–33. Zhai, L., Li, S., Zou, B., Sang, H., Fang, X., Xu, S., 2018. An improved geographically
Wu, Y., Wang, Y., Wang, L., Song, G., Gao, J., Yu, L., 2020. Application of a taxi-based weighted regression model for PM2.5 concentration estimation in large areas.
mobile atmospheric monitoring system in Cangzhou, China. Transportation Research Atmospheric Environment 181, 145–154.
Part d: Transport and Environment 86, 102449. Zhan, Y., Luo, Y., Deng, X., Grieneisen, M.L., Zhang, M., Di, B., 2018. Spatiotemporal
Wu, B., Yu, B., Shu, S., Liang, H., Zhao, Y., Wu, J., 2021. Mapping fine-scale visual prediction of daily ambient ozone levels across China using random forest for human
quality distribution inside urban streets using mobile LiDAR data. Building and exposure assessment. Environmental Pollution 233, 464–473.
Environment 206, 108323. Zhang, P., Ma, W., Wen, F., Liu, L., Yang, L., Song, J., Wang, N., Liu, Q., 2021b.
Wu, C.D., Zeng, Y.T., Lung, S.C.C., 2018. A hybrid kriging/land-use regression model to Estimating PM2.5 concentration using the machine learning GA-SVM method to
assess PM2.5 spatial-temporal variability. Science of the Total Environment 645, improve the land use regression model in Shaanxi, China. Ecotoxicology and
1456–1464. Environmental Safety 225, 112772.
Xiang, Y., Zhang, T., Ma, C., Lv, L., Liu, J., Liu, W., Cheng, Y., 2021. Lidar vertical Zhang, L., Tian, X., Zhao, Y., Liu, L., Li, Z., Tao, L., Wang, X., Guo, X., Luo, Y., 2021a.
observation network and data assimilation reveal key processes driving the 3-D Application of nonlinear land use regression models for ambient air pollutants and
dynamic evolution of PM2.5 concentrations over the North China Plain. Atmospheric air quality index. Atmospheric Pollution Research 12 (10), 101186.
Chemistry and Physics 21 (9), 7023–7037. Zhang, Z., Wang, J., Hart, J.E., Laden, F., Zhao, C., Li, T., Zheng, P., Li, D., Ye, Z.,
Xu, H., Bechle, M.J., Wang, M., Szpiro, A.A., Vedal, S., Bai, Y., Marshall, J.D., 2019a. Chen, K., 2018. National scale spatiotemporal land-use regression model for PM2.5,
National PM2.5 and NO2 exposure models for China based on land use regression, PM10 and NO2 concentration in China. Atmospheric Environment 192, 48–54.
satellite measurements, and universal kriging. Science of the Total Environment 655, Zhang, Q., Wu, S., Wang, X., Sun, B., Liu, H., 2020. A PM2.5 concentration prediction
423–433. model based on multi-task deep learning for intensive air quality monitoring
Xu, X., Qin, N., Qi, L., Zou, B., Cao, S., Zhang, K., Yang, Z., Liu, Y., Zhang, Y., Duan, X., stations. Journal of Cleaner Production 275, 122722.
2021. Development of season-dependent land use regression models to estimate BC Zhao, B., Yu, L., Wang, C., Shuai, C., Zhu, J., Qu, S., Taiebat, M., Xu, M., 2021. Urban Air
and PM1 exposure. Science of the Total Environment, 148540. Pollution Mapping Using Fleet Vehicles as Mobile Monitors and Machine Learning.
Xu, X., Qin, N., Zhao, W., Tian, Q., Si, Q., Wu, W., Iskander, N., Yang, Z., Zhang, Y., Environmental Science & Technology 55 (8), 5579–5588.
Duan, X., 2022c. A three-dimensional LUR framework for PM2.5 exposure Zhou, S., Lin, R., 2019. Spatial-temporal heterogeneity of air pollution: The relationship
assessment based on mobile unmanned aerial vehicle monitoring. Environmental between built environment and on-road PM2.5 at micro scale. Transportation
Pollution 301, 118997. Research Part d: Transport and Environment 76, 305–322.
Xu, J., Yang, W., Han, B., Wang, M., Wang, Z., Zhao, Z., Bai, Z., Vedal, S., 2019b. An Zhou, X., Tong, W., Li, L., 2020. Deep learning spatiotemporal air pollution data in China
advanced spatio-temporal model for particulate matter and gaseous pollutants in using data fusion. Earth Science Informatics 13 (3), 859–868.
Beijing, China. Atmospheric Environment 211, 120–127. Zou, B., Luo, Y., Wan, N., Zheng, Z., Sternberg, T., Liao, Y., 2015a. Performance
Xu, J., Yang, Z., Han, B., Yang, W., Duan, Y., Fu, Q., Bai, Z., 2022a. A unified empirical comparison of LUR and OK in PM2.5 concentration mapping: a multidimensional
modeling approach for particulate matter and NO2 in a coastal city in China. perspective. Scientific Reports 5 (1), 1–7.
Chemosphere, 134384. Zou, B., Wang, M., Wan, N., Wilson, J.G., Fang, X., Tang, Y., 2015b. Spatial modeling of
Xu, J., Yang, W., Bai, Z., Zhang, R., Zheng, J., Wang, M., Zhu, T., 2022b. Modeling spatial PM2.5 concentrations with a multifactoral radial basis function neural network.
variation of gaseous air pollutants and particulate matters in a Metropolitan area Environmental Science and Pollution Research 22 (14), 10395–10404.
using mobile monitoring data. Environmental Research 210, 112858. Zou, B., Chen, J., Zhai, L., Fang, X., Zheng, Z., 2016. Satellite based mapping of ground
Xu, S., Zou, B., Lin, Y., Zhao, X., Li, S., Hu, C., 2019c. Strategies of method selection for PM2.5 concentration using generalized additive modeling. Remote Sensing 9 (1), 1.
fine-scale PM2.5 mapping in an intra-urban area using crowdsourced monitoring. Zou, B., Li, S., Lin, Y., Wang, B., Cao, S., Zhao, X., Peng, F., Qin, N., Guo, Q., Feng, H.,
Atmospheric Measurement Techniques 12 (5), 2933–2948. Matthew, C.J., 2020. Efforts in reducing air pollution exposure risk in China: State
Xuan, W., Zhang, F., Zhou, H., Du, Z., Liu, R., 2021. Improving geographically weighted versus individuals. Environment International 137, 105504.
regression considering directional nonstationary for ground-Level PM2.5 estimation.
ISPRS International Journal of Geo-Information 10 (6), 413.
Yan, N., Mei, C.L., 2014. A two-step local smoothing approach for exploring spatio-
Further reading
temporal patterns with application to the analysis of precipitation in the mainland of
China during 1986–2005. Environmental and Ecological Statistics 21, 373–390. Lee, E.G., Magrm, R., Kusti, M., Kashon, M.L., Guffey, S., Costas, M.M., Boykin, C.J.,
Yang, W., Deng, M., Xu, F., Wang, H., 2018. Prediction of hourly PM2.5 using a space- Harper, M., 2017. Comparison between active (pumped) and passive (diffusive)
time support vector regression model. Atmospheric Environment 181, 12–19. sampling methods for formaldehyde in pathology and histology laboratories. Journal
Yang, Z., Freni-Sterrantino, A., Fuller, G.W., Gulliver, J., 2020. Development and of Occupational and Environmental Hygiene 14 (1), 31–39.
transferability of ultrafine particle land use regression models in London. Science of
the Total Environment 740, 140059.

21

You might also like