You are on page 1of 21

Mathematics and Computers in Simulation 54 (2000) 1–21

Spatial distribution characteristics of some air pollutants in Sydney


Hiep Duc a,∗ , Ian Shannon a , Merched Azzi b
a
Environment Protection Authority, Air Science Section, P.O. Box 29, Lidcombe, NSW 1825, Australia
b
Division of Coal and Energy Technology, CSIRO, North Ryde, Australia
Accepted 26 April 2000

Abstract
Spatial distribution characteristics may be used to help in siting of air monitoring stations. This technique is also
helpful in predicting variations in the concentrations of air pollutants due to changes in meteorology.
This paper discusses the spatial distribution patterns at a number of monitoring sites in Sydney. The pollutants
modelled are ozone, nitrogen oxides and particles as determined by nephelometry. Concentrations, as monthly
averages, covering a summer and winter period, are used. Spatial cross-correlation of time series between sites is
also investigated. The correlation analysis shows that for most pollutants, the effect is only significant within 30 km
around the site.
The daily variations in correlation between sites, caused by changing meteorological conditions, can be minimised
by deriving the correlation coefficient for each hour in a particular year. This allows the correlation pattern at each
hour to be seen and the average effect of meteorology can be revealed by the changing pattern over a 24 h period.
This approach is used for each hour over 1993 and 1994, based on the correlation matrices of ozone, nitrogen
dioxides and nephelometry. Crown Copyright © 2000 Published by Elsevier Science B.V. on behalf of IMACS. All
rights reserved.
Keywords: Spatial distribution; Kriging; Nephelometry; Sydney; Correlation field

1. Introduction

In 1992, the Metropolitan Air Quality Study (MAQS) was initiated by the NSW government. From
1993, the number of monitoring stations was significantly increased as well as the number of pollutants
and meteorological parameters to be measured. Parameters measured include carbon monoxide (CO),
nitrogen oxides (NO, NO2 , NOx ), ozone, visibility (nephelometer) and sulphur dioxides (SO2 ).
In 1994, the Sydney basin had 17 monitoring stations scattered throughout the region. Air pollutants
and meteorological parameters are measured continuously on a 2 min basis and consolidated into hourly
values which are used as the basis for all statistical analyses. Since spatial analysis requires an adequate

Corresponding author.
E-mail address: sydow@first.gmd.de (H. Duc).

0378-4754/00/$20.00 Crown Copyright © 2000 Published by Elsevier Science B.V. on behalf of IMACS. All rights reserved.
PII: S 0 3 7 8 - 4 7 5 4 ( 0 0 ) 0 0 1 6 5 - 8
2 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 1. Sydney air quality monitoring network.

number of measurements at different sites, the data period to be analysed is from 1993 to 1994. The
Sydney basin and the monitoring network as of 1994 is depicted in Fig. 1.
There is a need to analyse the collected data both in the temporal and spatial domain and to study
the regional air pollution pattern throughout this region. This also has an implication for network design
issues such as adding, removing or relocating monitoring stations.
Other network design issues to be addressed are
• The adequacy of the current number and the density of monitoring stations to represent the Sydney
metropolitan area and to permit a reconstruction of the underlying concentration field within a given
limit of error.
• The identification of different sub-regions which have different pollution characteristics.
• The adequacy of the number of monitoring stations within each identified sub-region to collect infor-
mation to interpret, represent and infer (or predict) pollution concentration at any nominated location
within the sub-region.
• The possibility of using data from a neighbouring station when only limited data is available from a
monitoring station.
In the study of the effect of pollution on health such as asthma, the air quality data from the monitoring
stations which reflect accurately or are relevant to the area under consideration should be used. It may also
be more appropriate to evaluate the dosage area product (DAP) rather than the peak concentration to gauge
the effect of pollution exposure on human health. The DAP index requires spatial data interpolation [5].
To adequately address these issues, the study of spatial distribution of pollutants over different periods
under different meteorological conditions is required.

2. Methodology

As the number of data measurements across a region is limited by the available monitoring stations, to
estimate the real surface that represents the data values at all locations requires a good and appropriate
interpolation method. One of the most widely used method is the kriging method.
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 3

This is a statistically-based interpolation method. The generation of the kriged surface is similar to
the minimum curvature spline surface except that the basis function used is a form of covariance kernel.
This basis function is related to the semivariogram function of distance. The semivariogram specifies the
spatial relationship between the data points.
If the pollution field is homogeneous and isotropic then the semivariogram function will depend only
on the separation distance and not on the positions of the monitoring stations. Such an assumption is
often made to simplify the analysis. However, actual data analysis shows a departure from the expected
theoretical behaviour. Hence, this assumption is usually not valid.
Casado et al. [2] used a variogram model known as the spherical model for the visualisation of spa-
tial hourly ozone data in south-eastern US. Identifying parameters of their model are the range (dis-
tance before the modelled function plateaus) and sill (minimum value at zero distance of the modelled
function).
Kriging originated from geostatistics. The idea is to find the value at the grid point by estimating the
weighted average of all the given observed values using a statistical technique. The weights depend on
the distance from the observed values and the estimator is Best Linear Unbiased Estimator (BLUE). Data
points are honoured exactly in the kriging method.
The kriging technique is now widely applied in many fields outside geostatistics [1,2,9]. Venkatram [17]
has applied kriging in the spatial analysis of acid precipitation data. He then shows that universal kriging
in which the drift or trend effect is taken into account is most appropriate for this kind of data. Gilbert
and Simpson [6] used the technique on the logarithmic-transformed data to estimate the spatial pattern
of radionuclide contaminants in underground nuclear test area. They refer to this method as lognormal
kriging.
Another widely used technique is the use of the spatial cross-correlation of time series between sites.
Goldstein and Landovitz [7,8] used the spatial cross-correlation matrices of SO2 and smoke particles in
the metropolitan area of New York to analyse the pollution pattern. Lack of correlation between sites
indicates that the use of data at a particular monitoring station is not representative for the area around
that station. Spatial correlation between stations also has been studied by many authors such as by Van
Egmond and Onderdelinden [16] for SO2 , ozone and NO2 over The Netherlands, and by Chock and
Levitt [3] for oxidant ozone and CO in Los Angeles. Specifically, Van Egmond and Onderdelinden [16]
considered three different correlation coefficients with respect to the station mean (space), the hourly
mean field (time) and the overall mean. In this paper, we are mainly concerned with the correlation with
respect to the station mean.
One of the requirement of correlation analysis is that the data should be normally distributed [4,14].
As air pollution data has been generally accepted to be positively skewed distribution, the logarithm of
the concentration data values is often used to approximate normal distribution [5].
However, a more general transformation such as the Box–Cox transform is much more appropriate
[10]. This transform is specified by

X(t)α − 1
Y (t) = , limα−0 Y (t) = ln X(t)
α

when α=0.5, it is the classic square root transform which makes a Weibull distribution normal and in the
limiting case of α=0, it is the log transformation. It has been suggested [10] that an alpha value of 0.2 is
best for Sydney data. This transform of the data stabilises the variance.
4 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Elsom [4] used the correlation technique to construct a correlation field with respect to a reference
station to access the degree to which daily concentration for any urban location can be derived, using
data from one reference site. The ability to estimate missing values at a monitoring station from another
reference site is very useful as in practice data from monitoring stations is often incomplete. It can also
be used to derive the spacing or density of the monitoring station network required to give a degree of
accuracy in the construction of isopleth maps.
The correlation field can be used to rationalise the monitoring station network [4]. If the correlation
coefficients appear to decay much less in a particular direction than in any other (i.e. anisotropy), then
fewer stations would be required along this direction than in any other directions. As noted by Elsom [4],
the design of the network ultimately also depends upon the objectives of monitoring, and if the absolute
levels are important and steep pollution concentration gradients are present, a larger number of monitors
might be required despite excellent temporal correlations. However, such correlation field method offers
an useful approach.
Shannon et al. [15] used the correlation technique to determine the correlation structure of daily average
of turbidity (particles between 0.1 and 1.0 ␮m) using a sun photometer over north-eastern US. A general
correlation structure function (CSF) taking into account anisotropy is then used to fit the correlation data.
This CSF is used for the optimum placement of a limited number of sensor stations measuring turbidity
using a dichopyranometer. The placement of sensors is such that it minimises the normalised residual
(unexplained) variance obtained from objective analyses performed at grid points across the region of
interest. Non-linear programming was used to fit the CSF and the search for the optimum placement of
sensors. This technique was used to place five sensors in addition to five fixed reference sensors using
both sequential and simultaneous placement of sensors. There are other approaches which was suggested
for the network design. One is to evaluate an initial array of sensors then determine the smallest number
of sensors that would explain as much variance as required. The other is to increase the number of sensors
until the incremental value of the CSF is less than some fraction of the incremental value due to the first
sensor.
To analyse the correlation between sites more accurately, the time series at each site should be modelled
formally. More specifically, the trend, the seasonality, external factors such as meteorological wind data,
temperature or stability class and the remaining residual characteristics should be included in the model.
To simplify the model, we remove the trend and the seasonality effect by studying two short time series
at each site corresponding to the summer and winter period of 1993. An auto-regressive (AR) of order 2
is applied to the series. The AR(2) model was found to be adequate in the study of Finzy et al. [5] for air
quality data. It will be shown that the time series of hourly pollutant concentration at Sydney monitoring
sites can be described and fitted using AR(2) model. Therefore, an AR(2) model was adopted for the
correlation analysis.
The above methods will be used to characterise the whole Sydney basin. However, in order to further
delineate the sub-regions which share similar characteristics, classic cluster analysis can be used. Clus-
tering methods are applied based on correlation similarity measure. Correlation similarity measure was
used by Wolff and Parsons [18] and Kendall [11] to group sites or categories into different clusters using
simple near neighbour linkage method.
To characterise the stations based on their pollution levels, especially in the high ranges
above the goal level, the statistical distribution at a site can be used as a similarity measure as a
basis to cluster the stations. Similarity measure based on the pollutant distribution is more difficult to
categorise.
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 5

3. Spatial characteristics of Sydney air

The two main indicators of air pollution are the ozone and nephelometer measurements at monitoring
stations. Nitrogen oxides (NOx ) are closely related to ozone formation and are mainly due to motor
vehicle emissions and as such are the main source of air pollution in Sydney. The spatial characteristics
of these three parameters will be examined for two different periods. The smog produced by air pollution
during summer is characterised by ozone. Conversely, particles (as measured by nephelometer) are the
dominant pollutants during winter. The degree of pollutant effect on health is also different in each season
of the year [12]. In this paper, the characteristics of the spatial distribution in summer and winter will be
discussed in detail. Validated monitoring data which was initially available for analysis during the early
stage of this study covers the period of January 1993 up to June 1994. For this reason, January and June
were chosen as representative months for summer and winter, respectively.

3.1. Monthly daily maxima characteristics

The monthly statistics can be used to study the overall behaviour of the spatial distribution of pollutant
data. Two monthly statistics are used for this purpose. The monthly average values and the monthly
average of daily maxima. To see whether the result of the analysis is repeatable, the data for 1993 and
1994 was analysed. The monthly summaries for January and June of 1993 and 1994 at different monitoring
stations are listed in Table 1.

Table 1
Monthly average of daily maximum concentration of nephelometer, nitrogen oxides and ozone at different monitoring stationsa
Site January 1993 June 1993 January 1994 June 1994

Ozone Neph NOx Ozone Neph NOx Ozone Neph NOx Ozone Neph NOx

Kensington 2.95 0.44 7.2 0.88 0.50 20.68 2.05 0.88 6.24 0.76 0.86 16.94
Rozelle 2.29 0.38 5.57 1.44 1.17 19.82 2.54 0.84 5.31 1.07 0.97 12.45
Lidcombe 4.10 ∗ 5.94 0.98 ∗ 20.68 2.04 ∗ 7.00 1.35 ∗ 22.46
Earlwood 3.59 ∗ 8.60 1.23 ∗ 37.77 2.86 ∗ 9.06 1.38 ∗ 31.10
Westmead 4.32 0.64 2.60 1.63 1.47 21.94 3.53 1.65 8.48 1.35 1.08 26.02
Woolooware 3.60 0.46 ∗ 1.44 1.21 ∗ 2.99 1.20 4.07 2.02 1.17 14.94
Liverpool 4.45 0.66 4.95 0.73 1.45 16.33 4.00 1.10 5.93 1.54 1.32 25.75
Campbelltown 3.34 ∗ 2.24 2.18 ∗ 17.66 3.91 ∗ 5.88 ∗ ∗ ∗
Bringelly 3.91 ∗ ∗ 2.12 0.67 5.41 4.92 1.53 2.14 2.33 0.88 5.08
Blacktown 4.47 0.61 4.26 1.39 1.28 23.85 3.76 1.82 5.36 1.76 1.62 14.80
St. Mary 4.65 ∗ 1.86 1.73 0.87 14.58 4.85 1.95 3.41 2.34 0.84 14.66
Richmond 4.29 0.70 1.73 2.17 0.70 6.48 4.43 3.36 2.66 1.97 1.34 6.92
Vineyard ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 2.42 1.66 5.70
Lindfield ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 2.10 ∗ 12.46
Peakhurst ∗ ∗ ∗ ∗ ∗ ∗ 3.98 1.50 5.11 ∗ ∗ ∗
Camden ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ 4.43 ∗ 4.67
Smithfield ∗ ∗ ∗ ∗ ∗ ∗ 5.02 2.80 5.51 ∗ ∗ ∗
a
The symbol (∗) denotes that data is not available or missing. The unit of ozone and nitrogen oxides is pphm and that of
nephelometer is backscattering unit.
6 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Note that in the early January 1994, there were periods of intense bush fires in and around Sydney
(north and south of Sydney and far west in the Blue Mountain). This caused elevated concentration levels
of particles throughout the Sydney region.
When the isopleths are plotted for both monthly average and month average of daily maxima, they
show very much the same distribution pattern across the Sydney region. For the purpose of the discussion
below, the isopleths of the monthly average of daily maxima were considered. They are plotted using
kriging interpolation in Fig. 2.
The isopleth of the monthly average of daily maxima of nitrogen oxides (NOx ) for January/June 1993
shows high concentration around Earlwood. For nephelometer, in June 1993, high concentration occurs
around Westmead, Lidcombe and Fairfield area. For ozone, low concentration occurs mostly in the eastern
and central part of Sydney for January/June 1993.
From the isopleths of ozone for January 1993/1994, the two patterns are much similar. The ozone
level is gradually increased from the eastern part of Sydney toward the west, south and north of Sydney.
Similarly, for June 1993 and 1994, the eastern and central part of Sydney have low concentration but are
gradually increased toward the west, south and the north. The ozone pattern are very similar for each year
in both summer and winter periods.

3.2. Hourly analysis using correlation functions

The above analysis was done using the monthly average of daily maxima data. The average statistical
characteristics may not reveal enough insights to the spatial distribution of air pollutants. For this reason,
the spatial statistical distribution characteristics was next investigated based on hourly time series data at
different sites.
The use of a spatial correlation matrix which is derived after correcting the data by a Box–Cox trans-
form is helpful in determining the spatial characteristics of air pollution data. The time series at each
monitoring station can be modelled by the AR model. An AR(2) model was chosen because an anal-
ysis of the time series at each station showed that the auto-correlation was not significant beyond lags
of 3 h.
To minimise and isolate the effect of meteorology which can cause daily changes in correlation between
sites, the correlation coefficient is derived for each hour in each season for a particular year. In this way,
the correlation pattern at each representative hour can been seen. Then the average effect of meteorology
can be revealed by the changing pattern in the 24 h period. Correlation matrices of ozone, nephelometer
and NO were derived for each hour in summer and winter period of 1993 and 1994. These correlation
matrices were then used in the following correlation field analysis and in the classifying of stations, using
cluster technique in Section 3.3.
Appendix A contains a series of plots of the correlation between pairs of stations plotted against the
separation of the pairs of stations. There are 12 graphs each being of ozone, Neph and NOx for the four
study periods of summer and winter 1993, and summer and winter 1994. In all cases, a fitted spline
shows an overall decline of the correlation as distance decreases, although there is a far degree of scatter
about the line. Correlations range from approximately 0.8 down to 0.2 and the distances (between pairs
of stations) range from 8 to 60 km. Table 2 summarises the correlation trend with distance as given by
the fitted splines.
As can be seen from the summary information in Table 2, the ozone correlations are higher and more
pervasive in that a strong relationship is maintained for pairs of sites further apart when compared with
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 7

Fig. 2. (a) Isopleths of monthly average of daily maximum of nephelometer, ozone and NOx (January 1993/1994); (b) isopleths
of monthly average of daily maximum of nephelometer, ozone and NOx (June 1993/1994).
8 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 2. (Continued ).
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 9
Table 2
Correlation coefficient at 10 km distance and the distance at which the correlation falls below 0.65 for the fitted spline of the
relationships between correlation and distance of tablesa
Period Ozone Neph NOx

Summer 1993 0.84, 38 km 0.67, 21 km 0.68,13 km


Winter 1993 0.75, 21 km 0.65, 10 km 0.70, 16 km
Summer 1994 0.85, 46 km 0.77, 20 km 0.72, 20 km
Winter 1994 0.70, 25 km 0.70, 21 km 0.60, 6 km
a
Given in Appendix A.

the other two pollutants which demonstrate more local impacts. There is also a tendency for the winter
periods to have lower correlations (that is more localised effects) than the summer periods.

3.3. Wind pattern

Meteorology is an important factor in understanding the spatial pattern of pollutants. The size of
the meteorology network in the Sydney region increased significantly from 1993 to 1994 compared to
previous years. Before 1993, there were only two monitoring stations that measured wind data. In 1993, the
stations which measured meteorological data were Campbelltown and Lidcombe. In 1994, meteorological
data were measured at Blacktown, Bringelly, Earlwood, Lindfield, Lidcombe, Liverpool, Rozelle and
Westmead. The variables measured include wind speed, wind direction, sigma theta, temperature, relative
humidity and solar radiation.
The frequency statistics of wind data can be summarised by using the wind rose plots at each individual
site. However, the frequency statistics of wind data as depicted in the wind roses at different sites does not
reveal the overall pattern of the wind for each typical day in the 24 h period. To see the overall pattern for
each hour in the 24 h period, the frequency analysis for each hour in each season of wind speed and wind
direction can be derived. Figs. 3 and 4 show the average wind pattern, as measured by all monitoring
stations, across the Sydney region for the 24 h period in summer and winter, respectively.

3.4. Correlation field

Using a reference station, the correlation field can be constructed using a spatial interpolation scheme.
The kriging method for interpolation will be used in this paper. Van Egmond and Onderdelinden [16] have
shown that the differences in the results from using the optimal interpolation, the eigenvector interpolation,
and the distance and density weighting interpolation methods are small. Both the optimal interpolation
and the eigenvector interpolation are based on the covariance or correlation kernel matrix which is also
used by the kriging method. However, the eigenvector interpolation can be used without the stringent
assumption of homogeneity and isotropy.
As mentioned above, the correlation field is useful in accessing the degree to which daily concentration
for a location can be derived using data at a reference site or whether the data from a monitoring station
can be used to represent the area of interest as in the study of the health effects of pollutant on the
population in the area. The correlation field is also useful in the grouping of the monitoring stations
to form a sub-region with respect to the common characteristics of the pollutant being considered. The
grouping of the stations based on the high correlation coefficients among the stations is performed here
10 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 3. Average wind pattern across Sydney for each hour (winter 1993).

using the cluster method used by Anh et al. [1]. As a result, this technique can be used to rationalise the
monitoring network or to determine the optimal density of the network.
As the correlation between stations depends on the season and the hour of the day, the correlation
matrices for each pollutant are derived using hourly data for each hour in the 24 h period in each season
of 1993/1994 data set. From the correlation matrices, the correlation field can be plotted easily using one
of the mentioned interpolating methods.

3.4.1. Correlation field with lidcombe as a reference station


The correlation field can be constructed using a site as a reference station. Lidcombe is chosen because
it is a central Sydney site which is assumed to share some characteristics in terms of source emission and
pollution formation with some sites in the west and the south-west.

3.4.1.1. Ozone. The correlation field is constructed by using the correlation coefficients between each
pair of stations for both the spring of 1993 and the summer of 1993/1994 periods. The correlation
coefficients were derived using the hourly data of the season being considered for each hour in the 24 h
period. The correlation field can then be plotted for each hour of the 24 h of the day. From these plots, we
can see the correlation behaviour of all stations with respect to Lidcombe. The correlation field changes
with time as it is influenced by meteorological conditions. Figs. 5 and 6 show the correlation fields at
9.00 and 15.00 h, respectively.
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 11

Fig. 4. Average wind pattern across Sydney for each hour (summer 1993).

Fig. 5. Correlation field of ozone (spring 1993 at 09.00 h).


12 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 6. Correlation field of ozone (spring 1993 at 15.00 h).

The correlation changes rapidly to the north of Lidcombe but rather slowly in the south-east toward
Woolooware and there is a uniform change in the other directions. It means that even though the distance
between Lidcombe and Woolooware is larger than that between Lidcombe and Westmead, the correlation
between Lidcombe and Woolooware is higher than that of Lidcombe and Westmead.
From this, we can say that the ozone distribution around Lidcombe, Liverpool, Fairfield in the south-west,
Earlwood, Rozelle in the east and Woolooware in the south share similar characteristics in term of ozone
behaviour and formation.
At 15.00 h, the correlation field shows that the correlation coefficient is slowly changed to the south-west
direction from Lidcombe. This is expected, as a prevailing north-easterly wind flows in the afternoon
occurs during this time.
For the 1993/1994 summer period, the correlation coefficients were also derived from the correlation
matrix. It is expected that the correlation field is different for each season. The correlation field for the
summer 1994 at 09.00 and 15.00 h are shown in Figs. 7 and 8.
At 09.00 h, the correlation changes slowly to the east direction and in the south-west direction from
Lidcombe. A relative strong correlation between Lidcombe and Rozelle and Earlwood in the east is also
evident.
At 15.00 h, the correlation changes slowly with distance in the south-west direction from Lidcombe,
i.e. Lidcombe, Lansvale, Peakhurst and Bankstown area share the same airshed at this time. In contrast,
the correlation change rather steeply to the north-west direction from Lidcombe.

3.4.1.2. Nitrogen oxides (NOx ). For nitrogen oxides (NOx ), the correlation field for spring 1993 at
09.00 and 15.00 h are shown in Figs. 9 and 10.
At 09.00 h, the NOx correlation between Lidcombe and other stations is uniformly inversely related to
distance in all directions in an area encompassing Rozelle, Earlwood, Peakhurst, Smithfield, Liverpool,
Westmead and Lindfield. There is no prevalent direction even though to the south, the correlation changes
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 13

Fig. 7. Correlation field of ozone (summer 1993/1994 at 09.00 h).

slower than in other directions. Compare this with the pattern of the ozone correlation field at the same
time. This is due to the fact that the NOx distribution is much more influenced by local emission rather
than by meteorology.
The correlation of NOx between Lidcombe and other stations at 15.00 h is low compared to that between
Lidcombe and other stations at 09.00 h. In fact, the correlation coefficients are mostly below 0.5. At this
time, the NOx distribution is independent and randomly distributed across all sites. From the correlation

Fig. 8. Correlation field of ozone (summer 1993/1994 at 15.00 h).


14 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 9. Correlation field of NOx (spring 1993 at 09.00 h).

Fig. 10. Correlation field of NOx (spring 1993 at 15.00 h).


H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 15

matrix, no particular groups of stations can be clustered together based on the high correlation coefficients
between themselves.

4. Summary and discussion

This paper attempts to show general characteristics of the Sydney airshed. Other studies such as those
in The Netherlands [6,16] studied the spatial correlation of SO2 , NO, NO2 and ozone over a long distance
network. The correlation distance is in the order of 100 km and the distance when the correlation coefficient
drops to 0.6 is comparable to this study using Sydney data.
It has been shown that within the 30 km radius, the sites have a reasonable correlation for both ozone
and nephelometer. The homogeneity is generally held within this area. However, the isotropic condition
is not met. For ozone, there is no structure dependence within the 40 km distance in both the summer and
winter season. This means that the ozone formation occurs independently across the Sydney basin.
From the correlation field isopleth using Lidcombe as a reference station, the anisotropic nature of the
distribution is evident. The rapid change in correlation with distance between Lidcombe and the eastern
monitoring stations (Rozelle, Kensington, Bondi Junction) suggests that the pollution concentration is
governed by the local meteorological or emission characteristics of the area rather than the mesoscale
meteorological condition.
For ozone, the gradual slowly decrease in correlation with distance between Lidcombe and the south-west
stations (Liverpool, Campbelltown, Bringelly) suggests that they share much of the same characteristics
and there is no need to increase the network density in this area. This could be explained by the prevalent
south-east wind in the afternoon during the spring and summer period.
The relationship between the correlation and distance between sites as well as meteorological factors
such as wind speed, wind direction and stability class will be explored in the future.
There are many criteria for the design of the monitoring network depending on the function of the
network. Seinfeld [13] listed a few functions and determine which criteria is most appropriate to be used.
The function of the network can be one or more of the following:
1. To provide a basis for air pollution control regulation and strategy.
2. To determine the effectiveness of control action on ambient air quality.
3. To provide real-time data and trends for an episode alert and warning system.
These functions lead to the criteria needed to determine the effect of source emission changes on air
quality. Alternatively, if the function of the network is to determine the source–receptor relations then the
criteria should be to access the effective dose level to the population. The network configuration is likely
to be different under different criteria.
In the analysis described above, one of the main questions to ask is whether the current air quality
monitoring network is adequate or not as a warning system to provide data and trends for detecting an
exceedance that can occur in the whole Sydney basin. The other related question is whether the current
network configuration is optimal or not in the sense that some of the stations should be relocated to other
locations which can measure and reveal more information. In other words, the aim is to maximise the
information that can be collected from the current number of monitoring stations available.
If we adopt the aim of detecting air quality exceedance as the basis for the network configuration, then the
clustering of stations exhibiting similar distribution statistics results in clusters which can contain stations
16 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

that are far apart geographically from each other. The classification of stations based on the correlation
similarity measure results in clusters containing stations which are usually geographically close and
hence share the same airshed. Using the results of these two clustering methods, the rationalisation of the
network configuration design can be done by relocation or removal of stations within the same cluster.
It is important to note that the results of clustering of stations are different for different pollutants. This
is due mainly to their air chemistry and dispersion characteristics. The optimal configuration for one
pollutant is not necessary the optimal configuration for the others.
As meteorology can produce abnormal phenomena outside its usual pattern and if the aim is to gain a
thorough understanding of the Sydney area airshed, then it worthwhile to point out that there are some
areas in Sydney where the density of the network stations should be increased to such a level that can
they can detect these changes.

Appendix A. Correlations between pairs of stations plotted against corresponding distance

A.1. Ozone

A.1.1. Summer
The correlation versus distance shows a significant correlation from 10 to 30 km separation. The cor-
relation coefficient is about 0.9–0.7 within this radius (Figs. 11 and 12).

A.1.2. Winter
During June 1993, the correlation between sites is about 0.8 at 10 km and slowly drops to 0.7 at about
30 km. The ozone correlation coefficient is significant within this 30 km radius. The extrapolation of
correlation to 0 km distance intercepts the correlation axis at a value of <1. This indicates that the data
contains measurement and sampling errors (Figs. 13 and 14).

Fig. 11. Ozone correlation between stations vs. distance (January 1993).
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 17

Fig. 12. Ozone correlation between stations vs. distance (January 1994).

Fig. 13. Ozone correlation between stations vs. distance (June 1993).

Fig. 14. Ozone correlation between stations vs. distance (June 1994).
18 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 15. Nephelometer correlation between stations vs. distance (January 1993).

A.2. Nephelometer

A.2.1. Summer
The spatial correlation slowly decreases with distance starting at about 0.7 and decreasing to 0.6 at
about 30 km. This indicates a low correlation between sites in the Sydney area (Figs. 15 and 16).

A.2.2. Winter
The inter-site correlation versus distance for the June 1993 period is shown in Fig. 9. The correlation
coefficient is about 0.7 and then drops to 0.6 at a distance of about 15 km. While for the June 1994 period,
the correlation coefficient drops to 0.6 at a larger distance of about 30 km. A correlation coefficient of 0.5
or below indicates that 25% of variance at a site can only be explained by the other site (Figs. 17 and 18).

Fig. 16. Nephelometer correlation between stations vs. distance (January 1994).
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 19

Fig. 17. Nephelometer correlation between station vs. distance (June 1993).

Fig. 18. Nephelometer correlation between station vs. distance (June 1994).

Fig. 19. Nitrogen oxides correlation between station vs. distance (June 1993).
20 H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21

Fig. 20. Nitrogen oxides correlation between station vs. distance (January 1994).

Fig. 21. Nitrogen oxides correlation between station vs. distance (June 1993).

Fig. 22. Nitrogen oxides correlation between station vs. distance (June 1994).
H. Duc et al. / Mathematics and Computers in Simulation 54 (2000) 1–21 21

A.3. Nitrogen oxides (NOx )

A.3.1. Summer
For the June 1993 period, the correlation function is about 0.7 at low distance then drops to 0.6 at about
20 km. For the June 1994 period, the correlation coefficient drops to 0.6 at a larger distance of about
30 km (Figs. 19 and 20).

A.3.2. Winter
The correlation coefficient is about 0.7 at low distance and drops to 0.5 at a distance of about 30 km.
The decrease of correlation coefficient with distance is faster in the June 1994 period than the June 1993
period (Figs. 21 and 22).

References

[1] V. Anh, M. Azzi, H. Duc, G. Johnson, 1996, Classification of air quality monitoring stations in the Sydney region, in:
Proceedings of the Asia Pacific Environment Conference, Singapore, June 1996.
[2] L. Casado, S. Rouhani, C. Cardelino, A. Ferrier, Geostatistical analysis and visualization of hourly ozone data, Atmos.
Environ. 28 (12) (1994) 2105–2118.
[3] P. Chock, S. Levitt, A space–time correlation study of oxidant and carbon monoxide in the Los Angeles basin, Atmos.
Environ. 10 (1976) 107–113.
[4] D. Elsom, Spatial correlation analysis of air pollution data in an urban area, Atmos. Environ. 12 (1978) 1103–1107.
[5] G. Finzi, G. Fronza, S. Rinaldi, Stochastic modelling and forecast of the dosage area product, Atmos. Environ. 12 (1978)
831–838.
[6] R. Gilbert, J. Simpson, Kriging for estimating spatial pattern of contaminants: potential and problems, Environ. Monitoring
Accessment 5 (1985) 113–135.
[7] I. Goldstein, L. Landovitz, Analysis of air pollution patterns in New York city. II. Can one aerometric station represent the
area surrounding it ? Atmos. Environ. 11 (1977) 53–57.
[8] I. Goldstein, L. Landovitz, Analysis of air pollution patterns in New York city. I. Can one station represent the large
metropolitan area ? Atmos. Environ. 11 (1977) 47–52.
[9] P. Guttorp, P. Sampson, 1994, Methods for estimating heterogeneous spatial covariance functions with environmental
applications, in: G. Patil, C. Rao (Eds.), Handbook of Statistics, Vol. 12, 1994, pp. 661–689.
[10] G. Johnson, M. Azzi, P. Best, K. Lunney, V. Anh, 1995, An initial analysis of Sydney ozone measurements, Investigation
report CET/1R377R for NSW EPA, Sydney.
[11] Kendall, M. 1980, Multivariate Analysis, 2nd Edition, Griffin, London.
[12] B. McNeney, J. Petkau, Overdispersed Poisson regression models for studies of air pollution human health, Can. J. Stats.
22 (1994) 421–440.
[13] J. Seinfeld, Optimal location of pollutant monitoring stations in the airshed, Atmos. Environ. 6 (1981) 847–858.
[14] Z. Sen, Regional air pollution accessment by cumulative semivariogram technique, Atmos. Environ. 29 (4) (1995) 543–548.
[15] J. Shannon, M. Wesely, J. Brady, Objective sensor placement for sampling regional turbidity, Atmos. Environ. 12 (1978)
937–943.
[16] N. Van Egmond, D. Onderdelinden, Objective analysis of air pollution monitoring network data; spatial interpolation and
network density, Atmos. Environ. 15 (6) (1981) 1035–1046.
[17] A. Venkatram, On the use of kriging in the spatial analysis of acid precipitation data, Atmos. Environ. 22 (9) (1988)
1963–1975.
[18] D. Wolff, M. Parsons, 1984, Pattern Recognition Approach to Data Interpretation, Plenum Press, New York.

You might also like