Professional Documents
Culture Documents
Abstract
Mapping to predict the air pollution concentration in Da Nang city is an urgent issue for Nhận 01.08.2018
management agencies and researchers of environmental pollution. Although the simulation of Được duyệt 10.2018
spatial location has become popular, it uses the classical interpolation methods with low Công bố 06.01.2019
reliability. Based on the distribution of air quality monitoring stations located in industrial
parks, residential areas, transport axes ... and sources of air pollution, the application of
geostatistical theories, this study presents the results of the Cokriging's interpolation selection
which provides forecast results of air pollution distribution in Da Nang city with high reliability.
In this article, we use the recorded TSP concentrations (one of major air pollution causes at Keywords
large metropolis) at several observational stations in Da Nang city, employ the Cokriging Air pollution,
interpolation method to find suitable models, then predict TSP dust concentrations at some geostatistics, Cokriging,
unmeasured stations in the city. Our key contribution is finding good statistical models by variogram
several criteria, then fitting those models with high precision.
® 2018 Journal of Science and Technology - NTTU
variance of the difference between the attribute values at all value of variogram corresponding to a vector with origin in
points separated by has followed [4]: si and extremity in sj.
1
γ(h) = ∑N(h) )
i=1 [Z(si − Z(si + h)]
2
(1) In fact, we can also use the multiple parameters in the
2N(h)
relation to each other. We can estimate certain parameters,
where Z(s) indicates the magnitude of the variable, and
in addition to information that may contain enough by
N(h) is the total number of pairs of attributes that are
itself, one might use information of other parameters that
separated by a distance h.
have more details. Cokriging is simply an extension of
Under the second-order stationary conditions [5], one
auto-kriging in that it takes into account additional
obtains:
correlated information in the subsidiary variables. It
[Z(s)] appears more complex because the additional variables
and the covariance: increase the notational complexity.
Cov[Z(s), Z(s h)] [(Z(s) )(Z(s h) )] Suppose that at each spatial location si, i 1,2,...,n we
[Z(s)Z(s h) 2 ] (2) observe k variables as follows:
C(h) Z1 (s1 ) Z1 (s 2 ) Z1 (s n )
Z2 (s1 ) Z2 (s 2 ) Z2 (s n )
Then Var[Z(s)] C(0) [Z(s) ]2
1
(h) [Z(s) Z(s h)]2 C(0) C(h) Zk (s1 ) Zk (s2 ) Zk (s n )
2
The most commonly used models are spherical, We want to predict Z1(s0), i.e. the value of variable Z1 at
exponential, Gaussian, and pure nugget effect (Isaaks & location s0.
Srivastava,1989) [6]. The adequacy and validity of the This situation that the variable under consideration (the
developed variogram model is tested satisfactorily by a target variable) occurs with other variables (co-located
technique called cross-validation. variables) arises many times in practice and we want to
Crossing plot of the estimate and the true value shows the explore the possibility of improving the prediction of
correlation coefficient R2. The most appropriate variogram variable Z1 by taking into account the correlation of Z1 with
was chosen based on the highest correlation coefficient by these other variables.
trial and error procedure. The predictor assumption:
k n
w Z (s ) w
Kriging technique is an exact interpolation estimator used
to find the best linear unbiased estimate. The best linear Ẑ1 (s0 ) ji j i 11Z1 (s1 )
unbiased estimator must have a minimum variance of j1 i 1
estimation error. We used ordinary kriging for spatial and (5)
w1n Z1 (s n ) w 21Z2 (s1 ) w 2n Z2 (s n )
temporal analysis, respectively. Ordinary kriging method is
mainly applied for datasets without and with a trend,
respectively. w k1Zk (s1 ) w kn Zk (s n )
The general equation of linear kriging estimator is We see that there are weights associated with variable Z1
n but also with each one of the other variables. We will
Ẑ(s0 )
w Z(s )
i 1
i i (3) examine ordinary cokriging, which means that
[Z j (si )] j for all j and i. In vector form:
In order to achieve unbiased estimations in ordinary kriging [Z1 (s)] 1
the following set of equations should be solved
[Z2 (s)] 2
simultaneously. [Z(s)] (6)
n
i 1
w i (si ,s j ) (s0 ,si ) [Zk (s)] k
We want the predictor Ẑ1 (s0 ) to be unbiased, that is
n (4)
ˆ (s )] . We take expectations of (5)
[Z1 0 1
wi 1
i 1
where Ẑ(s0 ) is the kriged value at location s0, Z(si) is the
known value at location si, wi is the weight associated with
the data, 𝜆 is the Lagrange multiplier, and 𝛾(𝑠𝑖 , 𝑠𝑗 ) is the
k n n n
[Zˆ 1 (s0 )]
wj1 i 1
ji [Z j (si )] min e2 [(Z1 (s0 )
w Z (s ) w
i 1
1i 1 i
i 1
2i Z2 (si )
(13)
w11 [Z1 (s1 )] w1n [Z1 (s n )] n
or
w k1 [Zk (s1 )] w kn [Zk (s n )] n
and using (6), we have
[Zˆ 1 (s0 )] w111 w1n 1 w 21 2 w 2n 2
min e2 [(Z1 (s0 ) 1 )
w [Z (s ) ]
i 1
1i 1 i 1
(14)
w k1k w kn k (8) n
n n n
w 2i [Z2 (si ) 2 ]]
2
w w
i 1
1i 1
i 1
2i 2
w
i 1
ki k 1 i 1
We complete the square (14) to get:
Therefore, we must have the following set of constraints: n
n n n [Z1 (s0 ) 1 ]2 2
w [Z (s ) ][Z (s ) ]
1i 1 0 1 1 i 1
w1i 1, w 2i 0, , w ki 0 (9) i 1
i 1 i 1 i 1 n
As with the other forms of kriging, cokriging minimizes the
mean squared error of prediction (MSE):
2
w
i 1
2i [Z1 (s 0 ) 1 ][Z 2 (si ) 2 ]
min 2 [Z (s ) Zˆ (s )]2 n n
w w [Z (s ) ][Z (s ) ]
e 1 0 1 0
or 1i 1j 1 i 1 1 j 1 (15)
k n i 1 j1
min e2 [Z1 (s0 )
w Z (s )]
j1 i 1
ji j i
2
(10)
n
w
n
w
n n n
wi 1
1i 1,
w
i 1
2i 0, ,
wi 1
ki 0 (11) 2[
i 1
w1i [Z1 (si ) 1 ][
i 1
2i [Z2 (si ) 2 ]]
For simplicity, lets assume k = 2, in other words, we It can be shown that the last term of the expression (15) is
observe variables Z1 and Z2 and we want to predict Z1. equal to:
n n
w [Z (s ) ][ w
Therefore, from (10) (with k = 2) we have
n n 2[ 2i [Z2 (si ) 2 ]
w Z (s ) w
1i 1 i 1
min e2 [Z1 (s0 ) 1i 1 i 2i Z2 (si )]
2 (12)
i 1 i 1
(16)
i 1 i 1 n n
n n 2
w w 2 j[Z1 (si ) 1 ][Z2 (s j ) 2 ]
w w
1i
From (9), we have 0 2i 2i 2 . Let's add the i 1 j1
i 1 i 1 Find now the expected value of the expression (15):
n
following quantities: 1 1
w i 1
2i 2 on (12), we
have:
n n
min [Z1 (s0 ) 1 ]2 2
i 1
w1i [Z1 (s0 ) 1 ][Z1 (si ) 1 ] 2C12 (s0 ,si ) 2
w
j1
2 jC 22 (si ,s j )
n (21)
n
2
i 1
w 2i [Z1 (s0 ) 1 ][Z2 (si ) 2 ]
2
w C
j1
1j 21 (si ,s j ) 2 2 0, i 1,..., n
n n
w w
(17)
[Z1 (si ) 1 ][Z1 (s j ) 1 ] n n
w 1, w
1i 1j
i 1 j1 1i 2i 0
n n i 1 i 1
w 2i w 2 j [Z2 (si ) 2 ][Z2 (s j ) 2 ] Put
i 1 j1 C11 (s1 ,s1 ) C11 (s1 ,s n )
n n
[C11 ]
2
w w
i 1 j1
1i 2j [Z1 (si ) 1 ][Z2 (s j ) 2 ] C (s ,s ) C (s ,s )
11 n 1 11 n n
;
w
1 0 w1n w 2n
min 12 2 w1i C11 (s0 ,si ) 2 2i C12 (s 0 ,si )
i 1 i 1
C11 (s0 ,s1) C12 (s0 ,s1 )
n n n n [C11 (s 0 ,s i )] ; [C12 (s0 ,si )]
i 1 j1
w1i w1jC11 (si ,s j )
w
i 1 j1
2i w 2 jC 22 (si ,s j )
(19)
C (s ,s )
11 0 n
C (s ,s )
12 0 n
[C11 (s0 ,si )] Table 2 isotropic variogram values of the dust TSP
Nugget Sill Range r2 RSS
[C (s ,s )]
c 12 0 i
1 Linear 2106 2499 19295 0.03 6.02E+07
Gaussian 1 2482 2252 0.081 5.73E+07
0
We have Gw = c Spherical 1 2479 2930 0.078 5.76E+07
where i 1,2,...,n , C12(h) may not be the same as C21(h), Exponetial 1 2481 3480 0.07 5.83E+07
h = |si – sj|. This is because of definition of cross-
covariance: 𝐶12 (ℎ) = 𝔼{[𝑍1 (𝑠) − 𝜇1 ][𝑍2 (𝑠 + ℎ) − 𝜇2 ]}
1
and 𝐶̂12 (ℎ) = ∑ 𝑍1 (𝑠)𝑍2 (𝑠 + ℎ) − 𝜇̂ 1 𝜇̂ 2 , obviously,
𝑁(ℎ)
1
𝐶̂21 (ℎ) = ∑ 𝑍2 (𝑠)𝑍1 (𝑠 + ℎ) − 𝜇̂ 2 𝜇̂ 1
𝑁(ℎ)
is not necessarily equal to 𝐶̂12 .
The Cokriging system is written as Gw = c, where the
vector w, c have dimensions (2n + 2) × 1 and the matrix G
has dimensions (2n + 2) × (2n + 2). The weights will be
obtained by w = G-1c.
The GS+ software (version 5.1.1) was used for
geostatistical analysis in this study (Gamma Design
Software, 2001) [7].
Through Semi-variance map of these two parameters, the From Fig. 8 and Fig. 9, we see that, in March 2016 at K49.3
model of isotropic is suitable. The variograms values are neighborhood has low pollution levels, due to transport and
presented in Table 4. less population density. The process of urbanization has not
Table 4 Isotropic variogram values of tsp and NO2 developed as today. Neighborhood of K40.3 have high
Nugget Sill Range r2 RSS pollution levels, so at this point density traffic caused high
Linear 302 302 19295 0.0 1539545 proportion in pollution. This is one of the focal areas of the
city. It is the intersection of districts and there are many
Gaussian 1 330 2460 0.079 1424179
roads with crowded transport volume. The process of
Spherical 1 329 3270 0.076 1433748 urbanization is growth.
Exponetial 1 327 3510 0.068 1452090
Model Testing: The reliable result of model selection using
appropriate interpolation is expressed in Table 5 by
coefficient of regression, coefficient of correlation and
interpolated values, in addition to the error values as the
standard error (SE) and the standard error prediction (SE
Prediction).
Table 5 Testing the model parameters
Coefficient Coefficient
SE SE Prediction
regression correlation
1.026 1 0.001 0.141
Figure 8 2D Cokriging Interpolation Map of TSP
References
[1] Robert Angus Smith, “On the Air of Towns”, Journal of the Chemical Society, 9, pp. 196-235, 1859.
[2] Anh Pham Thi Viet, “Application of airborne pollutant emission models in assessing the current state of the
air environment in Hanoi area caused by industrial sources”, 6th Women's Science Conference, Ha Noi national
university, pp. 8-17, 2001.
[3] Yen Doan Thi Hai, “Applying the Meti-lis model to calculate the emission of air pollutants from traffic and
industrial activities in Thai Nguyen city, orienting to 2020”, Journal of Science and Technology, Volume 106
No. 6, Thai Nguyen university, 2013.
[4] S.H. Ahmadi and A.Sedghamiz, “Geostatistical analysis of Spatial and Temporal Variations of groundwater
level”, Environmental Monitoring and Assessment, 129, 277-294, 2007.
[5] R.Webster and M.A. Oliver, Geostatistics for Enviromental Scientists, 2nd Edition, John Wiley and Sonc
LTD, The Atrium, Southern Gate, Chichester, West Sussex PO19, England, 6-8, 2007.
[6] E.Isaaks and M.R. Srivastava, An introduction to applied geostatistics, New York: Oxford University Press,
1989.
[7] Gamma Design Software, GS+ Geostatistics for the Environmental Science, version 5.1.1, Plainwell USA:
MI, 2001.
[8] P.Goovaerts, Geostatistics for natural resources Evaluation, New York: Oxford University Press, 1997.
Ứng dụng phương pháp nội suy Cokriging để dự báo chỉ số chất lượng không khí cho
nồng độ bụi TSP thành phố Đà Nẵng
Nguyễn Công Nhựt1,*, Lai Văn Phút1, Bùi Hùng Vương1
1
Khoa Công nghệ thông tin, Trường Đại học Nguyễn Tất Thành, Việt Nam
*ncnhut@ntt.edu.vn
Tóm tắt Việc lập bản đồ để dự đoán nồng độ ô nhiễm không khí ở thành phố Đà Nẵng là một vấn đề cấp bách
đối với các cơ quan quản lí và các nhà nghiên cứu về ô nhiễm môi trường. Mặc dù mô phỏng về vị trí không
gian đã trở nên phổ biến, nó sử dụng các phương thức nội suy cổ điển với độ tin cậy thấp. Dựa trên sự phân bố
các trạm quan trắc chất lượng không khí nằm trong khu công nghiệp, khu dân cư, trục giao thông ... và nguồn ô
nhiễm không khí, ứng dụng các lý thuyết địa chất, nghiên cứu này trình bày kết quả lựa chọn phương pháp nội
suy Cokriging dự báo ô nhiễm ở thành phố Đà Nẵng với độ tin cậy cao. Trong bài viết này, tôi sử dụng nồng độ
TSP được ghi nhận (một trong những ô nhiễm không khí chính gây ra tại các đô thị lớn) tại một số trạm quan sát
ở thành phố Đà Nẵng, sử dụng phương pháp nội suy Cokriging để tìm mô hình phù hợp, sau đó dự báo nồng độ
bụi TSP tại một số trạm không có dữ liệu quan trắc trong thành phố. Đóng góp quan trọng của tôi là tìm kiếm
các mô hình thống kê tốt theo một số tiêu chí, sau đó tìm các mô hình phù hợp với độ chính xác cao.
Từ khóa Ô nhiễm không khí, địa lý, Cokriging, variogram