You are on page 1of 4



for their given application. Obviously, RMSE is an important index

for geometric accuracy. Therefore, it is suggested that the RMSE, area
omission error, and area commission error should be used for the
majority of applications.
Level III: Shape similarity indexes (corner difference, perimeter difference, area difference, and moment-derived shape similarity). Some
applications such as cadastral management may have shape similarity
requirements. The rst three indexes are easy to calculate and can be
used to roughly estimate the shape similarity. The moment-derived
index has a robust theoretical background and is more rigorous. Certain
applications such as building visualization may require shape similarity
For three-dimensional building evaluation, some of the indexes can
be directly used (detect rate, correctness, corner difference). RMSE can
be calculated using x, y, z coordinates. As well, a three-dimensional
index can be dened to represent the volume difference between the
extracted building and the reference. Similarly, the area omission and
commission errors can be replaced by volume omission and commission errors.
A comprehensive building accuracy assessment approach has been
presented. Unlike the more popular assessment approaches that only
rely on building counts, this research provides ten quantitative indexes
to add to the assessment of extraction completeness, correctness, geometric accuracy, and feature shape similarity.
Using only count-based indexes may generate an overoptimistic performance evaluation. Also using too few indexes may allow biased assessment results to be perpetuated. Implementing the proposed ten indexes can evaluate the accuracy of the building extraction process more
extensively. Accuracy assessment is a complex task, and different indexes should be combined to provide insight on the implications of
inaccuracy from several different perspectives.
[1] A. Gruen, E. P. Baltsavias, and O. Henricsson, Automatic Extraction of
Man-Made Objects from Aerial and Space Images (II). Basel, Germany: Birkhauser Verlag, 1997.
[2] H. Mayer, Automatic object extraction from aerial imagery: A survey
focusing on buildings, Comput. Vis. Image Understanding, vol. 74, no.
2, pp. 138149, May 1999.
[3] R. Nevaita and A. Huertas, Research in knowledtge-based automatic
feature extraction, Inst. Robotics Intell. Syst., Univ. Southern California, Los Angeles, Tech. Rep. 00383, 1999. [Online]. Available:
[4] T. Kim and J. P. Muller, A technique for 3D building reconstruction,
Photogramm. Eng. Remote Sens., vol. 64, no. 9, pp. 923930, Sep.
[5] J. A. Shufelt, Performance evaluation and analysis of monocular
building extraction from aerial imagery, IEEE Trans. Pattern Anal.
Mach. Intell., vol. 21, no. 4, pp. 311326, Apr. 1999.
[6] O. Henricsson and E. Baltsavias, 3-D building reconstruction with
ARUBA: A qualitative and quantitative evaluation, in Proc. Conf.
Automatic Extraction of Man-Made Objects from Aerial and Space
Images (II), A. Gruen, E. Baltavias, and O. Henricsson, Eds., 1997, pp.
[7] A. Brunn and U. Weidner, Hierarchical Bayesian nets for building extraction using dense digital surface models, ISPRS J. Photogramm. Remote Sens., vol. 53, pp. 296307, Oct. 1998.
[8] S. Theodoridis and K. Koutroumbas, Pattern Recognition. San Diego,
CA: Academic, 1999, pp. 245249.
[9] M. K. Hu, Visual pattern recognition by moment invariants, IRE Trans.
Inf. Theory, vol. IT-8, no. 2, pp. 179187, Feb. 1962.

Accuracy, Reliability, and Depuration of SPOT HRV and

Terra ASTER Digital Elevation Models
Aurora Cuartero, A. M. Felicsimo, and F. J. Ariza
AbstractThe aim of this communication is to study the accuracy and
reliability of digital elevation models (DEMs) generated from two different
satellite sources [the Terra Advanced Spaceborne Thermal Emission and
Reection Radiometer (ASTER)] and [the Systeme Pour lObservation
de la Terre (SPOT) High Resolution Visible (HRV)] stereoscopic images),
using three different photogrammetric softwares. The main reason of the
study is the heterogeneity and absence of agreement found in previous research concerning several signicant aspects of DEM generation methods.
A set of 91 DEMs were generated from SPOT data and 55 DEMs from
ASTER data. Error control was performed with 315 check points determined by differential global positioning systems. Results of Terra ASTER
DEMs show that elevation RMSE (root mean square error) equals 13.0 m.
The corresponding RMSE value for SPOT HRV DEM is 7.3 m. In both
cases, the error is less than the pixel size. Furthermore, this communication proposes a technique to improve DEM structure, based on an objective criterion to cleanse redundancy in DEMs without a signicant loss of
accuracy. This criterion is based on removing all points with a correlation
value below a threshold value.
Index TermsAccuracy, Advanced Spaceborne Thermal Emission and
Reection Radiometer (ASTER), digital elevation model (DEM), High Resolution Visible (HRV), reliability, Systeme Pour lObservation de la Terre
(SPOT), Terra.

A digital elevation model (DEM) can be extracted automatically
from stereo satellite images. Numerous applications are based on DEM,
and their validity directly depends on the quality of the original elevation data. High-quality DEMs are seldom available, even though photogrammetric technology, the most common to work with DEMs, has
been around for a few years.
The possibility of using stereoscopic images from satellites for a
global digital elevation data production did not arise until the launch
of the Systeme Pour lObservation de la Terre (SPOT) series in 1986.
Today, several satellites also offer the possibility for stereoscopic
acquisition: SPOT [1], the Modular Optoelectronic Multispectral
Stereo Scanner [2], the Indian Remote Sensing satellite, the Korea
Multipurpose Satellite, the Advanced Visible and Near-Infrared Radiometer sensor [3], Terra [4], and more recently, the high-resolution
pushbroom scanners IKONOS, EROS-A1, QUICKBIRD-2, SPOT 5,
and ORBVIEW-3. Thus, some studies focus on constructing DEMs
from stereoscopic images by means of high-resolution pushbroom
scanners such as IKONOS [5], [6], EROS-A1 [7], and SPOT-5 [8];
furthermore, it is assumed that the automatic generation of a DEM
from remotely sensed data with a Z subpixel accuracy is possible [9].
Automation allows the construction of DEMs with an almost randomly large point density. The selection of very important points,
common in manual processing for the construction of triangulated irregular network (TIN) structures, is not applicable to automatic photogrammetric processes. The result often entails a very hard DEM

Manuscript received February 26, 2004; revised November 8, 2004.

This work was supported in part by the Junta de Extremadura (II Plan Regional de Investigacin, Desarrollo Tecnolgico e Innovacin de Extremadura)
and in part by the Fondo Europeo de Desarrollo Regional (FEDER) as part of
Project 2PR03A105.
A. Cuartero and A. M. Felicsimo are with the University of Extremadura,
10071 Caceres, Spain (e-mail:
F. J. Ariza is with the University of Jan, 23071 Jan, Spain.
Digital Object Identier 10.1109/TGRS.2004.841356

0196-2892/$20.00 2005 IEEE




For this reason, we have conducted a set of experiments that guarantee reliability in error control and analyze factors such as the inuence of software, discarded in previous research.


This communication aims to do the following.

Verify the inuence of pixel size and stereoscopic capture method
(along/cross-track) in DEM accuracy generated from two different sources (Terra ASTER and SPOT HRV). In order to make
results more consistent, we have used different photogrammetric
software applications: Erdas Imagine with OrthoBase Pro (Leica
Geosystems), Geomatica Ortho Engine (PCI Geomatics), and
Socet Set (Leica Geosystems).
Propose a method of improvement for the structure of DEM
without a loss in accuracy. This process of simplication enables the data structure to be better adapted for integration in a
geographical information system (GIS).
A. Study Area, Data, and Software

where a lot of redundant or unrelevant information can be removed. In

a literature review, we could nd no references to possible optimization
strategies for this phase of the process.
Accuracy estimation can be carried out by comparing the DEM data
with a set of check points measured by high-precision methods. The
basic conditions for a correct work ow are: 1) high accuracy of check
points and 2) enough points to guarantee error control reliability. We
have examined that most research does not satisfy those conditions.
The common sources of check points are topographic contour maps,
whose accuracy is not well known (e.g., see Table I). Also, the error
control is frequently performed with a number of points that are clearly
insufcient to guarantee the test reliability.
Deriving DEMs from stereoscopic satellite images is a well-known
technique; however, the results of DEM accuracy and the method
used to capture the check points and to calculate the reliability differ
according to the literature revised. This variation may be due to the
method used to estimate error in DEM as much in the number as in the
source of check points used.
Table I shows some signicant examples about accuracy in SPOT
DEM. We can see that root mean square error (RMSE) values are very
different, varying from 3.333 m. The number of check points is also
very different, from 640, but many authors do not provide information
about this issue. Also, other aspects that may be crucial, such as the
terrain topography, remain unknown.
The Advanced Spaceborne Thermal Emission and Reection Radiometer (ASTER), onboard the National Aeronautics and Space Administration Terra satellite, provides along-track near-infrared stereoscopic images at 15-m resolution. Terra ASTER is a quite recent sensor;
thus, there is little research that analyzes the accuracy of DEM generated, mostly on simulated ASTER data [4], [17]. There is little research
focusing on possibilities in DEM generation with a variable elevation
RMSE between 760 m. Table II shows results of research about accuracy in DEM derived from ASTER images.
From previous work, we may conclude that the results for both SPOT
High Resolution Visible (HRV) and Terra ASTER data are very different. Important questions such as the number of check points and the
capture method are not standardized. Some authors do not even inform
about control methods.

The study area is a 23 km 2 28 km rectangle in Granada (southern

Spain). It is an area with a complex topography: steep slopes in
the south and at surfaces in the north. Elevations are in the range
3002800 m with an average of 1060 m.
We have used a pair of panchromatic stereo images SPOT HRV
(10-m pixel) and two pairs of Terra ASTER scenes (15-m pixel). The
SPOT images were taken on November 2, 1991 and January 2, 1992,
and the Terra ASTER images were taken on August 22, 2000.
ASTER data were processed with Erdas Imagine and Ortho Engine.
Erdas Imagine and Socet Set were used for processing SPOT data.
B. DEM Generation
The automatic extraction of DEM is facilitated if the specic sensor
model information is available. In order to guarantee the most accurate DEM that can provide SPOT HRV and Terra ASTER images, we
have analyzed the inuence of aspects such as number and spatial distribution of ground control points, the data structure [TIN or uniform
regular grid (URG)], and the sample interval. With some applications,
the algorithms and correlation coefcient threshold can also be tested.
We have conducted several experiments to determine the optimal
value of inuential variable. We constructed ninety SPOT-derived
DEM and 60 ASTER-derived DEM (see the Section V).
C. Accuracy and Reliability
DEM accuracy is estimated by a comparison with DEM Z values
and by contrasting many check points with true elevations. The pairwise comparisons allow the calculation of the mean error (ME), RMSE,
standard deviation (SD), or similar statistics. The number of check
points is an important factor in reliability inuencing the range of stochastic variations on the SD values [18]. Another factor is obvious: the
accuracy of check points must be sufcient for the control objectives.
The estimate of errors in DEM is usually made by following the U.S.
Geological Survey recommendation of a minimum of 28 check points
[19]. Li [18] showed, however, that many more points are needed to
achieve a reliability closer to what is accepted in most statistical tests.
The expression that relates reliability to number of check points is

R(e) =


2(n 0 1) 2 100%


where R e represents the condence value in percent, and n is the

number of check points used in the accuracy test. As an inverse example, if we wish to obtain a SD condence value of 5%, we need



about 100 check points. If we used 28 check points, we would reach a

20% condence value.
Therefore, the number of check points must guarantee stability in
error estimates. Revised research is rather heterogeneous regarding
number and accuracy of check points, and no author has veried
reliability in of these results.
Most research used a number of check points that proved clearly
insufcient for guaranteeing the validity of error results [14], [16].
One article explained the use of check points from preexisting cartography [10], [11]; this procedure is not recommended, as there tends to
be no knowledge about the control map quality itself. Methods based
on global position systems (GPS) constitute the ideal source to obtain these points, since they yield the coordinates with great accuracy
and also allow to plan a spatially well-distributed sample covering the
whole area under analysis.
To ensure error reliability, we used a set of 315 randomly distributed
check points whose coordinates were determined by differential GPS
techniques. We were able to calculate the difference between these
points and the elevation values of the DEM, and estimate the ME, SD,
and RMSE. The condence interval (CI) of the standard deviation was
also calculated (see Table IV).







D. DEM Depuration Procedure

Due to they very high density of points, digital photogrammetric
workstation-generated DEM may attain massive computational sizes.
Their integration into a GIS may lead to conict between these huge
data sizes and the analysis, mapping algebra, and simulation operations. Therefore, it would clearly be advisable to avoid a blind inclusion of all of the data deriving from the photogrammetric process and
only to take those points that show both good quality and signicance
in representing the relief. By eliminating poor-quality data, model accuracy can be improved. By eliminating unnecessary data, will be redundancy reduced. Obviously, this process can be applied over TIN
structures. Due to xed cell size, the procedure is not applicable over
gridded DEMs.
Effectiveness in the characteristic operations of the GIS can be increased by both effects. By use of GIS, the DEM is just one more of
the layered variables to be considered. The working hypothesis is that
the correlation coefcient associated with each elevation value may be
interpreted as a reliability index. If this assumption is correct, we can
eliminate the data with a poor correlation value without a signicant
loss of accuracy. All the accuracy tests were carried out with the set of
315 check points.
A. Accuracy and Reliability Results
We constructed 146 DEM, 91 from SPOT images and 55 from
ASTER images serving as numerous combinations of variables until
reaching the most accurate DEM. A synthesis of the most accurate
DEM is given in Table IV, which lists the values of the ME, SD, its
: ), and RMSE. In our case,
condence interval (CI
the availability of 315 check points enabled the error control to have a
reliability of 96%. Optimal ndings include the following.
Erdas Imagine generates the most accurate ASTER-DEM
(34.8-m RMSE) using 12 ground control points; and SPOT
DEM (7.7-m RMSE) using 14 ground control points. Both are
TIN structures.
Geomatica OrthoEngine obtains the best ASTER-DEM (12.6-m
RMSE) as a URG structure (30-m cell size), using 15 ground
control points.
Socet Set obtains the best SPOT DEM (8.6-m RMSE) as a URG
structure (20-m cell size), and using 13 ground control points.

= 95% = 0 05

Based on the results obtained in this study, the generation of DEM

from Terra ASTER and SPOT HRV stereo images can be done with
methods of digital restitution, leading to RMSE values less than the
pixel size. The sampling interval is one of the factors that inuences
the quality of the DEM: The best results are obtained for a cell size
twice the pixel size (i.e., 30 m from Terra ASTER; 20 m from SPOT
HRV). Increasing of this distance among sampled points is not a good
strategy because it is equivalent to a progressive generalization of the
DEM structure.
The inuence of software is obvious from the experiments carried
out. Erdas Imagine shows worse results from ASTER data, whereas
the accuracy of SPOT DEM is similar for both Erdas and Socet Set.
These results may require some explaining. We believe that the main
reason is an absence of specic geometric satellite models: Erdas can
work with ASTER data, but it forces to the use of a generic model
unable to take full advantage of the data. In contrast, we conclude that
the SPOT model is fully implemented, and the results are very similar.
Ortho Engine includes an ASTER specic model that compensates the
shortage of orbital parameters.
B. DEM Depuration Results
We have conducted the depuration process based on the hypothesis
of a certain correspondence between correlation and data reliability:
The presence of a low correlation value is not a denitive proof of poor
quality, but is a valid warning signal and has statistical signicance.
The huge DEM (with no points yet removed) was denoted as
MDE00. Other DEMs were generated by previously deleting those
points whose correlation coefcient was less than a threshold value
(Table III). For example, MDE50 was the result of taking a threshold
value 0.50 for the correlation coefcient.
Table IV shows error evolution versus the correlation coefcient
threshold. It can be noted that error did not rise signicantly when
the number of eliminated points is increased, at least until a correla: ) is reached.
: ) or 0.94 (SD
tion threshold of 0.93 (SD
On moving to 0.95, the quality of the DEM signicantly drops (SD

= 79

= 80


12:2). MDE94 contains only 18.5% of the points of the massive original DEM (MDE00), while the MDE93 contains 23%. We emphasize
that the depuration process does not imply an improvement in accuracy
statistics, but it contributes to making the structure much more manageable in a GIS environment.
Automated DEM extraction using cross-track SPOT satellite, has
been known for 17 years. The addition of along-track ASTER provides
an alternative for the extraction of DEM data. In addition, ASTER data
are very attractive because they can be downloaded and are very affordable.
We concluded that both the along-track Terra ASTER and crosstrack SPOT images will provide the opportunity for the generation of
DEMs with RMSE Z values less than the pixel size. We cannot conclude that the accuracy results are affected by other factors such as the
stereo capture method (along-track versus cross-track).
Photogrammetric programs are not identical. SPOT geometry and
data are fully implemented, but ASTER data cause more problems. Geomatica shows good ASTER RMSE values, but blunders are common.
Erdas shows bad ASTER RMSE values, but blunders are infrequent.
We emphasize the obligatory use of many accurate check points. The
use of a very limited number of points implies a very unreliable error
control that can make the results useless. We suggest a minimum of 100
points which corresponds to a condence value of about 0.10. Since
quality control procedures are ever required, carrying out the type of
tests described in this communication should not be a burden. Instead,
this experimentation should be done to lighten the DEM before it can
be regarded as a nished product.
Thanks to A. Curado for the linguistic revision of this
[1] R. Priebbenow and E. E. Clerici, Cartographic applications of SPOT
imagery, Int. Arch. Photogramm., vol. 37, pp. 289297, 1988.
[2] F. Lanzl, P. Seige, F. Lehmann, and P. Hausknecht, Using multispectral
and stereo MOMS-02 data from the Priroda mission for remote sensing
applications, in Proc. Int. Symp. Spectral Sensing Research, Melbourne,
Australia, 1995.


[3] T. Hashimoto, DEM generation from stereo AVNIR image, Adv. Space
Res., vol. 25, pp. 931936, 2000.
[4] R. Welch, T. Jordan, H. Lang, and H. Murakami, ASTER as a source
for topographic data in the late 1990s, IEEE Trans. Geosci. Remote
Sens., vol. 36, no. 4, pp. 12821289, Jul. 1998.
[5] R. Li, G. Zhou, S. Yang, G. Tuell, N. J. Schmid, and C. Fowler, A
study of the potential attainable geometric accuracy of IKONOS satellite
imagery, in Proc. 19th ISPRS Congress, Amsterdam, The Netherlands,
[6] T. Toutin, DEM generation from new VIR sensors: IKONOS, ASTER
and Landsat-7, in Proc. IGARSS, Sydney, Australia, 2001.
[7] L. Chen and T. Teo, Orbit adjuntment for EROS A1 high resolution
satellite image, in Proc. 22nd Asian Conf. Remote Sensing, Singapore,
[8] G. Petrie, The future direction of the SPOT programme: SPOT5 International Conference, in GeoInformatics, vol. 4, 2001, pp. 1217.
[9] P. Krzystek, New investigations into the practical performance of automatic DEM generation, in Proc. ACSM/ASPRS Annu. Convention,
Charlotte, NC, 1995.
[10] Y. Mukai, T. Sugimura, and K. Arai, Automated generation of digital
elevation model using system corrected SPOT data, in Proc. 23th Int.
Symp. Remote Sensing Environment, Bangkok, Thailand, 1990.
[11] K. C. Sasowsky and G. W. Petersen, Accuracy of SPOT digital elevation
model and derivatives: Utility for Alaskas North slope, Photogramm.
Eng. Remote Sens., vol. 58, pp. 815824, 1992.
[12] N. Al-Rousan and G. Petrie, System calibration, geometric accuracy
testing and validation of DEM and ortoimages data extracted from SPOT
stereopairs using commercially available image processing systems,
Int. Arch. Photogramm. Remote Sens., vol. 32, pp. 815, 1998.
[13] L. Hae-Yeoun, P. Wonkyu, K. Taejung, K. Seungbum, K. LHeung, and
K. Tag-gon, The development of an accurate DEM extraction strategy
for satellite image pairs using epipolarity of linear pushbroom sensor
and intelligent interpolation scheme, Int. Arch. Photogramm. Remote
Sens., vol. 33, pp. 705712, 2000.
[14] T. Toutin and P. Cheng, DEM generation with ASTER stereo data,
Earth Obs. Mag., vol. 10, pp. 1013, 2001.
[15] A. Kb, Monitoring high-mountaing terrain deformation from repeated air- and spaceborne optical data: Examples using digital aerial
imagery and ASTER data, ISPRS J. Photogramm. Remote Sens., vol.
57, pp. 3952, 2002.
[16] A. Hirano, R. Welch, and H. Lang, Mapping from ASTER stereo image
data: DEM validation and accuracy assessment, ISPRS J. Photogramm.
Remote Sens., vol. 1255, pp. 115, 2003.
[17] M. Abrams and S. J. Hook, Simulated ASTER data for geological
studies, IEEE Trans. Geosci. Remote Sens., vol. 33, no. 3, pp. 692699,
May 1995.
[18] Z. Li, Effects of check point on the reliability of DTM accuracy estimates obtained from experimental test, Photogrammetric Engineering.
[19] USGS, Digital Elevation Models: Data Users. Reston, VA: U.S. Geol.
Surv., 1987.