Professional Documents
Culture Documents
net/publication/305695578
CITATIONS READS
8 273
3 authors, including:
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Qian Zhang on 30 October 2017.
Abstract—This letter proposes an efficient framework for build- new solutions to exploit the spatial details and extract accurate
ing detection from coarse to fine using morphological technique building information [1]. High spatial resolution images have
for high-resolution optical satellite imagery over urban areas. now become one of the most important input sources for rapidly
First, the preliminary result of building regions is obtained by the obtaining building information.
recently developed morphological building index (MBI) method,
which is able to detect potential building structures. However, the A number of studies have used additional information as a
raw results derived from the MBI can be subject to a number of supplement for building verification or distinction. Sources of
false alarms, which are caused by bright soil, roads, and open ar- additional information have included light detection and rang-
eas. In this letter, we propose to use morphological spatial pattern ing (LiDAR) images, digital elevation models, digital surface
analysis as a postprocessing to further optimize the MBI result and models (DSMs) [2], [3], stereo imagery [4], and so on. Qin and
remove the commission errors. The original MBI result is then Fang [2] used both a DSM and aerial imagery to hierarchically
separated into seven mutually exclusive categories—core, islet,
extract buildings. Zhang et al. [3] used ultrahigh-resolution
loop, bridge, perforation, edge, and branch—by applying a series
of morphological transformations such as erosions, geodesic dila- orthophotographs and a DSM to obtain a classification map
tion, reconstruction by dilation, anchored skeletonization, etc. The based on a dual morphological top-hat profile. Shaker et al. [4]
objects corresponding to the generic categories are then analyzed, used IKONOS stereo imagery and digital surface and terrain
and the categories corresponding to building parts are maintained, models to extract building planimetric positions and heights,
while the others are abandoned. After this postprocessing, the respectively. Due to the great contribution of the additional in-
small noisy patches and narrow roads, which were wrongly ex- formation, the aforementioned methods generally perform well.
tracted by the MBI, can be removed. In addition, the shape of
the buildings can also be regularized by removing the branches, However, a major problem is that auxiliary data sources (e.g.,
and the holes contained in the building objects can be identified the LiDAR point cloud) are not available in most situations.
and filled. Extensive experiments performed on GeoEye-1 and On the other hand, image primitives such as shadows [5]–[7],
WorldView-2 images confirm the effectiveness and robustness of corners, edges, or line segments have also been considered
the proposed morphological building detection framework. for building extraction. Ok et al. [5] proposed a new fuzzy
Index Terms—Building detection, high resolution, morphologi- landscape generation approach (denoted as Shadow-based) to
cal, postprocessing. model the directional spatial relationship between buildings and
their shadows, and the final arbitrarily shaped building regions
I. I NTRODUCTION are automatically detected by a GrabCut partitioning approach
from monocular QuickBird and GeoEye-1 imagery. Ok [6] then
B UILDINGS are a key factor in region management and
planning, change monitoring, etc., as buildings can reflect
the status and trends of a region’s development. Therefore,
extended the previous approach [5] and proposed an effective
shadow postprocessing step (denoted as Shadow-GC, GC =
having reliable and accurate information about building objects graph cuts) to refine the building shadow and a two-level graph
is critical, particularly in rapidly developing countries such as theory approach to detect building regions. Manno-Kovacs and
China. Extensive research has been devoted to building extrac- Ok [7] also proposed an approach for building detection from
tion in recent decades. The emergence of available high spatial monocular very high resolution optical images by integrated
resolution images from satellite sensors has provided more urban area knowledge.
spatial information and opportunities for building extraction More recently, Huang and Zhang [8] proposed the mor-
but has also resulted in more challenges. This has prompted phological building index (MBI), which can be quickly im-
scholars to break through the traditional thinking and look for plemented, does not need training samples, and has proved
to be effective for automatic building extraction from high-
resolution imagery. Buildings are extracted by performing a
Manuscript received March 14, 2016; revised June 7, 2016 and June 26, threshold (T) operation on the MBI feature image, which is
2016; accepted July 9, 2016. Date of publication July 27, 2016; date of current
version August 5, 2016. This work was supported in part by the National
extracted by a set of morphological operators (e.g., top-hat
Natural Science Foundation of China under Grants 41301472 and 91338111, by reconstruction, granulometry, and directionality) according
by the China National Science Fund for Excellent Young Scholars under Grant to the intrinsic properties of the buildings (e.g., brightness,
41522110, and by the Foundation for the Author of National Excellent Doctoral size, contrast, directionality, and shape). However, there are
Dissertation of China under Grant 201348.
Q. Zhang and G. Zhang are with the Department of Computer Science and drawbacks to the MBI, as analyzed in [8]. Due to its bright-
Technology, East China Normal University, Shanghai 200241, China. ness hypothesis, i.e., building regions are brighter than their
X. Huang is with the School of Remote Sensing and Information Engineer- surroundings, bright objects such as bright soil, open areas, and
ing, Wuhan University, Wuhan 430079, China (e-mail: huang_whu@163.com). roads can cause a number of false alarms since they are brighter
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. than their neighborhoods and show similar spectral properties
Digital Object Identifier 10.1109/LGRS.2016.2590481 to buildings.
1545-598X © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
ZHANG et al.: MORPHOLOGICAL BUILDING DETECTION FRAMEWORK FOR OPTICAL IMAGERY OVER URBAN AREAS 1389
II. M ETHODOLOGY
B. MSPA
A. MBI
In a binary map (foreground and background), MSPA is
The MBI [8] can be viewed as a high-level feature for used to analyze the geometry and connectivity of the image
building extraction, which possesses multiscale and multidirec- components. MSPA consists of a series of sequential math-
tional characteristics, free parameters, and low computational ematical morphological operators, such as erosion, geodesic
complexity. The basic principle of the MBI is that building dilation, reconstruction by dilation, and anchored skeletoniza-
regions are brighter than their surroundings (building shadows). tion. Each pixel of the foreground is segmented as one of the
Building objects are extracted by the MBI according to the mutually exclusive sets of structural classes: core, islet, edge,
spectral–spatial properties of the buildings (e.g., brightness, loop, perforation, bridge, and branch. In different applications,
size, contrast, directionality, and shape) by a set of morpho- the corresponding objects in the generic MSPA categories are
logical operators. The calculation of the MBI features is briefly different [9]–[11]. MSPA borrows the notion of path connec-
introduced as follows. tivity to determine the 8-connected or 4-connected regions of
the foreground (FGconn). There are two key parameters in
1) Brightness image. Each pixel’s maximum value in the
MSPA, i.e., the foreground connectivity (FGconn) and the size
visible bands is kept as the brightness. This is because
parameter (Ew). Parameter Ew defines the width or thickness
the visible bands make the most significant contribution
of the noncore classes in the pixels.
to the spectral property of buildings.
Fig. 1 shows the calculation process of MSPA. Core pixels
2) Top-hat morphological profiles. The differential morpho-
(X1 ) are groups of foreground pixels whose distance to the
logical profiles of the top-hat transformation (DMPTH )
background is greater than the given size parameter Ew. The
represent the spectral–structural property of the buildings
core pixels are detected by the erosion of input image (X) with
a Euclidean disk of radius equal to Ew. In practice, the core
DMPTH = [ΔTH(d, s), d ∈ D, s ∈ S] , pixels X1 are obtained by thresholding the Euclidean distance
with, ΔTH(d, s) = |TH(d, s) − TH(d, s − Δs)| transform (EDT) of the foreground by a threshold value Ew
TABLE I
N UMBERS O F G ROUND T RUTH S AMPLES FOR
T HE F IVE VALIDATION A REAS
Fig. 2. Example of the proposed approach. (a) Original image. (b) Original
MBI features. (c) Ground reference. (d) MSPA result, Ew = 2. (e) Final
building result extracted from (d).
Fig. 4. Results of building detection. Row 1: Columns 1–5 represent the original images of R1–R5, respectively. Rows 2–5 represent the building maps extracted
by the Shadow-based, the Shadow-GC, the MBI, and the proposed method, respectively.
as background and samples that are wrongly classified as build- of the buildings, the optimal values of Ew for images with
ings, respectively. The overall accuracy (OA) and Kappa co- different spatial resolutions will be different. For all the test
efficient (Ka) for the building–background classification were images, the OA and Ka of Ew = 1 are higher than those of
calculated based on the confusion matrix. Ew = 2, and the OE is lower in Fig. 3. Thus, the optimal value
The MBI refers to two parameters: directionality (d) and the of Ew is 1 for the five test images.
size (s) of the SE. Eight directions (d) were considered, ranging The postprocessing involves two parameters: the threshold of
from 22.5 to 180 with an interval of 22.5. The linear SE size (s) the smallest area of objects (t) and the length–width ratio (t2)
was varied from 2 to 22 with a step size of 5, which was set of narrow objects, which were set to 30 and 9.6, respectively.
according to the spatial resolution of the image and the size of These parameters were kept constant in all the test data sets.
the buildings in the study area. Further details of MBI parameter
selection can be found in [8].
C. Results and Discussion
Fig. 3 shows the sensitivity of the MSPA parameters, namely,
FGconn and Ew, to the accuracy of the R1–R5 test images. The proposed morphological building detection framework
Clearly, the results are not sensitive to the FGconn parameter. was compared with the original MBI features and the Shadow-
When Ew is kept the same, the results with different FGconn based and Shadow-GC methods proposed by Ok et al. [5] and
values show only very slight differences, which can be ignored. Ok [6]. Fig. 4 shows the extracted building maps, and the
Based on our extensive experiments, objects can be more accuracies are given in Table II.
easily separated with less computation time when FGconn uses It can be seen that many buildings are not detected by the
4-connectivity rather than 8-connectivity. Therefore, FGconn Shadow-based (see row 2 of Fig. 4), and the average OE of
was fixed as 4 in this letter. Parameter Ew has a greater impact the Shadow-based is the highest, as listed in Table II. The
on the results than FGconn, as shown in Fig. 3. According to aforementioned problem is significantly reduced in the results
the definition of Ew, the image resolution, and the average size of the shadow-GC, but there is slight overcorrecting. Many
1392 IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, VOL. 13, NO. 9, SEPTEMBER 2016
TABLE II
A CCURACY A SSESSMENT OF THE B UILDING E XTRACTION FOR THE S HADOW-BASED , THE S HADOW-GC,
THE MBI, AND THE P ROPOSED M ETHOD ( THE U NITS OF OE, CE, AND OA A RE IN P ERCENT )
roads are detected as buildings by the Shadow-GC (see row 3 dense buildings, and they are not designed for use in rural
of Fig. 4). Thus, the CE of the Shadow-GC shows a significant areas. However, the outstanding performance of the proposed
increase when compared with that of the Shadow-based. The framework is more apparent in rural areas with sparse buildings,
Shadow-based and Shadow-GC obtain the best results in R3 compared with urban areas with dense buildings.
and R2, respectively.
There are a number of nonbuilding objects that are detected
ACKNOWLEDGMENT
as buildings by the MBI in all the test data sets (see row 4
of Fig. 4). The observations are consistent with the CE results The authors would like to thank Prof. A. O. Ok for providing
listed in Table II. The bare land, roads, and vegetation (grass or the results of his approach for comparison and the anonymous
farmland) classes are detected as small patches of buildings by reviewers for the constructive comments.
the MBI, which has a significant negative effect on the accuracy
in all seven test images. However, most of these categories are R EFERENCES
classified as islet (small patches) or bridge (narrow roads) by
[1] X. Huang and L. Zhang, “An SVM ensemble approach combining
MSPA and are therefore discarded. A few wide roads are re- spectral, structural, and semantic features for the classification of high-
moved by the length–width ratio, i.e., larger than the threshold resolution remotely sensed imagery,” IEEE Trans. Geosci. Remote Sens.,
(t2). Overall, the CE of the proposed approach is reduced by vol. 51, no. 1, pp. 257–272, Jan. 2013.
[2] R. Qin and W. Fang, “A hierarchical building detection method for
12.9% on average compared with the CE of the MBI. The very high resolution remotely sensed images combined with DSM using
proposed approach is also effective in the rural area where graph cut optimization,” Photogramm. Eng. Remote Sens., vol. 80, no. 9,
buildings are not dominant, as shown in R5. pp. 873–883, 2014.
The OA of the proposed method shows an average increase [3] Q. Zhang, R. Qin, X. Huang, Y. Fang, and L. Liu, “Classification of ultra-
high resolution orthophotos combined with DSM using a dual morpho-
of 15.1%, 6.7%, and 7.4%, respectively, compared with the logical top hat profile,” Remote Sens., vol. 7, no. 12, pp. 16 422–16 440,
Shadow-based, the Shadow-GC, and the MBI. Furthermore, 2015.
the Kappa coefficient shows an average increase of 0.303, [4] I. F. Shaker, A. Abd-Elrahman, A. K. Abdel-Gawad, and M. A. Sherief,
0.135, and 0.152, respectively. The proposed approach is also “Building extraction from high resolution space images in high density
residential areas in the Great Cairo region,” Remote Sens., vol. 3, no. 4,
very fast. The average computation times of the Shadow-based pp. 781–791, 2011.
and Shadow-GC are about 300 and 480 s, respectively, for a [5] A. O. Ok, C. Senaras, and B. Yuksel, “Automated detection of arbitrarily
computer with a 2.40-GHz Intel i7 processor and 16-GB RAM. shaped buildings in complex environments from monocular VHR optical
satellite imagery,” IEEE Trans. Geosci. Remote Sens., vol. 51, no. 3,
The average computation times of the MBI and the proposed pp. 1701–1717, Mar. 2013.
method are 3.3 and 7.7 s, respectively, for a computer with a [6] A. O. Ok, “Automated detection of buildings from single VHR mul-
2.00-GHz Intel i7 processor and 8-GB RAM. Thus, despite the tispectral images using shadow information and graph cuts,” ISPRS J.
complexity of the test images, the results are encouraging and Photogramm. Remote Sens., vol. 86, pp. 21–40, 2013.
[7] A. Manno-Kovacs and A. O. Ok, “Building detection from monocular
prove the proposed method’s viability. VHR images by integrated urban area knowledge,” IEEE Geosci. Remote
Sens. Lett., vol. 12, no. 10, pp. 2140–2144, Oct. 2015.
[8] X. Huang and L. Zhang, “A multidirectional and multiscale morphologi-
IV. C ONCLUSION cal index for automatic building extraction from multispectral GeoEye-1
imagery,” Photogramm. Eng. Remote Sens., vol. 77, no. 7, pp. 721–732,
The main contribution of this letter is that morphological 2011.
component classification is used with the raw results of the [9] P. Soille and P. Vogt, “Morphological segmentation of binary patterns,”
MBI. According to the morphological classification results, Pattern Recognit. Lett., vol. 30, no. 4, pp. 456–459, Mar. 2009.
the intrinsic CEs of the MBI are removed. The extensive [10] P. Vogt, J. R. Ferrari, T. R. Lookingbill, R. H. Gardner,
K. H. Riitters, and K. Ostapowicz, “Mapping functional connectivity,”
experiments undertaken in this letter indicate that the proposed Ecological Indicators, vol. 9, no. 1, pp. 64–71, 2009.
morphological building detection framework has significant [11] M. Suarez-Rubio, T. R. Lookingbill, and A. J. Elmore, “Exurban develop-
advantages and outperforms the three state-of-the-art methods ment derived from Landsat from 1986 to 2009 surrounding the District of
chosen for comparison. The proposed approach also has a Columbia, USA,” Remote Sens. Environ., vol. 124, pp. 360–370, 2012.
[12] R. G. Congalton and K. Green, Assessing the Accuracy of Remotely
very high computational efficiency. The traditional building Sensed Data: Principles and Practices. Boca Raton, FL, USA: CRC
detection algorithms are mainly designed for urban areas with Press, 2008.