2015 AutoFeatureExtraction-Building HacettepeUniversity PG-Thesis

AN APPROACH FOR AUTOMATIC BUILDING
EXTRACTION FROM HIGH RESOLUTION SATELLITE

IMAGES USING SHADOW ANALYSIS AND ACTIVE
CONTOURS MODEL
GÖLGE ANALİZİ VE AKTİF YÜKSELTİ EĞRİLERİ

MODELİ KULLANARAK YÜKSEK ÇÖZÜNÜRLÜKLÜ
UYDU GÖRÜNTÜLERİNDEN OTOMATİK BİNA
ÇIKARIMI İÇİN BİR YAKLAŞIM
SALAR GHAFFARIAN
PROF. DR. MUSTAFA TÜRKER
Supervisor
Submitted to Institute of Sciences of Hacettepe University as a
Partial Fulfillment of the Requirements
for the Award of the Degree of Master
in Geomatics Engineering
May 2015
To My Parents and my
Dear wife
ABSTRACT
AN APPROACH FOR AUTOMATIC BUILDING

EXTRACTION FROM HIGH RESOLUTION SATELLITE
IMAGES USING SHADOW ANALYSIS AND ACTIVE
CONTOURS MODEL
SALAR Ghaffarian
Master of Science, Department of Geomatics Engineering
Supervisor: Prof. Dr. Mustafa TÜRKER
May 2015, 92 pages
Building extraction is an important task for detecting the changes of buildings, detecting
the destroyed buildings, updating the vector maps as well as reconstructing 3D building
models. Satellite and airborne images are greatly used for this purpose in various integrated
forms. In this study, a novel approach is presented for automatic extraction of buildings
from high resolution multispectral space imagery. The main goal of the study was to
develop an automatic method for building extraction by utilizing the relationship between
the cast shadows and the buildings with various types, shapes, sizes, heights and
environmental scenes. If the characteristics of the shadow areas and the buildings that cast
them are considered it can be seen that the geometry of the shadows highly depends on the
geometry of buildings that share a border with their cast shadows. Therefore, the
i
developed method has been developed with regard to meaningful relations between the
buildings and their cast shadows.
In the beginning, the shadow regions of the buildings are extracted through a novel
developed technique which operates on the features of the shadows in LAB color space.
Next, to segment the buildings automatically, a novel sampling method entitled ‘buffer
zone generation’ is carried out. This method utilizes the geometric relations between the
buildings and their relevant cast shadows as well as the illumination direction. Then, the
generated buffer zone is enlarged to generate appropriate initial contours for the Gradient
Vector Flow (GVF) Snake segmentation algorithm. To enlarge the sampling region and
localize the initial contours in reliable positions, the pixel-based Region-Growing
segmentation algorithm is carried out to produce the required vectors to be used by the
GVF Snake algorithm. The region-growing segmentation is used as an intermediate step as
a connectivity-bridge between the generated buffer zone and the GVF segmentation to
automate the GVF Snake algorithm. The results of Region-Growing segmentation are then
converted into vector form and used as initial contours for the GVF Snake segmentation
algorithm to extract buildings. This thesis study contains three main contributions. The first
contribution is that it automatically extracts the cast shadows of the buildings as the
essential requirements for the method. The second contribution is that it automatically
collects samples from the rooftops of the buildings. The third contribution is that it
segments the building areas in an automatic manner by means of the GVF snake algorithm.
The developed approach was tested on 50 test sites selected from urban and sub-urban
areas in Ankara, the capital of Turkey. The high resolution multispectral satellite images
used for the test sites were obtained from Google Earth. For each of the three main steps
followed throughout the approach, the accuracy values were computed using the well-
known metrics Precision, Recall, and FB-score. For each metric, the overall accuracy and
the individual accuracies were computed. The results achieved are quite satisfactory. For
the shadow extraction part, the overall FB-score result was computed to be 87.3%, while
for the final extracted buildings the overall FB-score result was computed to be 84.9%. The
results achieved in this study also demonstrated that the GVF Snake algorithm improved
the overall results of the region growing segmentation about 10%.
Keywords: Building Extraction, Shadow Detection, Region Growing Segmentation, GVF

Snake, Google Earth Images, Shadow Detection, LAB Color Space.
ii
ÖZET
GÖLGE ANALİZİ VE AKTİF YÜKSELTİ EĞRİLERİ

MODELİ KULLANARAK YÜKSEK ÇÖZÜNÜRLÜKLÜ
UYDU GÖRÜNTÜLERİNDEN OTOMATİK BİNA
ÇIKARIMI İÇİN BİR YAKLAŞIM
Salar Ghaffarian
Yüksek Lisans, Geomatik Mühendisliği Bölümü
Tez Danışmanı: Prof. Dr. Mustafa TÜRKER
Mayıs 2015, 92 Sayfa
Bina çıkarma işlemi bina değişimlerinin tespit edilmesi için çok önemli bir görevdir.
Bunun yanısıra yıkılan binaların tespit edilmesi, vektör haritaların güncellenmesi ve
binaların 3 boyutlu modellerinin oluşturulması için de kullanılır. Uydu ve hava görüntüleri,
bu amaçlar için çeşitli entegre edilmiş formlarda yaygın olarak kullanılmaktadır . Bu
çalışmada, yüksek çözünürlüklü çok bantlı uzay görüntülerinden, binaların otomatik
çıkarılması için yeni bir yaklaşım önerilmiştir. Önerilen yaklaşımdaki temel amaç çeşitli
boyut, yükseklik, renk, şekil ve çevresel durumlara sahip binaların gölgeleri arasındaki
ilişkiye dayanarak otomatik çıkarılmaları için bir yöntem geliştirmesidir. Eğer gölgeler ve
onları oluşturan binaların geometrik özellikleri dikkate alınır ise gölgelerin geometrisinin
yüksek oranda gölgeyi oluşturan binanın geomerisine bağlı olduğu görülebilir. Bu nedenle,
sunulan yöntem bina ve onların gölgeleri arasındaki anlamlı ilişkiye dayalı olarak
geliştirilmiştir.
iii
Başlangıçta, binaların gölge alanları LAB renk uzayında gölgelerin özellikleri üzerinde
çalışan yeni bir teknik yardımıyla çıkarılmaktadır. Sonra, binaların anahatlarının
bölütlenmesi için başlığı ’Buffer zone generation’ olan yeni bir örnekleme yöntemi
uygulanmaktadır. Önerilen yöntem, binalar ve onların gölgelerinin arasındaki geometrik
ilişki ve aydınlatma yönü bilgilerini kullanmaktadır. Daha sonra oluşturulan tampon bölge
GVF Snake bölütleme algoritmasının ihtiyacı olan uygun başlangıç konturlarını
oluşturacak şekilde büyütülmektedir.. Örnekleme alanın büyütülmesi ve başlangıç
konturları güvenilir pozisyonlara yerleştirmek için, piksel bazlı Alan-büyüme bölütleme
algoritması kullanılmıştır ve bu algoritma sayesinde GVF Snake algoritması tarafından
kullanılan gerekli vektörler üretilmiştir. Alan-büyüme bölütlemesi GVF Snake algoritması
ve oluşturulmuş tampon bölgesi arasındaki bağlantıyı kurmak ve GVF Snake algoritmasını
otomatik yapmak için ara adım olarak kullanılmıştır. Alan-büyüme algoritmasının
sonuçları daha sonra vektör formata dönüştürülmüş ve GVF Snake algoritmasının binaları
çıkarmak için kullandığı başlangıç konturları üretmek için kullanılmıştır. Bu tez çalışması
üç ana katkı sunmaktadır. Birincisi, binaların gölgelerini, yöntemin temel gereksinimi
olarak, otomatik çıkarmaktadır. İkincisi, yöntem otomatik olarak binaların üzerinden örnek
toplamaktadır. Üçüncü katkı olarakta, binaları otomatik olarak GVF Snake algoritması ile
çıkarmaktadır.
Geliştirilen yaklaşım, Ankara’nın yakın çevresindeki kentsel ve banliyö bölgelerde yer

alan 50 test alanı üzerinde uygulanmıştır. Bu çalışmada kullanılan yüksek çözünürlüklü
uydu görüntüleri Google Earth’den elde edilmiştir. Yöntem boyunca takip edilen her üç
ana aşama için doğruluk değerleri iyi bilinen ölçum teknikleri olan Precision, Recall ve
FB-score ile hesaplanmıştır. Elde edilen sonuçlar oldukça tatmin edicidir. Önerilen
algoritmada gölge çıkarma bölümü için toplam kalite değerlendirme sonucu %87.3
olurken, binaların son çıkarma durumu için toplam kalite değerlendirme sonucu %84.9
olarak hesaplanmıştır. Ayrıca bu çalışmada elde edilen sonuçlar, GVF Snake bölütleme
algoritmasının Alan-büyüme bölütleme sonuçlarını tarafından yaklaşık 10% iyileştirildini
göstermiştir.
Anahtar kelimeler: Google Earth Görüntüleri, Gölge Tespiti, LAB Renk uzayı, Alan-
Büyüme bölütleme, GVF Snake, Bina çıkarma.
iv
ACKNOWLEDGEMENTS
I would like to gratefully thank to my supervisor Prof. Dr. Mustafa TURKER for his
support and motivation during the development of this thesis and I wish him all the best.
I wish to express my deepest gratitude to my brother Saman GHAFFARIAN for his great
collaborations on my thesis.
I would like to thank and acknowledgement to the Google Earth Company for providing
data sets of the thesis.
I cordially would like to dedicate this thesis to my dear parents Mortezagholi

GHAFFARIAN and Ferangis DARAI. Their continuous affection, encouragement,
patience and belief made me who I am. I am really to have a father and mother like you I
also would like to thank my father-in-low Rasoul SOHRABI and mother-in-low Iran
FOURUGHI and sister-in-low Bahareh SOHRABI and her husband Amir JALİLZADEH
for their encouragement and support during the preparation of the thesis.
Last but not least, I would like to express my special thanks to my wife Sharareh
SOHRABI for her love, encouragement, help and understanding. Her supportive attitude
gave me strength to overcome the difficulties I faced throughout this study, and I am really
lucky to have a wife like you. Thank you for everything.
v
TABLE OF CONTENTS
Page
ABSTRACT ........................................................................................................................... i
ÖZET .................................................................................................................................... iii
ACKNOWLEDGEMENTS .................................................................................................. v
TABLE OF CONTENTS ..................................................................................................... vi
LIST OF TABLES ............................................................................................................. viii
LIST OF FIGURES .............................................................................................................. ix
SYMBOLS AND ABBREVIATIONS…………………………………………………….xi
1.Introduction ........................................................................................................................ 1
1.1.Motivation .................................................................................................................... 1
1.2.The objectives ............................................................................................................. 3
1.3.Software ....................................................................................................................... 3
1.4. Thesis Outline ............................................................................................................. 3
2. The Literature Review .................................................................................................... 5
2.1. Previous Methods for Building Detection, Extraction and Reconstruction ................ 5
3. Methodology ................................................................................................................ 13
3.1.The General Structrue of The Algorithm ................................................................... 13

3.2.Shadow Extraction ..................................................................................................... 17
3.3.Buffer Zone Generation ............................................................................................. 21
3.3.1.Preprocessing .......................................................................................................... 22
3.3.2.Buffer Zone Generation .......................................................................................... 23
3.3.2.1. Creating a candidate buffer zone inside the shadow region ............................... 23
3.3.2.2. Shifting the candidate buffer zone into building area ......................................... 27
3.3.2.3. Enlarging the shifted initial buffer zone ............................................................. 28
3.3.2.4. Removing the noise ............................................................................................ 29
3.4. Region-Growing segmentation ................................................................................. 32
3.5. Snake active contours segmentation ......................................................................... 35
3.5.1.Traditional Snake .................................................................................................... 35
3.5.2. Gradient Vector Flow (GVF) Snake ...................................................................... 36
vi
4. The Experiments, Results and Discussion..................................................................... 41
4.1. Image Data Set .......................................................................................................... 41

4.2. The Assessment Strategy and Parameter Selection .................................................. 46
4.3.Results ....................................................................................................................... 48
4.4.Discussions ............................................................................................................... 62
4.5. The advantages of the proposed method .................................................................. 65
4.6. The limitations of the method ................................................................................... 66
5. Conclusions and Future works ..................................................................................... 67
5.1.Conslusions ................................................................................................................ 68
5.2. Future works ............................................................................................................. 69
REFERENCES .................................................................................................................... 71
CURRICULUM VITAE ................................................................................................... 76
vii
LIST OF TABLES
Page
Table 3.1. The parameters and the parameter values used in the study…………………...40
Table 4.1. Test images……………………………………………………………………. 43
Table 4.2. The computed Precision, Recall, FB, TP, FP, FN and TN values for the
extracted shadow regions of the test fields.......................................................... ............... .58
Table 4.3.The computed Precision, Recall, FB, TP, FP, FN and TN values for the
segmented building areas of the test fields through the region growing method……… …59
segmented building areas of the test fields through the GVF Snake method. . ……………60
Table 4.5. The grouping of the test sites with respect to FB accuracy of shadow
extraction……………………………………………..……………………………...…….61
Table 4.6. The grouping of the test images with respect to FB accuracy of Region-
Growing segmentation.………………...………………………………………………….61
Table 4.7. The grouping of the test images with respect to FB accuracy of GVF Snake
segmentation.…………….…………………………………………………………..…….61
viii
LIST OF FIGURES
Page
Figure 3.1 The Proposed Building Extraction Algorithm.……………………..…….…..14
Figure 3.2 A detailed flowchart of the developed approach………………………...…..16
Figure 3.3 LAB color space figurative form ………………………………….……..….17
Figure 3.4 The original Google Earth Image…………………………………….…..….18
Figure 3.5 The contrast stretched image using the histogram equalization method…….19
Figure 3.6 The Luminance channel of LAB color space…………………………..……19
Figure 3.7 The masked out shadow areas using the default threshold value…………….20
Figure 3.8 The shadow areas before performing the morphological operations Image
without noises (deleted) small islands…………………………………………….…...….20
Figure 3.9 Morphological filling………………………………...……………….………21
Figure 3.10 (a), (b), (c), (d), (e),(f), (g), (h) and (i)are the separated shadow areas shown
in Figure 3.9……………………………...………………………………...………….…..22
Figure 3.11 Separation of the shadow areas of building with connected shadow areas..24
Figure 3.12 Figurative demonstration of D1, D2, S1, S2, Boundary 1 and Boundary 2.....25
Figure 3.13 Sequential steps of buffer zone generation…...…...……………..………....26
Figure 3.14 Illumination and shifting direction…………………………………………..27
Figure 3.15 A problem case in the shift of the initial buffer……………………………..28
Figure 3.16 (a) Original image. (b) Extracted shadow. (c) Erosion of shadow region
under S1. (d) Second erosion result under S2. (e) Boundary 1. (f) Third erosion result
under S2. (g) Boundary 1. (h) Candidate buffer zone. (i) Shifted candidate buffer zone. (j)
Trimmed buffer zone. (k) Morphological dilation under S3. (l) The final buffer zone after
removing the noise..…...………...……………………………………….…………...…...30
Figure 3.17 Buffer zone before de-noising operation………………..………………….31
ix
Figure 3.18 Buffer zone after de-noising operation……………………………….……..31
Figure 3.19 Start of growing region………………………………………..…….………33
Figure 3.20 Growing process after a few iterations……………….……………….….…33
Figure 3.21 The Sharpened image using a median filter as a previous stap of Region
Growing segmentation…………………………………………………………………….33
Figure 3.22 The segmented image using the Region-Growing segmentation method…..34
Figure 3.23 The segmented image after applying a morphological filling operation …...34
Figure 3.24 Parametric representation of an enclosed curve………………………...…...35
Figure 3.25 The boundaries of the segmented image .……...…………….……...……...38
Figure 3.26 The results of Canny-edge detection algorithm (Edge-map)………..……...38
Figure 3.27 The result of the GVF Snake model ……………………………………......39
Figure 4.1 The red rectangles show the areas from which the test images (50 in total)
were selected. One of these test images is illustrated in the lower right…………..…….42
Figure 4.2 The Illustration of True Positive (TP), True Negative (TN), False Negative
(FN) and False Positive (FP)………………………………………………………...…….47
Figure 4.3.(a) The original images,(b) the results of shadow extraction, (c) the results of
region growing segmentation, and (d) the results of GVF Snake segmentation. The green
areas show the TP pixels, the Blue areas show the FN pixels, and ultimately the Red areas
illustrate the FP pixels......................................................................................................... 57
x
SYMBOLS AND ABBREVIATIONS
GIS Geographical Information Systems
GVF Gradient Vector Flow
NIR Near Infra-Red
BV Brightness Value
LAB Luminance/ Lightness - A and B color channels
LiDAR Light Detection and Ranging
xi
CHAPTER 1
Introduction
1.1 Motivation
Automatic object detection and/or extraction from high resolution satellite/aerial images
have been an important research topic in computer vision for many years. Some useful
applications of this subject include updating the Geographic Information System (GIS)
database, urban city planning and land use analysis. The fundamental challenges in
building extraction are the edge or line extraction and image segmentation, which have
been the main subjects for many studies. In these studies, the researchers have tried to find
a desired object and separate it from the background in the presence of distractions caused
by other features, such as surface markings, vegetation, shadow, and highlights.
1
A large variety of building detection and/or extraction algorithms have been reported in the
literature. However, most of these algorithms have emphasized on one or some of the
characteristics of the buildings, such as elevation, size, color, geometry and shape and to
the best knowledge of the author most of them are not quite general. On the other hand,
only a few algorithms detect and extract buildings in a fully-automatic manner. Further,
although a number high and very high spatial resolution satellite sensors with spatial
resolution of 1m in its sharpened RGB format exist in the market, for the time being, free
and quick access to them are not possible.
Thus, considering the above situations, a novel approach was proposed in this study to
automatically extract buildings from high-resolution satellite images. As the data set, the
free-downloadable and easy-accessible Google Earth Images have been utilized. The
advantages of the Google Earth images are their cost and the accessibility. Most of the
proposed building detection and extraction algorithms are based on non-automatic or semi-
automatic methods that require significant amount of user contribution [48] and [49].
Therefore, developing a fully automatic building detection and extraction technique
becomes very important. For example, an automatic technique would minimize the user
interference errors and reduce the time for data collection. Besides, in this thesis a general
approach is proposed to detect various types of buildings with different shape, color, size,
height, and complexity with the challenging environmental conditions.
The first motivation behind this work is to use LAB color information for the automatic
extraction of the cast shadows of the buildings. The second motivation is to automatically
seed the sample data over building areas through utilizing the illumination direction
information and the extracted shadows in order to initialize the region-growing
segmentation and to extract initial building boundaries. The third motivation of this work is
to increase the reliability of the extracted initial building boundaries using an improved
method of the Snake active contour model and therefore to extract building boundaries
more reliably without the need for human intervention.
2
1.2 The objectives
The objectives of this thesis are as follows:

 To develop a robust and reliable algorithm for the automatic extraction of buildings
with different shapes, sizes, colors, heights, types, with simple or complex
environments.
 To develop a novel technique for the automatic buffer zone generation around the
building areas.
 To extract the cast shadows of the buildings.
 To develop an approach for the automatic initialization of the Snake active contour
model to extract building boundaries.
 To extract building areas using a region-growing segmentation method.
 To test the proposed method on Google Earth images that contain buildings with
different colors, sizes, shapes, heights, and environmental conditions.
 To assess the accuracy of the developed shadow extraction method in different
regions under different environment conditions.
 To improve the results of the region-growing segmentation using the GVF Snake,
 To compare the results of GVF Snake and the region-growing segmentation
methods.
1.3 Software
The proposed automatic building extraction method was implemented on a computer

system with the following characteristics: RAM capacity of 16 GB – CPU of I3.30 GHz in
the 64 bit system, on windows 7 and with the MATLAB2013a software.
1.4 Thesis Outline
The remaining part of the thesis is organized in four chapters. Chapter 2 states the previous
works in the field of building detection and extraction from remotely sensed data. Chapter
3 states the proposed methodology entitled “An approach for automatic extraction of the
building outlines from the high resolution Google Earth Images”. In chapter 4, the study
3
area and the data sets are described as well as presenting the results and critical discussion
of the execution of the proposed method on test images. Finally, the conclusions and
recommendations for future investigations are given in section 5.
4
CHAPTER 2
The Literature review
2.1 Previous Methods for Building Detection, Extraction and Reconstruction
Building detection has attracted many researchers in computer vision, remote sensing,
surveying and mapping for many important applications, such as map updating, city-
modeling, and urban-planning. Moreover, building detecting has become more important
along with the increase in population in urban and sub-urban areas. In map-contexts, it is
very important to have up-to-date maps to provide a quick access of the real-time images.
Besides, automatic detection and extraction of buildings can help automate the map
updating process. As the result, it reduces user interaction and needs lesser period of time
to perform updating.
5
There exist extensive number of applications and methods for the detection and extraction
of buildings from remotely sensed images. Furthermore, there exist many research studies
for evaluating and comparing the methods which have been proposed and developed in the
previous studies. These studies present and discuss the pros and cons of the previous
methods by demonstrating their advantages and limitations. The primary aim of these
papers is to help investigators have a detailed review of the methods so as to overcome the
challenging aspects. There are many studies in the context of building extraction, detection
and reconstruction.
In the past, a number of different methods for building extraction have been presented.
Mayer [1] reported several methods developed up to 1990. Baltsavias [2] looked on
different aspects of knowledge that can be used for object extraction which lead to
practical use of object extraction. Furthermore, the comparative analysis of the
performances of the approaches developed until 2003 has been reported in [3]. Another
review over both the optical images and the LiDAR data is presented in [4] which presents
an analysis over the optical images and LiDAR data simultaneously. A valuable review for
the extraction of man-made objects, in particular buildings, up to late 2002 based on aerial
images and laser scanning data is given in [5].
This thesis is related to automatic detection and extraction of buildings from high spatial
resolution multispectral Google Earth Images. Therefore, in this section the automatic
detection and extraction of buildings from the optical images is discussed. The initial
studies in the monocular context used simple region growing methods along with simple
models of building geometry [6], [7]. Later, edge and line and/or corner information has
been used as the principle element for extracting the buildings [8], [9]. In these two studies
the shadow information has also been used as the important sign of buildings. In order to
extract building boundaries from single aerial images Liow and Pavalidis [10] used shadow
information and the grouping process of the boundary segments. Shufelt [11] conducted a
comparative study for the methods which were proposed in [9], [12], and [13]. He
evaluated these methods and came to the conclusion that none of these methods were
capable of handling all the challenging tasks in building detection. Moreover, most of these
systems suffer from some limitations, for example detecting only the buildings with
specific shapes because of their dependency on the low-level image features like edges. In
a different study, Lin and Nevatia [14] used shadow information as a key verification
factor for the buildings’ roofs to be used in a shape-based model. Turker and San [15]
6
demonstrated the use of the relationships between the cast shadows of buildings before and
after an earthquake for the detection of the collapsed buildings due to an earthquake. Peng
and Liu [16] presented a model-based method with the shadow-context model to extract
outlines of the buildings. In this method, the shadow effects and the physical models of the
buildings have mostly been considered.
In the past decade, the availability of very high resolution satellite images has motivated
the scientists to develop new methods for building detection and extraction from very high
resolution images. Pesaresi and Benediktsson [44] and Benediktsson et al. [45] tested
differential morphological profiles (DMPs) for the classification of IRS 1-C and IKONOS-
2 image data sets. In these studies, the features were extracted from the DMPs defined by
morphological transforms, and a neural network was used to classify urban regions. Their
approach assumes that morphological profile of a structure is composed of only one
significant derivative maximum; however, that is usually not the case for the structures
commonly observed in complex environments. A new classification method based on
supervised classification and Hough-transformation for the automatic building extraction
from IKONOS imagery was proposed in [17]. They demonstrated that their proposed
algorithm relies mostly on the supervised classification results to get a comprehensive set
of building roofs. A different approach used by Inglada [18] was based on recognizing the
man-made objects from the SPOT-5 images. The approach uses the support vector
machine (SVM) classifier to extract the building areas. Unfortunately however, this
approach was not quite successful to recognize the building areas and only the certain
patches with specific sizes were labeled as buildings. Later, a study was conducted by Koc-
San and Turker [19] to assess the effects of additional bands, such as nDSM, NDVI, and
several texture measures plus original bands of IKONOS image for finding the building
patches using SVM classification. It was demonstrated that the additional bands increase
the accuracy about ten percent. In a recent study conducted by Huang and Zhang [20]
morphological building and shadow indices were used in building extraction. Their attempt
was to increase the accuracy of morphological building index with the other integrated
methods. The improvements of this method is presented in three steps: 1) using a
morphological shadow index (MSI) to detect shadows that are used as spatial constraint of
buildings; 2) using a dual threshold filtering to integrate the information of MBI and MSI,
and 3) implementing a framework in an object-based environment. Then, a geometrical
7
index and a vegetation index are used to remove noise form narrow roads and bright
vegetation areas. They stated that their method improved the original MBI significantly.
Izadi and Saeedi [21] developed a graph-based search algorithm based on the orientation of
lines at their intersections and examine their relations with each other using a graph-based
search to generate a set of rooftop hypotheses. Moreover, they used shadow information
and corner vertices as additional data for the automatic extraction and 3D modeling of
buildings with polygonal flat or flat looking rooftops in monocular satellite images. The
method proposed by Tanchotsrinon et al. [22] uses the integration of color segmentation,
texture analysis and neural classification techniques to detect buildings automatically from
remotely sensed images. An approach based on an adaptive fuzzy-genetic algorithm for
building detection from IKONOS-2 satellite image was proposed by Sumer and Turker
[23]. In their approach, the sample data are collected with the user interaction. Then, an
evolutionary system, which combines a traditional classification method of Fisher’s linear
discriminant and an image-based adaptive fuzzy logic component, is used. The
evolutionary process is repeated until a predefined number of generations are reached. The
main problem of this method is that the accuracy is mainly depended on the accuracy of
collected sample data. In a recent study conducted by Ghaffarian and Ghaffarian [24], an
automatic method to detect buildings from high resolution Google Earth images was
proposed. They improved FastICA algorithm in terms of automatic initializing in order to
make it purposeful and named the method as Purposive FastICA (PFICA). Furthermore,
they proposed a novel masking approach to initialize the PFICA algorithm. Although they
demonstrated that their proposed method is efficient and reliable, it failed to accurately
detect buildings with significant different color rooftops due to the structure of the PFICA
algorithm.
Some of the graph-based building detection methods are also mentioned below.
Krishnamachari and Chellappa [25] used a graph-based method which consists of three
main steps. First, the algorithm uses an edge detection method to extract straight lines.
Second, a Markov random field (MRF) is applied on the extracted lines with a suitable
neighborhood. Then, a probabilistic model is used to support the properties of the shapes of
buildings (rectangular, L-shaped). And finally, an energy function associated with the
MRF is minimized and lead to grouping of lines as delineation of buildings. The main
limitation of this method is its ability to work only on buildings with specific structures.
8
Kim and Muller [26] proposed an algorithm based on graph theory to detect buildings in
aerial images. They used linear characteristics as the vertices of graph and shadow
information to verify the building appearance. Several algorithms have been developed to
overcome the limitations of the graph-based approaches. Some of these studies have been
proposed to extract various features about the buildings [27], [28], [29], [30], [31].
Recently, several efficient graph-based building detection approaches were developed.
Sirmacek and Unsalan [30] utilized scale invariant feature transform (SIFT) and graph
theoretical tools to detect buildings in urban area from satellite images. Teke et al. [46] and
Ok et al. [32] proposed new approaches with the aim of automatic detection of buildings
from a single very high resolution optical satellite image using shadow in integration of
fuzzy logics and GrabCut partitioning algorithm. Thereafter, Ok [33] increased the
accuracy of the previous work by using a new method to detect shadow-areas which are
the evidences of buildings. Although the proposed method benefits from the advantage of
eliminating the user-input for GrabCut algorithm this function has one main disadvantage
that in one image if the building sizes are considerably different from each other the
method will produce more false negative (FN). This is because of the landscape constant
value that is determined for any single image.
There also exist several approaches for the detection and extraction of buildings from
remotely sensed images with active contours in particular with Snake active contour
models. The use of Snake active contour models to extract objects from medical images as
well as tracking objects in continuous images have also become an active research area in
computer vision. In this thesis, the developed main approach for the automatic extraction
of buildings from high resolution satellite imagery is based on theGVF Snake model. A
brief history of the traditional Snakes segmentation method and the GVF Snake method
used in this thesis are described below.
Traditional Snake (active contour) was proposed by Kass et al. [34] who defined as
energy-minimizing spline guided by external constraint forces and image forces that pull it
toward features such as lines and edges. The traditional Snake’s model has several
disadvantages. The two main weaknesses are: i) its disability to progress into boundary
concavities, ii) it has much dependency on locating the initial contours requires being so
close to the boundaries to catch the boundaries. In [35], Xu and Prince developed a Snake
active contour model by creating a novel External energy function to overcome the
9
disadvantages of the traditional Snake model. By producing a new external energy instead
of the external energy used in traditional Snake method the method’s efficiency was
improved. The new energy function is capable of overcoming the traditional Snake model
because of its vector form. This vector form of the new external energy leads to create a
new vectored edge map with less sensitivity to the location of the initial Snake contours.
Therefore, the model increases the attractiveness of the initial contour to the near
boundaries and also let the contours catch the concave parts of the objects.
Snake and its improved methods have been used to extract buildings from satellite images.
Theng and Chiong proposed a new approach for the automatic extraction of buildings
using an improved Snake model [36]. Their approach has three terms of energy that are
continuity, curvature, and image. The image term attracts the Snakes to the image points
with the minimum gradient magnitude. The continuity term creates equal space Snakes
control points. By this way it prevents from grouping the Snakes contours in place with
little distance among contours. The curvature term expresses the curvature of the Snakes
contour. Since the initial curve points have to be located in a meaningful location, this
method uses a corner detection method to find corners of the buildings to initialize the
points of the first curve to be proceeded. This approach depends on many factors to work.
In addition, in such methods to work in an automatic manner determining all parameters
can be very time consuming. Moreover, the radial casting can be a problem for the
buildings which have larger or smaller sizes with respect to size of the initial curve. The
method proposed by Peng et al. [37] has two main improvements. One improvement is the
selection criteria for the initial seeding points, while the other improvement is the external
energy function. They used a mathematical solution to overcome the disadvantages of
traditional Snake. Their method has emphasized on building regions with simple shapes
and they did not assess their method on buildings with complex shapes. Cao and Yang
proposed an approach [38] to detect buildings automatically from aerial images through
multi-stage level set algorithm. They modified the level set method for detecting buildings
developed by Chan-Vese [39]. The main contribution in the level set algorithm is the
determination of a new factor for the traditional active contour models. This factor does
not depend on the gradient of the image but instead related to a particular segmentation of
the image. In addition, the initial contour points are seeded through the circles that cover
the image. However, their method is capable of extracting only the regions of the man-
made objects and not capable of extracting the single building patches. Moreover, the
10
method provides incorrect results in the form of False Positive (FP) due to the initializing
process. The approach presented by Karantzalos and Paragios [40] for the automatic
extraction of the buildings is based on the knowledge of the shapes of the buildings. They
incorporated the shape data in the level set active contour model and let them deform to
extract buildings. The accuracies of the results were reported to be over 80%. However,
their method has high dependency on the shapes of the buildings in the image under the
investigation. And, this is one of the limitations of this algorithm. The approach developed
by Ahmadi et al. [41] is based on the level set method which was previously developed by
Chan and Vese [39]. In their work the initializing process starts without any information
about the probable locations of buildings which can be gained from the building shadow
regions. They produced their initial contour in the certain size of circle in regular
distribution to all over the image. Since these contours have not been located with any
logic, they create some disadvantages, such as segmenting the non-building areas as
building. Although this method has a good logic for the automatic detection of building
areas, it may not be quite reliable for all environments. In particular, this method may have
serious problems if the background environment has high variation through the image.
Recently, the detection and extraction of buildings using the cast shadow regions have
become the case of investigation. For example, Ok [32], [33] used shadow regions to
create a landscape over the building areas to perform a graph cut to detect building regions.
This method produces constant size of landscape for the collection of data from the
building areas. This fact brought a disadvantage of collecting data from wrong place. By
this way, it makes this method not quite robust for automatic building extraction. In an
image with various sizes of buildings this method finds some false areas instead of
building areas and also may not extract buildings with larger size. To overcome this
problem of the mentioned method, in this study the shapes of the buildings are considered
and building-based buffer zones are generated. Therefore, the shapes and sizes of the
buffer zones are dependent on the shapes and sizes of the buildings. As the result, the
samples to represent buildings are collected more reliably and therefore the accuracy of of
building extraction is increased.
As an important achievement of this study, the proposed algorithm is full automatic to

extract building areas with high accuracy values which will be presented in the results
11
section of the thesis. Several automatic methods of building extraction exist in the
literature, such as those developed in [23], [41], and [42]. The fact that most of these
methods use the combination of the various types of data makes data-accessibility hard and
also more expensive. For example, the method developed by Awrangjeb [42] for the
automatic residential building detection used LIDAR data and multispectral images in an
integrated manner. As is well known LIDAR data are quite expensive and not always
available as in the case of the present study. Therefore, one of the challenges of this study
was to develop a method that operates on the low costs implementation. For this reason,
the proposed method uses highly accessible and gratis Google Earth images.
12
CHAPTER 3
The Methodology
3.1 THE GENERAL STRUCTURE OF THE ALGORITHM
The proposed algorithm demonstrates two main novelties. The first novelty is the building
shadow extraction, while the second one is the automatic sample collection from within the
buffer zones which are generated on building areas adjacent to cast shadows of the
buildings. A region growing algorithm is used to segment the image and a refinement
operation is carried out using the GVF Snake algorithm in order to achieve more accurate
segmentation. The general steps followed in the proposed approach for the automatic
extraction of buildings from high resolution space images are given in Figure 3.1.
13
Input Image
Building
Shadow
Extraction
Buffer
Zone
Generation
Region
Growing
Segmentation
Segmentation
Using the GVF
Snake
Algorithm
Extracted
Segm
Buildings
Figure 3.1 The Proposed Building Extraction Algorithm
14
A more detailed flowchart of the proposed approach, which includes the intermediate
steps, is given in Figure 3.2. The shadow extraction, which is the first stage of the
approach, consists of five steps: (i) contrast stretching of the image through the histogram
equalization method, (ii) transforming the input image from RGB (Red, Green, Blue) color
space to LAB (Lightness (L) and two color channels (a and b)) color space, (iii) masking
out the shadow areas in the lightness channel through applying a threshold, (iv) removing
the shadow islands smaller than a user defined size (150 connected pixels in the present
case), and (v) performing a morphological filling operation to improve the accuracy of the
extracted building shadow regions by means of filling the holes.
The approach starts with the extraction of the cast shadow regions of the buildings. After
extracting the shadow regions of the buildings, a buffer zone is generation for each of the
detected shadow areas. As illustrated in Figure 3.12, morphological erosion operations are
carried out sequentially to create boundary1 and boundary2. Then, boundary2 is shifted
over boundary1. By this operation, the intersection region of the shifted boundary and
boundary1 is generated. Next, a noise removing operation is carried out to eliminate the
noises. This is followed by trimming the intersected shadow regions on both ends using a
trimming operation. The amount of trimming does not have to be the same on both ends of
the regions. After that, the shadow regions, from which the samples for seeding the initial
segmentation points to be automatically collected, are enlarged by means of morphological
operations. The final part of this section is pursued with De-noising processing along with
standard deviation and the mean values of the 3 bands (Red, Green, and Blue) of the
original image. This procedure is carried out to achieve more pure data for sampling.
The last stage of the approach consists of the segmentation of the image, which is carried
out in two steps. In the first step, a median filter is passed over the original image to
remove noise inside buildings as well as preserving the edges. Next, the Region-Growing
segmentation is carried out over the median filtered image to segment the buildings. With
respect to achieved results of the region-growing stage, the results may not be satisfactory.
Therefore, to overcome the problems of the region-growing segmentation, the GVF Snake
is employed. The GVF Snake algorithm uses the boundaries extracted through the region
growing segmentation as the initial contours and benefits from the results of the Canny
edge detection performed on the original image. The GVF Snake deforms until it catches
the boundaries of the buildings and extracts the buildings boundaries.
15
The detailed descriptions of the main steps followed in the proposed approach are given
below.
Figure 3.2 A detailed flowchart of the developed approach
16
3.2 SHADOW EXTRACTION
In this study, a novel method is proposed for the extraction of shadow areas from a single
RGB image. The detection of the shadow regions is important for building detection in a
sense that if there is shadow then there exists a building adjacent to shadow. Therefore,
with the knowledge of sun illumination direction and the extracted cast shadows of the
buildings we can get information about the heights and the shapes of the buildings. More
importantly, the detected casts shadows of buildings can help improve building detection
from high resolution images.
The developed method is based on LAB color space which is based on one channel for
luminance (lightness - L) and two color channels (A and B) Figure 3.3. The luminance or
lightness value of L is in the range of 0-100. L value represents the darkest black at L=0,
and the brightest white at L=100. The red/green opponent colors are represented along
the A axis, with green at negative A values and red at positive A values. The yellow/blue
opponent colors are represented along the B axis, with blue at negative B values and
yellow at positive B values. A and B values are run in the ranges of ±100 or -128 to +127.
Figure 3.3 LAB color space figurative form
In this study, the reason for choosing the LAB color space is that the detection of the
shadows in this color space by means of the luminance channel, which represents the
brightness rate of the objects in images, is easier and more accurate. This method generates
the luminance channel through converting the image from RGB color space to LAB color
space [43]. To improve the capability of the algorithm, the method is applied on a contrast
stretched image with histogram equalization since the contrast stretched image provides
better detection of the shadow areas. This is due to the fact that the contrast stretching
17
operation makes the dark areas darker and therefore it becomes easier to extract them from
the image. Since the shadow regions are darker and less illuminated than the surroundings,
it is easy to extract them by means of applying a simple thresholding technique on the
image. Figure 3.4 is an original RGB Google Earth image of a selected area. Figures 3.5
and 3.6., respectively illustrate the contrast stretched image and the Luminance channel of
LAB color space of this image. To find out the threshold value for segmenting the
luminance channel several experiments were carried out. In this respect, more than 100
experiments were carried out to find out the most suitable range of threshold values for
segmenting the shadow areas in the image. The values found throughout this experimental
work are given in Table 3.1.
The above described pixel-based thresholding operation separates the shadow regions
from the other objects without considering any other characteristics. The thresholding
operation is carried out using the predefined value of the threshold. To perform
thresholding operation, the values under the threshold value, which represents the shadow
areas, are kept and the values above the threshold value are eliminated from further
processing operations. By this way, the shadow regions are masked out.
Figure 3.4 The original Google Earth Image.
18
Figure 3.5 The contrast stretched image using the histogram equalization method
𝑀𝑒𝑎𝑛 (𝐿)− 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝐿)

𝐿≤ (1)
3
Figure 3.6 The Luminance channel of LAB color space
19
Figure 3.7 The masked out shadow areas using the default threshold value
Figure 3.8 The shadow areas before performing the morphological operations image without
noises (deleted) small islands
20
Figure 3.9 Morphological filling
Figure 3.7 illustrates the shadow areas segmented on the Luminance channel using the
default threshold value. Figures 3.8 and 3.9 show the segmented shadow areas before and
after applying the morphological operations.
3.3 BUFFER ZONE GENERATION
In order to extract buildings using a region-based segmentation, samples are needed to be

collected on building areas to initiate the segmentation operation. In this study, an
innovative method was proposed for the automatic collection of samples from building
areas based on their cast shadows and the knowledge of the sun illumination direction. In
order to develop a generic algorithm with no restrictions for finding the building locations
and constructing proper buffer zones on the buildings, several points must be considered:
1- Shapes of the buildings are relative to their shadows casted under their base.
2- Building heights are directly proportional with the lengths of the shadows.
3- Shadows are formed in the sides of buildings which are located in the opposite
direction of the illumination direction.
21
3.3.1 Preprocessing
To increase the capability of the proposed algorithm, to check the efficiency of the method
for each building, and to achieve more accurate results in the cases that the building
shadows are so close to each other, the algorithm requires a preprocessing of separating
each shadow area into a unique data matrix. Furthermore, by separating the shadow
regions into single matrices, the method lets the buildings with complex shapes the shadow
of which has composed of more than one part to be detected twice or more in separated
regions. And, processing each part independently will improve the accuracy of the method.
Figure3.10. illustrates the separated shadow areas that are shown in Figure 3.9.
(a) (b)
(c) (d)
(f) (g)
(h) (i)
Figure 3.10 (a), (b), (c), (d), (e),(f), (g), (h) and (i) are the separated shadow areas shown in
Figure 3.9.
22
3.3.2 Buffer Zone Generation
A building can be defined by the extent of the cast shadow region in the opposite direction
of illumination. In the past, several methods have been developed for the detection of
buildings based on their cast shadows. However, most of these algorithms have limitations,
such as the restrictions of detecting the buildings with arbitrary shapes or the restrictions
about the size, color or heights of the buildings. Of the developed past techniques, only a
few illustrate a generic algorithm to detect buildings of various shapes, sizes, and heights.
To overcome some of these problems and generalize the building extraction process, a
novel method was proposed in this thesis. The proposed method does not depend on the
shapes, colors, heights, densities and somehow the sizes of the buildings. It is not claimed
that the proposed method is the most general method for building detection, but the belief
is that this method can open many visions in this way to produce more effective and
generic methods in building detection studies. The proposed method of buffer zone
generation procedure consists of four main parts:
1- Creating a candidate buffer zone inside the shadow region

2- Shifting the candidate buffer zone into building area
3- Enlarging the shifted buffer zone to make it more efficient to be used
4- De-noising the created buffer zone
As described above, the separated and labeled data-sets of shadow regions were available
as the result of preprocessing. Therefore, the buffer zone generation operation is carried
out for each region separately. The generated buffer zones on building areas will be used
for the automatic starting of the segmentation operation and to eliminate the user
interaction. A synthetic visualization of an image is shown in Figure 3.13 to provide the
steps of the methodology of the proposed method. Figure 3.16 demonstrates the steps of
this section on a real satellite image.
3.3.2.1 Creating a candidate buffer zone inside the shadow region
The aim of this step is to generate an imprecise and approximate buffer zone area with
respect to the cast shadow of a building inside the shadow region. The shadows contain
useful information about the adjacent buildings which cast, such the geometry, shape and
height. Here, the algorithm benefits from the shape and geometry of the buildings. By this
way, the algorithm becomes more general with respect to the shapes of buildings.
23
The processing operations used consist of several morphological operations. First, the cast
shadows are eroded using a structure element S1 with the size ranging from 7 to 9. This
size range has been determined empirically in order to solve problems of the extracted
shadow regions with some tree shadow noises around the real shadow boundaries.
Therefore, this operation reduces the partial amount of tree shadows combined with the
building shadow area. Further, by means of these morphological operations the joint cast
shadows of the closely located buildings are separated. Figure.3.11 presents the cast
shadows of closely located two buildings before and after conducting the morphological
erosion operation.
Before erosion After erosion
#1 #1
#2 #2
A joint area between the shadows of two buildings
Figure 3.11 Separation of the shadow areas of building with connected shadow areas
After performing many experiments on the selected test images, it was found that although,
the rate of erosion processes increases the accuracy of the fınal results of the method, the
amount of erosion is restricted to the length of the cast shadow along illumination
direction. Due to this restriction, if the size of the structuring element S1 is determined so
large, some of the cast shadows would be totally deleted and thus, no shadow region will
be present to produce a buffer zone. The eroded shadow region obtained as the result of
this operation (E1) is shown in Figure 3.13b.
After the first erosion step, a double erosion stage with the aim of producing two adjacent
boundaries with the same width of W2 is carried out. The size of the structuring element
S2 used in this part is 5 pixels. By using this structuring element and performing the
erosion operation twice, the regions E2 and E3, which are shown in Figures 3.13b and c,
are generated. Next, Boundary1 and Boundary2 (Figure 3.12) are generated using the
equations (4) and (5). Figure 3.13d demonstrates the results of this processing operation. In
24
addition to Figure 3.12 which shows Boundary1 and Boundary2, the formula for
calculating the width of each of the Boundary1 and Boundary2 is provided in equation (6).
Boundary 1
Boundary 2
Extracted shadow region
Structure element sample for 5
dimensional squares
D2
D2 5 pixels
D1
5 pixels
Building area
Illumination direction
Figure 3.12 Figurative demonstration of D1, D2, S1, S2, Boundary 1 and Boundary 2
𝑆1 − 1 (2)
𝐷1 =
2
𝑆2 − 2 (3)
𝐷2 = 𝐷3 =
2
Boundary1 = E1 – E2 (4)
Boundary2 = E2 – E3 (5)
𝑛−1 (6)
𝑊𝑛 =
2
Shifting Bondary2 over Boundary1 is the final operation of creating the candidate buffer
zone within the shadow region before shifting it toward the building area. The direction of
shift is in the opposite direction of the illumination and the amount of shift is 2 pixels
which is equal to the width of Boundary1 and/or Boundary2. The amount of shifting is
calculated using equation (7), where n is the size of the structure element that eroded
region E2 and/or E3 and V is the shift vector used in this part. The shifting amount should
be the same as the width of these two boundaries. And this fact, answers the question of
why the structure-element size for both Boundary1 and Boundary2 are chosen to be the
same. With the selection of the region of intersection between the shifted boundary2 and
25
boundary1 the algorithm will detect the initial buffer zone inside the shadow region. The
results of this operation are shown in figures 3.13e and f.
𝑛−1
⃑|=
|𝑉 (7)
2
E1
1
E2
1
E3
(a)
1
D2
1D2
1
D1
1
(b) (c) (d)
(e)
(f) (g)
(h) (i) (j)
Figure3.13 The sequential steps of buffer zone generation
26
3.3.2.2 Shifting the candidate buffer zone into the building area
After structurizing the initial buffer zone inside the buildings’ shadow area it is shifted
towards the building which is located in the opposite direction of illumination. To do that,
two main factors, which are the direction and magnitude of the displacement vector, are
considered. In figure 3.14, the direction of displacement vector is - 45 degrees from the x
axis assuming that the illumination direction is 45 degrees from the x axis. The magnitude
of the displacement vector shows the amount of displacement. In the proposed algorithm
the amount of shift (P) is calculated using equation 8.
Figure 3.14 Illumination and shifting direction
P = D2 + D1 + Є + D4 (8)
Where, P is the amount of shift, D2 is the width of boundary1; D1 is the size of area
decrease in the first erosion. Є is the empirical amount that is used to put a distance
between the buildings side and the shifted buffer zone as a confidence rate, and D4 is the
distance added in order to have a dilation step after the trimming stage. The value used for
the parameter Є is given in Table 3.1. The value for this parameter was determined in order
to prevent the case shown in Figure 3.15. As it can be seen in Figure 3.15, the large value
of є may lead to some faults in the automatic collection of sample data. In other word, this
case may occur if the initial buffer zone is over shifted.
27
Shadow Region
Initial buffer zone before

the shift
Illumination direction
Generated buffer zone
Uninterested region in the

buffer zone area
Building area
Figure 3.15 A problem case in the shift of the initial buffer
3.3.2.3 Enlarging the shifted initial buffer zone
Before performing the enlargement operation, the shifted initial buffer zone is trimmed
based on the shape-center. In the present case, the size of trimming was selected in the
range of 20 – 30 pixels based on the experimental tests (Table 3.1). With the trimming
operation 20-30 pixels whose Euclidean distance from the shape-center is longer than the
other pixels, are deleted and excluded from further processing. The whole operation is
carried out using the equations (9)-(12). This step is shown in figure 3.13.h.
Shape center c = (𝑋,𝑌) (9)
∑𝑛
𝑖=1 𝑋𝑖
𝑋= (10)
𝑛
∑𝑛
𝑖=1 𝑌𝑖 (11)
𝑌=
𝑛
𝑑𝑖 = √(𝑋𝑖 − 𝑋 )2 + (𝑌𝑖 − 𝑌)2 (12)
Where 𝑑𝑖 is the distance of a pixel ‘i’ from the shape center ‘c’, (𝑋 , 𝑌) are the coordinates
of the shape center of the initial buffer zone and ‘n’ is the total number of the pixels within
the buffer zone. This is illustrated in figure 3.13.h. The trimming operation helps the initial
buffer zone exclude the isolated noisy pixels in both ends as well as providing a space for
dilation operation to be carried out for enlarging the size of the initial buffer zone.
Therefore, the size of trimming and the size of dilation have a direct relationship.
28
After conducting the trimming operation, a morphological dilation operation is carried out
in order to enlarge the area of the initial buffer zone. The enlargement operation is carried
out so that the area from which the sample data to be collected on building becomes large
enough. Therefore, by employing the enlargement operation the reliability of automatic
sample collection for starting the segmentation operation is improved. The dilation
operation is carried out using structuring element S3 with 9 pixel-size. The trimming stage
and the dilation operation are shown in Figures 3.13.h and 3.13i. In equation (13), D4 is
the size of the side increased after the dilation operation. For example, based on equation
(13), the value for D4 is computed as 4 if the size of the structuring element is selected as
9.
𝑆3 −1 (13)
D4 =
2
3.3.2.4 Removing the noise
The noise removing operation is performed to discard the noisy pixels in the image as well
as removing the isolated small shadow islands on the extracted shadow area. The isolated
shadows and the other noisy pixels in the image affect the result of region-growing
segmentation. Therefore, to reduce the effect of noise, a de-noising operation is carried out
within the buffer zone area using the conditional equations (14)-(16). To label a pixel as
noise and remove it, all of these three conditions must be satisfied. Figure 3.16.k and figure
3.16.l show the results of this step on a real image. Similarly, figures 3.17 and 3.18
illustrate the buffer zones before and after the noise removal operations.
𝑀𝑅 − 𝜎𝑅 ≤ 𝑅𝑖 ≤ 𝑀𝑅 + 𝜎𝑅 (14)
𝑀𝐺 − 𝜎𝐺 ≤ 𝐺𝑖 ≤ 𝑀𝐺 + 𝜎𝐺 (15)
𝑀𝐵 − 𝜎𝐵 ≤ 𝐵𝑖 ≤ 𝑀𝐵 + 𝜎𝐵 (16)
Where, MR, MG, MB are the mean values and σR, σG, σB are the standard deviations of
the pixels inside the buffer zone for the red, green, and blue bands, respectively. On the
other hand, 𝑅𝑖 , Gi , 𝐵𝑖 indicate the i’th pixel of the red, green and blue bands of the pixels
within the buffer zone.
29
(a) (b) (c)
(d) (e) (f)
(g) (h) (i)
(j) (k) (l)

Figure 3.16 (a) Original image. (b) Extracted shadow. (c) Erosion of the shadow region using
the structuring element S1. (d) Second erosion using the structuring element S2. (e) Boundary 1.
(f) Third erosion using the structuring element S2. (g) Boundary 1. (h) Candidate buffer zone. (i)
Shifted candidate buffer zone. (j) Trimmed buffer zone. (k) Morphological dilation using the
structuring element S3. (l) The final buffer zone after removing the noise.
30
Figure 3.17 Buffer zone before de-noising operation
Figure 3.18 Buffer zone after de-noising operation
31
3.4. Region-Growing segmentation
To obtain the segmented building areas, the proposed method uses the maximum-minimum
region-growing method. The region-growing segmentation method requires small regions
(seeds) to grow and segment the objects. The statistical values (minimum and maximum
values) of the initial mass of the collected seeds are calculated. According to the minimum
and maximum values, a comparison between the new pixels in the neighborhood of the
first mass is made. Next, the neighboring pixels become the members of the central mass if
and only if they satisfy the equations (17), (18), and (19) at the same time. This operation
continues until no pixel is left in the growing mass that satisfies the given conditions. In
another words, new pixels continue to add the mass up to there is no pixel with the
necessity conditions. Figures 3.19 and 3.20 demonstrate the region growing segmentation.
𝑀𝑖𝑛 𝑅𝑖𝑛𝑖𝑡𝑖𝑎𝑙 ≤ 𝐾𝑖 𝑅 ≤ 𝑀𝑎𝑥 𝑅𝑖𝑛𝑖𝑡𝑖𝑎𝑙 (17)
𝑀𝑖𝑛 𝐺𝑖𝑛𝑖𝑡𝑖𝑎𝑙 ≤ 𝐾𝑖 𝐺 ≤ 𝑀𝑎𝑥 𝐺𝑖𝑛𝑖𝑡𝑖𝑎𝑙 (18)
𝑀𝑖𝑛 𝐵𝑖𝑛𝑖𝑡𝑖𝑎𝑙 ≤ 𝐾𝑖 𝐵 ≤ 𝑀𝑎𝑥 𝐵𝑖𝑛𝑖𝑡𝑖𝑎𝑙 (19)
Obviously, one of the pivotal points for applying this technique is that the reliably placing
the seeds in the image. At least one island for a growing action is necessary to be inside the
building area. To make the segmentation process automatic it is necessary to insert the
seeds automatically. The automatic collection of the seeds is described above in the buffer
zone generation section. Here a decision is made about whether to grow toward new
adjacent pixels or not using the maximum and minimum values and making a similarity
comparison with the initial zone’s pixels. Therefore, it is expected from the method
produces non-acceptable results because of its sensitivity to the noises. To improve the
results of the proposed segmentation algorithm a median filter is applied on the original
RGB image. The median filter sharpens the boundaries of the buildings and decreases the
rate of exceeding to the outsides of the building areas. Figure 3.21 shows the result of the
11x11 median-filter as applied on an experimental test image. The result of the region-
growing segmentation is shown in figure 3.22.
32
Seed point Grown pixels
Direction of growth Pixels being considered
Figure 3.19 Start of growing region Figure 3.20 Growing process after a few
iterations
Figure 3.21 The sharpened image using a median filter as a previous step of the Region
Growing segmentation
33
Figure 3.22 The segmented image using the Region-Growing segmentation method
After segmenting the image using the Region-Growing segmentation method, small gaps
may appear in the building areas. Therefore, to obtain a more accurate segmentation results
a typical morphological hole filling operation is applied to fill the gaps in the binary
segmented image. The segmented image after applying a morphological filling operation is
shown in figure 3.23.
Figure 3.23 The segmented image after applying a morphological filling operation
34
3.5. Snake active contours segmentation
The region growing segmentation operation may not extract all the parts of the buildings.
This may be due to the heterogeneities of the pixel values of the buildings as well as the
spectral similarity between the buildings and the surroundings. Therefore, the result of the
region growing segmentation operation requires improvement. In this study, the
improvement is carried out using the well-known active contour models named gradient
vector flow (GVF) Snake.
Active contour models-also called Snakes are popular models which are greatly used in
image applications like object tracking, shape recognition, segmentation, and edge
detection. In the proposed algorithm a developed Snake model named GVF Snake
proposed by Xu [35] is used.
3.5.1. Traditional Snake

A Snake is a controlled continuity spline that moves and localizes onto a specific contour
under the influence of the objective function. The parametric form of the active contours
has been presented to parametric Snakes models. Figure 3.24 is a parametric representation
of the curve v(s) = (x(s), y(s)) in which ( 0  s  1 ). In this representation, s is a parameter
related to the arc length.
Figure 3.24 Parametric representation of an enclosed curve
Having specified contour as v(s), making the model as the sum of the energy terms in the
continuous spatial domain. The energy terms can be categorized as follows
1) Internal Energy: Internal energy is a function of the contour v(s) itself and is an
energy that controls the tension and smoothness of the curve. Therefore, the internal
energy is only related with the characteristics of the curve.
35
2) External Energy: External energy describes how well the curve matches the image
data locally. Numerous forms can be used, attracting the curve toward different
image features.
The Snake model tries to minimize the total amount of energy, Etotal which is stated in
equation (20), where E in is the internal energy and Eex is the external energy of the
curve. Equations (21) and (22) are general formula of how external and internal energies
are calculated. Equation (23) represents the total E in computations in its discrete form.
Equation (24) and (25) represent the formulas for computing Eex .
Etotal  Ein  Eex (20)

1
Ein   Ein ( ( s )) ds (21)
0
1
(22)
Eex   Eex ( ( s)) ds
0
2
d d 2
2
(23)
Ein ( ( s ))   ( s )   ( s)
ds d 2s
Elasticity Stiffness
𝐸𝑒𝑥 = 𝛾 − |∇𝐼(𝑥, 𝑦)|2 (24)
𝐸𝑒𝑥 = 𝛾 − |𝛻(𝐺𝜎 (𝑥, 𝑦) × 𝐼(𝑥, 𝑦))|2 (25)
Where, α and β are the weighting parameters that control the Snake’s tension and rigidity,
respectively, 𝛾 is the weighting parameter in the external energy function that controls the
viscosity of the external energy function. Furthermore, 𝐼(𝑥, 𝑦) is the gray-level image and
𝐺𝜎 (𝑥, 𝑦) is the Gaussian function in two-dimensional space with the standard deviation of
𝜎 and at last 𝛻 is the gradient operator. In these equations large amount of 𝜎 causes the
image- boundaries become distorted.
3.5.2. Gradient Vector Flow (GVF) Snake

There are three difficulties with the parametric, traditional Snakes model. First, the initial
contour must, in general, be close to the true boundary, or else it will probably converge to
the wrong result. Second, Snake active contours do not progress into boundary concavities.
36
Third, parametric Snake active contours are non-convex. To solve these problems, GVF
Snake model is used. GVF Snake uses vector field v(p(s),q(s)) to minimize the energy
function with exchanging the image energy with the function in equation (26).
𝜀 = ∬ 𝜇(𝑝𝑥2 + 𝑝𝑦2 + 𝑞𝑥2 + 𝑞𝑦2 ) + |∇𝑓|2 |𝑉 − ∇𝑓|2 𝑑𝑥𝑑𝑦 (26)
Where, ∇𝑓 is the gradient of the edge map, f is derived from the input image I(x,y), 𝜇 is an
adjustment parameter, and 𝑑𝑥𝑑𝑦 indicates partial derivatives with respect to x and y axes.
In the implementation form we cannot create a function for the curve. Therefore, a discrete
form of the curves applied in computer which works related to the continuous curves’
functions. Generally, the contours are pushed toward the boundaries with high intensity
changes and the contours deform to capture the best fit of the objects.
As can be seen in the original image which is shown in figure 3.4 and the stretched image
which is shown in figure 3.5, the areas near the boundaries of buildings have different tone
when compared to tone near the central part of the buildings. Therefore, it is reasonable to
say that the region growing segmentation method may become incapable for extracting all
the parts of the building areas. Although these parts have different colors from the inner
parts of the buildings, they are not strong enough to become a problem for the GVF Snake
segmenting algorithm. And, although there are edges inside the building, they cannot be
effective when an edge detection algorithm is applied with the controllable lower level and
upper level edge detecting manner. To do that a well-known edge detection algorithm is
used to find the GVF as one of the requirements of the Snake algorithm and do some
creative changes on thresholds of it to get more accurate segmentation results. The first
contribution here is to raise the lower level threshold in order not to detect the weak edges
of the buildings which are mostly located inside the building. The second contribution is to
automate the GVF Snake algorithm by composing initial vectors to start the energy-
minimizing contour segmentation. To do that, the method uses the result of the region
growing segmentation which is in raster form. It finds the boundaries as raster and then
generates a vector index of the extracted boundary and ultimately uses these vector
boundaries as the input for the Snake model to extract buildings. To minimize the energy
function of the Snake model, 15 iterations were used for the test images. This iteration
number was determined experimentally to optimize the performance of the method. The
higher iteration number would lead to increase the computational work and consequently
37
would take longer time to reach a result. On the other hand, the lower iteration number
would cause incomplete and/or inaccurate segmentation of the interested objects. The
lower and upper thresholds for the Canny edge detector used in this section are given in
Table 3.1. The results of this part are given in figures 3.25, 3.26 and 3.27.
Figure 3.25 The boundaries of the segmented image.
Figure 3.26 The results of the Canny-edge detection algorithm (Edge-map)
38
Figure 3.27 The result of the GVF Snake model
In the proposed approach, the GVF Snake is applied on the result of the region-growing
segmentation. By doing this the accuracy of segmentation is improved. However, the fact
that the main contribution in this part of the thesis is to automate the Snake model by
means of determining the initial contours’ locations which are seeded on a place close to
boundaries of the buildings. Therefore, in the proposed approach, the GVF Snake
algorithm is automated by using the output of the region-growing segmentation as the
input for the GVF Snake algorithm.
39
Table 3.1. The parameters and the parameter values used in the study
Initial Parameter Settings for Proposed Approach

Processing Parameter Predefined values
Maximum-Brightness Value threshold 20-40 pixels
Shadow Extraction
150 pixels in 8-
Minimum-Area to Remove
connectivity
S1 5-7 pixels
Sizes of Structure Elements S2 5 pixels
S3 11 pixels
1-3 pixels in 8-
Minimum-Area to Remove
connectivity
Buffer Zone Generation
Trimming size 20-30 pixels
Coefficient of standard deviation used in

1
noise removing procedure
є 5-9 pixels
Median Filter Size [11 X 11]

Region-Growing
segmentation
Condition of the connectivity 8-neighbour pixels
Iteration number 25 -40

Canny upper and lower C1 0,09
thresholds C2 1
- Interpolation:
- The maximum distance between 2
two Snake points
- The minimum distance between
two Snake points 0.5
Snake segmentation
- External forces:
0.6
- Edge force coefficient
- Internal forces: 0.05

- Alpha (Elasticity)
- Beta (Rigidity) 0
- Gamma (Viscosity) 1
40
CHAPTER 4
The Experiments, Results and Discussion
4.1 Image Data Set
The test images used were selected from Google Earth which cover urban, rural and sub-
urban areas in and near Ankara the capital of Turkey (Figure 4.1). The Google Earth
Images of the test area were collected by the IKONOS satellite sensor with 1-meter spatial
resolution and three spectral bands (Blue, Green, Red). The selected test areas contain
buildings with various characteristics, such as the shape, size, height etc. The assessments
were carried out over 50 test images (Table 4.1) selected from 36 different locations in
Ankara. The test images were particularly selected from those areas that cover diverse
building characteristics, such as the shape, size, color, density, and the roof shape (flat and
non-flat). Indeed the main factor for selecting the test images of various types of buildings
with various environments in their neighborhood was to measure and demonstrate the
algorithm’s weakness and efficiency of the processing steps of shadow extraction, buffer
zone generation, region growing segmentation and segmentation using the GVF Snake
model.
41
Ankara
Figure 4.1 The red rectangles show the areas from which the test images (50 in total) were
selected. One of these test images is illustrated in the lower right.
The test sites were grouped into four categories by means of arranging them based on the
number of buildings contained within. The first group contains the test sites with one
building only. Falling within this group are the test sites #(1– 6), #9, #10, #(16–18), #20,
#(27–30), #38, #41 and #45. The test sites in the second group contain two and three
buildings. Falling within this group are the test sites #7, #11,#(13-15),#22,#(24-
26),#33,#35,#43 and #44. The number of buildings that fall within the test sites of the third
group are in the range from 4 to 10. Falling within this group are the test sites
#19,#21,#31,#36,#37,#40,#42,#46 and #50. The fourth group contains the test sites with
the number of buildings above ten. Falling within this group are the test sites
#8,#12,#23,32,#34,#39 and #(47-49).
As can be seen in Table 4.1 there exist extensive complexities of the images of the selected
test fields. There are diversities from many aspects of the buildings’ characteristics.
Although the test images were grouped according to number of buildings contained within,
they also contain various characteristics of buildings, such as the size, shape, color, rooftop
type, slope of the roofs as well as the complexities of their environments which indeed
make the proposed building detection approach challenging. In addition, there also exist
various illumination cases. For example, the extraction of both the cast shadows and the
building itself for those buildings that are obscured by the surrounding trees will be quite
42
challenging. Furthermore, there also exist some soil exposed surfaces whose reflectance
values are similar to the buildings’ reflectance values.
Table 4.1 Test images

Number of
Original Images
Buildings
(1) (2) (3)
(4) (16) (6)
1#
(9) (5) (10)
(17) (27) (20)
43
(18) (38)
1#
(29) (30) (28)
(41) (45)
(7) (26) (11) (24) (25)

2-3#
(35) (14)
44
(13) (15)
2-3#
(22) (33) (43)
(44)
(19) (21) (40)
4-10#
(31) (42) (46)
45
4-10#
(50) (37) (36)
Over (12) (23) (32) (39) (49)
10#
(8) (47) (34) (48)
4.2 The Assessment Strategy and Parameter Selection
The performance of the proposed building detection method was evaluated by comparing
the results with the reference data. For each building, the reference boundaries and the
shadow boundaries were manually extracted through on screen digitization. The buildings
which have been exposed partially in the images are also used in the generation of the
reference data for the buildings.
The shadow areas for the shadow performance evaluation and the building areas for the
building accuracy evaluations were considered as the foreground of the binary image and
the other parts of the images were regarded as the background of the images. The
46
algorithm tries to find the foreground parts and segment out these areas from the other
parts of the images.
To evaluate the performance of the approach, pixel-based measures have been used. In the
pixel-based evaluation, the pixels are classified into the following distinct categories [47].
 True positive (TP)

 False positive (FP)
 False Negative (FN)
 True Negative (TN)
A simple illustration of these four categories is given in Figure 4.2. TP denotes a pixel that
is labeled as a building by the proposed approach also corresponds to a building in the
reference data. FP indicates a pixel that does not correspond to any of the pixels labeled as
buildings in the reference data, and FN corresponds to a pixel that is labeled in the
reference data as building but cannot be found by the proposed approach. TN refers to the
pixels that are not labeled by the proposed approach and do not correspond to a building in
the reference data.
TN
FN
FP
P
Case of study
Sharp corner polygon denotes classified objects

through the algoritm
Round corner polygon denote the reference data
Figure 4.2 The illustration of True Positive (TP), True Negative (TN), False Negative (FN)
and False Positive (FP).
47
In this study, the well-known three metrics Precision, Recall, and Fβ-score (Eq. 27-29)
were used to evaluate the pixel-based performance of the proposed approach. [47], [32],
[33].
#𝑇𝑃
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 = (27)
#𝑇𝑃 + #𝐹𝑃
#𝑇𝑃
𝑅𝑒𝑐𝑎𝑙𝑙 = (28)
#𝑇𝑃 + #𝐹𝑁
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑟𝑒𝑐𝑎𝑙𝑙
𝐹 (1 + 2 ) (29)
( ∗ 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 + 𝑟𝑒𝑐𝑎𝑙𝑙)
2
In Equations 27 and 28, the sign # denotes the number of pixels assigned to each distinct
category. In Equation 29, Fβ-score, which is the weighted harmonic mean of Precision and
Recall values, combines the Precision and Recall metrics to a single measure with a
nonnegative real constant value of β. In a building detection scenario, since both Precision
and Recall metrics are equally important these metrics were evenly weighted in this study
by fixing the tuning parameter to one (β = 1).
4.3 Results
The test images along with the results of the three main steps of shadow extraction, region-
growing segmentation and the segmentation through GVF Snake algorithm are given in
Figure 4.3, which enables to make a visual comparison of the outputs of the proposed
building extraction procedure. In Figure 4.3, column “a” illustrates the original images,
column “b“ illustrates the results of shadow extraction, column “c“ illustrates the outputs
of the region-growing segmentation, and column “d“ illustrates the segmentation results of
the GVF Snake algorithm.
48
(a) (b) (c) (d)
#1
#2
#3
#4
#5
49
#6
#7
#8
#9
#
10
#
11
50
#
12
#
13
#
14
#
15
#
16
#
17
#
18
51
#
19
#
20
#
21
#
22
#
23
#
24
52
#
25
#
26
#
27
#
28
#
29
#
30
53
#
31
#
32
#
33
#
34
#
35
#
36
54
#
37
#
38
#
39
#
40
#
41
55
#
42
#
43
#
44
#
45
#
46
#
47
56
#
48
#
49
#
50
Figure 4.3 (a) The original images, (b) the results of shadow extraction, (c) the results of
region growing segmentation, and (d) the results of GVF Snake segmentation. The green
areas show the TP pixels, the Blue areas show the FN pixels, and ultimately the Red areas
illustrate the FP pixels.
The computed Precision, Recall, |FB, TP, FP, FN, and TN values as well as the overall
values for each of the test sites are given in Tables, 4.3, 4.4, and 4.5, respectively for the
extracted shadow regions, the building areas segmented through region-growing
segmentation, and the building areas segmented through the GVF Snake algorithm.
57
Table 4.2.The computed Precision, Recall, FB, TP, FP, FN and TN values for the extracted
shadow regions of the test fields.
Numerical Results of The Proposed Shadow Detection Approach

Shadow extraction
Pixel-based accuracy assessment
# precision recall FB TP FP FN TN
#1 94,3% 89,6% 91,9% 4114 250 477 41699
#2 97,3% 92,1% 94,6% 2674 74 230 14732
#3 97,7% 74,8% 84,8% 2183 51 734 12842
#4 94,3% 80,3% 86,7% 10568 639 2592 56071
#5 99,4% 96,9% 98,1% 7271 43 232 35924
#6 96,5% 97,6% 97,0% 6035 218 150 57089
#7 97,3% 94,6% 96,0% 10556 288 598 68916
#8 96,5% 90,9% 93,6% 23703 860 2363 220768
#9 100,0% 59,0% 74,2% 752 0 522 18774
#10 92,2% 95,7% 93,9% 3655 309 163 50851
#11 97,7% 67,0% 79,5% 5270 126 2596 31875
#12 83,2% 64,5% 72,7% 9537 1921 5239 89143
#13 79,5% 76,7% 78,1% 6918 1784 2101 54882
#14 87,1% 80,7% 83,8% 6748 1001 1612 47401
#15 71,4% 88,2% 79,0% 6755 2701 900 51033
#16 97,4% 88,8% 92,9% 4364 117 550 94901
#17 95,6% 90,1% 92,7% 12253 567 1354 56848
#18 91,2% 88,4% 89,8% 6596 637 862 28115
#19 93,1% 88,2% 90,6% 22312 1661 2977 224429
#20 96,0% 94,6% 95,3% 5419 225 311 26220
#21 90,0% 82,7% 86,2% 12661 1412 2654 161225
#22 77,5% 90,7% 83,6% 5896 1714 605 69656
#23 94,3% 86,5% 90,2% 15189 912 2381 303917
#24 88,2% 92,7% 90,4% 2122 284 168 49973
#25 98,3% 94,3% 96,2% 10531 188 642 61185
#26 96,7% 84,8% 90,3% 2565 89 461 17297
#27 92,5% 83,4% 87,7% 2072 168 412 16958
#28 99,2% 87,6% 93,1% 2478 19 350 16809
#29 86,8% 94,1% 90,3% 2222 339 140 45443
#30 74,4% 96,9% 84,2% 2658 913 85 37048
#31 91,9% 89,2% 90,6% 24327 2139 2936 378814
#32 89,3% 86,3% 87,8% 42059 5043 6692 787264
#33 99,8% 74,9% 85,6% 9272 20 3112 79552
#34 94,9% 82,8% 88,5% 56848 3063 11776 843582
#35 98,8% 88,5% 93,4% 5825 73 755 80185
#36 83,4% 58,1% 68,6% 3321 639 2398 82892
#37 95,1% 79,6% 86,7% 29806 1528 7647 259835
#38 92,9% 94,1% 93,5% 4156 320 259 32138
#39 87,3% 86,2% 86,7% 11762 1719 1877 155610
#40 99,6% 74,5% 85,2% 3717 14 1273 41310
#41 98,4% 85,2% 91,3% 5693 94 988 55735
#42 98,9% 81,4% 89,3% 25117 282 5748 124981
#43 97,9% 80,6% 88,4% 11915 251 2871 80098
#44 95,5% 85,5% 90,3% 8253 387 1397 98167
#45 96,4% 94,0% 95,2% 1216 46 77 16463
#46 98,0% 83,5% 90,1% 14885 311 2949 183767
#47 58,5% 45,7% 51,3% 5638 3998 6694 251646
#48 93,7% 82,9% 88,0% 26253 1753 5432 314590
#49 96,3% 87,4% 91,6% 21733 840 3148 335887
#50 95,2% 55,9% 70,5% 7019 351 5532 343762
overall 92,3% 83,7% 87,3% 534892 42381 108022 6608302
58
segmented building areas of the test fields through the region growing method
Numerical Results of Region Segmentation Detection Approach

Region Growing segmentation
#1 100,0% 67,0% 80,3% 12181 0 5989 28370
#2 97,5% 69,3% 81,0% 3014 79 1337 13280
#3 89,7% 34,6% 49,9% 935 107 1768 13000
#4 99,4% 96,6% 98,0% 13547 80 480 55763
#5 99,7% 89,2% 94,2% 10295 31 1241 31903
#6 98,3% 96,1% 97,2% 7906 139 325 55122
#7 97,5% 70,4% 81,8% 10217 266 4291 65584
#8 94,4% 76,6% 84,5% 43132 2584 13187 188791
#9 100,0% 45,0% 62,1% 3103 0 3796 13149
#10 99,4% 81,1% 89,3% 15853 93 3702 35330
#11 100,0% 35,0% 51,9% 2669 0 4955 32243
#12 97,3% 62,9% 76,4% 18379 506 10836 76119
#13 95,1% 59,2% 72,9% 9202 478 6355 49650
#14 96,4% 65,6% 78,0% 7464 283 3920 45095
#15 96,6% 83,8% 89,7% 16761 590 3241 40797
#16 97,7% 74,8% 84,7% 32409 776 10910 55837
#17 100,0% 40,1% 57,2% 5716 2 8542 56762
#18 98,6% 74,7% 85,0% 4660 65 1575 29910
#19 96,9% 75,2% 84,7% 23707 752 7824 219096
#20 99,9% 65,2% 78,9% 4469 6 2382 25318
#21 96,5% 81,8% 88,5% 29571 1082 6593 140706
#22 97,3% 86,0% 91,3% 15872 435 2580 58984
#23 99,3% 58,4% 73,6% 28300 198 20154 273747
#24 98,8% 57,0% 72,3% 5125 62 3873 43487
#25 99,5% 47,7% 64,5% 10139 52 11128 51227
#26 93,7% 79,7% 86,1% 3839 260 978 15335
#27 100,0% 38,1% 55,2% 2483 0 4037 13090
#28 92,9% 86,1% 89,4% 2535 195 409 16517
#29 99,9% 79,4% 88,4% 14075 18 3663 30388
#30 98,0% 52,8% 68,7% 4007 80 3576 33041
#31 95,7% 79,9% 87,1% 66214 2974 16658 322370
#32 95,9% 69,6% 80,6% 112713 4823 49355 674167
#33 98,1% 64,3% 77,7% 19265 373 10711 61607
#34 81,1% 43,6% 56,7% 76717 17882 99357 721313
#35 97,4% 43,5% 60,1% 7268 192 9458 69920
#36 93,8% 25,9% 40,6% 5658 377 16189 67026
#37 99,3% 48,8% 65,4% 44796 325 47078 206617
#38 100,0% 62,5% 76,9% 8429 0 5068 23376
#39 94,0% 60,4% 73,5% 22209 1419 14577 132763
#40 98,1% 53,4% 69,1% 6484 125 5665 34040
#41 99,0% 50,3% 66,7% 6567 70 6490 49383
#42 97,5% 75,3% 84,9% 51237 1323 16845 86723
#43 99,1% 60,5% 75,1% 21913 191 14318 58713
#44 97,5% 73,4% 83,8% 13881 358 5027 88938
#45 99,3% 56,1% 71,7% 3447 23 2700 11632
#46 97,1% 74,3% 84,2% 18725 560 6489 176138
#47 98,5% 63,3% 77,1% 26923 402 15580 225071
#48 96,3% 78,8% 86,7% 65105 2471 17489 262963
#49 98,3% 74,0% 84,4% 34354 613 12076 314565
#50 99,8% 57,1% 72,6% 38008 80 28610 289966
overall 97,3% 64,9% 76,6% 1011478 43800 553387 5684932
59
segmented building areas of the test fields through the GVF Snake method.
Numerical Results of the Proposed Automatic Building Extraction method

Snake-segmentation
#1 94,8% 96,4% 95,6% 17512 961 658 27409
#2 93,9% 96,6% 95,2% 4203 275 148 13084
#3 82,4% 34,2% 48,3% 924 198 1779 12909
#4 97,8% 96,3% 97,1% 13511 302 516 55541
#5 96,8% 94,5% 95,6% 10905 366 631 31568
#6 92,5% 99,7% 95,9% 8202 666 29 54595
#7 93,9% 91,7% 92,8% 13304 866 1204 64984
#8 91,6% 89,2% 90,4% 50251 4626 6068 186749
#9 95,1% 51,6% 66,9% 3562 185 3337 12964
#10 97,8% 88,6% 92,9% 17315 389 2240 35034
#11 89,4% 74,1% 81,0% 5649 668 1975 31575
#12 85,0% 68,3% 75,8% 19965 3511 9250 73114
#13 86,3% 77,8% 81,8% 12105 1921 3452 48207
#14 87,5% 88,9% 88,2% 10115 1447 1269 43931
#15 90,6% 89,3% 90,0% 17866 1844 2136 39543
#16 95,2% 99,9% 97,5% 43272 2179 47 54434
#17 99,8% 91,2% 95,3% 12998 27 1260 56737
#18 92,3% 92,9% 92,6% 5791 483 444 29492
#19 94,6% 91,4% 93,0% 28805 1645 2726 218203
#20 97,8% 78,9% 87,4% 5408 120 1443 25204
#21 91,0% 95,3% 93,1% 34455 3397 1709 138391
#22 92,8% 97,4% 95,1% 17979 1392 473 58027
#23 83,3% 82,7% 83,0% 40061 8015 8393 265930
#24 91,6% 89,8% 90,7% 8084 744 914 42805
#25 91,8% 53,8% 67,9% 11448 1027 9819 50252
#26 83,5% 74,0% 78,5% 3566 706 1251 14889
#27 95,9% 49,5% 65,3% 3225 139 3295 12951
#28 88,8% 94,4% 91,5% 2780 351 164 16361
#29 93,4% 99,3% 96,3% 17614 1245 124 29161
#30 94,6% 98,1% 96,3% 7440 427 143 32694
#31 89,4% 86,9% 88,1% 71971 8519 10901 316825
#32 90,1% 75,8% 82,3% 122882 13533 39186 665457
#33 87,6% 79,9% 83,6% 23935 3385 6041 58595
#34 77,3% 54,2% 63,7% 95383 28071 80691 711124
#35 93,3% 53,9% 68,3% 9013 651 7713 69461
#36 83,9% 39,2% 53,4% 8561 1641 13286 65762
#37 90,7% 67,0% 77,1% 61551 6334 30323 200608
#38 92,8% 87,0% 89,8% 11748 912 1749 22464
#39 84,9% 80,7% 82,7% 29670 5293 7116 128889
#40 87,3% 74,0% 80,1% 8991 1308 3158 32857
#41 96,7% 70,7% 81,7% 9237 320 3820 49133
#42 92,9% 83,9% 88,2% 57146 4363 10936 83683
#43 94,2% 83,6% 88,6% 30286 1855 5945 57049
#44 85,8% 98,1% 91,5% 18546 3079 362 86217
#45 94,0% 71,2% 81,0% 4375 278 1772 11377
#46 91,3% 92,9% 92,1% 23429 2238 1785 174460
#47 86,8% 90,8% 88,8% 38600 5884 3903 219589
#48 89,5% 86,6% 88,1% 71551 8356 11043 257078
#49 90,1% 89,9% 90,0% 41727 4608 4703 310570
#50 92,5% 63,4% 75,2% 42225 3412 24393 286634
overall 91,1% 81,1% 84,9% 1229142 144162 335723 5584570
60
For each of the shadow extraction, region growing segmentation, and segmentation
through GVF Snake algorithm, a grouping arrangement of the test sites with respect to FB
accuracy results are shown in Tables 4.5, 4.6 and 4.7. Table 4.5 contains the results for the
shadow extraction operation. In a similar manner, Table 4.6 contains the grouping of the
accuracy results of the Region-Growing segmentation operation and Table 4.7 contains the
grouping of the results of the GVF segmentation. These three tables provide more details
about the accuracies of the test images with respect to three determined accuracy ranges.
These tables also show the percentages of the total number of test sites which have been
assigned into one of the three accuracy ranges.
Table 4.5 The grouping of the test sites with respect to FB accuracy of shadow extraction.
Shadow Detection
Test images are assigned to each range Number

FB accuracy range of percentage
of FB accuracy
element
Below 70% 36 1 2%
70% - 85% 3-9-11-12-13-14-15-22-30-50 10 20%
1-2-4-5-6-7-8-10-16-17-18-19-20-21-23-
Over 85% 24-25-26-27-28-29-31-32-33-34-35-37-38- 39 78%
39-40-41-42-43-44-45-46-47-48-49
Table 4.6 The grouping of the test images with respect to FB accuracy of Region-Growing
segmentation.
Region Growing
Segmentation
of FB accuracy
element
Below 70% 3-9-11-17-25-27-30-34-35-36-37-40-41 13 26%
1-2-7-8-12-13-14-16-19-20-23-24-32-33-
70% - 85% 24 48%
38-39-42-43-44-45-46-47-49-50
Over 85% 4-5-6-10-15-18-21-22-26-28-29-31-48 13 26%
61
Table 4.7 The grouping of the test images with respect to FB accuracy of GVF Snake
segmentation.
GVF Snake
Segmentation
of FB accuracy
element
Below 70% 3-9-25-27-34-35-36 7 14%
70% - 85% 11-12-13-23-26-32-33-37-39-40-41-45-50 13 26%
1-2-4-5-6-7-8-10-14-15-16-17-18-19-20-
Over 85% 21-22-24-28-29-30-31-38-42-43-44-46- 30 60%
47-48-49
4.4 Discussions
 The results given in Figure 4.3 illustrate the number of pixels assigned to TP, FP and
FN in the pixel-based evaluation. It can be stated based on the visual inspection that
the proposed approach for shadow-based building extraction is quite successful.
 There are significant factor of impacts which affect the accuracy of segmentation and
consequently the final results of building extraction. One of these factors is FP value
computed for the results of the Region-Growing segmentation that are quite low for
all test sites. Indeed, this is a clear indication for the robustness of the developed
algorithm. Having a low amount of FP values in the Region-Growing segmentation
process highly depends on the shadow extraction and buffer zone generation parts.
 As can be seen in table 4.2, the precision, recall and FB ratios are in the range 58.5%
-100%, 45.7%- 97.6% and 51.3% - 98.1%, respectively for the extracted shadows.
The overall precision, recall and quality (FB) values of 92.3%, 83.7% and 87.3%,
illustrate the robustness of the proposed algorithm in shadow extraction. The high
values of the results of shadow extraction have also proved that the algorithm is
capable of overcoming the mix problems between the shadow and vegetation areas
when they are close to each other fully or partially in the absence of NIR band, in
which the vegetation and shadow areas are separable. In addition, the proposed
method has also been successful to obtain acceptable results even for the Google
Earth images with relatively low spatial and radiometric resolutions.
62
 Since the algorithm highly depends on the success of the extracted shadow areas, it is
reasonable to have low success of the final extracted building regions due to the
failure in the shadow extraction mission. Nevertheless, in several cases high accuracy
values were obtained even under partial extraction of the shadow areas. Therefore,
the method has some advantages to extract buildings even under partial detection of
the cast shadows. For example, for test site #47, the extracted shadow area has the FB
accuracy of 51.3 %. However, in the proceeding it has a quite high accuracy value of
88.8 % for the GVF Snake as the ultimate result of the method. As can be seen, the
ultimate result is above the overall accuracy and it’s quite acceptable for this method.
 Since the proposed method is based on the detection of the cast shadows of the
buildings, the method is completely incapable of extracting those buildings with the
no cast shadow areas inside the test site. There are some cases in the test images like
image #32, where the shadow area of one building is not inside the test scene and
therefore this building remained undetected through the implementation of the
approach.
 More specifically it can be seen in Table 4.5 that one test site has the FB accuracy
value below 70%. This also indicates the robustness of the method in shadow
extraction. In addition, the FB accuracy of test image #36, stayed below 70% (Table
4.5). This is due to the fact that a number of factors, such as the presence of the
vegetation areas near the buildings and having too much mix with the shadow regions
of the buildings, the heights and the small sizes of the buildings, have caused the
shadow regions to become small and thus their extraction has become more difficult.
Further, all buildings in this test side (#36) had heterogeneity of the brightness values
in their rooftops and therefore have led to rather lower accuracy in the segmentation
parts.
 For the region growing segmentation part, the minimum and maximum values
computed for precision, recall and FB methods were 81.1%-100%, 25.9% -96.6%
and 40.6% - 98 %, respectively. The high precision ratios demonstrate that the pixels
that are labeled as buildings are highly correct. However, in several cases, the
number of pixels assigned as buildings were found to be quite low. Nevertheless, the
overall accuracy value of the results of evaluation is approximately 70% which can
be considered adequate for building detection but not for building extraction
purposes. The main reason for the high accuracy values computed for the TP pixels is
63
due to the efficiency of the buffer zone generation during which the sample pixels
were accurately collected from the building areas. On the other hand, the reason for
the low values for the Recall evaluation method is due to the fact that less number of
pixels which were labeled as building in the Region-Growing segmentation part. This
amount also has affected the results of FB method because FB method considers
precision and recall accuracy assessment as well in presenting an average evaluation.
 From other point of view, it can be obviously seen from the number of the test
images assigned to each of the accuracy ranges that the less number of the test
images really has reliable FB accuracy in this section to be an adequate methodology
for segmenting the outlines of the buildings. Therefore, the methodology needs to an
extra pert to increase the accuracy of the results and makes them more reliable and
robust.
 The developed method has used the GVF Snake segmentation algorithm to improve
the outputs of Region-Growing algorithm. The results have proved that the GVF
Snake segmentation has increased the accuracies of the region growing segmentation
for 45 test sites out of 50. The improvement of the results of the Region-Growing
segmentation is mainly because of two facts. First, the number of TP pixels increase
due to the fact that GVF Snake overcomes the local noise, whereas the Region-
Growing algorithm. Second, the number of FP pixels decrease resulting the increase
of the accuracy because it uses the gradients of the image and tries to catch them.
However the overall accuracies computed for the FB and Recall evaluations
demonstrate that the GVF Snake algorithm increases the results of the region
growing segmentation to 84.9 % for the FB evaluation and 81.1% for the Recall
evaluation.
 From a quantitative point of view, it can be seen in Tables 4.6 and 4.7 that the
accuracy results increased considerably. The GVF Snake algorithm increased the
number of test sites with higher accuracy whereas decreased the number of test sites
with lower of accuracy.
 One of the problematic cases for extracting the outlines of the buildings is due to the
diversities of slopes of the rooftops of buildings. Although the whole roof surface of
a building is covered with the same material, the roof can have different BVs due to
the variations in the reflectance from different parts of the roof surface. Therefore,
the diversities of the slopes of the rooftops of buildings lead to problems in
64
segmenting the outlines of the buildings. This impact can be clearly seen in the
results of test images #3,#8,#12,#13,#15,#24,#26,#27,#(32-41) and #50.
 There are some challenging cases in which the buildings and their environments
around them have the same reflectance value with their rooftops. Therefore, it
becomes a challenging case to separate correctly the building area from its
surrounding environments. Test images #1, #23, #24, #26 and #32 indicate the
related challenging cases. The algorithm has gained the FB results of 95.6%, 83%,
90.7%, 78.5% and 82.3% respectively in these test scenes. The mentioned results
prove the robustness of the algorithm in these cases.
4.5 The advantages of the proposed method
1- The proposed method is useful and generic for the buildings with simple and
complex shapes. Therefore, the method can be successfully implemented to
automatically extract buildings with arbitrary shapes.
2- The proposed method is also quite successful in extracting the buildings with
different sizes. In particular, if the processed image contains a single building then
the success of the method increases. This is due to the fact that the developed
method extracts buildings based on their cast shadows which provide a major hint
about the shapes and the sizes of the buildings. And, no matter the size of a
building, the buffer zone generated based on the extracted shadow provides a seed
area to start the region growing segmentation which is refined through the GVF
Snake model.
3- The proposed approach can also be efficiently used under varying illumination
settings and produce reasonable building extraction performances even under
challenging environmental conditions which can be seen in the Google Earth Test
Images.
4- The method can be considered fully automatic as it needs no user intervention in
the processing stage. Therefore, this makes the method ideal to be used even by the
inexperienced users.
5- One of the most valuable advantages of this method is that it uses free available and
easy-accessible high resolution Google Earth images.
6- The high accuracies of the extracted buildings also make this method ideal for 3d
building reconstruction.
65
7- The method may produce reasonably effective results even under the relatively
poor shadow conditions. Nevertheless, this case is not quite general for all the poor
shadow areas.
4.6 The limitations of the method
1- The proposed method is incapable of detecting the buildings with no shadows.

Therefore, the proposed approach may lose its efficiency for the cases in which
(self-) occlusion occurs on the shadows. Thus, the proposed method is highly
dependent on the extracted shadows.
2- On buildings with non-flat roofs the slopes of the rooftops may cause to have
various reflectance values in parts of the building’s roof. Therefore, this becomes a
barrier not only for the region growing segmentation to grow but also for the GVF
Snake algorithm to segment the other parts of the building’s roof with different
slope.
3- If a group of connected buildings are handled simultaneously, the connected
buildings may not be detected as a single building area.
66
CHAPTER 5
Conclusions and Future works
In this thesis a new approach was proposed for the automatically extraction of buildings
from high resolution multispectral Google Earth images. To test and evaluate the success
of the approach extensive experiments were carried out on 50 test sites selected in and near
Ankara, the capital of Turkey. The results of experiments illustrate that the performance of
the approach is quite high for both shadow detection and building extraction. In this
chapter, the conclusions derived from the developed approach are stated and the
recommendations regarding to possible further studies are given.
67
5.1 Conclusions
The conclusions which were derived from this study are as follows:
 The proposed approach takes the full advantages of the luminance channel of the
LAB color space to extract shadow regions even if they are mixed with the
vegetation areas near the shadow regions. The fact which has been explored in this
study that the shadow areas have the lowest brightness values in the luminance
channel. As a result, in most of the cases, the shadow areas were able to be
separated from the vegetation areas using a pre-defined threshold.
 The evaluation results for the shadow extraction section of the study clearly
demonstrate the reliability and efficiency of the method which is used to extract the
cast shadows of the buildings. For the extracted shadow areas, the average accuracy
value for all test areas was computed to be 87.3%. The visual assessments of the
results also confirm this fact.
 In the case there exists vegetation, the shadow extraction method wrongly extracts
the vegetation areas instead of the cast shadows of the buildings. However, the
proposed approach reduces the bad-effects caused by the vegetation areas by means
of performing the successive morphological erosion operations. The morphological
erosion operations reduce the noise in the buffer zone generation part. With the
sequential erosion operations, which are carried out to make the size of the buffer
zone smaller, the wrongly detected vegetation areas that illustrate spectral
reflectance values similar to that of shadow areas are reduced.
 Removing the noise in the buffer zone generation stage improved the initial sample
collection operation and made the sample collection procedure more reliable and
efficient. This is due to the fact that through the noise removal operation those seed
points which had spectral values above a certain decision range, which was
determined as the thresholds for this operation, were removed. The buffer zones
before and after removing the noise can be seen in Figures 3.17 and 3.18,
respectively.
 This study uses sequentially the region growing segmentation method and the GVF
Snake algorithm to extract boundaries of buildings. Both the visual assessments and
the quantitative evaluation of the results demonstrate that the GVF Snake algorithm
improves the results of the region growing segmentation method about 15 percent
in the overall accuracy. Further, the GVF Snake algorithm has increased the
68
segmentation accuracy results to 96% and the amount of increase was in the range
from %0,5 to %38,1.
 The GVF Snake segmentation algorithm does not produce satisfactory results if it is
directly implemented. Therefore, in this study a region-growing segmentation is
used as the initial segmentation in order to generate a region for seeding the initial
contours for the GVF Snake algorithm. By this way, the proposed approach
eliminates the user interaction in the creation of initial contours and made the GVF
Snake algorithm automatic.
 The Canny edge detector has been applied with the predefined lower and upper
threshold values which make this edge detector to be less sensitive to weak edges
inside the buildings and increase the robustness of the method in detecting the
strong edges on the boundaries.
 The GVF Snake segmentation method with its particular external energy provides
much better segmentation even in the discrete lines of the boundaries. Moreover, it
can extract the concavities and achieve more precise segmentation in comparison to
traditional active contour models.
 Despite the use of relatively low radiometric quality and the three-band (RGB)
Google Earth satellite images in the experimental tests the obtained results have
proven the efficiency and effectiveness of the proposed approach.
5.2 Future works
o The main assumption used in this thesis is to extract shadow regions which has
near-135 angle from the x axis. As a future work, the aim is to extract shadow
illumination direction without the knowledge of solar azimuth and elevation angles.
This information may be unknown to the user wants to do building detection
through a software from the single image with the buildings shadow in the assumed
direction. To produce a full-automatic building detection the present algorithm
needs to find the direction of the shadows. The innovative challenge can be the use
of the shadow’s geometry to calculate the shadow directions using Hough-
transformation for line detection after a shadow boundary extraction mission.
o There are some problems in detecting the building roofs with various slopes which
can have a same color naturally but they are seen differently due to illumination
69
concept and the reflectance rate from their roofs. As the next project I want to
integrate the LIDAR data of the region and robust the weakness of the algorithm on
finding the buildings with different slops on their roof.
o Access to very high resolution images with NIR band able this proposed automatic
building extraction method to improve its algorithm by means of NDVI vegetation
index. So the additional NIR band can be used for reducing the negative-effects of
the vegetation area in shadow extraction through applying NDVI INDEX to mask
out the vegetation area effects and raise the method’s efficiency.
o As a robust shadow extraction method which is used in the shadow extraction stage
of the proposed method, it gives the opportunity to use this part in removing the
shadow areas in satellite images and enhance their visual interpretation. This idea
can be used to enhance sight of the structures are located under the shadow casts of
the higher structures.
o Another new and attractive investigating aspect of this thesis is to robust the
proposed method with fully-extracting the buildings with various slops in their
roofs using another color spaces considering the fact that they are naturally in the
same color but they just reflected different amount of brightness due to their
difference in there . This problem can be solved with considering their color
variation ratios when they are being considered in hue channel of the HSV color
space or in A and B channels of the LAB color space.
o Little modification of hierarchical use of region growing and GVF Snake

segmentation analysis can be used also to detect other important man-made features
like roads. Furthermore, it can be the initial step of roads-network updating with
regarding the characteristics of the roads.
o Building’s height can be calculated from the combination of some other methods
with the shadow extraction part of this thesis to estimate the elevations of the
buildings.
70
REFERENCES
[1] H. Mayer, “Automatic object extraction from aerial imagery—A survey focusing on
buildings,” Comput. Vis. Image Understand., vol. 74, no. 2, pp. 138–149, May
1999.
[2] E. Baltsavias, “Object extraction and revision by image analysis using existing
geodata and knowledge: Current status and steps towards operational systems,”
ISPRS J. Photogramm. Remote Sens., vol. 58, no. 3/4, pp. 129–151, Jan. 2004.
[3] C. Ünsalan and K. L. Boyer, “A system to detect houses and residential street
networks in multispectral satellite images,” Comput. Vis. Image Understand., vol.
98, no. 3, pp. 423–461, Jun. 2005.
[4] N., Haala, M., Kada, “An update on automatic 3D building reconstruction”, ISPRS
J. Photogramm. Remote Sens., vol. 65, no. 6, pp.570–580, Nov. 2010.
[5] C. Brenner, “Building reconstruction from images and laser scanning,” Int. J. Appl.
Earth Observ. Geoinf., vol. 6, no. 3/4, pp. 187–198, Mar. 2005.
[6] M. Tavakoli, A. Rosenfeld, “Building and Road Extraction from Aerial

Photogtaphs”, IEEE Trans. Syst., Man, Cybern., vol. smc-12, no. 1, pp. 84–91, Jan-
Feb, 1982.
[7] M., Herman, T., Kanade, “Incremental reconstruction of 3D scenes from multiple
complex images,” Artificial Intelligence, vol. 30, pp. 289, 1986.
[8] A. Huertas and R. Nevatia, “Detecting buildings in aerial images,” Comput. Vis.,
Graph., Image Process., vol. 41, no. 2, pp. 131–152, Feb. 1988.
[9] R. B. Irvin and D.M.McKeown, Jr., “Methods for exploiting the relationship
between buildings and their shadows in aerial imagery,” IEEE Trans. Syst., Man,
Cybern., vol. 19, no. 6, pp. 1564–1575, Nov./Dec. 1989.
[10] Y. T. Liow and T. Pavlidis, “Use of shadows for extracting buildings in aerial
images,” Comput. Vis., Graph., Image Process., vol. 49, no. 2, pp. 242–277, Feb.
1990.
[11] J. A. Shufelt, “Performance evaluation and analysis of monocular building

extraction from aerial imagery,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 21,
no. 4, pp. 311–326, Apr. 1999.
[12] J. C. McGlone and J. A. Shufelt, “Projective and object space geometry for
monocular building extraction,” in Proc. IEEE Comput. Soc. Conf. CVPR, pp. 54–
61 , 1994.
71
[13] J. A. Shufelt, “Exploiting photogrammetric methods for building extraction in aerial
images,” Int. Archives Photogramm. Remote Sens., vol. 31, pt. Part B6, pp. 74–79,
1996.
[14] C. Lin and R. Nevatia, “Building detection and description from a single intensity
image,” Comput. Vis. Image Understand., vol. 72, no. 2, pp. 101–121, Nov. 1998.
[15] M. Turker and B. T. San, “Detection of collapsed buildings caused by the 1999
Izmit, Turkey earthquake through digital analysis of post-event aerial photographs,”
Int. J. Remote Sens., vol. 25, no. 21, pp. 4701–4714, Nov. 2004.
[16] J. Peng and Y. C. Liu, “Model and context-driven building extraction in dense urban
aerial images,” Int. J. Remote Sens., vol. 26, no. 7, pp. 1289–1307, Apr. 2005.
[17] D. S. Lee, J. Shan, and J. S. Bethel, “Class-guided building extraction from Ikonos
imagery,” Photogramm. Eng. Remote Sens., vol. 69, no. 2, pp. 143–150, Feb. 2003.
[18] J. Inglada, “Automatic recognition of man-made objects in high resolution optical

remote sensing images by SVM classification of geometric image features,” ISPRS
J. Photogramm. Remote Sens., vol. 62, no. 3, pp. 236–248, Aug. 2007.
[19] D., Koc-San, M., Turker, “Support vector machines classification for finding
building patches from IKONOS imagery: the effect of additional bands,” Journal of
Applied Remote Sensing, Vol. 8, No.1, 2014.
[20] X., Huang, L., Zhang, “Morphological building/shadow index for building
extraction from high-resolution imagery over urban areas,” IEEE J. Selected
Topics Appl. Earth Observat. Remote Sens,vol. 5, no. 1, pp.161–172, 2012.
[21] M. Izadi and P. Saeedi, “Three-dimensional polygonal building model estimation

from single satellite images,” IEEE Trans. Geosci. Remote Sens., vol. 50, no. 6, pp.
2254–2272, Jun. 2012.
[22] C., Tanchotsrinon, S.,Phimoltares, C., Lursinsap, “An automatic building detection
method based on texture analysis, color segmentation, and neural classification,” 5th
International conference on knowledge and smart technology (KST), pp. 162- 167,
2013.
[23] E., Sümer, M., Turker, “An adaptive fuzzy-genetic algorithm approach for building
detection using high-resolution satellite images,” Computers, Environment and
Urban Systems, vol. 39, pp. 48–62. 2013.
[24] S. Ghaffarian , S. Ghaffarian, “Automatic building detection based on Purposive

FastICA (PFICA) algorithm using monocular high resolution Google Earth
images”, ISPRS J. Photogramm. Remote Sens., vol. 97, pp. 152–159, Nov. 2014.
72
[25] S. Krishnamachari and R. Chellappa, “Delineating buildings by grouping lines with
MRFs,” IEEE Trans. Image Process., vol. 5, no. 1, pp. 164–168, Jan. 1996.
[26] T. Kim and J. P. Muller, “Development of a graph-based approach for building

detection,” ImageVis.Comput., vol. 17, no. 1, pp. 3–14, Jan. 1999.
[27] T., Kim, J.P., Muller, “A technique for 3D building reconstruction,”

Photogrammetric Engineering and Remote Sensing, vol. 64, no. 9, pp.923–930,
1998.
[28] A. Katartzis and H. Sahli, “A stochastic framework for the identification of building
rooftops using a single remote sensing image,” IEEE Trans. Geosci. Remote Sens.,
vol. 46, no. 1, pp. 259–271, Jan. 2008.
[29] A., Croitoru, Y., Doytsher, “Monocular right-angle building hypothesis generation
in regularized urban areas by pose clustering,” Photogrammetric Engineering and
Remote Sensing, vol. 69, no. 2, pp.151–169, 2003.
[30] B. Sirmacek and C. Unsalan, “Urban-area and building detection using SIFT
keypoints and graph theory,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 4, pp.
1156–1167, Apr. 2009.
[31] C. Ünsalan and K. L. Boyer, “A system to detect houses and residential street
networks in multispectral satellite images,” Comput. Vis. Image Understand., vol.
98, no. 3, pp. 423–461, Jun. 2005.
[32] A.O., Ok, C., Senaras, and B., Yuksel, “Automated Detection of Arbitrarily Shaped
Buildings in Complex Environments From Monocular VHR Optical Satellite
Imagery”, IEEE Trans. Geosci. Remote Sens., vol. 51, no. 3, pp. 1701–1717, Mar.
2013.
[33] A.O., Ok, “Automated detection of buildings from single VHR multispectral images
using shadow information and graph cuts”, ISPRS journal of photogrammetry and
remote sensing , vol. 86, pp. 21-40, 2013.
[34] M. Kass, A. Witkin, D. Terzopoulos, “Snakes: active contour models”,

International Journal of Computer Vision, vol. 1, no. 4, pp. 321–331, 1988.
[35] C. Xu, J.L., Prince, “Snakes, Shapes, and Gradient Vector Flow”, IEEE Trans.
Image Process., vol. 7, no. 3, pp. 359–369, Mar. 1998.
[36] L.B., Theng, R. Chiong, “An improved Snake for Automatic Building Extraction”,
6th WSEAS Int. Conference on Computational Intelligence, Man-Machine Systems
and Cybernetics, Tenerife, Spain, pp. 171-176, Dec. 2007.
73
[37] J. Peng, D. Zhang, and Y. Liu, “An improved Snake model for building detection
from urban aerial images,” Pattern Recognit. Lett., vol. 26, no. 5, pp. 587–595, Apr.
2005.
[38] G., Cao, X., Yang, “Man-made object detection in aerial images using multistage
level set evolution,” International Journal of Remote Sensing, vol. 28, no. 8, pp.
1747– 1757, 2007.
[39] T.F., Chan, L.A., Vese, “Active Contours Without Edges”, IEEE Trans. Image
Process., vol. 10, no. 2, pp. 266–277, Feb. 2001.
[40] K., Karantzalos, N., Paragios, , “Recognition-driven two-dimensional competing

priors toward automatic and accurate building detection,” IEEE Transactions on
Geoscience and Remote Sensing , vol. 47, no. 1, pp.133–144, 2009.
[41] S. Ahmadi, M. Zoej, H. Ebadi, H. A. Moghaddam, and A. Mohammadzadeh,

“Automatic urban building boundary extraction from high resolution aerial images
using an innovative model of active contours,” Int. J. Appl. Earth Observ. Geoinf.,
vol. 12, no. 3, pp. 150–157, Jun. 2010.
[42] M., Awrangjeb, M., Ravanbakhsh, C.S., Fraser, “ Automatic detection of residential
buildings using LIDAR data and multispectral imagery,” ISPRS Journal of
Photogrammetry and Remote Sensing, vol. 65, no. 5, pp. 457–467, 2010.
[43] S. Murali, V. K. Govindan “Shadow Detection and Removal from a Single Image
Using LAB Color Space”, Cybernetıcs and Information Technologies, vol. 13, no.
1, 2013.
[44] M., Pesaresi, J.A., Benediktsson, “ A new approach for the morphological
segmentation of high-resolution satellite imagery,” IEEE Transactions on
Geoscience and Remote Sensing, vol. 39, no. 2, pp.309–320, 2001.
[45] J. A. Benediktsson, M. Pesaresi, and K. Arnason, “Classification and feature

extraction for remote sensing images from urban areas based on morphological
transformations,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 9, pp. 1940–1949,
Sep. 2003.
[46] M. Teke, E. Ba¸seski, A. Ö. Ok, B. Yüksel, and Ç. ¸ Senaras, “Multispectral false

color shadow detection,” in Photogrammetric Image Analysis, vol. 6952, U. Stilla,
F. Rottensteiner, H. Mayer, B. Jutzi, and M.Butenuth, Eds. Berlin,Germany:
Springer Verlag, pp. 109–119. , 2011.
[47] S. Aksoy, I. Z. Yalniz, and K. Ta¸sdemir, “Automatic detection and segmentation of

orchards using very high resolution imagery,” IEEE Trans. Geosci. Remote Sens.,
vol. 50, no. 8, pp. 3117–3131, Aug. 2012.
74
[48] S.D Mayunga, Dr. Y. Zhang, and Dr. D.J Coleman, “Semi-Automatic Building
Extraction Utilizing Quickbird Imagery,” IAPRS, vol. XXXVI, pp. 29–30, Vienna,
Austria, Aug. 2005.
[49] E. Gulch, H. Muller, T. Labe and L. Ragia, “On the Performance of Semi-
Automatic Building Extraction”, ISPRS Commission III Symposium on object
Recognition and Scene Classification from Multispectral and Multi-sensor Pixels,
Columbus, 1998.
75
CURRICULUM VITAE
Credentials
Name, Surname : Salar GHAFFARIAN
Place of Birth : Tabriz, Iran
Marital Status : Married
E-mail : salarghaffarian1363@gmail.com
Address : Hacettepe University
Department of Geomatics Engineering
06800 Beytepe - Anakara - Turkey
Education
High School : Taleghani High School, Iran
BSc. : Civil Engineering, Islamic Azad University, Iran
MSc. : -
PhD. : -
Foreign Languages
English, Persian , Turkish
Work Experiences
 Member of Iranian young researchers club

 Iranian Organization for Engineering Order of Building Province East Azarbayjan
76
Areas of Experiences
 Reviewer/Referee in IEEE Transactions Cybernetics journal.
Projects and Budgets
Publications
Journal papers
 Saman Ghaffarian, Salar Ghaffarian, Automatic building detection based on Purposive FastICA
(PFICA) algorithm using monocular high resolution Google Earth images, ISPRS Journal of
Photogrammetry and Remote Sensing, Volume 97, November 2014, Pages 152-159, ISSN 0924-
2716,
http://dx.doi.org/10.1016/j.isprsjprs.2014.08.017
 Saman Ghaffarian, Salar Ghaffarian, Automatic histogram-based fuzzy C-means clustering for
remote sensing imagery, ISPRS Journal of Photogrammetry and Remote Sensing, Volume 97,
November 2014, Pages 46-57, ISSN 0924-2716,
http://dx.doi.org/10.1016/j.isprsjprs.2014.08.006 .
Conference paper
 “Automatic Building Detection Based on Supervised Classification using High Resolution Google
Earth Images”, Salar Ghaffarian and Saman Ghaffarian, ISPRS Technical Commission III,
Photogrammetric Computer Vision Conference in Zurich Switzerland, 2014.
Doi:10.5194/isprsarchives-XL-3-101-2014.
Oral and Poster Presentations
 Poster presentation of the paper entitled Automatic Building Detection Based on

Supervised Classification using High Resolution Google Earth Images in the ISPRS
Technical Commission III in Photogrammetric Computer Vision conference in
Zurich Switzerland 2014
77

2015 AutoFeatureExtraction-Building HacettepeUniversity PG-Thesis

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

2015 AutoFeatureExtraction-Building HacettepeUniversity PG-Thesis

Uploaded by

Copyright:

Available Formats

AN APPROACH FOR AUTOMATIC BUILDING

EXTRACTION FROM HIGH RESOLUTION SATELLITE

GÖLGE ANALİZİ VE AKTİF YÜKSELTİ EĞRİLERİ

PROF. DR. MUSTAFA TÜRKER

Submitted to Institute of Sciences of Hacettepe University as a

Partial Fulfillment of the Requirements

for the Award of the Degree of Master

AN APPROACH FOR AUTOMATIC BUILDING

Master of Science, Department of Geomatics Engineering

Supervisor: Prof. Dr. Mustafa TÜRKER

May 2015, 92 pages

Keywords: Building Extraction, Shadow Detection, Region Growing Segmentation, GVF

GÖLGE ANALİZİ VE AKTİF YÜKSELTİ EĞRİLERİ

Yüksek Lisans, Geomatik Mühendisliği Bölümü

Tez Danışmanı: Prof. Dr. Mustafa TÜRKER

Mayıs 2015, 92 Sayfa

Geliştirilen yaklaşım, Ankara’nın yakın çevresindeki kentsel ve banliyö bölgelerde yer

I cordially would like to dedicate this thesis to my dear parents Mortezagholi

ÖZET .................................................................................................................................... iii

TABLE OF CONTENTS ..................................................................................................... vi

LIST OF TABLES ............................................................................................................. viii

LIST OF FIGURES .............................................................................................................. ix

SYMBOLS AND ABBREVIATIONS…………………………………………………….xi

3.1.The General Structrue of The Algorithm ................................................................... 13

4.1. Image Data Set .......................................................................................................... 41

CURRICULUM VITAE ................................................................................................... 76

Figure 3.2 A detailed flowchart of the developed approach………………………...…..16

Figure 3.3 LAB color space figurative form ………………………………….……..….17

Figure 3.4 The original Google Earth Image…………………………………….…..….18

Figure 3.6 The Luminance channel of LAB color space…………………………..……19

Figure 3.9 Morphological filling………………………………...……………….………21

Figure 3.13 Sequential steps of buffer zone generation…...…...……………..………....26

Figure 3.14 Illumination and shifting direction…………………………………………..27

Figure 3.15 A problem case in the shift of the initial buffer……………………………..28

Figure 3.17 Buffer zone before de-noising operation………………..………………….31

Figure 3.19 Start of growing region………………………………………..…….………33

Figure 3.20 Growing process after a few iterations……………….……………….….…33

Figure 3.24 Parametric representation of an enclosed curve………………………...…...35

Figure 3.25 The boundaries of the segmented image .……...…………….……...……...38

Figure 3.26 The results of Canny-edge detection algorithm (Edge-map)………..……...38

Figure 3.27 The result of the GVF Snake model ……………………………………......39

GIS Geographical Information Systems

GVF Gradient Vector Flow

NIR Near Infra-Red

LAB Luminance/ Lightness - A and B color channels

LiDAR Light Detection and Ranging

The objectives of this thesis are as follows:

The proposed automatic building extraction method was implemented on a computer

1.4 Thesis Outline

The Literature review

2.1 Previous Methods for Building Detection, Extraction and Reconstruction

As an important achievement of this study, the proposed algorithm is full automatic to

3.1 THE GENERAL STRUCTURE OF THE ALGORITHM

Figure 3.1 The Proposed Building Extraction Algorithm

Figure 3.2 A detailed flowchart of the developed approach

Figure 3.3 LAB color space figurative form

Figure 3.4 The original Google Earth Image.

𝑀𝑒𝑎𝑛 (𝐿)− 𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 (𝐿)

Figure 3.6 The Luminance channel of LAB color space

3.3 BUFFER ZONE GENERATION

In order to extract buildings using a region-based segmentation, samples are needed to be

1- Creating a candidate buffer zone inside the shadow region

3.3.2.1 Creating a candidate buffer zone inside the shadow region