You are on page 1of 15

International Journal of Digital Earth

ISSN: (Print) (Online) Journal homepage: https://www.tandfonline.com/loi/tjde20

Enhanced urban functional land use map with free


and open-source data

T. T. Vu, N. V. A. Vu, H. P. Phung & L. D. Nguyen

To cite this article: T. T. Vu, N. V. A. Vu, H. P. Phung & L. D. Nguyen (2021) Enhanced urban
functional land use map with free and open-source data, International Journal of Digital Earth,
14:11, 1744-1757, DOI: 10.1080/17538947.2021.1970262

To link to this article: https://doi.org/10.1080/17538947.2021.1970262

Published online: 29 Aug 2021.

Submit your article to this journal

Article views: 799

View related articles

View Crossmark data

Citing articles: 1 View citing articles

Full Terms & Conditions of access and use can be found at


https://www.tandfonline.com/action/journalInformation?journalCode=tjde20
INTERNATIONAL JOURNAL OF DIGITAL EARTH
2021, VOL. 14, NO. 11, 1744–1757
https://doi.org/10.1080/17538947.2021.1970262

Enhanced urban functional land use map with free and


open-source data
a
T. T. Vu , N. V. A. Vub, H. P. Phung b
and L. D. Nguyen b

a
Faculty of Engineering and Science, Curtin University Malaysia, Miri, Malaysia; bVietnam National Space Center,
Vietnam Academy of Science and Technology (VAST), Ho Chi Minh city, Vietnam

ABSTRACT ARTICLE HISTORY


The study aims at developing an applicable methodology to produce Received 4 May 2021
the functional land-use map using only free and open-source data. Accepted 16 August 2021
Top-view Sentinel image and ground-view Open Street Map (OSM)
KEYWORDS
data are chosen due to their extensive availability. The three-stage Data fusion; Remote Sensing;
framework, including object-based image analysis, OSM data cleaning, Digital City; Land use; GIS
and ontology-based decision fusion, is proposed and implemented
with open-source tools. We applied the developed approach to
districts 1, 4, and 7 of HoChiMinh city, representing the complexities
of the dynamic change in big cities. The result showed a good
functional land use map with 78.70% overall accuracy. The outcome
presents the mismatch between the data-driven approach and human
knowledge, which can be improved by ontology-based fusion with
OSM data. The ontology-based framework comprises the common
urban land-use classes and OSM attributes, which can be applied and
extended in other urban areas. Additional text attributes may be
applicable only locally and can be modified in our open-source
framework. Object-based image analysis takes advantage of Google
Earth Engine computing power, whereas ontology-based processing
works well on a local computer. In future studies, adopted natural
language processing to pre-process OSM data and ontology-based
fusion will be implemented on the cloud-computing platform to
enhance computational efficiency.

1. Introduction
Since the 1990s, the urban population in Vietnam has been steadily growing at approximate 3%
annually (Pimhidzai et al. 2019). As of 2019, 36.63% of Vietnam population now is living in the
cities, and the biggest city, Ho Chi Minh city houses of over 8 million residents (WorldBank
2020). Rapid urbanisation and the emerging economy greatly transform the land cover and
land use situation in the cities. Timely updated landcover and landuse (LULC) maps are essen-
tial for monitoring the impact on environment and urban residents well-being as well as a bet-
ter plan of future development (Hu et al. 2016). In the complex urban scene, however, a finer
detailed urban functional map would reflect the real picture better than the formal landuse
map.
Existing multi-source remote sensing images enable the production of LULC maps at different
details. Using appropriate resolutions have been widely deployed to tackle the complexity and

CONTACT T. T. Vu tuongthuy.vu@curtin.edu.my Faculty of Engineering and Science, Curtin University Malaysia, CDT 250,
Miri, Sarawak, Malaysia
© 2021 Informa UK Limited, trading as Taylor & Francis Group. The International Journal of Digital Earth is an Official Journal of the International
Society for Digital Earth
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1745

heterogeneity in urban mapping (Momeni, Aplin, and Boyd 2016; Vu, Thy, and Nguyen 2018; Cai
et al. 2019). Sub-meter spatial resolution found to be more appropriate for residential building,
whereas others can be mapped with medium resolution 2–5 m (Cai et al. 2019). Qin et al. (2018)
proved that outlines of individual buildings can be successfully extracted from very high-resolution
(0.1–0.4 m) optical images. Three ranges of spatial resolutions, i.e. 0.8–3, 6–8 m, and equal or higher
than 20 m, were identified as optimal resolutions for classifying isolated objects, vegetation areas,
and urban district, respectively (Tran et al. 2011). Likewise, experiment with three different resol-
utions for mapping a mix development area in Ho Chi Minh city, Vietnam revealed that 10–15 m
cannot produce good urban LULC maps, though it is sufficient to extract the entire low-rise high-
density old developed residential areas (Vu, Thy, and Nguyen 2018).
Although multi-resolution employment is proven to be cost-effective in producing urban LULC
map, high-cost sub-meter images are still required. Moreover, in the complex urban environment,
very high-resolution satellite image is still unable to classify and label a detect object to its right
function (Jia et al. 2018; Du et al. 2020). For instance, using only high-resolution satellite image,
it is hard to identify whether a group of buildings is the commercial shop-houses or the terrace resi-
dential housesblue. Information of human social and economic activities can help to clarify the true
functional use of the building (Soliman et al. 2017; Tu et al. 2018).
In 2000s, citizen as sensors emerged as an additional geo-information collection method besides
conventional collection by national map agency (Goodchild 2007). Using the relevant keywords like
citizen science, collaborative mapping, or crowdsourcing, etc., See et al. (2016) identified over
25,000 research works that were published in between 1990 and 2015. Volunteering geographic
information (VGI) can be used as the supplement to update the existing national GIS map (Pour-
abdollah et al. 2013) or be integrated to remote sensing image analysis for LULC mapping and
change detection (Johnson et al. 2017; Schultz et al. 2017; Wan et al. 2017; Yang et al. 2017). In
the past two decades, volunteers help to create a substantial ground information at great details,
and VGI keeps increasing its quantity and importance in mapping, e.g. regional forest mapping
(Pekkarinen, Reithmaier, and Strobl 2009) or damage mapping (Kerle and Hoffman 2013; Wang
et al. 2018), to name a few.
Human social information, in many forms such as mobile phone records (Ratti et al. 2006; Pei
et al. 2014; Tu et al. 2018), taxi data (Pan et al. 2013; Anugraha, Chu, and Ali 2020), social net-
working data (Jiang et al. 2015; Soliman et al. 2017), have been extensively employed in urban
landuse mapping. Open Street Map (OSM), the greatest voluntary effort worldwide, provides a
timely and rich details of street networks and point of interest (POI). Most, if not all relevant
researches have used OSM in integration with remote sensing. Coarse urban landuse map was
produced from nighttime light data (Cao et al. 2020), or fine resolution map was produced
from high-resolution satellite images (Liu et al. 2017; Wan et al. 2017), both used OSM as the
supplementary data. Schultz et al. (2017) even used OSM products as the primary data in deriv-
ing the Open Global Land Cover, whereas the gap was filled by free remote sensing images.
High-resolution satellite images integrated with social data produced high-detailed urban landuse
map with good accuracy (Liu et al. 2017; Wan et al. 2017; Du et al. 2020). It is also noted that
huge POIs data from various platforms in addition to OSM were also used. Anugraha, Chu, and
Ali (2020) attempted to work with lower resolution, 10 m Sentinel-2 images, but took advantage
of bike and taxi data together with OSM data. Such data set, however, is not always available for
every cities.
In this study, we aim at developing a solution using only free and open-sourced data available for
cities in developing countries. For most cities, we can employ only medium-resolution satellite
images like Landsat or Sentinel and OSM data. The question is how much in detail and reliable
functional landuse map can be derived? In the following, Section 2 introduces our study areas
and the data to be used. Section 3 presents the developed method for urban functional landuse map-
ping followed by experiments in Section 4. The final section draws some concluding remarks and
recommendation for further studies.
1746 T. T. VU ET AL.

2. Research material
2.1. Study area
In the past few decades, Ho Chi Minh city, the biggest city in Vietnam, has experienced fast
growth both horizontally and vertically. In District 1, the central business district (CBD), many
new high-rises have been built as well as new business activities have been developed in the
old and low-rise commercial and residential areas. Lots of confusion can be captured here if
using only remote sensing images. From the bird view, the Southeast area of Ho Chi Minh city
shows the obvious changes. Since the development of Phy My Hung Township, District 7 has
been drastically transformed with many new commercial and residential areas. The growing Dis-
trict 7 triggered the need for better connection to the CBD and hence, also transformed District 4,
located between districts 1 and 7.
The total areas are 47.59 km2 , about 2.3% of Ho Chi Minh city area, whereas the total population
is 666,000, about 7.5% of the city population (Department of Statistics, Ho Chi Minh city, 2018).
With the availability of ground truth data and the interesting and rapid changes in the past decades,
we tested our developed solution using free and open-sourced data for the three districts 1, 4, and 7
of Ho Chi Minh city (Figure 1).

2.2. Data used


As aforementioned, our study focused on employing free and open-source satellite images like Sen-
tinel-2 and OSM data. We selected only 4 spectral bands of Sentinel-2 image: blue, green, red and
near-infrared, which has the finest spatial resolution of 10 m. Since we focused on mapping the

Figure 1. Study area in Ho Chi Minh city (false colour composite Sentinel-2 image).
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1747

urban activities, the free image with the best spatial resolution was preferred. The cloud-free Sen-
tinel-2 images were selected in the period from November 2018 to March 2019.
The free OSM data is available for downloading from geofabrik.de. Relevant OSM layers were
extracted and clipped exactly into the boundary of districts 1, 4, and 7. The six layers include
road, building, landuse, water, and two layers of point of interest (POI), in polygon and point
forms. We rigorously checked the completeness and updates of all layers against our local knowl-
edge. The outcomes confirmed the full completion and up-to-date quality of road and water layers.
The other layers, however, showed incomplete and inconsistent data across the three districts. On
another hand, the polygons and points on these layers partially overlapped and hence, provided the
redundancy. It requires some pre-processing steps before integrating with satellite images.
The common functional zones in developing Asian cities can be categorised into two levels. On the
first level, there are Residential, Commerical, Public Services, Industry, Greenspace, Water, Transpor-
tation and Open land. Each class is further identified on level 2, for instance, low-dense house or
Townhouse/Condominium/Apartment belong to Residential, whereas Shopping Mall, Office Build-
ing/Complex, Hotel/Shoplot/Market, Commercial Entertainment (cinema, restaurant, playground)
belongs to Commercial class. A high-resolution satellite image, i.e. better than 5 m, is suitable to ident-
ify the first level whereas a sub-meter resolution image can be used to map the second level depending
on how much different the appearance is in comparison with the surroundings from the top view.
The detailed field survey was undertaken in the study area during the same period of Sentinel-2
images, and all functional zones were digitised as shown in Figure 2. This digitised zoning layer will
be used as the reference data for accuracy assessment. To avoid bias in assessment, the field survey

Figure 2. Reference data.


1748 T. T. VU ET AL.

followed by digitising functional zones on Google and Sentinel images were done independently by
a group of local volunteers. All of them are students who have been living in Ho Chi Minh city for
over 10 years and learning GIS and mapping at a basic level. They have a good understanding of
land-use zoning, and as long-term residents of Ho Chi Minh city, their awareness and definition
of the usage of each city block are defined as the ‘truth’. It is noted that road and river/canal net-
works were not digitised since we expected such networks can be easily extracted from OSM data.
We focus on assessing the performance of our developed method in the extraction of other classes.
As shown in Figure 2, residential areas are the dominant class in our study areas, whereas open
land and industrial areas are located mainly in District 7 and quite distinguishable. Figure 3 depicts
the three typical residential and commercial areas, and how they can be seen on the Sentinel-2 image.
A block of high-density small townhouses can be discriminated against with assistance from street
networks. Small patches of green spaces interspersed in low-density villas and terrace houses make
this zone seem quite distinguishable from others. Commercial areas present a mixture of high-rise
buildings and complex other structures. While the three types look distinguishable in very high-res-
olution images, it is not the case with a 10-m resolution Sentinel-2. Moreover, the functional use of
each building cannot be seen from the roof, which is expected to be further clarified with OSM data.

3. Methods
Figure 4 presents the workflow of the developed processing approach. Satellite images and OSM
layers are pre-processed separately prior to the decision-based fusion.

3.1. Sentinel-2 segmentation and classification


Sentinel-2 images are collected and processed directly on the Google Earth Engine platform
(https://code.earthengine.google.com/). Firstly, low-pass Gaussian and morphological opening

Figure 3. Typical commercial and residential areas as depicted with high-resolution Google Earth (left) and false colour compo-
site Sentinel-2 images (right).
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1749

Figure 4. Sentinel-2 and OSM data fusion flowchart.

filters are applied on 4-band Sentinel-2 images to decrease the level of complexity and enhance the
separation between clusters. Subsequently, 4-band Sentinel-2 image is transformed into 1-band seg-
mented cluster image using the Simple Non-iterative Clustering (SNIC) algorithm (Achanta and
Süsstrunk 2017), a proven algorithm in terms of simplicity, computational efficiency, and segmen-
tation quality. In SNIC, the size of the segmented cluster is defined by the ‘seed’ value. The bigger
the seed value, the bigger the segmented clusters are.
To define the most suitable seed value for our study area, we observed the statistical behaviour of
cluster value when increasing its size. We took random samples, scattering across our study area
and including all functional landuse classes, and created a series of a buffer from 10-pixels to
over 30 pixels, corresponding to 100–300 m. The buffer areas representing the sample clusters
were then applied on the first principle component image of the Sentinel-2 image and the standard
deviation values were computed. When increasing the size, the standard deviation values increase
due to the inclusion of more variety of pixel values, and hence the possibility of different land-use
types. In between 180 and 220 m, the standard deviation value showed almost no change, which
suggested that the size of 200 m is the best choice overall in the study area. To focus on functional
uses, we utilise the OSM road and river layers to mask out all relevant pixels, the remnants, there-
fore, are mainly buildings, green space and open land.
The next step is to categorise the spectral information of the remaining objects into building,
open land and greenspace classes. To prepare for the classification, we scanned through the
whole Ho Chi Minh city and carefully investigated the appearance of different land use classes in
our study area. In general, the following classes could be observed quite distinctively: open land,
green space, high-density old housing areas, and new low-density built-up housing areas. To
some extent, we could also identify parts of factory and port areas but in some cases, they looked
much similar to residential areas. The remaining groups of objects could be seen as a mixture of
condominiums, apartments, office buildings, and townhouses. Based upon our investigation,
seven sets of reference samples for the following classes: open land, greenspace, low-density houses,
high-density old houses, industrial areas, mix building 1 and mix building 2.
Aiming to apply the developed methodology to other areas and taking into consideration the
diversity and complexity in different urban areas, we adopted the simple K-mean clustering. We
defined the number of classes as 9, instead of 7, as we expected some additional classes may turn
out during the clustering. In addition, as a generic implementation, we randomly selected 5000
1750 T. T. VU ET AL.

samples to train the cluster in our algorithm. After clustering, a similarity spectral comparison was
carried out to match the clustered class with the corresponding informational reference classes as
defined above. After segmented and classified, each cluster is represented by a vector: C = [ID, SP],
where ID: identification number, and SP: spectral class.

3.2. OSM data cleaning


Except for road and river layers which are used as the masks in segmentation, four other layers are
used to assist the classification of functional uses: landuse (polygon), POI (polygon), building (poly-
gon) and POI (point). To prepare for the integration with segmented and classified clusters from
Sentinel-2, the attributes of each point/polygon need to be categorised according to the defined
classes as mentioned in Section 2. Specifically, the detailed reclassification is illustrated in Table 1.
In Ho Chi Minh city, like many other cities in Vietnam, we found many small business activities
like coffee shop, convenience store, or clothes shop, etc. located in residential areas. Blindly using the
information from OSM point of interest may create confusion in data integration and classification.
Moreover, there are many cases of which the attribute fclass of POI is NULL and further detail is given
in the field name. Those unclear points will not be used in the next data fusion and filtered out.
To simplify the data fusion later, spatial-relation analysis is employed to unite the landuse poly-
gon and POI polygon. If there is an overlap, the latter is normally contained within the former, and
in most cases, they have the same attributes. Interestingly, however, there are cases showing the
difference. For instance, a polygon from the POI layer is marked as park and is within a polygon
from the land-use layer that is marked as residential. This is a typical setting in urban areas
where a small park is built within a residential area. Both attributes will be reserved in merging
with satellite extracted clusters but the land use attribute is given a higher priority. The POI attri-
bute will be considered when there is a cluster of comparable size with that of the POI polygon.
Similarly, we unite the newly created polygon layer with the building polygon layer. After this
step, we have prepared two OSM layers (landuse polygon and POI point) to fuse with satellite-
extracted information.

3.3. Data fusion


Though data-driven approaches with innovative and high-performance algorithms have been well
developed and employed, there remains a big gap between data-driven outcomes and the knowl-
edge-driven information (Arvor et al. 2019), the understandable and meaningful information for

Table 1. Re-classified attributes of the four OSM layers.


Layer Attribute (values) New value
Landuse (polygon) fclass (residential) Residential
fclass (commercial, retail) Commercial
fclass (cemetery, military, recreation-ground) Public services
fclass (industrial) Industrial
fclass ( forest, grass, meadow, park) Green space
POI (polygon) fclass (apartments, house, residential) Residential
fclass (bank, bookshop, cafe, hotel, mall, office, restaurant, retail) Commercial
fclass (cinema, college, community-centre, court-house, hospital, museum, school, university) Public services
fclass (park) Green space
Building (polygon) type (apartments, house, residential) Residential
type (commercial, hotel, office, retail) Commercial
type (chapel, church, mosque, temple, civic, hospital, school, university) Public services
type (industrial, manufacture, warehouse) Industrial
POI (point) fclass (apartments, house, residential) Residential
fclass (bank, hotel, office, retail) Commercial
fclass (chapel, church, mosque, temple, civic, hospital, school, university) Public services
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1751

practical uses. The ground data as collected by the crowd provide the details from human view,
knowledge-driven information. Figure 5 illustrates the gap between the knowledge-driven landuse
functional zones with remote-sensing-driven extracted clusters, and the possible links of these two
groups. To fill the gap, we build the semantic rules and employ the ontology-based classification.
Figure 6 illustrates the applied ontology in fusion and classification of the two data sets. As afore-
mentioned, each cluster after segmented has the spectral index, either categorised as green space,
open land, or building, presenting as SGreen, SOpen and SBuilding in Figure 6. Further separation
of building types may create more ambiguity and confusion. The reclassified attributes of ground-
data layers are presented as POpen, PGreen, PRe, PCom, PPub, and PInd standing for open land,
green space, residential, commercial, public services and industrial, respectively. In merging with
extracted clusters from the satellite image, the spatial overlap is used for land-use polygons, whereas
the majority rule is applied for the POI point layer. The semantic rules are built using Protege
(https://protege.stanford.edu/), and the classification process is followed the graph-based classifi-
cation developed by Lampoltshammer and Wiegand (2015), which proved to be efficient in com-
putational performance. The classification is formalised as follows.
OpenCluster ; Cluster > hasSpectral.SOpen > overlap.POpen > majority.POpen
GreenCluster ; Cluster > hasSpectral.SGreen > overlap.PGreen > majority.PGreen
ResidentCluster ; Cluster > hasSpectral.SBuidling > overlap.PRe > majority.PRe
ComCluster ; Cluster > hasSpectral.SBuidling > overlap.PCom > majority.PCom
PubCluster ; Cluster > hasSpectral.SBuidling > overlap.PPub > majority.PPub
IndustryCluster ; Cluster > hasSpectral.SBuidling > overlap.PInd > majority.PInd

4. Results and discussions


Firstly, Sentinel-2 images of the study area were analysed on the Google Earth Engine platform. To
match with our ground data and other reference data sources, the free-cloud images acquired from
November 2018 to March 2019 were selected and processed. The segmented and classified clusters
are as shown in Figure 7. Overall, it depicts an acceptable extracted result that clearly shows the
green space and building areas in a mixture of high-dense, low-dense and industrial and port areas.
As a result of a data-driven approach without any supervision from human knowledge, the
extracted landuse map is far to reach the requirement of a functional landuse map for practical
uses. In comparison between Figures 2 and 7, one big area in District 7 was marked as the open

Figure 5. The gap between human-knowledge and data-driven extraction for landuse mapping.
1752 T. T. VU ET AL.

Figure 6. Applied ontology in fusion and classification.

land, planning for some new construction. The satellite image captured during the period when
vegetation fully covered so that the extraction from Sentinel-2 classified as green space. More com-
plication, however, is the functional uses of buildings. The extraction from only satellite image was
unable to present the direct relevance to residential, commercial, public service or industry areas,
but only the high-dense, low-dense and mixture of all types of building use.
We further checked the relationship of each extracted class from the Sentinel image with the
ground-truth data presented in Figure 2. As depicted in Figure 8, there is high ambiguity in
classified results, especially among the functional uses of buildings. We could only confirm high-
dense houses is more likely the residential areas. The outcome confirmed the mismatch between
what satellite image can provide and the practical functional uses.
The integrated analysis with crowd-sourced ground data in an ontology-based framework produced
the functional landuse map as depicted in Figure 9. The legends presented in Figures 2 and 9 provide
the overall pixel distribution of each functional landuse class, which show the similarity between the
two. Quantitatively, Table 2 presents the detailed accuracy assessment in the form of a confusion
matrix where its graphical illustration is shown in Figure 10. The confusion matrix was computed
based on the extracted objects and the ground-truth polygons. For each cluster, we took the majority
overlapping ground-truth class and treated it as true. Usually, the majority occupied greater or equal
40% of the entire cluster area. In total, 1634 samples were used for quantitative accuracy assessment.
Overall accuracy could reach 78.70% with acceptable producer and user accuracy for all classes.
The approach was proved to be robust and easy to be implemented with open-source tools. The
main challenge, however, is the reliability and completeness of the crowd-sourced data. Introducing
wrong ground data may create more confusion and alter the final classification. In this study, we
clarified the text attributes of each crowd-sourced record against our local knowledge, in which
the unclear records were filtered out. To be used for the extended application, pre-processing
OSM data prior to data fusion needs to be improved.
In our study area, while plenty of information is available for District 1, the central business area,
lack information in the other two districts. It is true that there are more confusion and ambiguity of
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1753

Figure 7. Segmented and classified result from Sentinel-2 image.

building uses between residential, commercial, public services in such central business than others.
However, the lack of ground data had some impact on the final result. As above-mentioned, in a
typical Asian city like Ho Chi Minh city, it is unclear between the residential and commercial func-
tions. We identified the confusion even in ground-truth and OSM data, where a block with shops
on the ground floor was marked as commercial but people are living on the upper floors.

Figure 8. The distribution of extracted classes from Sentinel image within each knowledge-driven class.
1754 T. T. VU ET AL.

Figure 9. Functional landuse map derived from Sentinel-2 and OSM data.

Table 2. Confusion matrix.


Reference
Resident Commercial Public Industry Green Open User Acc.
Classified Resident 532 107 26 1 3 16 77.66%
Commercial 35 321 17 3 2 2 84.47%
Public 28 21 179 2 1 3 76.50%
Industry 20 12 5 113 2 3 72.90%
Green 1 1 0 0 21 2 84.00%
Open 19 13 0 1 2 120 77.42%
Producer Acc. 83.78% 67.58% 78.85% 94.17% 67.74% 82.19%
Overall accuracy 78.70%

The low producer accuracy of green space is another proof of the mismatch between satellite
observation and the practical usage on the ground. In Ho Chi Minh city, many pieces of grass-
land or garden are big enough to be captured and extracted from satellite images, they are indeed
a part of a company (commercial) compound or inside a residential condominium. The assess-
ment, hence, showed that they were wrongly classified. Additional information used could help
to clarify the ambiguity and provide a much better outcome. The main aim of this study, how-
ever, was to test what we can achieve with only using free Sentinel and OSM data, and the con-
cept was proven.
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1755

Figure 10. Distribution of reference objects in each detected class.

5. Conclusion
The paper has presented an experiment of using only free and open data sources to produce the
functional landuse map for a crowded and busy city area. We proved that with a 10-m resolution,
the satellite image is far to reach an acceptable functional map defined by human knowledge. The
missing details from the 10-m top-view Sentinel image were filled by ground-view OSM data, and
hence able to close the gap between data-driven product and human-knowledge requirement. The
final map was produced with acceptable accuracy, and since both data sets are freely available
worldwide, it is potentially applicable to other areas.
The developed methodology utilised the computing power of the Google Earth Engine
platform, which enables fast object-based image analysis. The study also contributed to devel-
oping an ontology-based framework for decision-based data fusion. With a similar complex
urban setting, the developed framework can be easily adopted. Moreover, all steps were
implemented with open-source and accessible tools. Heavy computation can be expected
with the ontology decision-based fusion. We plan to implement it on the cloud-computing
platform for application to a large area or the entire city. Future studies will first look into
a solution for better pre-processing of OSM data such as including natural language proces-
sing, and subsequently, extend the application of the developed methodology to all Vietnam
cities.

Acknowledgments
The authors are grateful for the data and tool support from Google Earth Engine and Open Street Map (via Geofab-
rik) platforms.

Data availability statement


The data that support the findings of this study are available on Google Earth Engine (https://earthengine.google.
com) and Geofabrik (https://www.geofabrik.de).
1756 T. T. VU ET AL.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
The study is a part of the research project ‘Enhance the quality of remote-sensing-derived infor-
mation with crowd-sourced data’ [102.99-2018.16] funded by The National Foundation for Science
and Technology Development (NAFOSTED), Vietnam.

ORCID
T. T. Vu http://orcid.org/0000-0002-6656-8016
H. P. Phung http://orcid.org/0000-0003-4201-9917
L. D. Nguyen http://orcid.org/0000-0002-7581-6982

References
Achanta, Radhakrishna, and Sabine Süsstrunk. 2017. “Superpixels and Polygons Using Simple Non-iterative
Clustering.” In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI,
USA, July 21–26, 2017, 4895–4904. IEEE Computer Society.
Anugraha, Adindha Surya, Hone-Jay Chu, and Muhammad Zeeshan Ali. 2020. “Social Sensing for Urban Land Use
Identification.” ISPRS International Journal of Geo-Information 9 (9): 550.
Arvor, Damien, Mariana Belgiu, Zoe Falomir, Isabelle Mougenot, and Laurent Durieux. 2019. “Ontologies to
Interpret Remote Sensing Images: why Do We Need them?.” GIScience and Remote Sensing 56: 911–939.
Cai, Guoyin, Huiqun Ren, Liuzhong Yang, Ning Zhang, Mingyi Du, and Changshan Wu. 2019. “Detailed Urban
Land Use Land Cover Classification At the Metropolitan Scale Using a Three-Layer Classification Scheme.”
Sensors 19 (14): 3120.
Cao, Wenpu, Lei Dong, Lun Wu, and Yu Liu. 2020. “Quantifying Urban Areas with Multi-source Data Based on
Percolation Theory.” Remote Sensing of Environment 241: 111730.
Du, Shouji, Shihong Du, Bo Liu, Xiuyuan Zhang, and Zhijia Zheng. 2020. “Large-scale Urban Functional Zone
Mapping by Integrating Remote Sensing Images and Open Social Data.” GIScience & Remote Sensing 57 (3):
411–430.
Goodchild, Michael F. 2007. “Citizens As Sensors: Web 2.0 and the Volunteering of Geographic Information.”
Geofocus 7: 8–10.
Hu, Tengyun, Jun Yang, Xuecao Li, and Peng Gong. 2016. “Mapping Urban Land Use by Using Landsat Images and
Open Social Data.” Remote Sensing 8 (2): 151.
Jia, Yuanxin, Yong Ge, Feng Ling, Xian Guo, Jianghao Wang, Le Wang, Yuehong Chen, and Xiaodong Li. 2018.
“Urban Land Use Mapping by Combining Remote Sensing Imagery and Mobile Phone Positioning Data.”
Remote Sensing 10 (3): 446.
Jiang, Shan, Ana Alves, Filipe Rodrigues, Joseph Ferreira Jr., and Francisco C. Pereira. 2015. “Mining Point-of-inter-
est Data From Social Networks for Urban Land Use Classification and Disaggregation.” Computers, Environment
and Urban Systems 53: 36–46.
Johnson, Brian A., Kotaro Iizuka, Milben A. Bragais, Isao Endo, and Damasa B. Magcale-Macandog. 2017.
“Employing Crowdsourced Geographic Data and Multi-temporal/multi-sensor Satellite Imagery to Monitor
Land Cover Change: A Case Study in An Urbanizing Region of the Philippines.” Computers, Environment and
Urban Systems 64: 184–193.
Kerle, N., and R. R. Hoffman. 2013. “Collaborative Damage Mapping for Emergency Response: The Role of Cognitive
Systems Engineering.” Natural Hazards & Earth System Sciences 13: 97–113.
Lampoltshammer, Thomas, and Stefanie Wiegand. 2015. “Improving the Computational Performance of Ontology-
Based Classification Using Graph Databases.” Remote Sensing 7: 9473–9491.
Liu, Xiaoping, Jialv He, Yao Yao, Jinbao Zhang, Haolin Liang, Huan Wang, and Ye Hong. 2017. “Classifying Urban
Land Use by Integrating Remote Sensing and Social Media Data.” International Journal of Geographical
Information Science 31: 1675–1696.
Momeni, Rahman, Paul Aplin, and Doreen Boyd. 2016. “Mapping Complex Urban Land Cover From Spaceborne
Imagery: The Influence of Spatial Resolution, Spectral Band Set and Classification Approach.” Remote Sensing
8: 88.
INTERNATIONAL JOURNAL OF DIGITAL EARTH 1757

Pan, Gang, Guande Qi, Zhaohui Wu, Daqing Zhang, and Shijian Li. 2013. “Land-Use Classification Using Taxi GPS
Traces.” IEEE Transactions on Intelligent Transportation Systems 14 (1): 113–123.
Pei, Tao, Stanislav Sobolevsky, Carlo Ratti, Shih-Lung Shaw, Ting Li, and Chenghu Zhou. 2014. “A New Insight Into
Land Use Classification Based on Aggregated Mobile Phone Data.” International Journal of Geographical
Information Science 28: 1988–2007.
Pekkarinen, Anssi, Lucia Reithmaier, and Peter Strobl. 2009. “Pan-European Forest/non-forest Mapping with
Landsat ETM+ and CORINE Land Cover 2000 Data.” ISPRS Journal of Photogrammetry & Remote Sensing 64:
171–183.
Pimhidzai, Obert, Mathilde Sylvie Maria Lebrand, Roman Constantin Skorzus, Brian G. Mtonya, Charles Kunaka,
Steven M. Jaffee, Jung Eun Oh, and Pham Minh Duc. 2019. Vietnam Development Report 2019: Connecting
Vietnam for Growth and Shared Prosperity. Technical Report. World Bank. http://documents.worldbank.org/
curated/en/590451578409008253/Vietnam-Development-Report-2019-Connecting-Vietnam-for-Growth-and-
Shared-Prosperity
Pourabdollah, Amir, Jeremy G. Morley, Steven Feldman, and Mike Jackson. 2013. “Towards An Authoritative
OpenStreetMap: Conflating OSM and OS OpenData National Maps’ Road Network.” ISPRS International
Journal of Geo-Information 2 (3): 704–728.
Qin, Xuebin, Shida He, Xiucheng Yang, Masood Dehghan, Qiming Qin, and Jägersand Martin. 2018. “Accurate
Outline Extraction of Individual Building From Very High-Resolution Optical Images.” IEEE Geoscience and
Remote Sensing Letters 15 (11): 1775–1779.
Ratti, Carlo, Dennis Frenchman, Riccardo Maria Pulselli, and Sarah Williams. 2006. “Mobile Landscapes: Using
Location Data From Cell Phones for Urban Analysis.” Environment and Planning B: Planning and Design 33:
727–748.
Schultz, Michael, Janek Voss, Michael Auer, Sarah Carter, and Alexander Zipf. 2017. “Open Land Cover From
OpenStreetMap and Remote Sensing.” International Journal of Applied Earth Observation and Geoinformation
63: 206–213.
See, Linda, Peter Mooney, Giles Foody, Lucy Bastin, Alexis Comber, Jacinto Estima, and Steffen Fritz, et al. 2016.
“Crowdsourcing, Citizen Science or Volunteered Geographic Information? The Current State of Crowdsourced
Geographic Information.” ISPRS International Journal of Geo-Information 5 (5): 55.https://www.mdpi.com/
2220-9964/5/5/55
Soliman, Aiman, Kiumars Soltani, Junjun Yin, Anand Padmanabhan, and Shaowen Wang. 2017. “Social Sensing of
Urban Land Use Based on Analysis of Twitter Users’ Mobility Patterns.” PLoS ONE 12 (7): e0181657.
Tran, Thi Dong-Binh, Anne Puissant, Dominique Badariotti, and Christiane Weber. 2011. “Optimizing Spatial
Resolution of Imagery for Urban Form Detection: The Cases of France and Vietnam.” Remote Sensing 3: 2128–
2147.
Tu, Wei, Zhongwen Hu, Lefei Li, Jinzhou Cao, Jincheng Jiang, Qiuping Li, and Qingquan Li. 2018. “Portraying
Urban Functional Zones by Coupling Remote Sensing Imagery and Human Sensing Data.” Remote Sensing 10
(1): 141.
Vu, Tuong-Thuy, Pham Thi Mai Thy, and Lam Dao Nguyen. 2018. “Multiscale Remote Sensing of Urbanization in
Ho Chi Minh City, Vietnam: A Focused Study of the South.” Applied Geography 92: 168–181.
Wan, Taili, Hongyang Lu, Qikai Lu, and Nianxue Luo. 2017. “Classification of High-Resolution Remote-Sensing
Image Using OpenStreetMap Information.” IEEE Geoscience and Remote Sensing Letters 14 (12): 2305–2309.
Wang, Han, Erik Skau, Hamid Krim, and Guido Cervone. 2018. “Fusing Heterogeneous Data: A Case for Remote
Sensing and Social Media.” IEEE Transactions on Geoscience and Remote Sensing 56 (12): 6956–6968.
WorldBank. 2020. “Vietnam Urban Population 1960–2020.” https://www.macrotrends.net/countries/VNM/vietnam/
urban-population
Yang, Di, Chiung-Shiuan Fu, Audrey C. Smith, and Qiang Yu. 2017. “Open Land-use Map: a Regional Land-use
Mapping Strategy for Incorporating OpenStreetMap with Earth Observations.” Geo-spatial Information Science
20: 269–281.

You might also like