You are on page 1of 4

Measuring surfaces in orthophotos based in color segmentation using k-means

Enriquez F.1[0000-0002-6689-8357], Delgado G.1[0000-0003-0413-9520], Arbito F.1[0000-0002-9680-282X], Cabrera A.1[0000-0002-1539-2292],


Iturralde D. 1[0000-0003-2134-0453]
1 University of Azuay, Cuenca, Ecuador

efejaramillo@es.uazuay.edu.ec,gabrieldelgado@uazuay.edu.ec, teslamillingec@outlook.com,
apcabrera@uazuay.edu.ec, diturralde@uazuay.edu.ec

Abstract. The abstract should summarize the contents of the paper in short terms, i.e. 150-250 words.

Keywords: First Keyword, Second Keyword, Third Keyword.

1 Introduction.

Measurements of particular urban or rural areas help in several research projects in civil engineering, mining
engineering, high voltage cable routing, biology, natural reserves, agriculture, etc. These measurements are usually
taken using satellites, which have limitations to determine patterns due to the low quality of satellite images and its
big content shades of grey. However, due to new techniques that apply artificial intelligence algorithms it is possible
to obtain better measurements. [1].

[2].

Image segmentation is a fundamental process, and at the same time is a classic problem in the majority of computer
vision applications. Usually, image segmentation is done using a greyscale image, which means that only information
related with intensity is used. On the other hand, color image segmentation offers more levels of discrimination that
reach millions. [3].

The most used methods for image segmentation are based in histograms and association of colors. The method of
color association includes two branches: supervised and unsupervised. Supervised algorithms are simple because they
work with input data and the expected (output) data vector. However, they have a big disadvantage that consists in the
loss of information related to color [ref]. By contrast, unsupervised methods need training data, which consist in input
vectors without a specific goal. Instead of searching a specific data, the unsupervised algorithm autonomously finds a
pattern inside the data, resulting in a small error rate and an improvement in the rate of success in segmentation, though
with a bigger computational cost [3] [4].

This kind of classification systems help to develop algorithms capable of search and categorize patterns by colors in
photographs, and, in this context, orthophotographs.

Methodology.

This research is focused on obtaining a system able to measure areas using machine learning and unsupervised
algorithms such as the one shown in Fig.1. The algorithm needs as input an orthophotographic reconstruction of the
area to classify each pixel according to its color components to group them and create masks based on color.

Orthophotography

Previously to the implementation of the algorithm, orthophotographs with different spatial resolutions are needed. A
result of an orthophotograph using “Agisoft PhotoScan” is shown in Fig 2, as a result of the process of 250 aerial
images covering an area of 2.1 hectares with a spatial resolution of 1.78 cm/px.
Clustering using k-means

K-means is one of the most used unsupervised learning algorithms for cluster analysis. This denomination was used
for the first time by MacQueen. The goal of this algorithms is the solid clustering of data in k groups using the nearest
mean, giving as a result a partition of the data space into Voronoi cells.

The result given by the k-means algorithm is such that it groups the data with high similarity and at the same time
each group is very different to the other clusters. The similarity is decided by the median value of a group with respect
to its centroid.

The algorithm uses equation (1) with the method of least squares.
Where ||||| is the distance between a data poing 𝑥𝑥𝑥 and its centroid 𝐶𝐽 . K-means algorithm is shown in Fig 3.

Image reconstruction
Image reconstruction consist of recreating the original image with a limited number of colors previously obtained by
the k-means algorithm. The image is reconstructed by replacing each pixel with the centroid value of its group. Figure
4 shows the details of this process.

Measuring algorithm with binary images

Area measurement consists of counting pixels of the same color that afterwards are converted in units of area using
the scale given by the orthophotographic procedure (cm2, m2, km2). This process is detailed in Fig 5.

In order to obtain the number of pixels, the system uses a threshold mask to generate a binary image (two levels of
intensity). Equation (2) determines the threshold used to classify the pixels into the two groups, where LS is the highest
color value, LI is the lowest color value, 𝑃𝑖 is the original pixel value, and 𝑃0 is the final pixel value [20]

To eliminate the noise in the binary image, basic morphologic transformations are applied, such as closing (dilation
and erosion) that helps to fill in the interstices of an image [20].

The particular shape of an object in an image could be obtained using the measure of its moments. A zeroth moment
M (0,0) is the sum of all pixels of a certain value, and equation (3) shows the total number of white pixels [27]. Thus,
the area in square meteres is given by equation (4), where 𝐷𝑝 is the spatial resolution (cm2/px)

Quality measure and Validity index

In order to verify which color space gives the best result, the Calinksi-Harabasz Index (CHI) is used. This index is a
variance relationship criterion that gives an idea for the data structure. This method is described in equation 5 [23].

Where N represents the total number of data, k the total number of clusters, 𝑆𝑆𝑏 is the variance between clusters and
𝑆𝑆𝑤 is the variance of the cluster [23]. The CHI values show that, if the criterion is larger, the clustering is better.

RESULTS.

Analyzed data were obtained from an orthophotograph with spatial resolutions of 1.78 cm/px and 10 cm/px, from a
space of 2 hectares approximately, using a segmentation with k=12.

Figure 4 shows the results obtained after the image reconstruction with the values given by the algorithm after the
clustering applied to the orthophotographs in HSV color space. It is important to mention that among the 12 final
clusters white is included, which is discarded in the masks because it belongs to the background of the image.
Spatial resolution of 1.78 cm/px
Spatial resolution of 10 cm/px

Orthophotograph reconstruction results (HSV) with different spatial resolutions and k=12

As expected, the reconstruction values (centroids given by the k-means algorithm) from Fig 6a and Figure 6b are
similar, although its spatial resolution is different.

With each centroid (color) obtained by the k-means algorithm, the area corresponding to each color is calculated.
Next, the results are shown to the user, who decides which ones are of its interest.

The measurements of each segmentation with its HSV value are shown in Table 1.

Table 1. AREA MEASUREMENTS OF MASKS IN HSV (1.78 CM/PX)

The same procedure is applied to segment the color space (in HSV) of the image with spatial resolution of 10cm/px.
Table 2 shows the dimensions for the masks with its respective segmentation value.

Table 2. AREA MEASUREMENTS OF MASKS IN HSV (10CM/PX)

The algorithm previously described is applied in the RGB color space, for comparison. The results for each RGB
masks measurements applied to 1.78 cm/px image are shown in Table 3.

Table 3. AREA MEASUREMENTS OF MASKS IN RGB

The procedure is repeated applying the algorithm to the 10 cm/px image with RGB color space. The result is shown
in Table 4.

Table 4 AREA MEASUREMENTS OF MASKS IN RGB.

A very important aspect to assess the proposed algorithm is the execution time, from the adjustment of the data
samples, until the area measurement. Table 5 shows the execution time for the algorithm using masks, which is given
by the addition of the execution time of the different sub processes to complete the algorithm.

Table 5. Execution time using masks in HSV color space

Similarly, Table 6 shows the algorithm execution times for the masks in RGB color space

Table 7 shows the index values obtained using the CHI criterion, being the best value the larger one.

Table 7. Index values for clustering using k-means for the orthophotographs

Table 8 shows the results of clustering selected by user (using green tones) to obtain the total area measurement.

Table 8. Comparisons between measurements and execution time

In order to calculate the errors, measurement using polygons approximation is taken as the norm (Figure 6). The
resulting area of green space is 10 558.4 m2.

Figure 6 Area approximation using polygons.


DISCUSSION

Analysis of Tables 1, 2, 3 and 4 is performed, observing that the centroid values change as the spatial resolution of
the image changes because the amount of information contained in each pixel (color) is modified. The value of the
centroids will also change due to the change in the color space, because HSV color space is a nonlinear transformation
of the RGB color space.

The time difference to obtain the clustering result between RGB and HSV is very similar. However, the same cannot
be stated about the measurements obtained by these two. The values in table IX show that the HSV color space has
better clustering, expressed in a larger CHI value obtained compared to the RGB one.

Comparing the measurement values shown in TABLE X with the data obtained by “Agisoft PhotoScan” software, the
error is 0.08% for the image with a spatial resolution of 1.78 cm/px, and 10.61% for the image with a 10 cm/px
resolution. Thus, we can conclude that the error is proportional to the orthophotographs spatial resolution.

You might also like