You are on page 1of 31

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/337831768

Automatic classification of common building materials from 3D terrestrial


laser scan data

Article in Automation in Construction · February 2020


DOI: 10.1016/j.autcon.2019.103017

CITATIONS READS

59 1,252

3 authors:

Liang Yuan Jingjing Guo


Faculty of Architecture The University of Hong Kong Hunan University
12 PUBLICATIONS 159 CITATIONS 26 PUBLICATIONS 424 CITATIONS

SEE PROFILE SEE PROFILE

Qian Wang
Southeast University (China)
69 PUBLICATIONS 2,470 CITATIONS

SEE PROFILE

All content following this page was uploaded by Qian Wang on 18 April 2020.

The user has requested enhancement of the downloaded file.


1 Automatic Classification of Common Building Materials from 3D

2 Terrestrial Laser Scan Data

4 Liang Yuan 1, Jingjing Guo 2, Qian Wang 2*


1
5 School of Construction Management and Real Estate, Chongqing University, Chongqing, China

6 400045
2
7 Department of Building, School of Design and Environment, National University of Singapore,

8 4 Architecture Drive, Singapore 117566

9 *Corresponding author. Email: bdgwang@nus.edu.sg

10

11 Abstract

12 Automatic building material classification has been a popular research interest over the past

13 decades because it is useful for construction management and facility management. Currently, the

14 proposed methods for automatic material classification are mainly based on 2D images by using

15 the visual features of building materials. A terrestrial laser scanner (TLS) with a built-in camera

16 can generate a set of coloured laser scan data that contain the surface geometries of building

17 materials. The laser scan data include not only the visual features of building materials but also

18 other attributes such as material reflectance and surface roughness. With more attributes provided,

19 laser scan data have the potential to improve the accuracy of building material classification.

20 Therefore, this research aims to develop a TLS data-based classification method for common

21 building materials using machine learning techniques. The developed technique uses material

22 reflectance, HSV colour values, and surface roughness as the features for material classification.

23 A database containing the laser scan data of ten common building materials was created and used

24 for model training and validation with machine learning techniques. Different machine learning

25 algorithms were compared, and the best algorithm showed an average classification accuracy of

26 96.7%, which demonstrated the feasibility of the developed method.

27
1
28 Keywords: Building material classification; Terrestrial laser scanning; Machine learning

29

30 1. Introduction

31 In the past decade, automatic building material classification based on the state-of-the-art

32 information technologies has been a promising research direction in the architecture, engineering,

33 and construction (AEC) industry. Automatic material classification can improve the efficiency of

34 a variety of tasks including damage detection and onsite material management and tracking [1,2].

35 Moreover, building information modelling (BIM) has received extensive attention from both the

36 academic and industrial communities and has been increasingly adopted in the design,

37 construction, and operation stages of construction projects. It has been an important task to

38 generate as-is building information models (BIMs) that reflect the as-is conditions of buildings,

39 which can be applied for various applications such as operation and maintenance (O&M) of

40 existing buildings and building performance analysis [3,4]. An as-is BIM contains not only the

41 geometric information of a building but also non-geometric information of building elements

42 including building materials [5]. The material information is essential for many BIM applications

43 such as the three-dimensional representation of objects and building energy simulation. Therefore,

44 there is a high demand for automatic building material classification in order to generate

45 semantically rich as-is BIMs containing material information.

46 Applying machine learning techniques for automatic building material classification has been

47 a popular approach in AEC industry over the past years. The proposed material classification

48 methods can be categorized as image-based and laser scan data-based according to the types of

49 the collected data. Currently, image-based material classification methods are extensively used.

50 The core technique of image-based methods focuses on using the visual features of building

51 materials such as colour, texture, roughness, and projection [3,6-8] for automatic classification.

52 However, image-based methods are heavily influenced by illumination conditions. Different

53 illumination conditions strongly affect the visual characteristics of materials, causing difficulty

54 for image-based building material classification. Moreover, poor textures on objects and unknown

2
55 viewpoints also negatively affect the robustness and accuracy of image-based material

56 classification [8].

57 Terrestrial laser scanning (TLS) provides a new and promising perspective for automatic

58 building material classification. A TLS with a built-in camera can capture not only the visual

59 features but also intrinsic properties of building materials such as the material reflectance.

60 Meanwhile, unlike passive imaging which is critically dependent on environmental lighting

61 conditions, TLS uses an active measurement technique with infrared lights, which is not affected

62 by environmental illumination conditions [9]. Therefore, TLS has great potentials to achieve more

63 accurate material classification considering the more types of information provided and higher

64 robustness to changeable lighting conditions. Moreover, TLS has been extensively adopted for

65 constructing as-is BIMs that represent the as-is conditions of buildings [10], due to its high

66 measurement accuracy and speed. As a result, using TLS data for building material classification

67 does not require extra data collection if TLS data are already collected for constructing as-is BIMs.

68 Despite the advantages of TLS, few previous studies have adopted TLS data for building material

69 classification.

70 This research aims to develop an automatic classification method for common building

71 materials based on TLS data. The developed method will be useful for various applications

72 especially when using TLS data to construct semantically rich as-is BIMs where material

73 information is needed. In the rest of this paper, Section 2 presents a comprehensive literature

74 review about automatic material classification. Based on the review, the features for TLS data-

75 based material classification are determined in Section 3. Then, Section 4 provides a validation

76 experiment under real-world environments in order to validate the accuracy of the proposed

77 method on ten common building materials, and the experimental results are presented and

78 discussed in Section 5. The limitations of this study and potential future works are discussed in

79 Section 6, and lastly Section 7 concludes this study.

80

81 2. Related works

3
82 Building material is any material that can be used for construction purposes, and the term

83 construction material is also used as a synonym in some papers. Some previous studies have

84 contributed to automatic building material classification using image-based methods or TLS data-

85 based methods.

86 Image-based material classification using machine learning techniques [3,6-8,11] has been the

87 dominant non-destructive material classification method. The image-based material classification

88 approaches can be classified into two types: pixel-based methods and object-based methods. For

89 pixel-based methods, each pixel is classified into a material category. For example, using colour

90 as a feature, Son et al. [6] explored the performances of single classifiers and ensemble classifiers

91 for classifying concrete, steel, and wood. Zhu and Brilakis [11] used material colour and texture

92 as features to extract concrete material regions from the construction site images based on

93 different classification algorithms including support vector data description, C-support vector

94 classification, and artificial neural network. On the other hand, object-based methods classify

95 each image patch into a material category. For example, Han and Golparvar-Fard [3] used

96 material colour and texture as features and multiple discriminative machine learning models as

97 classifiers to classify squared-shape image patches into fifteen common construction materials.

98 Dimitrov and Golparvar-Fard [7] used texture and colour as features and various support vector

99 machine algorithms as classifiers to classify 200x200 pixel image patches into twenty typical

100 construction materials under real-world construction site conditions. Similarly, Lu et al. [8]

101 developed a method to recognize building objects from images, and classify objects to material

102 categories using five features including projection results from two directions, ratio values, RGB

103 color values, roughness values, and hue values.

104 On the other hand, some other studies have explored building material classification based on

105 TLS data. The classification approaches used in these studies were point-based, which is similar

106 to the pixel-based method for images. The reflected intensity values collected by TLS are first

107 adopted for material classification. For instance, Franceschi et al. [12] focused on the recognition

108 of rocks in simple sedimentary successions that mainly consisted of limestones and marls using

4
109 the intensity values from TLS. The results of a series of experiments indicated that the intensity

110 values could provide a reliable method to classify the rocks in outcrop conditions. Armesto-

111 González et al. [13] processed laser scan data using digital image processing techniques to detect

112 damages on stony materials of historical buildings. Unsupervised classification algorithms were

113 used in this study for the classification of 2D reflected intensity images derived from 3D laser

114 scan data. This work showed the potential of using the reflected intensity from TLS for the

115 recognition and characterization of certain damages in building materials of historical buildings.

116 Riveiro et al. [14] presented a novel segmentation algorithm for automatic segmentation of

117 masonry blocks from 3D laser scan data based on the reflected intensity values. Moreover,

118 Sánchez-Aparicio et al. [15] developed an approach to detecting and classifying different

119 pathological processes commonly presented on the masonry material of cultural heritage by

120 combining the reflected intensity values and point cloud coordinates acquired by TLS.

121 In addition to the reflected intensity values, TLS is often equipped with a built-in camera to

122 capture the colour information of the scanned targets. Some studies have utilized the colour

123 information for material classification. For example, Hassan et al. [16] confirmed the availability

124 of material identification using the reflected intensity and Red-Green-Blue (RGB) values from

125 TLS. They obtained the scan data of structural concrete, light-weight concrete, and clay brick

126 samples for experiments. The experimental results showed that the scanned materials had

127 different reflected intensity distributions, and the recorded RGB colour values could be used as a

128 secondary parameter for material classification. Valero et al. [17] achieved automatic

129 segmentation of individual masonry units and mortar regions in digitized rubble stone

130 constructions based on coloured laser scan data acquired by TLS. The scan data of the target

131 surface were converted into 2D depth maps as a feature for automatic segmentation, and colour

132 information was used as another feature. The experimental results demonstrated the effectiveness

133 of the technique. Also, Valero et al. [18] achieved automatic material defect area detection and

134 classification of ashlar masonry walls by using TLS data including point cloud coordinates,

135 reflected intensity, and RGB colour values. A supervised machine learning technique was used

5
136 in this study.

137 Although image-based material classification methods have made great advancement and are

138 more extensively adopted than TLS data-based methods, their applications in real-world

139 environments still face challenges due to the complex field conditions and their dependence on

140 environmental illumination conditions. The ability to capture more types of information and the

141 use of active measurement technique makes TLS data-based material classification more

142 promising. However, the previous studies on TLS data-based material classification are either

143 based on only laboratory experiments in a controlled environment or for the recognition of only

144 one or two categories of materials. To tackle the limitations of previous studies, this study

145 examines the feasibility of using TLS data for the classification of ten different common building

146 materials in real-world environments.

147

148 3. Determination of features

149 This section aims to determine the features used for automatic building material classification.

150 For each scan point, the TLS collects a set of attributes comprising of the reflected laser beam

151 intensity, RGB colour values, and x-y-z coordinates. A set of features for material classification

152 are calculated based on the collected attributes. The features include 1) material reflectance, 2)

153 colour, and 3) surface roughness, as explained in the following subsections.

154

155 3.1 Material reflectance

156 3.1.1 Theoretical fundamentals

157 For each scan point, the TLS provides a reflected laser intensity value (𝐼𝐼𝑟𝑟 ), which is determined

158 by the type of material and scanning parameters. Although 𝐼𝐼𝑟𝑟 is recommended to be a feature for

159 material classification in some studies [14,16], different materials are likely to present similar 𝐼𝐼𝑟𝑟

160 values because various scanning parameters can also significantly affect 𝐼𝐼𝑟𝑟 . Instead, among all

161 the factors that affect 𝐼𝐼𝑟𝑟 , the material reflectance 𝜌𝜌 is the only intrinsic property of a certain

162 material. In other words, the same material always has the same 𝜌𝜌 value even though other factors

6
163 are varying. However, it is worth noting that, for a specific material, the estimated material

164 reflectance 𝜌𝜌 values from TLS data can differ within a certain range. Therefore, the material

165 reflectance 𝜌𝜌 is adopted as a feature for material classification in this paper, and the 𝜌𝜌 value will

166 be estimated from the 𝐼𝐼𝑟𝑟 value.

167 Previous studies have investigated how to obtain the 𝜌𝜌 value from the 𝐼𝐼𝑟𝑟 value. Tan and Cheng

168 [19] proposed a polynomial calculation method to recover the absolute 𝜌𝜌 value of the scanned

169 surface by eliminating the effects of the scanning range 𝑅𝑅 from the TLS to the target and the laser

170 beam incident angle 𝜃𝜃 on 𝐼𝐼𝑟𝑟 . However, this method requires a target with known 𝜌𝜌 to be a

171 reference in the scanning scene. Franceschi et al. [12] proposed a linear fitting method to obtain

172 the 𝜌𝜌 values of different stone materials. However, their study was limited to cases with 𝑅𝑅 >

173 15 𝑚𝑚, and the same reference target (with unknown 𝜌𝜌) was required in different scanning scenes.

174 Because it is inconvenient to place a reference target in real-world environments, this study

175 develops a reference target-free method for calculating the 𝜌𝜌 values of different materials.

176 Because the laser intensity is the laser power per unit area [20], Equation (1) can be obtained:

𝐼𝐼𝑟𝑟 = 𝑃𝑃𝑟𝑟 /∆𝑆𝑆 (1)

177 where 𝑃𝑃𝑟𝑟 is the received laser signal power and ∆𝑆𝑆 is the unit area.

178 For each scan point, 𝑃𝑃𝑟𝑟 is determined by the sensor, the target, and atmospheric parameters.

179 Equation (2) illustrates the specific factors that determine the received signal power 𝑃𝑃𝑟𝑟 :

𝑃𝑃𝑡𝑡 𝐷𝐷𝑟𝑟 2
𝑃𝑃𝑟𝑟 = 𝜂𝜂 𝜂𝜂 𝜎𝜎 (2)
4𝜋𝜋𝑅𝑅 4 𝛽𝛽𝑡𝑡 𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎𝑎𝑎

180 where 𝑃𝑃𝑡𝑡 is the transmitted power, 𝜎𝜎 is the effective target cross-section, 𝐷𝐷𝑟𝑟 is the receiver

181 aperture diameter, 𝑅𝑅 is the range from the TLS to the target, 𝛽𝛽𝑡𝑡 is the laser beam width, 𝜂𝜂𝑎𝑎𝑎𝑎𝑎𝑎 is

182 the atmospheric transmission factor, and 𝜂𝜂𝑠𝑠𝑠𝑠𝑠𝑠 is the system transmission factor [21]. Equation (2)

183 is only valid under two assumptions: 1) the receiver field of view matches the beam divergence,

184 and 2) the laser emitter and the laser detector have the same distance to the scanned target. The

185 cases in this study fulfil the above-mentioned assumptions.

186 According to [21], Equation (2) can be simplified into Equation (3)
7
𝑃𝑃𝑡𝑡 𝐷𝐷𝑟𝑟 2 𝜌𝜌
𝑃𝑃𝑟𝑟 = 𝜂𝜂 𝜂𝜂 cos𝜃𝜃 (3)
4𝑅𝑅 2 𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎𝑎𝑎

187 where 𝜃𝜃 is the incident angle of the laser beam on the scanned object.

188 Three conditions need to be met for simplifying Equation (2) into Equation (3): 1) the entire

189 laser footprint is reflected on only one surface (extended target) and the laser footprint area is

190 circular, 2) the laser footprint area has a solid angle 𝛺𝛺 of 𝜋𝜋 steradians (𝛺𝛺 = 4𝜋𝜋 for scattering into

191 the whole sphere), and 3) the target surface presents Lambertian scattering characteristics [22]. In

192 case of TLS measurement in the AEC industry, the size of the reflecting surface usually greatly

193 exceeds the size of laser footprint, and the scanned object can be regarded as an extended target

194 with a solid angle of 𝜋𝜋 steradians. Meanwhile, the laser reflection on the target surface could be

195 approximately treated as a Lambertian scattering for most building materials [23].

196 Equation (4) can be obtained by combing Equations (1) and (3):

𝐼𝐼𝑡𝑡 𝐷𝐷𝑟𝑟 2 𝜌𝜌
𝐼𝐼𝑟𝑟 = 𝜂𝜂 𝜂𝜂 cos𝜃𝜃 (4)
4𝑅𝑅 2 ∆𝑆𝑆 𝑠𝑠𝑠𝑠𝑠𝑠 𝑎𝑎𝑎𝑎𝑎𝑎

197 In the AEC industry, the range 𝑅𝑅 is relatively short (no more than 20 m). In this case, the 𝜂𝜂𝑎𝑎𝑎𝑎𝑎𝑎

198 value within the scanned region can be regarded as a constant. In addition, the 𝐼𝐼𝑡𝑡 , 𝐷𝐷𝑟𝑟 , and 𝜂𝜂𝑠𝑠𝑠𝑠𝑠𝑠

199 values will also be fixed when using the same TLS for scanning. Hence, 𝜂𝜂𝑎𝑎𝑎𝑎𝑎𝑎 , 𝐼𝐼𝑡𝑡 , 𝐷𝐷𝑟𝑟 , 𝜂𝜂𝑠𝑠𝑠𝑠𝑠𝑠 , and

200 ∆𝑆𝑆 can be substituted by a total coefficient 𝐾𝐾, and Equation (4) can be simplified into Equation

201 (5).

cos𝜃𝜃
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝜌𝜌 (5)
𝑅𝑅 2

202 In Equation (5), the 𝑅𝑅 and 𝜃𝜃 values of a scan point can be calculated from the x-y-z coordinates

203 as follows. As shown in Fig. 1, a laser scan point 𝑃𝑃𝑖𝑖 can be presented in both the Spherical

204 coordinate system as 𝑃𝑃𝑖𝑖 (𝑅𝑅, 𝛼𝛼, 𝛽𝛽) and the Cartesian coordinate system as 𝑃𝑃𝑖𝑖 (𝑥𝑥, 𝑦𝑦, 𝑧𝑧). The relations

205 between the two coordinate systems are shown in Equation (6):

𝑥𝑥 = 𝑅𝑅 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐
𝑦𝑦 = 𝑅𝑅 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 � (6)
𝑧𝑧 = 𝑅𝑅 𝑠𝑠𝑠𝑠𝑠𝑠𝑠𝑠

206 where 𝛼𝛼 is the horizontal angle of the laser beam in the TLS’s coordinate system, and 𝛽𝛽 is

8
207 equivalent to 𝜃𝜃.

208
209 Fig. 1 The relationship between the Spherical coordinates system and the Cartesian

210 coordinate system for a scan point 𝑃𝑃𝑖𝑖

211

212 Combing Equations (5) and (6), Equation (7) can be obtained as:

𝑥𝑥 2 + 𝑦𝑦 2
� 2
cos𝜃𝜃 𝑥𝑥 + 𝑦𝑦 2 + 𝑧𝑧 2 (7)
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝜌𝜌 2 = 𝜌𝜌𝜌𝜌 2
𝑅𝑅 𝑥𝑥 + 𝑦𝑦 2 + 𝑧𝑧 2

213 According to Equation (7), 𝐼𝐼𝑟𝑟 presents proportional relationships with cos𝜃𝜃⁄𝑅𝑅2 . However,

214 several previous studies show that 𝐼𝐼𝑟𝑟 and cos𝜃𝜃⁄𝑅𝑅 2 present linear relationships instead of

215 proportional relationships when obtaining scan data by TLS [23-25]. It may be because TLS

216 normalizes the 𝐼𝐼𝑟𝑟 value in some way such that the 𝐼𝐼𝑟𝑟 value is not proportional to 𝑃𝑃𝑟𝑟 , which makes

217 Equation (7) invalid. According to previous studies, these linear relationships can be expressed

218 as:

cos𝜃𝜃
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝜌𝜌 + 𝑏𝑏 (8)
𝑅𝑅 2

219 where 𝑏𝑏 is a constant term.

220

9
221 3.1.2 Experimental validation

222 Experiments were conducted to validate Equation (8) using real TLS data. Taking the realistic

223 scanning scenes of buildings into consideration, this study assumes that the range 𝑅𝑅 is less than

224 10 m. Although previous studies show the linear relationships between 𝐼𝐼𝑟𝑟 and cos𝜃𝜃⁄𝑅𝑅 2, these

225 studies consider only long-distance scanning (with 𝑅𝑅 larger than 10 m). When the 𝑅𝑅 value is

226 within 10 m, the relationship between 𝐼𝐼𝑟𝑟 and cos𝜃𝜃⁄𝑅𝑅2 has not been investigated in previous

227 studies. Therefore, this study examined the relationship using a set of experiments as follows.

228 In the first step, we examined the relationships between 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 and 1/𝑅𝑅 2 separately.

229 When one factor (e.g. 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐) is fixed, 𝐼𝐼𝑟𝑟 should show a linear relationship with the other factor

230 (e.g. 1/𝑅𝑅 2). This study used the FARO FocusS70 laser scanner to collect scan data, and the scan

231 data of white latex painting and white metal plate were taken as examples. Data points with

232 different 𝜃𝜃 values but the same 𝑅𝑅 value were selected firstly. The 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 values of these

233 selected points were plotted in Fig. 2(a). Two different lines were fitted into the scattered data

234 points of the two building materials separately, as shown in Fig. 2(a). The result showed that 𝐼𝐼𝑟𝑟

235 presented an approximately linear relationship with 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐, which has a R-Square (𝑟𝑟 2 ) value larger

236 than 0.9. Then, data points with different 𝑅𝑅 values but the same 𝜃𝜃 value were selected. The 𝐼𝐼𝑟𝑟 and

237 1/𝑅𝑅 2 values of these selected points were plotted in Fig. 2(b). According to Fig. 2(b), the

238 relationship between 𝐼𝐼𝑟𝑟 and 1/𝑅𝑅 2 presented three different trends when 1/𝑅𝑅 2 < 0.06

239 (approximately 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 ), 0.06 < 1/𝑅𝑅 2 < 0.25 (approximately 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚), and

240 1/𝑅𝑅 2 > 0.25 (approximately 𝑅𝑅 < 2 𝑚𝑚) . When 1/𝑅𝑅 2 < 0.06 or 0.06 < 1/𝑅𝑅 2 < 0.25 , the

241 relationship between 𝐼𝐼𝑟𝑟 and 1/𝑅𝑅 2 was approximately linear with 𝑟𝑟 2 > 0.9. When 1/𝑅𝑅 2 > 0.25,

242 the 𝑟𝑟 2 values of the linear relationship between 𝐼𝐼𝑟𝑟 and 1/𝑅𝑅 2 were about 0.8.

243

10
244

245
246 Fig. 2 The relationship between 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 and 1/𝑅𝑅 2: (a) the relationship between 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐

247 when 𝑅𝑅 ix fixed, where 𝜌𝜌𝑚𝑚 and 𝜌𝜌𝑝𝑝 are the respective material reflectance of metal plate and latex

248 painting, and (b) the relationship between 𝐼𝐼𝑟𝑟 and 1/𝑅𝑅 2 when 𝜃𝜃 is fixed, where 𝐾𝐾1 , 𝐾𝐾2 and 𝐾𝐾3 are

249 the different values of 𝐾𝐾.

250

251 We tested the relationships between 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐 and 1/𝑅𝑅 2 for various common building

252 materials. All test results demonstrated that 𝐼𝐼𝑟𝑟 had a linear relationship with 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐. Also, 𝐼𝐼𝑟𝑟 had

11
253 different linear relationships with 1/𝑅𝑅 2 when 1/𝑅𝑅 2 < 0.06 (approximately 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 ),

254 0.06 < 1/𝑅𝑅2 < 0.25 (approximately 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚), and 1/𝑅𝑅 2 > 0.25 (approximately 𝑅𝑅 <

255 2 𝑚𝑚). The different linear relationships with different 𝑅𝑅 ranges were also observed in other

256 studies, and it is because the scanner is equipped with a brightness reducer to protect the scanner

257 from extremely high received laser intensity [25,26].

258 In the second step, we examined the linear relationship between 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/𝑅𝑅 2. As discussed

259 before, the relationships between 𝐼𝐼𝑟𝑟 and 1/𝑅𝑅 2 showed three different linear functions when

260 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚, 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚, and 𝑅𝑅 < 2 𝑚𝑚. As a result, the relationship between 𝐼𝐼𝑟𝑟 and

261 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/𝑅𝑅 2 can be written as three separate linear functions for different 𝑅𝑅 values:

cos𝜃𝜃
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝐾𝐾1 + 𝑏𝑏1 , 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 ⎫
𝑅𝑅2 ⎪
cos𝜃𝜃
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝐾𝐾2 2 + 𝑏𝑏2 , 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 (9)
𝑅𝑅 ⎬
cos𝜃𝜃 ⎪
𝐼𝐼𝑟𝑟 = 𝜌𝜌𝐾𝐾3 2 + 𝑏𝑏3 , 𝑅𝑅 < 2 𝑚𝑚⎭
𝑅𝑅

262 where 𝐾𝐾1 , 𝐾𝐾3 , and 𝐾𝐾3 are three different values of coefficient 𝐾𝐾, and 𝑏𝑏1 , 𝑏𝑏2 , and 𝑏𝑏3 are three

263 different constant terms.

264 We still chose the material of white metal plate and white latex painting as examples. Random

265 samples of laser scan data of the two materials with different 𝐼𝐼𝑟𝑟 and cos𝜃𝜃/𝑅𝑅 2 values were selected

266 and plotted in Fig. 3. All the sampling data had 𝑅𝑅 values within 10 m. As shown in Fig. 3, the

267 relationship between 𝐼𝐼𝑟𝑟 and cos𝜃𝜃/𝑅𝑅 2 of two materials showed three different linear relations

268 when 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚, 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 and 𝑅𝑅 < 2 𝑚𝑚. Using linear function 𝐼𝐼𝑟𝑟 = 𝜌𝜌𝜌𝜌(𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/

269 𝑅𝑅 2 ) + 𝑏𝑏 to fit the six sets of scattered points, the obtained 𝑟𝑟 2 values of the all fitted lines were

270 larger than 0.9, showing high goodness of fit. We also tested using laser scan data of other

271 materials, and similar conclusions were drawn. In conclusion, the experimental results showed

272 that Equation (9) was valid. The 𝐼𝐼𝑟𝑟 value had three different linear relationships with cos𝜃𝜃/𝑅𝑅 2

273 when 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚, 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚, and 𝑅𝑅 < 2 𝑚𝑚. Moreover, Fig. 3 also demonstrated that

274 the 𝐼𝐼𝑟𝑟 values of different materials may be equivalent at certain 𝑅𝑅 and 𝜃𝜃 values. However, the

275 slope 𝜌𝜌𝜌𝜌 values of fitted lines were always different because the 𝜌𝜌 values of the two materials

12
276 were different. Therefore, we can get the 𝜌𝜌 value of a material based on linear regression fitting

277 of scan data of this building material, and then use the 𝜌𝜌 value as a feature for material

278 classification. Considering that the scanning scenes with 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 or 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚

279 are much more common than 𝑅𝑅 < 2 𝑚𝑚 for buildings, this study chooses to limit the research

280 scope to 𝑅𝑅 values between 2 m and 10 m.

281

282
283 Fig. 3 Relation between 𝐼𝐼𝑟𝑟 and cos𝜃𝜃/𝑅𝑅 2, where 𝐿𝐿𝑃𝑃𝑖𝑖 and 𝑀𝑀𝑃𝑃𝑖𝑖 are the respective data points of

284 white latex painting and white metal plate and 𝑙𝑙𝑖𝑖 is the fitted line of each set of data points.

285

286 3.1.3 Calculation of material reflectance from laser scan data

287 For each laser scan point, the 𝜌𝜌 value should be calculated for material classification in the

288 following three steps. First, the neighbouring points of a scan point are obtained by finding all

289 points within a 𝑠𝑠1 × 𝑠𝑠1 × 𝑠𝑠1 bounding box that is centred at this scan point. As shown in Fig. 4,

290 all the points within the blue bounding box become the neighbouring points of the blue point 𝑃𝑃𝑖𝑖 .

291 Second, the cos𝜃𝜃/𝑅𝑅 2 values of the all neighbouring points are calculated based on Equation (7),

292 and the 𝐼𝐼𝑟𝑟 values of the neighbouring points are extracted from the laser scan data. Third, a linear

293 function is fitted into the cos𝜃𝜃/𝑅𝑅 2 and 𝐼𝐼𝑟𝑟 values of neighbouring points according to Equation

13
294 (9), and the coefficient 𝜌𝜌𝐾𝐾1 or 𝜌𝜌𝐾𝐾2 is obtained. Because 𝐾𝐾1 or 𝐾𝐾2 is fixed when using the same

295 TLS, and it is difficult to estimate their specific values, 𝜌𝜌𝐾𝐾1 or 𝜌𝜌𝐾𝐾2 is used in this study to

296 represent 𝜌𝜌 as the material reflectance.

297

298
299 Fig. 4 Calculation of 𝜌𝜌 based on neighbouring points of a scan point

300

301 3.2 Colour

302 A TLS with a built-in camera can capture the colour information of each scan point and record

303 as RGB colour space (RGB values from 0 to 255). The RGB values can potentially help in

304 building material classification. Object colours have been extensively used for not only object

305 and material classification using 2D images in the computer vision community, but also material

306 classification using laser scan data [27]. According to the literature, the Hue-Saturation-Value

307 (HSV) colour space is preferred than the RGB colour space because of its better robustness under

308 variable illumination conditions. Therefore, this study also adopts the HSV colour space as the

309 features for automatic building material classification. The translation function from RGB to HSV

310 is described in [28], as follows:

14
𝑅𝑅 𝐺𝐺 𝐵𝐵
𝑟𝑟 = , 𝑔𝑔 = , 𝑏𝑏 = ⎫
255 255 255 ⎪
𝑀𝑀𝑀𝑀 = 𝑚𝑚𝑚𝑚𝑚𝑚(𝑟𝑟, 𝑔𝑔, 𝑏𝑏) , 𝑀𝑀𝑀𝑀 = 𝑚𝑚𝑚𝑚𝑚𝑚(𝑟𝑟, 𝑔𝑔, 𝑏𝑏), ∆= 𝑀𝑀𝑀𝑀 − 𝑀𝑀𝑀𝑀⎪
0𝑜𝑜 , 𝑖𝑖𝑖𝑖 𝑀𝑀𝑀𝑀 = 𝑀𝑀𝑀𝑀 ⎪
⎧ 𝑔𝑔 − 𝑏𝑏 ⎪
⎪60𝑜𝑜 × + 0𝑜𝑜 , 𝑖𝑖𝑖𝑖 𝑀𝑀𝑀𝑀 = 𝑟𝑟 𝑎𝑎𝑎𝑎𝑎𝑎 𝑔𝑔 ≫ 𝑏𝑏 ⎪
⎪ ∆ ⎪
⎪ 𝑔𝑔 − 𝑏𝑏 ⎪
𝑜𝑜 𝑜𝑜
𝐻𝐻 = 60 × ∆ + 360 , 𝑖𝑖𝑖𝑖 𝑀𝑀𝑀𝑀 = 𝑟𝑟 𝑎𝑎𝑎𝑎𝑎𝑎 𝑔𝑔 < 𝑏𝑏 (10)
⎨ 𝑏𝑏 − 𝑟𝑟 ⎬
⎪60𝑜𝑜 × + 120𝑜𝑜 , 𝑖𝑖𝑖𝑖 𝑀𝑀𝑀𝑀 = 𝑔𝑔 ⎪
⎪ ∆
⎪ 𝑜𝑜 𝑟𝑟 − 𝑔𝑔 𝑜𝑜

⎩ 60 × ∆ + 240 , 𝑖𝑖𝑖𝑖 𝑀𝑀𝑀𝑀 = 𝑏𝑏 ⎪

0, 𝑖𝑖𝑓𝑓 𝑀𝑀𝑀𝑀 = 0 ⎪
𝑆𝑆 = � ∆ ⎪
, 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒𝑒 ⎪
𝑀𝑀𝑀𝑀
𝑉𝑉 = 𝑀𝑀𝑀𝑀 ⎭

311

312 3.3 Surface roughness

313 With the millimetre-level laser beam diameter, TLS can capture geometric characteristics of

314 building materials, e.g. surface roughness (𝑅𝑅𝑎𝑎 ) that measures the irregularity of an area. The

315 feasibility of estimating surface roughness based on laser scan data has been proven by previous

316 research efforts [18]. In general, each category of building material presents different surface

317 roughness. Previous studies show that it is potential to utilize surface roughness estimated from

318 laser scan data for material classification. Therefore, this research also uses surface roughness as

319 a feature for automatic building material classification.

320 In this study, the surface roughness 𝑅𝑅𝑎𝑎 of a scan point is calculated in the following four steps.

321 First, for each scan point 𝑃𝑃𝑖𝑖 , its neighbouring points are obtained as the points within the 𝑠𝑠2 ×

322 𝑠𝑠2 × 𝑠𝑠2 bounding box that is centred at this point, as shown in Fig. 5. Second, a plane is fitted into

323 the neighbouring points using the M-estimator Sample Consensus (MSAC) algorithm [29]. The

324 fitted plane 𝑀𝑀 can be expressed as 𝐴𝐴𝐴𝐴 + 𝐵𝐵𝐵𝐵 + 𝐶𝐶𝐶𝐶 + 𝐷𝐷 = 0. Third, the orthogonal distance 𝑑𝑑𝑖𝑖

325 from each neighbouring point (𝑥𝑥𝑖𝑖 , 𝑦𝑦𝑖𝑖 , 𝑧𝑧𝑖𝑖 ) to the fitted plane is calculated using Equation (11):

|𝐴𝐴𝑥𝑥𝑖𝑖 + 𝐵𝐵𝑦𝑦𝑖𝑖 + 𝐶𝐶𝑧𝑧𝑖𝑖 + 𝐷𝐷|


𝑑𝑑𝑖𝑖 = (11)
√𝐴𝐴2 + 𝐵𝐵2 + 𝐶𝐶 2

326 Lastly, the surface roughness 𝑅𝑅𝑎𝑎 at point 𝑃𝑃𝑖𝑖 is calculated as the standard deviation of the

327 distances 𝑑𝑑𝑖𝑖 of the 𝑛𝑛 neighbouring points to the fitted plane [18]:
15
1 𝑛𝑛 1
𝑅𝑅𝑎𝑎 = � �𝑖𝑖=1(𝑑𝑑𝑖𝑖 − 𝜇𝜇)2 , with 𝜇𝜇 = ∑𝑛𝑛𝑖𝑖=1 𝑑𝑑𝑖𝑖 (12)
𝑛𝑛 𝑛𝑛

328
329 Fig. 5 Calculation of surface roughness 𝑅𝑅𝑎𝑎 based on neighbouring points and fitted plane

330

331 4. Experiments

332 4.1 Selection of building materials

333 It is necessary to determine a set of common building materials as the target materials in this

334 study. Some construction material libraries have been created for image-based material

335 classification [3,7]. Therefore, taking the existing construction material libraries as a reference,

336 we created a common building material set.

337 As shown in Fig. 6, this study considered ten different categories of materials, including

338 concrete, mortar, stone, metal, painting, wood, plaster, plastic, pottery, and ceramic. Although

339 being extensively used in buildings, glass is not chosen in this study because TLS has difficulty

340 in capturing transparent objects. For the ten categories of materials, one specific commonly-used

341 building material was selected as a sample from each category.

342

16
343
344 Fig. 6 Ten categories of common building materials used in this study

345

346 4.2 Data collection and processing

347 A FARO FocusS70 TLS was used to collect laser scan data of the ten materials. As shown in

348 Table 1, this TLS had a measurement range of 0.6 to 70 m, and a field of view of 300° vertically

349 and 360° horizontally. The beam diameter at exit was 2.12 mm and the divergence was 0.3 mrad.

350

351 Table 1 Parameters of the TLS used in this study [30].

Model Range Field of view Beam diameter at exit Beam divergence

FARO Vertical: 300°


0.6 -70 m 2.12 mm 0.3 mrad
FocusS70 Horizontal: 360°

352

353 The laser scan data of the ten building materials were collected from buildings in the National

354 University of Singapore (NUS) in Singapore. The collected scan data included both building

355 interiors (i.e. ceilings, walls, and floors) and exterior facades. Furthermore, for each material, scan

356 data were collected from different sites under different illumination conditions to make sure that

357 the collected data of each material were representative.

358 The data processing was executed in MATLAB2019a [31] after the laser scan data were

359 extracted from the TLS’s software FARO SCENE [32]. To calculate the 𝜌𝜌 value of a scan point,
17
360 the bounding box size 𝑠𝑠1 to find neighbouring points was set as 0.5 m. It was to make sure that

361 the 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/𝑅𝑅 2 and 𝐼𝐼𝑟𝑟 values of all the neighbouring points were distributed in a wide range so that

362 the linear fitting provided accurate results for the estimation of 𝜌𝜌. Because the coefficient 𝐾𝐾 was

363 different when 2 𝑚𝑚 < 𝑅𝑅 < 4𝑚𝑚 and 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 , linear fitting was carried out for two

364 separate datasets with different 𝑅𝑅 values. To calculate surface roughness 𝑅𝑅𝑎𝑎 , the bounding box

365 size 𝑠𝑠2 to find neighbouring points was set as 0.2 m such that the number of neighbours would be

366 proper to calculate the local surface roughness.

367

368 4.3 Model training

369 After calculating the 𝜌𝜌, HSV, and 𝑅𝑅𝑎𝑎 values, two datasets with 2 𝑚𝑚 < 𝑅𝑅 < 4𝑚𝑚 and 4 𝑚𝑚 <

370 𝑅𝑅 < 10 𝑚𝑚 were created for respective training. The dataset with 2 𝑚𝑚 < 𝑅𝑅 < 4𝑚𝑚 contained

371 41,000 data points (approximately 4,100 data points for each building material), and the dataset

372 with 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 contained 53,000 data points (approximately 5,300 data points for each

373 building material). Each data point represents a sample, comprising a building material category

374 label, 𝜌𝜌 value, H, S, and V values in the HSV colour space, and 𝑅𝑅𝑎𝑎 value.

375 This paper mainly considered building material classification as a multi-class classification

376 problem. The overall workflow of model training and validation is presented in Fig. 7. Two

377 datasets were trained and validated respectively. We used 80% of each dataset to train the

378 classification model and the rest 20% to test the trained model. To find the best combination of

379 features, different combinations of features (𝜌𝜌, HSV values, and 𝑅𝑅𝑎𝑎 ) were tested. Besides, the 𝐼𝐼𝑟𝑟

380 value and RGB values were also considered in the comparisons. To find the best classification

381 model, different types of supervised learning classifiers were explored including Decision Trees

382 (DTs), Discriminant Analysis (DAs), Naive Bayes (NBs), Support Vector Machines (SVMs), K-

383 Nearest Neighbours (KNNs), and Ensembles. Each type of the above-mentioned classifiers

384 included multiple algorithms. For example, KNNs contains Fine KNN, Medium KNN, Coarse

385 KNN, Cosine KNN, Cubic KNN, and Weighted KNN algorithm. Hence, the different algorithms

386 were also compared, and the one producing the highest accuracy was selected.

18
387

388
389 Fig. 7 The overall workflow of training and validating models

390

391 4.4 Accuracy of the classification model

392 Classification accuracy was used in this research to measure the performance of different

393 models. The accuracy of a classification model can be quantified by Equation (13).

𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇
𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴𝐴 = (13)
𝑇𝑇𝑇𝑇 + 𝑇𝑇𝑇𝑇 + 𝐹𝐹𝐹𝐹 + 𝐹𝐹𝐹𝐹

394 where TP, TN, FP, and FN are the numbers of True Positives, True Negatives, False Positives,

395 and False Negatives, respectively. For instance, if a pottery material is correctly classified as

396 pottery, it will be a TP. If a concrete material is incorrectly classified as pottery, it will be a FP

397 for the pottery class. When a pottery material is not classified as the pottery class, the instance

398 will be a FN for the pottery class. When a material is not pottery material and is correctly classified

399 as a non-pottery material, the instance will be a TN.

400 We also used a confusion matrix to show the detailed classification accuracy for each category

401 of materials. The confusion matrix returns the average accuracy per material class. In a confusion

402 matrix, each column represents the predicted material class, and each row represents the true

403 material class.

404

405 5. Experimental results

406 5.1 Comparisons of features and classifiers

407 We trained different classifiers with different feature combinations in Classification Learner
19
408 of MATLAB2019a. Two accuracy performance matrixes with different ranges of 𝑅𝑅 are obtained,

409 as shown in Tables 2 and 3.

410 According to Table 2, when 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚, using 𝜌𝜌, HSV, and 𝑅𝑅𝑎𝑎 as features and Ensemble

411 as the classifier produced the highest classification accuracy of 96.2%. According to Table 3,

412 when 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 , using 𝜌𝜌 , HSV, and 𝑅𝑅𝑎𝑎 as features and ensemble as the classifier

413 produced the highest classification accuracy of 97.1%. In conclusion, using 𝜌𝜌, HSV, and 𝑅𝑅𝑎𝑎 as

414 features and Ensemble as the classifier always had the highest classification accuracy, showing

415 an average classification accuracy of 96.7% when 2 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚.

416 As mentioned above, this study used 𝜌𝜌 instead of 𝐼𝐼𝑟𝑟 , and HSV instead of RGB as the features

417 for material classification. The experiments also tested the classification accuracy when using 𝜌𝜌,

418 𝐼𝐼𝑟𝑟 , HSV, or RGB as the only feature. According to Tables 2-3, using 𝜌𝜌 as the only feature had

419 accuracies of 61.8% and 77.5%, respectively, which were much higher than the accuracies of

420 40.4% and 29.7% when using 𝐼𝐼𝑟𝑟 . The experimental results showed that 𝜌𝜌 was a much better

421 feature for material classification than 𝐼𝐼𝑟𝑟 . For the comparisons between HSV and RGB, using

422 HSV colours as the only features had accuracies of 88.6% and 77.7%, respectively. The

423 accuracies became 88.6% and 76.3% when using RGB colours. Although the accuracies were

424 very similar, HSV colours still had a better overall performance than RGB colours. The

425 comparisons proved that selecting 𝜌𝜌 and HSV colours as features was preferred.

426 We further compared the performances when using any two of 𝜌𝜌, 𝐻𝐻𝐻𝐻𝐻𝐻, and 𝑅𝑅𝑎𝑎 as the features

427 (i.e. 𝜌𝜌 + 𝑅𝑅𝑎𝑎 , 𝜌𝜌 + 𝐻𝐻𝐻𝐻𝐻𝐻, and 𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻). As shown in Tables 2-3, using the combination of any

428 two features performed better than using any single feature on classification accuracy. The results

429 indicated that all the features were useful for improving classification accuracy.

430 Regarding the comparisons of classifiers, the experimental results showed that the ensemble

431 algorithm was the best classifier. The result is consistent with the conclusion introduced in a

432 previous study [6]. According to the experimental results presented in both Tables 2 and 3, the

433 algorithm with the highest classification accuracy was the Bootstrap-aggregated Decision Trees

434 (namely Bagged Trees), which is one of the Ensemble algorithms.

20
435

436 Table 2 The classification accuracy of different feature combinations and classifiers when 2 𝑚𝑚 <

437 𝑅𝑅 < 4 𝑚𝑚

DTs DAs NBs SVMs KNNs Ensembles


(%) (%) (%) (%) (%) (%)

𝜌𝜌 61.0 54.7 59.0 60.3 61.8 58.6

𝑅𝑅𝑎𝑎 28.0 26.5 28.8 30.1 28.1 31.4

𝐻𝐻𝐻𝐻𝐻𝐻 85.7 73.4 78.5 82.7 88.1 88.6

𝑅𝑅𝑅𝑅𝑅𝑅 71.9 78.1 53.5 80.6 87.5 88.6

𝐼𝐼𝑟𝑟 40.3 38.8 40.4 38.9 35.4 40.2

𝜌𝜌 + 𝑅𝑅𝑎𝑎 69.3 53.8 65.9 63.8 72.7 70.4

𝜌𝜌 + 𝐻𝐻𝐻𝐻𝐻𝐻 89.0 86.2 89.1 88.1 94.2 94.8

𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 86.2 73.5 82.5 89.1 90.0 92.6

𝜌𝜌 + 𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 91.0 88.7 91.2 93.1 95.2 96.2


438

439 Table 3 The classification accuracy of different feature combinations and classifiers when 4 𝑚𝑚 <

440 𝑅𝑅 < 10 𝑚𝑚

DTs DAs NBs SVMs KNNs Ensembles


(%) (%) (%) (%) (%) (%)

𝜌𝜌 77.0 67.6 74.9 72.0 77.5 75.7

𝑅𝑅𝑎𝑎 30.9 26.6 26.0 26.0 24.7 29.5

𝐻𝐻𝐻𝐻𝐻𝐻 71.5 67.1 71.9 72.2 75.0 77.7

𝑅𝑅𝑅𝑅𝑅𝑅 65.8 68.2 48.8 60.8 75.0 76.3

𝐼𝐼𝑟𝑟 29.7 24.6 24.6 24.5 28.5 29.4

𝜌𝜌 + 𝑅𝑅𝑎𝑎 85.3 71.5 78.6 79.1 87.2 86.8

𝜌𝜌 + 𝐻𝐻𝐻𝐻𝐻𝐻 93.5 88.9 91.2 90.8 95.6 95.7

𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 77.9 69.5 77.0 80.0 82.7 84.7

𝜌𝜌 + 𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 94.1 89.6 95.2 90.5 96.6 97.1


441

21
442 5.2 Accuracy analysis of each building material

443 To further understand the material classification results, we used the confusion matrix to

444 analyse the classification performance for the case with the highest classification accuracy (i.e.

445 using 𝜌𝜌 + 𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 as features and Bagged Trees algorithm as the classifier). The classification

446 results for both 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 and 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 were averaged, and the confusion matrix

447 is shown in Fig. 8. Each row of the confusion matrix shows the percentage of TP and FN rates for

448 the true class of the building material sample. It is found that all ten categories of materials

449 produced TP rate of at least 92%. The painting material showed the highest TP rate of 100%,

450 indicating that all the data of the painting material were correctly classified. The mortar material

451 presented the lowest TP rate of 92%. It is shown that 7% of mortar data were wrongly classified

452 as stone. This phenomenon indicated that the features of stone material were similar to these of

453 mortar material.

454

455
456 Fig. 8 Confusion matrix of the case with the highest classification accuracy (i.e. using 𝜌𝜌 + 𝑅𝑅𝑎𝑎 +

457 𝐻𝐻𝐻𝐻𝐻𝐻 as features and Bagged Trees algorithm as the classifier)

22
458

459 To further analyse the reasons for the low TP rate of mortar material, a parallel coordinates

460 plot was drawn to visualize the multivariate data. As shown in Fig. 9, we compared the

461 distributions of features (𝜌𝜌, 𝐻𝐻, 𝑆𝑆, 𝑉𝑉, or 𝑅𝑅𝑎𝑎 ) for stone, mortar, and painting. For a certain feature, if

462 the data of one material have a higher dispersion degree or have large overlap with another

463 material, this feature will be less useful on improving the classification accuracy of this material.

464 Compared to painting material, it was found that the mortar material had a higher dispersion

465 degree on multiple features (e.g. 𝑆𝑆, 𝑉𝑉, and 𝑅𝑅𝑎𝑎 ), which resulted in large overlaps with other

466 materials and caused the low TP rate of mortar. Meanwhile, it is also found that the 𝐻𝐻 and 𝑆𝑆

467 values of mortar were heavily overlapping with those of stone material, which explained why 7%

468 of mortar data were wrongly classified as stone.

469

470

471 Fig. 9 Parallel coordinates plot of mortar, stone, and painting materials with the features as

472 horizontal axis and the dispersion degree of data as vertical axis.

473

474 5.3 Discussion

475 Building material classification is typically a multi-class classification problem, aiming to

476 classify an unknown object into one of several pre-defined categories. However, it can also be a

23
477 one-class classification problem that distinguishes one class from all other classes, if one

478 particular material needs to be extracted. In this study, to test how accurately one building material

479 can be recognized, one-class classification method was also implemented.

480 The training of a one-class classifier is to define a classification boundary for a specific class

481 so that the classification model can distinguish this specific class from other classes. Currently,

482 main one-class classification methods can be categorised into three types: 1) density-based

483 methods, 2) reconstruction-based methods, and 3) boundary-based methods [33]. Some

484 researchers [33,34] compared the accuracy performance of different one-class classification

485 methods and found that One-class Support Vector Machine (OC-SVM) method and Support

486 Vector Data Description (SVDD) method have better accuracy than other methods. The OC-SVM

487 method aims to find a maximum margin hyperplane in the feature space to best separate the

488 mapped data from the origin [35], and the SVDD method learns a closed boundary around the

489 data in the form of a hypersphere characterized by a centre and radius [36].

490 We divided the scan data set of each building material into a training dataset (80%) and a

491 testing dataset (20%). For each of the ten materials, OC-SVM and SVDD classifiers were trained

492 separately using a training dataset containing only data of this material. After obtaining the trained

493 classifier, the testing dataset of this material was used to test the accuracy performance of the

494 trained classifier. Similar to multi-class classification, 𝜌𝜌, 𝑅𝑅𝑎𝑎 , and 𝐻𝐻𝐻𝐻𝐻𝐻 values were used as

495 features in one-class classification. For each classifier, experiments were conducted to find the

496 optimal model parameters, such as the kernel function, 𝛾𝛾 value of the kernel function and cost

497 coefficient. All these steps were completed in MATLAB2019a with the LIBSVM toolbox [31].

498 Table 4 presents the accuracy of each building material in one-class classification when 2 𝑚𝑚 <

499 𝑅𝑅 < 4 𝑚𝑚 and 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 , respectively. For the OC-SVM classifier, the classification

500 accuracy for different materials was between 81.0% and 89.0%. For the SVDD classifier, the

501 classification accuracy for different materials was between 83.4% and 90.4%. The averaged

502 accuracy for OC-SVM and SVDD was 85.5% and 86.5%, respectively. Though the overall

503 accuracy of treating material classification as a one-class classification problem is lower than

24
504 treating it as a multi-class classification problem, the accuracy is still acceptable. This result

505 further demonstrated the feasibility of the proposed method in practical applications.

506

507 Table 4 The classification accuracy of each building material in one-class classification.

OC-SVM SVDD

2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚 2 𝑚𝑚 < 𝑅𝑅 < 4 𝑚𝑚 4 𝑚𝑚 < 𝑅𝑅 < 10 𝑚𝑚


(%) (%) (%) (%)

Ceramic 87.9 89.0 88.0 89.4

Concrete 85.0 85.9 86.3 87.7

Stone 84.8 85.5 85.9 87.5

Metal 85.5 85.9 84.0 84.7

Mortar 81.0 82.7 87.7 88.7

Painting 88.0 88.8 89.0 90.4

Plaster 83.2 85.0 83.6 85.8

Plastic 83.6 85.9 85.8 86.7

Pottery 87.0 88.2 84.8 85.9

Wood 83.4 84.6 83.4 84.0


508

509 6. Limitations and future works

510 Although this study shows promising results, future works are needed to address a few

511 limitations.

512 First, this study only considers cases where the target surface is plane and consists of a single

513 building material. For the calculation of material reflectance and surface roughness, the estimated

514 values would be different if the neighbouring points contain other materials. In future works, one

515 possible direction is to check the consistency of material using colours because colours of a point

516 are totally determined by this point only and not affected by neighbouring points. In addition,

517 when the target surface is not a plane, the calculation of surface roughness would be inaccurate.

518 To handle this problem, future research is needed to first examine whether the surface is planar

25
519 or curved before estimating the surface roughness.

520 Second, material humidity has a negative impact on the accuracy of calculating material

521 reflectance 𝜌𝜌. Previous studies have found that the existence of moisture can change the reflected

522 intensity 𝐼𝐼𝑟𝑟 from a surface [16], which makes the estimation of 𝜌𝜌 inaccurate. Further research is

523 needed to get rid of the effects of material humidity when calculating the value of material

524 reflectance 𝜌𝜌.

525 Third, this study considered only ten common building materials. The classification is therefore

526 limited to these ten materials only. Future work is needed to further extend the database with data

527 of other building materials. In addition, this study did not consider glass as a building material

528 because TLS cannot well capture the laser scan data for transparent materials. However, some

529 studies have suggested a variety of effective ways to extract glass windows from point clouds

530 collected by TLS [37,38]. Future research is needed to include glass in the material classification

531 method.

532 Fourth, some studies have demonstrated that combing multiple features derived from 2D

533 images can effectively improve classification accuracy (e.g. combing RGB colour with H of HSV

534 colour and projection or combing HSV colour with texture). TLS with a built-in camera can

535 capture not only the RGB colours themselves but also the material texture, projection, etc. Hence,

536 TLS data-based building material classification method may be able to reach a higher accuracy if

537 thoroughly adopting multiple 2D image features derived from scan data instead of using only

538 HSV colour, which will be future work.

539

540 7. Conclusion

541 This study proposes a TLS data-based method for classifying common building materials. In

542 the proposed method, material reflectance 𝜌𝜌, HSV colours, and surface roughness 𝑅𝑅𝑎𝑎 are used as

543 classification features. The 𝜌𝜌 value is an intrinsic property of a certain material, and it can be

544 inferred from the reflected laser intensity 𝐼𝐼𝑟𝑟 . It is found that 𝐼𝐼𝑟𝑟 has linear relationships with

545 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/𝑅𝑅 2, and 𝜌𝜌 is a part of the coefficient of the linear function. Therefore, the 𝜌𝜌 value of each

26
546 scan point is obtained by fitting a linear function into the 𝐼𝐼𝑟𝑟 and 𝑐𝑐𝑐𝑐𝑐𝑐𝑐𝑐/𝑅𝑅 2 values of the

547 neighbouring points of the scan point. The HSV colours are used in this study instead of the RGB

548 colours because the HSV colours show better robustness to varying lighting conditions. The HSV

549 colours are calculated from RGB colours that are obtained from the raw laser scan data. The 𝑅𝑅𝑎𝑎

550 value is calculated as the average orthogonal distance from the neighbouring points of a scan

551 point to the fitted plane of the neighbouring points.

552 To validate the proposed technique, this study selected ten common building materials,

553 including concrete, mortar, stone, metal, painting, wood, plaster, plastic, pottery, and ceramic. A

554 FARO FocusS70 TLS was used to collect laser scan data of the ten different materials. The laser

555 scan data were processed in MATLAB2019a to calculate the above-mentioned features.

556 To find the best combination of features, the different combinations of features were tested. In

557 addition, different supervised learning classifiers were explored including Decision Trees (DTs),

558 Discriminant Analysis (DAs), Naive Bayes (NBs), Support Vector Machines (SVMs), K-Nearest

559 Neighbours (KNNs), and Ensembles. The experimental results showed that using 𝜌𝜌, HSV, and

560 𝑅𝑅𝑎𝑎 as features and Ensemble as the classifier realized the highest classification accuracy of 96.7%.

561 Experimental results also validated that 𝜌𝜌 was a much better feature for material classification

562 than 𝐼𝐼𝑟𝑟 , and HSV colour outperformed RGB colour. Further analyses showed that all the ten

563 categories of materials produced TP rate of at least 92% when using 𝜌𝜌 + 𝑅𝑅𝑎𝑎 + 𝐻𝐻𝐻𝐻𝐻𝐻 as features

564 and Bagged Trees algorithm as the classifier. The painting material showed the highest TP rate

565 of 100%. Meanwhile, the mortar material presented the lowest TP rate of 92%, and 7% of mortar

566 data were wrongly classified as stone because the two materials share similar feature values.

567 To further test how accurately one building material can be recognized, we also adopted one-

568 class classification method to train and validate the classification models. OC-SVM and SVDD,

569 which are verified to be the best classifiers for one-class classification problem in previous studies,

570 were adopted in this paper. The experiments showed that the averaged accuracy for OC-SVM and

571 SVDD was 85.5% and 86.5%, respectively, which further demonstrated the feasibility of the

572 proposed method.

27
573

574 References
575
576 [1] J. DeGol, M. Golparvar-Fard, D. Hoiem, Geometry-informed material recognition, In
577 Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas,
578 NV, USA, 2016, pp. 1554-1562, https://doi.org/10.1109/cvpr.2016.172.
579 [2] J.E. Meroño, A.J. Perea, M.J. Aguilera, A.M. Laguna, Recognition of materials and damage on
580 historical buildings using digital image classification, South African Journal of Science 111 (1-2)
581 (2015) 1-9, https://doi.org/10.17159/sajs.2015/20140001.
582 [3] K.K. Han, M. Golparvar-Fard, Appearance-based material classification for monitoring of
583 operation-level construction progress using 4D BIM and site photologs, Automation in
584 Construction 53 (2015) 44-57, https://doi.org/10.1016/j.autcon.2015.02.007.
585 [4] Q. Wang, M. Kim, Applications of 3D point cloud data in the construction industry: A fifteen-
586 year review from 2004 to 2018, Advanced Engineering Informatics 39 (2019) 306-319,
587 https://doi.org/10.1016/j.aei.2019.02.007.
588 [5] Q. Lu, S. Lee, Image-based technologies for constructing as-is building information models for
589 existing buildings, Journal of Computing in Civil Engineering 31 (4) (2017) 4017005,
590 https://doi.org/10.1061/(asce)cp.1943-5487.0000652.
591 [6] H. Son, C. Kim, N. Hwang, C. Kim, Y. Kang, Classification of major construction materials in
592 construction environments using ensemble classifiers, Advanced Engineering Informatics 28 (1)
593 (2014) 1-10, https://doi.org/10.1016/j.aei.2013.10.001.
594 [7] A. Dimitrov, M. Golparvar-Fard, Vision-based material recognition for automated monitoring of
595 construction progress and generating building information modeling from unordered site image
596 collections, Advanced Engineering Informatics 28 (1) (2014) 37-49,
597 https://doi.org/10.1016/j.aei.2013.11.002.
598 [8] Q. Lu, S. Lee, L. Chen, Image-driven fuzzy-based system to construct as-is IFC BIM objects,
599 Automation in Construction 92 (2018) 68-87, https://doi.org/10.1016/j.autcon.2018.03.034.
600 [9] S.Y. Chen, Y.F. Li, W. Wang, J. Zhang, Active sensor planning for multiview vision tasks, Vol.
601 1, Springer, ISBN 978-3-540-77071-8(2008), pp. 25.
602 [10] E.B. Anil, P. Tang, B. Akinci, D. Huber, Deviation analysis method for the assessment of the
603 quality of the as-is Building Information Models generated from point cloud data, Automation in
604 Construction 35 (2013) 507-516, https://doi.org/10.1016/j.autcon.2013.06.003.
605 [11] Z. Zhu, I. Brilakis, Parameter optimization for automated concrete detection in image data,
606 Automation in Construction 19 (7) (2010) 944-953,
607 https://doi.org/10.1016/j.autcon.2010.06.008.
608 [12] M. Franceschi, G. Teza, N. Preto, A. Pesci, A. Galgaro, S. Girardi, Discrimination between
609 marls and limestones using intensity data from terrestrial laser scanner, ISPRS Journal of
610 Photogrammetry and Remote Sensing 64 (6) (2009) 522-528,
611 https://doi.org/10.1016/j.isprsjprs.2009.03.003.

28
612 [13] J. Armesto-González, B. Riveiro-Rodríguez, D. González-Aguilera, M.T. Rivas-Brea, Terrestrial
613 laser scanning intensity data applied to damage detection for historical buildings, Journal of
614 Archaeological Science 37 (12) (2010) 3037-3047, https://doi.org/10.1016/j.jas.2010.06.031.
615 [14] B. Riveiro, P.B. Lourenço, D.V. Oliveira, H. González Jorge, P. Arias, Automatic morphologic
616 analysis of quasi‐periodic masonry walls from LiDAR, Computer‐Aided Civil and
617 Infrastructure Engineering 31 (4) (2016) 305-319, https://doi.org/10.1111/mice.12145.
618 [15] L.J. Sánchez-Aparicio, S. Del Pozo, L.F. Ramos, A. Arce, F.M. Fernandes, Heritage site
619 preservation with combined radiometric and geometric analysis of TLS data, Automation in
620 Construction 85 (2018) 24-39, https://doi.org/10.1016/j.autcon.2017.09.023.
621 [16] M.U. Hassan, A. Akcamete-Gungor, C. Meral, Investigation of Terrestrial Laser Scanning
622 Reflectance Intensity and RGB Distributions to Assist Construction Material Identification, Lean
623 and Computing in Construction Congress - Volume 1: Proceedings of the Joint Conference on
624 Computing in Construction, 2017.
625 [17] E. Valero, F. Bosché, A. Forster, Automatic segmentation of 3D point clouds of rubble masonry
626 walls, and its application to building surveying, repair and maintenance, Automation in
627 Construction 96 (2018) 29-39, https://doi.org/10.1016/j.autcon.2018.08.018.
628 [18] E. Valero, A. Forster, F. Bosché, E. Hyslop, L. Wilson, A. Turmel, Automated defect detection
629 and classification in ashlar masonry walls using machine learning, Automation in Construction
630 106 (2019) 102846, https://doi.org/10.1016/j.autcon.2019.102846.
631 [19] K. Tan, X. Cheng, Surface reflectance retrieval from the intensity data of a terrestrial laser
632 scanner, Journal of the Optical Society of America A 33 (4) (2016) 771-778,
633 https://doi.org/10.1364/josaa.33.000771.
634 [20] O. Svelto, D.C. Hanna, Principles of lasers, Vol. 4, Springer, 9781441913012(1998), pp.
635 [21] B. Höfle, N. Pfeifer, Correction of laser scanning intensity data: Data and model-driven
636 approaches, ISPRS Journal of Photogrammetry and Remote Sensing 62 (6) (2007) 415-433,
637 https://doi.org/10.1016/j.isprsjprs.2007.05.008.
638 [22] M.E. Becker, Evaluation and characterization of display reflectance, Displays 19 (1) (1998) 35-
639 54, https://doi.org/10.1016/s0141-9382(98)00029-8.
640 [23] S. Kaasalainen, A. Jaakkola, M. Kaasalainen, A. Krooks, A. Kukko, Analysis of incidence angle
641 and distance effects on terrestrial laser scanner intensity: Search for correction methods, Remote
642 Sensing 3 (10) (2011) 2207-2221, https://doi.org/10.3390/rs3102207.
643 [24] W. Fang, X. Huang, F. Zhang, D. Li, Intensity correction of terrestrial laser scanning data by
644 estimating laser transmission function, IEEE Transactions on Geoscience and Remote Sensing
645 53 (2) (2015) 942-951, https://doi.org/10.1109/tgrs.2014.2330852.
646 [25] S. Kaasalainen, A. Krooks, A. Kukko, H. Kaartinen, Radiometric calibration of terrestrial laser
647 scanners with external reference targets, Remote Sensing 1 (3) (2009) 144-158,
648 https://doi.org/10.3390/rs1030144.
649 [26] S. Kaasalainen, A. Kukko, T. Lindroos, P. Litkey, H. Kaartinen, J. Hyyppa, E. Ahokas,
650 Brightness measurements and calibration with airborne and terrestrial laser scanners, Ieee

29
651 Transactions on Geoscience and Remote Sensing 46 (2) (2008) 528-534,
652 https://doi.org/10.1109/tgrs.2007.911366.
653 [27] Q. Wang, J.C. Cheng, H. Sohn, Automated estimation of reinforced precast concrete rebar
654 positions using colored laser scan data, Computer‐Aided Civil and Infrastructure Engineering
655 32 (9) (2017) 787-802, https://doi.org/10.1111/mice.12293.
656 [28] A.R. Smith, Color gamut transform pairs, ACM Siggraph Computer Graphics 12 (3) (1978) 12-
657 19, https://doi.org/10.1145/800248.807361.
658 [29] P.H. Torr, A. Zisserman, MLESAC: A new robust estimator with application to estimating
659 image geometry, Computer Vision and Image Understanding 78 (1) (2000) 138-156,
660 https://doi.org/10.1006/cviu.1999.0832.
661 [30] FARO. FARO® LASER SCANNER FOCUS. Available from: <https://www.faro.com/en-
662 sg/products/construction-bim/faro-laser-scanner-focus/>. Retrieved June 18, 2019.
663 [31] MathWorks. MATLAB R2019a. Available from:
664 <https://www.mathworks.com/products/new_products/latest_features.html?s_tid=hp_release_20
665 19a>. Retrieved June 18, 2019.
666 [32] FARO. FARO SCENE. Available from: <https://www.faro.com/en-gb/products/construction-
667 bim-cim/faro-scene/software/>. Retrieved June 18, 2019.
668 [33] B. Krawczyk, M. Galar, M. Woźniak, H. Bustince, F. Herrera, Dynamic ensemble selection for
669 multi-class classification with one-class classifiers, Pattern Recognition 83 (2018) 34-51,
670 https://doi.org/10.1016/j.patcog.2018.05.015.
671 [34] L. Swersky, H.O. Marques, J. Sander, R.J. Campello, A. Zimek, On the evaluation of outlier
672 detection and one-class classification methods, IEEE, 2016, pp. 1-10,
673 https://doi.org/10.1109/dsaa.2016.8.
674 [35] B. Schölkopf, J.C. Platt, J. Shawe-Taylor, A.J. Smola, R.C. Williamson, Estimating the support
675 of a high-dimensional distribution, Neural Computation 13 (7) (2001) 1443-1471,
676 https://doi.org/10.1162/089976601750264965.
677 [36] D.M. Tax, R.P. Duin, Support vector domain description, Pattern Recognition Letters 20 (11-13)
678 (1999) 1191-1199, https://doi.org/10.1016/s0167-8655(99)00087-2.
679 [37] S. Pu, G. Vosselman, Knowledge based reconstruction of building models from terrestrial laser
680 scanning data, ISPRS Journal of Photogrammetry and Remote Sensing 64 (6) (2009) 575-584,
681 https://doi.org/10.1016/j.isprsjprs.2009.04.001.
682 [38] S. Pu, Automatic building modeling from terrestrial laser scanning, Springer, 978-3-540-72135-
683 2(2008), pp. 147-160, https://doi.org/10.1007/978-3-540-72135-2_9.
684

30

View publication stats

You might also like