Professional Documents
Culture Documents
12693
ORIGINAL ARTICLE
F I G U R E 1 Manual reality capture planning may result in (a) incomplete coverage, (b) inaccurate 3D reconstruction, and (c) low
visibility measured in SSD
and are inaccurate due to drift errors and low surface ries of data frames to structure’s topology. UAVs and
sampling distance (SSD), that is, 𝑥 units of measure on the ground rovers are also operated near existing buildings,
structure’s surface per image pixel. Consequently, project trees, power lines, and on-site personnel; thus, there are
teams return to construction sites to perform additional inherent risks of damages and injuries.
captures. This practice is costly, and a complementary To address the limitations mentioned above, this work
capture might be too late to document critical changes. introduces methods to plan for image-based reality cap-
Of course, project teams can overcollect reality data for ture, evaluate the quality of capture plans based on a com-
better completeness and accuracy; however, this practice prehensive set of metrics, and improve the plans for sup-
exponentially increases the data’s postprocessing time porting construction monitoring applications. Addition-
and, in turn, takes away from productivity gains. ally, a cloud-based solution with an intuitive user interface
According to Ham et al. (2016), automatic collection is presented for reality capture planning and visualizing
of reality capture plans using autonomous platforms and each capture plan’s performance against the introduced
assessing the plans’ quality before collection offer an oppor- criteria. The next sections provide an overview of related
tunity to address the latter challenges. In practice, com- works, followed by a discussion on visual quality metrics,
mercially available flight planning applications are used evaluation methods, conducted experiments, and findings.
by drone operators to create user-defined flight plans. By
fine-tuning parameters such as the height of the flight,
image overlap, and image resolution, lawn-mowing—grid- 2 RELATED WORKS
based—2D flight missions are created to visually cover a
specific region. Then, dedicated mobile applications are Image-based 3D reconstruction has been widely used for
used to operate UAVs and collect the data automatically. automated progress monitoring, for instance, the work
Such planning methods can only provide users with feed- by Hamledari et al. (2017). Moreover, Xu et al. (2020)
back on the expected ground sampling distance (GSD)— used 3D reconstructed models for quality control during
which measures the distance between two pixels’ centers construction, and W. Y. Lin (2020) utilized such models
on the ground. While GSD feedback is useful for survey- for condition assessment of assets. Yang et al. (2015) and
ing tasks, specifically to evaluate the accuracy of mea- Kopsida et al. (2015) offer an extensive literature review
surements conducted on the collected images, it is not on recent techniques for automated evaluation of progress
suitable for construction monitoring tasks as the GSD deviations using 3D reality models. The following sections
value assumes a fixed distance between the UAV and the review prior work on the quality of 3D reconstructed
structure’s topology. Accordingly, GSD does not assure models and captured data for progress monitoring, quality
accurate measurements for buildings or infrastructure control, and condition assessment.
assets.
Moreover, applications such as automated construc-
tion progress monitoring and condition assessment of 2.1 Quality of 3D reconstruction
existing assets have several additional requirements: for
example, (1) visual coverage to the monitored assets, that Prior research works have focused on methods to com-
is, setting the viewpoints of a capture device to visually pare point clouds (with or without associated images)
cover a region of interest on the construction site; (2) against the building information model (BIM). Prior works
clear visibility of each construction asset in the collected have examined the quality of laser scanner point clouds
frames—measured in SSD; and (3) canonical trajecto- from capture planning and quality control perspectives.
IBRAHIM et al. 3
Anil et al. (2013) proposed a deviation analysis method to for complete visual coverage to an infrastructure
assess 3D as-built point clouds’ quality by measuring dis- asset. Here, the 3D geometry of the infrastructure
tances between laser scanner points and reference BIM. asset is decomposed into surface patches. Then,
Zhang et al. (2016) evaluated the level of detail (LOD) of the UAV path is optimized at a fixed distance from
point clouds by measuring the density of laser scanner the structure to cover all patches with a maximum
points projected on structure surfaces and used a min- GSD. Moreover, Baik and Valenzuela (2019) used
imum LOD requirement for optimizing laser scanning a visual coverage metric to create UAV plans for
plans. Kalyan et al. (2016) used dimensional analysis to complete visual inspection of electric transmission
measure the accuracy of reality models collected using towers using simple geometries. However, for con-
a depth sensor by comparing the dimensions of scanned struction monitoring tasks, visual coverage for
point clouds to a ground truth (GT) model. Rebolj et al. all elements cannot be assured using simplified
(2017) investigated different techniques for evaluating the geometry of the structure where some elements—
quality of laser scanner point clouds. In such work, the 3D especially those with small dimensions—will be
model’s quality is indicated by the points’ density (num- excluded from the evaluated model. Nevertheless,
ber of points per 𝑚2 ) and accuracy, which measures the clear visibility of each construction asset in the
difference in depth between scanned points and refer- collected frames is also needed to support auto-
ence points. To this end, similar metrics and methods to matic vision-based monitoring methods (Han &
assess the accuracy and completeness of image-based real- Golparvar-Fard, 2015).
ity capture plans—necessary for the success of automated Sensor parameters: to select the best camera func-
progress monitoring methods—have not been the subject tional parameters, including sensor’s type, resolu-
of research. tion, field of view (FOV), image type (e.g., per-
In the computer vision community, Seitz et al. (2006) spective or equirectangular), and frame rate for
established quality evaluation methods for calculating the time lapse or standard videos. Tuttas et al. (2016)
accuracy and completeness of 3D reality models generated compared different methods and setups for image
via different image-based reconstruction pipelines. In this acquisition on construction sites using hand-held
work, the accuracy of a model is measured by the 90th per- cameras, UAVs, and crane cameras. Zhang et al.
centile of the distances between 3D reconstructed points (2016) optimized plans for laser scanning by defin-
and a reference model (GT). Besides, the completeness is ing the best sensor configurations to improve the
calculated as a percentage of the GT points having a dis- density and accuracy of scanned point clouds.
tance below a fixed threshold to the nearest reconstructed Rodríguez-Gonzálvez et al. (2017) utilized visual
point. These evaluation metrics and methods are useful sensors for weld inspection and compared macro
in comparing 3D reconstruction algorithms. Accordingly, photography and laser scanners in terms of the
they offer an opportunity to compare the impact of reality resulting GSD and operational costs. Accordingly,
plans on the quality of reconstructed reality models. planning for visual reality capture has to consider
the variability in sensor parameters.
GSD: to indicate the resolution of measurements in the
2.2 Metrics to evaluate reality capture collected images. Daftry et al. (2015) used GSD as
plans an indicator of the quality of image-based recon-
structed point clouds collected using aerial plat-
To date, prior research works have introduced some met- forms. Baik and Valenzuela (2019) evaluated and
rics to evaluate and improve the quality of reality capture optimized the resolution of UAV images obtained
plans before data collection, including the following: for inspection missions by calculating the expected
GSD. Kim et al. (2019) set a fixed altitude for the
Visual coverage: to evaluate the best sensor configura- UAV missions to satisfy a maximum GSD value
tions (camera positions and trajectories) that com- suitable for generating an initial 3D map of the
pletely observe all the primitives of a 3D structure. construction environment. However, using GSD as
A survey by Galceran and Carreras (2013) pre- feedback does not assure accurate measurements
sented various robotic path planning methods for for buildings or infrastructure assets with complex
providing complete visual coverage to 3D struc- geometries, particularly when the camera takes
tures. The latter work recommended simplifying close-up images. In such a case, the GSD metric can
the structures to basic geometries to solve cover- be extended to measure SSD that has better perfor-
age path planning problems. More recent work by mance in evaluating the expected accuracy of a 3D
Phung et al. (2017) optimized UAV configurations reconstruction.
4 IBRAHIM et al.
Visual features: where image-based 3D reconstruc- 2.3 Methods to evaluate reality capture
tion relies on detecting and matching visual plans
features. These features are mathematically repre-
sented as feature descriptors that can be automat- Previous research endeavors have investigated the usage
ically detected, be matched across images, form of a priori—such as BIMs—to create, communicate, and
feature tracks, and be transformed into 3D recon- evaluate the quality and safety of reality capture plans
structed points via bundle adjustment optimiza- (Ibrahim, Roberts, et al., 2017; Y. H. Lin et al., 2013; Taneja
tion. Degol, Golparvar-Fard et al. (2018) evaluated et al., 2016). However, the latest site conditions—which
the success of 3D reconstruction through the prob- are modeled using 4D BIM and temporal 3D reality
ability of detecting and matching simulated fea- models—are rarely accounted for during data collection
tures and forming successful feature tracks (scene planning. Also, the applicability of prior works was
graph) between image pairs. Furthermore, Javad- validated in the context of UAV images only. However, it
nejad et al. (2021) sampled features from sparse did not consider 360◦ images/videos and interior spaces
point clouds and attributed the quality of recon- where automated data collection and progress monitoring
structed dense point clouds to the distance between are more challenging.
reconstructed points and reference features. While An earlier work by the authors (Ibrahim, Golparvar-
detecting and matching visual features are essential Fard, et al., 2017) discussed the importance of using the
for successful 3D reconstruction, to date, this met- most up-to-date state of an asset—either represented
ric has not been used to assess and optimize reality in the form of 4D BIM or 4D reality model—for qual-
plans. ity assessment of reality capture plans. However, the
Sensor orientation: which measures the angle of latter research focused only on providing a method to
incidence between the sensor’s look-at direction calculate the visibility of elements in a manually cre-
and the structure’s surfaces. As shown in Ham ated reality plan and using the redundant visibility of
et al. (2016) and Javadnejad et al. (2021), canoni- elements to indicate the completeness of reconstructed
cal views where camera orientations are orthogo- point clouds. Nonetheless, other critical criteria such as
nal to the structure can improve image resolution SSD, viewpoints orientation, reconstruction stability, and
and enhance the material detection algorithms’ operational requirements were not addressed. Moreover,
performance. Nevertheless, Degol, Lee et al. (2016) an objective validation utilizing real-world construction
have shown that canonical camera trajectories also data is missing, which is required to confirm the metrics
improve construction materials’ detection using and methods for evaluating reality capture plans.
on-site images. Consequently, assuring canonical Nevertheless, visualizing the quality metrics for reality
views in a reality plan is vital to enhance the accu- capture plans—particularly for users in the field—is as
racy of appearance-informed material recognition important as the methods used to calculate them. For
methods used for automated progress monitoring. instance, Daftry et al. (2015) used color heat maps to visu-
alize GSD values and indicate the number of overlapping
Although the previously researched metrics have been images observing the same mesh primitives. Ibrahim,
validated in different studies and under different assump- Golparvar-Fard, et al. (2017) colored BIM elements with
tions, to date, no research has investigated the collective a metaphor of traffic-light colors to visualize the values
usage of these metrics for evaluating construction reality of visual coverage and redundant observations of BIM
capture plans. Besides, the relative significance of these elements in the data. Visualizing other metrics such as
metrics in evaluating reality plans’ quality is still missing completeness of the capture and operational metrics
in the literature. introduced before has not been thoroughly investigated.
Beyond the monitoring metrics listed above, several
additional operational metrics should be accounted for
during the automatic collection of the data. Operating
2.4 Optimizing reality plans
autonomous platforms is associated with inherent risks
Optimizing a reality plan before execution is essential
of damages and injuries. Metrics for measuring (1) safe
for assuring complete and accurate results and reduce
proximity to the structure, (2) availability of enough
data collection time and costs. Several research studies
batteries to collect the data, and (3) preserve a continuous
have focused on solving path planning coverage problems
line-of-sight during UAV operation—required by the
such as Chen et al. (2018) and Lindner et al. (2019).
Federal Aviation Agency (FAA)—are equally important to
These works provide various algorithms such as greedy
assess the feasibility of executing reality capture plans.
next-best-view and set-cover optimization to generate
IBRAHIM et al. 5
F I G U R E 2 The new method and cloud-based system—built with a client-server architecture—enable fast and memory-efficient
evaluation and improvement of the reality capture plans against six evaluation criteria
data collection paths with optimal visual coverage. While each monitored element in SSD units, (3) the orientation
optimizing visual coverage using 3D models’ primitives is of the camera viewpoints to the model’s topology, (4) the
sufficient for some applications, Ibrahim, Golparvar-Fard, expected stability of 3D reconstruction pipeline, (5) satis-
et al. (2017) showed the importance of associating visual faction of the FAA regulation for maintaining line-of-sight
coverage to the number of back-projected pixels per BIM during drone operation, and (6) battery operation time
element. By doing so, a sufficient surface area of each during data collection. These criteria are transformed into
element would be visible in the data. Such a metric is visual quality metrics to assess visual coverage, redundant
designed to satisfy the requirements of appearance-based visibility, back-projection resolution, viewpoints orienta-
recognition methods used for progress monitoring (Han tion, and stability of 3D reconstruction. Simultaneously,
& Golparvar-Fard, 2017). the operational criteria are accounted for during reality
capture planning to enforce safe and successful execution.
Furthermore, each reality plan is improved through itera-
3 METHOD
tive manual modification and evaluation before execution.
In this paper, a reality capture plan ℝ has 𝑛 frames (per-
In this section, a comprehensive set of objective metrics
spective or equirectangular) given by (𝑓 | 𝑓 = 1 ∶ 𝑛; 𝑓 ∈
is presented to benchmark, compare, and support opti-
ℝ). ℝ is evaluated at time 𝐷 for a structure Γ that has 𝑧
mizing reality capture plans before their execution. A new
number of elements. An evaluated element—that is, an
method is introduced to enable fast and memory-efficient
element in BIM or a face of a polygon mesh in a reality
simulation of capture plans in the context of 4D BIM and
model—is defined by (𝑖 | 𝑖 = 1 ∶ 𝑧; 𝑖 ∈ Γ, 𝜕(𝑖) ≤ 𝐷), where
existing point clouds. A prototype of these methods and
𝜕(𝑖) returns the planned construction date for element
metrics is also presented that runs efficiently in a web
𝑖. Because reality models are dominantly reconstructed
browser. Besides, the prototype offers visual and numeric
using structure from motion techniques (Szeliski, 2020),
feedback on data collection plans’ performance against all
a pinhole camera model is considered for each frame in
evaluation criteria.
ℝ. As such, each element 𝑖 is back-projected into frame 𝑓
using Equation (1):
3.1 Metrics
𝑝𝑖,𝑓 = 𝑀𝑓 𝑃𝑖 = 𝑄𝑓 [𝑅𝑓 | 𝑇𝑓 ]𝑃𝑖 (1)
As shown in Figure 2, a reality capture plan is created con-
sidering six main criteria: (1) the visual coverage of data where 𝑃𝑖 is the 3D coordinate of a point that belongs to
frames to the structure’s elements, (2) the resolution of element 𝑖, 𝑀𝑓 is a 11 degrees-of-freedom (DoFs) camera
6 IBRAHIM et al.
𝑉𝐶 × 𝑧
𝑉̄ 𝐶 = ∑ ((∑ ) ) (3)
𝑧 𝑛
F I G U R E 3 Back-projecting point 𝑃𝑖 of element 𝑖 from WCS to 𝑖=1 𝛿 𝑓=1 𝛿(𝛾𝑖,𝑓 ≥ 1) > 0
FCS of frame 𝑓 using Equation (1). 𝛾𝑖,𝑓 shows all back-projected
points of element 𝑖 in frame 𝑓
3.1.2 Redundant visibility
where 𝜙𝑓 is the the camera’s FoV, ℎ𝑓 is the frame’s diago- metrics, which are all calculated in the FCS, the relative
nal size measured in pixels, and 𝑑̄𝑖,𝑓 is the mean depth of orientation is measured for all back-projected pixels 𝛾𝑖,𝑓 .
all back-projected pixels. The depth 𝑑𝑖,𝑓 of each pixel 𝑝𝑖,𝑓 The average relative orientation 𝜃𝑖,𝑓 between element 𝑖 and
is calculated by measuring the distance between the 3D camera 𝑓 is used similarly as averaging an element’s reso-
point 𝑃𝑖 and the camera frame 𝑓 (see Figure 3). lution per frame. Also, the mean of the top 𝑘𝑖 (Equation (6))
An element 𝑖 can be visible in several frames with differ- orientation values of an element 𝑖 is used to indicate the
ent resolutions. However, typically the best 𝑘-resolutions relative orientation of the element across all the frames in
across all visual frames contribute to reconstructing a ℝ. The viewpoint orientation metric 𝑂 is calculated as the
structure’s element in 3D. In addition, image-based mea- average orientation of all visible elements (Equation (9)):
surements of a structural element in a 3D viewer are most
∑𝑧 𝑘𝑖 ( )
𝑖=1 𝜇𝑓 𝜃𝑖,𝑓
accurate when conducted using canonical views or best
frames observing the element (Hoiem, 2018). Thus, the 𝑂= (9)
𝑉𝐶 × 𝑧
resolution of an element 𝑖 in plan ℝ is set to be the top
𝑘 mean resolutions of the back-projected element across 3.1.5 Stability of reconstruction
all frames. The redundant visibility of an element 𝑉𝑅,𝑖 =
∑𝑛
𝑓=1 𝛿(𝛾𝑖,𝑓 ≥ Ω) can end up lower that the desired 𝑘 value. Three-dimensional reconstruction algorithms require
Thus, 𝑘𝑖 is defined per element 𝑖, where the detection and matching of visual features across the
{ ) frames of ℝ so that visual tracks are formed and points are
𝑘 𝑘 ≤ 𝑉𝑅,𝑖 triangulated in 3D. To ensure success, implementations of
𝑘𝑖 = (6)
𝑉𝑅,𝑖 otherwise these algorithms recommend a minimum number of fea-
tures Λ to be detected per visual frame, for example, Λ = 20
Finally, the back-projected resolution metric 𝑅 for plan in Golparvar-Fard et al. (2009). In the presented method,
ℝ is calculated using Equation (7): these visual features are extracted from a priori model
∑𝑧 𝑘𝑖
Γ. Because these features are simulated, it is difficult—if
𝑖=1 𝜇𝑓 (𝜌𝑖,𝑓 ) not impossible—to precisely predict the actual position
𝑅= (7)
𝑉𝐶 × 𝑧 (𝑃𝑢 ) of each visual feature 𝑢 in the reality data before data
collection. In the worst case scenario, visual features exist
𝑘
where 𝜇𝑓𝑖 is the top 𝑘𝑖 mean resolution of an element 𝑖 at locations with a high probability of generating robust
across reality capture frames and 𝜌𝑖,𝑓 is the average SSD feature descriptors. These locations are typically around
of element 𝑖 in frame 𝑓. corners of elements and along highly textured surfaces.
Thus, the features are sampled from the simulated model
at the corners and at highly textured meshes with a
3.1.4 Viewpoint orientation sampling rate of 1 feature per m2 to measure the 3D
reconstruction’s stability in the most conservative state.
The camera’s relative orientation to an element’s sur- Once the features are detected, they need to be matched
faces affects the back-projected resolution and the overall across image pairs. Mathematically, epipolar geometry
quality of a 3D reconstruction. Hence, the viewpoint is used to model the transformations between image
orientation metric is designed to calculate the mean pairs utilizing the corresponding inlier features to fit a
orientation of a reality plan’s camera trajectory against all fundamental matrix using an RANSAC loop (Szeliski,
visible elements. Equation (8) calculates the orientation 2020). The fundamental matrix estimation per frame
angle between a frame 𝑓 and the surface normal vector at requires matching at least eight corresponding point pairs.
a point 𝑃𝑖 that belongs to element 𝑖: Since the simulated features may not be captured in the
( ) collected reality data, the necessary number of simulated
𝑁⃗𝑖 .𝑁⃗𝑓
Θ𝑖,𝑓 = cos−1 (8) feature pairs Λ is practically set higher than eight.
||𝑁⃗𝑖 ||.||𝑁⃗𝑓 || Finally, global optimization using bundle adjustment
is applied to estimate the 3D position of visual features
where Θ𝑖,𝑓 is the orientation angle between the surface accurately. A robust global optimization process requires
normal at point 𝑃𝑖 and frame 𝑓, 𝑁⃗𝑖 is the surface normal feature tracks between consecutive images in ℝ. Accord-
at the point 𝑃𝑖 , and 𝑁⃗𝑓 is the view direction of frame 𝑓. A ingly, an additional constraint is utilized to ensure each
canonical orientation occurs when the view direction 𝑁⃗𝑓 feature is visible in at least 𝑌 (e.g., 5) consecutive frames;
satisfies 𝑁⃗𝑖 ⋅ 𝑁⃗𝑓 = 0. For consistency with other evaluation where the frames in ℝ are spatially ordered along data
8 IBRAHIM et al.
A l g o r i t h m 1 Simulating stability of a 3D reconstruction camera-based metrics are used to color-code BIM or point
cloud/mesh models to provide visual feedback for users to
improve and optimize their capture plans.
FIGURE 4 An example of a stable 3D reality capture plan, which has resulted in a complete reconstructed point cloud
IBRAHIM et al. 9
FIGURE 6 Visualizing 4D BIM and reality model in the developed web-based application
on its depth, encoded using RGB colors; (3) orientation The overall simulation and evaluation are executed
pipeline ⊳𝑂 that renders the relative orientation of each according to Algorithm 2. Applying each rendering
back-projected pixel 𝑝𝑖,𝑓 with respect to frame 𝑓 encoded pipeline at frame 𝑓 results in a new frame, thus the
using RGB colors; and (4) feature pipeline ⊳𝐹 that renders four rendering pipelines results in 𝑓𝑉 , 𝑓𝑅 , 𝑓𝑂 , and 𝑓𝐹 .
each simulated feature 𝑢 with a unique color as well. The rendering clear color is set to black, which has a
The color coding of visibility and feature rendering decoded index equals to zero to remove the background
pipelines follows the equation 𝐼𝑖 or 𝐼𝑢 = 2562 𝑅 + 256𝐺 + during evaluation.
𝐵, where 𝐼𝑖 is the index of the element 𝑖 ∈ Γ similarly 𝐼𝑢 is
the index of the feature 𝑢 ∈ Γ and 𝑅, 𝐺, and 𝐵 are the red,
green, and blue color channels, respectively. This color- 3.4 Visual quality analysis and feedback
coding strategy encodes indices with over 16.7 million val-
ues, sufficient for the simulated structure. For simulating While visual quality metrics are useful to compare data
visual features, occlusion of the features by the model’s ele- collection plans, they do not provide spatial feedback on
ments is accounted for using feature rendering pipeline ⊳𝐹 locations linked with low visual quality, which is vital for
that renders the structure Γ using background color to hide optimizing reality plans. Thus, each metric is calculated
occluded features. Additionally, the features are translated per element 𝑖 ∈ Γ, and visual feedback is provided using
IBRAHIM et al. 11
TA B L E 1 Experimental setup
ID Project description Simulated systems Progress state Capture modality Device # of frames
P1 Five-storey institutional building S,A Completed Outdoors UAV 1924
P2 Four-storey commercial building S,A Completed Outdoors UAV 366
P3 Five-storey residential building S,A Completed Integrated 360◦ 183
◦
P4 Two mechanical rooms and a facade S,A In progress Outdoors 360 282
P5 30-storey high-rise commercial building S,A In progress Outdoors UAV 153
◦
P6 One floor of a commercial building S,A,M,P In progress Indoors 360 268
A warehouse and a connected one-storey office building captured at different dates
P7:1 S In progress Outdoors UAV 289
P7:2 S,A In progress Outdoors UAV 349
P7:3 S,A In progress Outdoors UAV 369
P7:4 S,A In progress Indoors 360◦ 508
◦
P7:5 S,A,M In progress Indoors 360 354
P7:6 S,A,M,P In progress Integrated 360◦ 1251
P7:7 S,A,M,P In progress Iutdoors UAV 503
Note: S: structural, A: architectural, M: mechanical, P: plumbing.
FIGURE 8 Sample reality capture plans for (a) UAV and (b) ground rover in outdoor and indoor environments
IBRAHIM et al. 13
TA B L E 2 Evaluation results
Data set ID 𝑽𝑪 (%) 𝑽̄ 𝑪 (%) 𝑽𝑹 (#) 𝑹 (m) 𝑶 (◦ ) 𝑻 (#) 𝑨 (m) 𝑪𝒆 (%) 𝑪𝒗 (%) 𝑪𝒕 (%) Time (min)
P1 36.42 50.86 20.42 0.011 17.11 479.37 0.057 26.11 92.51 38.05 57.0
P2 48.84 96.34 35.77 0.007 24.61 499.83 0.076 88.94 98.03 49.11 4.8
P3 13.95 25.35 13.93 0.031 39.62 3326.11 0.067 30.3 94.27 29.29 13.8
P4 55.08 60.75 34.28 0.03 24.67 1765.8 0.143 40.31 94.37 56.78 9.6
P5 1.56 2.81 14.87 0.023 53.76 805.62 0.302 8.44 87.5 8.34 2.9
P6 3.64 11.38 23.34 0.013 39.42 1093.11 0.03 28.1 79.21 8.67 20.7
P7:1 91.86 92.94 24.99 0.007 29.75 209.71 0.072 18.96 93.85 70.93 3.8
P7:2 25.32 30.29 14.39 0.007 29.49 575.38 0.04 7.19 82.77 12.64 4.6
P7:3 7.81 31.29 12.41 0.012 42.77 116.42 0.03 1.98 87.5 2.86 4.1
P7:4 2.76 5.06 16.1 0.044 35.76 2400.47 0.1 53.46 94.36 30.44 27.5
P7:5 3.25 7.43 20.78 0.044 27.48 4597.41 0.106 45.55 91.66 31.70 36.7
P7:6 7.61 10.62 20.41 0.015 30.17 1012.56 0.04 27.35 85.58 36.68 49.7
P7:7 4.29 39.24 15.31 0.009 43.4 84.11 0.064 37.37 84.36 5.40 9.8
values between BIM and reality meshes (see Equation times are reported. Figures 9 and 10 show examples of the
(14)). The relative weight 𝜔𝑖,𝑓 for each pixel 𝑝𝑖,𝑓 is set visual feedback for outdoor data set P1 and indoor data
to the redundant observation value for the BIM element set P7:5. For more results, a demonstration video can be
detected at the pixel, which reflects the probability of accessed via https://vimeo.com/477370145.
correct reconstruction of an element:
F I G U R E 9 Visual quality feedback on outdoor capture plan P1: (a) visual coverage and redundant visibility, (b) elements’ resolution, (c)
viewpoint orientation, and (d) stability of reconstruction
F I G U R E 1 0 Visual quality feedback on indoor reality plan P7:5: (a) visual coverage and redundant visibility, (b) elements’ resolution,
(c) viewpoint orientation, and (d) stability of reconstruction
IBRAHIM et al. 15
evaluating reality plans is vital for the proposed system’s cations. Besides, the iterative evaluation and modification
feasibility, where bench-marking and optimizing reality processes are tedious. Such processes have to be repeated
capture plans before their execution require generating per data collection date where changes in the structure—
visual quality feedback promptly (within a few minutes). represented through 4D a priori—require altering the
reality plan. Future work will focus on automating the
creation and optimization of reality capture plans offline
6 CONCLUSION AND FUTURE WORK using the five developed metrics and operational require-
ments. Moreover, this work does not consider localization
This work demonstrated the importance of utilizing five errors during plans’ execution, which presents a challenge
visual quality metrics simultaneously to assess construc- to collect the data accurately.
tion reality capture plans. Besides, the proposed quality
metrics and their calculation methods were effective in
AC K N OW L E D G M E N T S
providing—within a few minutes—feedback on the recon-
The authors would like to acknowledge the financial
structed models’ quality during reality capture planning.
support of National Science Foundation (NSF) Grants
Nevertheless, the work showed that using 4D a priori
1446765 and 1544999. The authors also appreciate the
is essential for evaluating reality plans for construction
support of Reconstruct Inc. and all other construction
monitoring and asset inspection tasks.
companies who offered the real-world project data. Any
Results from 13 reality plans—created for seven con-
opinions, findings, conclusions, or recommendations
struction projects—concluded a significant correlation
expressed in this material are those of the authors. They
between the proposed metrics and the completeness and
do not necessarily reflect the view of the NSF, industry
accuracy of reconstructed reality models. It was found that
partners, or professionals mentioned above.
the expected visual coverage and redundant visibility of
elements in the reality plan are indicative of the overall
completeness of reconstructed models. Nonetheless, the REFERENCES
reconstruction’s accuracy depends on the resolution of Anil E. B., Tang P., Akinci B., Huber D. (2013). Deviation analysis
elements in the data frames measured in terms of SSD method for the assessment of the quality of the as-is Building
and the viewpoints’ orientation to the structure topology. Information Models generated from point cloud data. Automation
Additionally, the visual coverage and redundant visibility in Construction, 35, 507–516. https://doi.org/10.1016/j.autcon.2013.
metrics reflect the expected completeness of reconstruc- 06.003.
tion better than the reconstruction’s stability metric. Asadi, K., Kalkunte Suresh, A., Ender, A., Gotad, S., Maniyar, S.,
More interestingly, the experiments showed that setting Anand, S., Noghabaei, M., Han, K., Lobaton, E., Wu, T. (2020). An
integrated UGV-UAV system for construction site data collection.
camera trajectories to canonical views, which was believed
Automation in Construction, 112, 103068. https://doi.org/10.1016/j.
to improve reconstruction quality, actually leads to lower autcon.2019.103068
completeness. Thus, a combination of canonical and Autodesk Forge. (2020). https://forge.autodesk.com/api/model-
noncanonical views is recommended. It is important to derivative-cover-page/
note that more complex structures and building systems Baik, H., & Valenzuela, J. (2019). Unmanned aircraft system path
require comprehensive 3D reality plans; for example, the planning for visually inspecting electric transmission towers. Jour-
2D lawn-mowing pattern used in P5 results in low visual nal of Intelligent and Robotic Systems: Theory and Applications,
95(3-4), 1097–1111. https://doi.org/10.1007/s10846-018-0947-9
coverage (below 2%) and an overall completeness value of
Chen, M., Koc, E., Shi, Z., & Soibelman, L. (2018). Proactive 2D
∼ 8%. Since 360◦ cameras have low pixel resolution, it is
model-based scan planning for existing buildings. Automation in
suggested to place 360◦ waypoints close to the structure’s Construction, 93, 165–177. https://doi.org/10.1016/j.autcon.2018.05.
elements to improve SSD and visual coverage. 010
Finally, the feasibility of a client-server architecture Daftry, S., Hoppe, C., & Bischof, H. (2015). Building with drones:
for deploying the developed web-based system relies on Accurate 3D facade reconstruction using MAVs. IEEE Interna-
the performance of data visualization and storage using tional Conference on Robotics and Automation (ICRA), Seattle, WA,
efficient data structures. It was shown that leveraging USA (pp. 3487–3494). https://doi.org/10.1109/ICRA.2015.7139681
Degol, J., Golparvar-Fard, M., & Hoiem, D. (2016). Geometry-
GPU’s power for visualization, simulation, and processing
informed material recognition. Proceedings of the IEEE Computer
tasks can promptly provide feedback (∼ 2.3 s per frame) Society Conference on Computer Vision and Pattern Recognition,
and supports interactive optimization of reality capture Las Vegas, NV, USA (pp. 1554–1562). https://doi.org/10.1109/CVPR.
plans. 2016.172
While the presented methods provide feedback on the Degol, J., Lee, J. Y., Kataria, R., Yuan, D., Bretl, T., & Hoiem, D. (2018).
quality of reality plans, creating an optimal reality plan FEATS: Synthetic feature tracks for structure from motion evalu-
still relies on user-defined parameters and manual modifi- ation. In International Conference on 3D Vision, 3DV 2018, Verona,
IBRAHIM et al. 17
Italy (pp. 352–361). Institute of Electrical and Electronics Engi- Construction, 106, 102918. https://doi.org/10.1016/j.autcon.2019.
neers Inc. https://doi.org/10.1109/3DV.2018.00048 102918
Galceran, E., & Carreras, M. (2013). A survey on coverage path plan- Kopsida, M., Brilakis, I., & Vela, P. A. (2015). A review of
ning for robotics. Robotics and Autonomous Systems, 61(12), 1258– automated construction progress monitoring and inspection
1276. https://doi.org/10.1016/j.robot.2013.09.004 methods. In 32nd CIB W78 Conference 2015 (pp. 421–431).
Golparvar-Fard, M., Peña-Mora, F., Arboleda, C. a., & Lee, S. (2009). Eindhoven, The Netherlands. http://itc.scix.net/data/works/att/
Visualization of construction progress monitoring with 4D sim- w78-2015-paper-044.pdf
ulation model overlaid on time-lapsed photographs. Journal of Lin, W. Y. (2020). Automatic generation of high-accuracy stair
Computing in Civil Engineering, 23(6), 391–404. https://doi.org/10. paths for straight, spiral, and winder stairs using IFC-based mod-
1061/(ASCE)0887-3801(2009)23:6(391) els. ISPRS International Journal of Geo-Information, 9(4), 22–26.
Ham, Y., Han, K. K., Lin, J. J., & Golparvar-Fard, M. (2016). https://doi.org/10.3390/ijgi9040215
Visual monitoring of civil infrastructure systems via camera- Lin, J. J., & Golparvar-fard, M. (2016). Web-based 4D visual produc-
equipped unmanned aerial vehicles (UAVs): A review of related tion models for decentralized work tracking and information com-
works. Visualization in Engineering, 4(1), 1. https://doi.org/10. munication on construction sites. Construction Research Congress
1186/s40327-015-0029-z 2016, San Juan, Puerto Rico (pp. 1731–1741). https://doi.org/10.1061/
Hamledari, H., McCabe, B., & Davari, S. (2017). Automated com- 9780784479827.203
puter vision-based detection of components of under-construction Lin, Y. H., Liu, Y. S., Gao, G., Han, X. G., Lai, C. Y., & Gu, M. (2013).
indoor partitions. Automation in Construction, 74, 78–94. https: The IFC-based path planning for 3D indoor spaces. Advanced
//doi.org/10.1016/j.autcon.2016.11.009 Engineering Informatics, 27(2), 189–205. https://doi.org/10.1016/j.
Han, K. K., & Golparvar-Fard, M. (2015). Appearance-based mate- aei.2012.10.001
rial classification for monitoring of operation-level construction Lindner, S., Garbe, C., & Mombaur, K. (2019). Optimization based
progress using 4D BIM and site photologs. Automation in Con- multi-view coverage path planning for autonomous structure from
struction, 53, 44–57. https://doi.org/10.1016/j.autcon.2015.02.007 motion recordings. IEEE Robotics and Automation Letters, 4(4),
Han, K. K., & Golparvar-Fard, M. (2017). Potential of big visual data 3278–3285. https://doi.org/10.1109/LRA.2019.2926216
and building information modeling for construction performance Park, H. S., Lee, H. M., Adeli, H., & Lee, I. (2007). A new approach
analytics: An exploratory study. Automation in Construction, 73, for health monitoring of structures: Terrestrial laser scanning.
184–198. https://doi.org/10.1016/j.autcon.2016.11.004 Computer-Aided Civil and Infrastructure Engineering, 22(1), 19–30.
Hoiem, D. (2018). Maximize measurement accuracy with images over- https://doi.org/10.1111/j.1467-8667.2006.00466.x
laid on point clouds. 1–4. https://medium.com/reconstruct-inc/ Phung, M. D., Quach, C. H., Dinh, T. H., & Ha, Q. (2017). Enhanced
maximize-measurement-accuracy-with-images-overlaid- discrete particle swarm optimization path planning for UAV
on-point-clouds-dca828f4a539. vision-based surface inspection. Automation in Construction, 81,
Ibrahim, A., & Golparvar-Fard, M. (2019). 4D BIM based opti- 25–33. https://doi.org/10.1016/j.autcon.2017.04.013
mal flight planning for construction monitoring applications Rebolj, D., Pucko, Z., Babic, N. C., Bizjak, M., & Mongus, D. (2017).
using camera-equipped UAVs. In Computing in Civil Engineer- Point cloud quality requirements for Scan-vs-BIM based auto-
ing 2019, Atlanta, Georgia (pp. 217–224). https://doi.org/10.1061/ mated construction progress monitoring. Automation in Construc-
9780784482438.028 tion, 84, 323–334. https://doi.org/10.1016/j.autcon.2017.09.021
Ibrahim, A., Golparvar-Fard, M., Bretl, T., & El-Rayes, K. (2017). Reconstruct. (2020). https://www.reconstructinc.com/
Model-driven visual data capture on construction sites: Method Rodríguez-Gonzálvez, P., Rodríguez-Martín, M., Ramos, L. F., &
and metrics of success. In International Workshop for Computing González-Aguilera, D. (2017). 3D reconstruction methods and
in Civil Engineering (IWCCE 2017), Seattle, Washington (pp. 109– quality assessment for visual inspection of welds. Automation in
116). https://doi.org/10.1061/9780784480847.014 Construction, 79, 49–58. https://doi.org/10.1016/j.autcon.2017.03.
Ibrahim, A., Roberts, D., Golparvar-Fard, M., & Bretl, T. (2017). An 002
interactive model-driven path planning and data capture system Seitz, S. S. M., Curless, B., Diebel, J., Scharstein, D., & Szeliski, R.
for camera-equipped aerial robots on construction sites. Inter- (2006). A comparison and evaluation of multi-view stereo recon-
national Workshop for Computing in Civil Engineering (IWCCE struction algorithms. In IEEE Computer Society Conference on
2017), Seattle, Washington (pp. 117–124). https://doi.org/10.1061/ Computer Vision and Pattern Recognition (CVPR’06) (vol. 1, pp.
9780784480847.015 519–528). https://doi.org/10.1109/CVPR.2006.19
Javadnejad, F., Slocum, R. K., Gillins, D. T., Olsen, M. J., & Parrish, Szeliski, R. (2020). Computer vision: Algorithms and applications.
C. E. (2021). Dense point cloud quality factor as proxy for accuracy Springer. https://doi.org/10.1007/978-1-84882-935-0
assessment of image-based 3D reconstruction. Journal of Survey- Taneja, S., Akinci, B., Garrett, J. H., & Soibelman, L. (2016). Algo-
ing Engineering, 147(1), 04020021. https://doi.org/10.1061/(asce)su. rithms for automated generation of navigation models from
1943-5428.0000333 building information models to support indoor map-matching.
Kalyan, T. S., Zadeh, P. A., Staub-French, S., & Froese, T. M. (2016). Automation in Construction, 61, 24–41. https://doi.org/10.1016/j.
Construction quality assessment using 3D as-built models gen- autcon.2015.09.010
erated with project tango. Procedia Engineering, 145, 1416–1423. Tuttas, S., Braun, A., Borrmann, A., & Stilla, U. (2016). Evaluation of
https://doi.org/10.1016/j.proeng.2016.04.178 acquisition strategies for image-based construction site monitor-
Kim, P., Park, J., Cho, Y. K., & Kang, J. (2019). UAV-assisted ing. ISPRS - International Archives of the Photogrammetry, Remote
autonomous mobile robot navigation for as-is 3D data collec- Sensing and Spatial Information Sciences, 41, 733–740. https://doi.
tion and registration in cluttered environments. Automation in org/10.5194/isprsarchives-XLI-B5-733-2016
18 IBRAHIM et al.
Xu, Z., Kang, R., & Lu, R. (2020). 3D reconstruction and measure- Zhang, C., Kalasapudi, V. S., & Tang, P. (2016). Rapid data qual-
ment of surface defects in prefabricated elements using point ity oriented laser scan planning for dynamic construction envi-
clouds. Journal of Computing in Civil Engineering, 34(5), 04020033. ronments. Advanced Engineering Informatics, 30(2), 218–232.
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000920 https://doi.org/10.1016/j.aei.2016.03.004
Yang, J., Park, M. W., Vela, P. A., & Golparvar-Fard, M. (2015). Con-
struction performance monitoring via still images, time-n=lapse
photos, and video streams: Now, tomorrow, and the future. How to cite this article: Ibrahim A,
Advanced Engineering Informatics, 29(2), 211–224. https://doi.org/ Golparvar-Fard M, El-Rayes K. Metrics and
10.1016/j.aei.2015.01.011 methods for evaluating model-driven reality
Zhang, C., & Arditi, D. (2013). Automated progress control using
capture plans. Comput Aided Civ Inf. 2021;1–18.
laser scanning technology. Automation in Construction, 36, 108–
116. https://doi.org/10.1016/j.autcon.2013.08.012
https://doi.org/10.1111/mice.12693