Professional Documents
Culture Documents
Automation in Construction
journal homepage: www.elsevier.com/locate/autcon
Review
Keywords: Computer vision is transforming processes associated with the engineering and management of construction
Computer vision projects. It can enable the acquisition, processing, analysis of digital images, and the extraction of high-di-
Construction mensional data from the real world to produce information to improve managerial decision-making. To acquire
Science mapping an understanding of the developments and applications of computer vision research within the field of con-
Review
struction, we performed a detailed bibliometric and scientometric analysis of the normative literature from 2000
to 2018. We identified the primary areas where computer vision has been applied, including defect inspection,
safety monitoring, and performance analysis. By performing a mapping exercise, a detailed analysis of the
computer vision literature enables the identification of gaps in knowledge, which provides a platform to support
future research in this fertile area for construction.
1. Introduction analysis of the computer vision literature from 2000 to 2018 [12–14].
Our aim is to garner an understanding of the intellectual core of the
Computer vision is an interdisciplinary scientific field that focuses computer vision domains in construction instead of focusing on in-
on how computers can acquire a high-level understanding from digital dividual and specific issues. In pursuing this line of inquiry, we un-
images or videos [1]. It can be used to transform the tasks of en- earthed the primary areas and topics of research that computer vision
gineering and management in construction by enabling the acquisition, has focused upon, the gaps that restrict its application, the individuals
processing, analysis of digital images, and the extraction of high-di- and institutions that are the forefront of its developments, the colla-
mensional data from the real world to produce information to improve borations that take place, and the primary publication outlets. We be-
decision-making. Furthermore, computer vision can provide practi- lieve the line of inquiry presented in this paper will not only advance
tioners with rich digital images and videos (e.g., location and behavior our understanding of computer vision research in construction and
of objects/entities, and site conditions) about a project's prevailing engineering but enhance the field's ability to explore how technological
environment and therefore enable them to better manage the con- innovation can be effectively applied to address problems confronting
struction process [2]. everyday practice.
Computer vision has been used to examine specific issues in con- The paper commenced by introducing and describing the research
struction such as tracking people's movement [3,4], progress mon- method we have adopted to conduct our review (Section 2). We then
itoring [5], productivity analysis [6], health and safety monitoring discussed the results of our bibliometric and scientometric analysis
[7,8], and postural ergonomic assessment [9]. In bringing together the (Section 3). Next, we categorized the areas of computer vision research
developments and applications of computer vision research, we re- and identified knowledge gaps that exist in the literature (Section 4). In
viewed the normative literature to identify emerging trends to provide the final section of our paper, we presented our conclusions and lim-
a roadmap for future lines of inquiry. We acknowledge that reviews of itations.
computer vision have been undertaken for specific problem domains
such as defect detection and condition assessment [10] and safety [2], 2. Research method
but they are subjective and therefore are prone to bias [11].
To address such limitations, we conducted a science mapping Our bibliometric review and scientometric analysis of the computer
⁎
Corresponding author.
E-mail address: haitao_w@hust.edu.cn (H. Wu).
https://doi.org/10.1016/j.autcon.2019.102919
Received 4 April 2019; Received in revised form 21 July 2019; Accepted 22 July 2019
Available online 30 July 2019
0926-5805/ © 2019 Elsevier B.V. All rights reserved.
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Phase 1
(recognize*) OR (caption*)))) AND
Data Paper retrieval from ( construction worker* OR construction
acquisition WoS core collection project* OR civil engineering OR
construction site* OR construction *
management OR project
management ))
Data
Books and conference papers were excluded
processing
Collaboration
Author
analysis
Bibliometric Author co-citation
analysis Journal
Influential journals
Phase 3
analysis
Science Keywords
Keyword co-occurrence
mapping analysis Keyword evolution
Document co-citation
Scientometric Document
analysis analysis Cluster analysis
Further
discussion
Discuss research gaps of computer vision in the construction industry
vision literature utilized the Web of Science (WoS) database. We were scholarly journals, which had been published in English as they are
drawn to the WoS as it contains the most comprehensive and influential reputable and reliable sources [16]. A manual review on search results
journals within the field of construction and engineering management was adopted to remove unrelated papers. We hasten to note that the
[13,15]. In this paper, the academic relationships and keywords within exclusion of articles from books, editorials, and conference papers. As a
the domain of computer vision in construction are mapped, the salient result of conducting our search, we identified a total of 216 journal
research themes identified using cluster analysis, which are then further articles were suitable for analysis.
explored by utilizing a qualitative line of inquiry. Additionally, we
discussed the gaps in knowledge that exist within the computer vision
literature. We presented an overview of the research process in Fig. 1. 2.2. Bibliometric and scientometric analysis
After the acquisition and processing of the journal articles, the next
2.1. Data acquisition step was to perform a quantitative analysis using bibliometric and sci-
entometric techniques. The bibliometric analysis aimed to scientifically
We applied the following search string within the WoS data to map and visualize the dataset [17] by examining themes such as au-
identify papers for our review that commences in 2000 and finishes in thors, journals, and keywords. Contrastingly, scientometric techniques
2018: (((“computer vision”) OR (((image*) OR(video*)) AND ((iden- were used to quantitatively analyze and assess the content of publica-
tify*) OR (detect*) OR (recognize*) OR (caption*)))) AND (“construc- tions [13]. There are two general scientometric approaches: (1) nor-
tion worker*” OR “construction project*” OR “civil engineering” OR mative; and (2) descriptive [18].
“construction site*” OR “construction * management” OR “project The normative approach aims to create norms, rules, and heuristics
management”)), where “*” denotes a fuzzy search. We identified a to ensure progress within a particular field. Equally, the purpose of the
paper when the defined terms appeared in its title, keywords, or ab- descriptive approach is to observe and report on the actual activities
stracts. We also restricted our search to articles in peer-reviewed within a field, and in particular, refer to its leading researchers'. We
2
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Number
25
lead to the development of normative recommendations” [19]. Cite-
Space, however, can enable knowledge domains to be systematically
20 16
14 15
created using an array of graphs to visualize and analyze the literature
15 11
9
[20]. Moreover, the developments and research progress and its cor-
10
4 5 4
responding time nodes and emergent trends were able to be identified 5 3 2 3 2
1 1 1
[12]. 0
3
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Table 1
Top ten collaboration nodes.
Authora Institution Country Frequency
a
Authors surname is presented first throughout our paper.
4
B. Zhong, et al. Automation in Construction 107 (2019) 102919
those authors that are cited in the same publication. Setting the Table 2
minimum number of co-citations at 20 in CiteSpace, a total of 20 out of Top 10 cited authors.
the 334 nodes met the thresholds. In Fig. 5, the node size expresses the Author Frequency Country Centrality
number of co-citations for each author. The links between authors re-
flect the established indirect cooperative relationships based on citation Golparvar-Fard Mani 60 USA 0.09
frequency. Brilakis Ioannis K 57 UK 0.22
Gong Jie 52 USA 0.05
Table 2 identified the top ten cited authors. The top ten most highly
Teizer Jochen 51 Germany 0.07
cited authors were from USA (3); UK (2); Canada (2); South Korea (2); Park Manwoo 48 South Korea 0.12
China (1); and German (1). Of these authors, Golparvar-Fard Mani, the Lowe David 39 Canada 0.04
director of real-time and automated monitoring and control laboratory Yang Jun 38 China 0.15
Zhu Zhenhua 33 Canada 0.24
at the University of Illinois at Urbana-Champaign, has focused on de-
Chi Seok-ho 32 South Korea 0.08
veloping scientific solutions for automated and real-time performance Bosché Frederic 31 UK 0.19
monitoring. Conversely, Brilakis Ioannis K. has specialized in the field
of automation and used computer vision to examine areas such as
5
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Table 3 Additionally, the minimum threshold for citation was set to 40, and
Main journal outlets. thus 12 nodes out of 233 met the requirement (Fig. 6). The node size
Journal Country Count Percentage refers to the co-citation frequency of each source journal. With respect
to co-citation frequency, the top five most influential journals were
Automation in Construction Netherlands 67 31% Automation in Construction (frequency = 147), ASCE Journal of Com-
ASCE Journal of Computing in Civil USA 35 16%
puting in Civil Engineering (frequency = 134), Transactions on Pattern
Engineering
Advanced Engineering Informatics UK 33 15%
Analysis and Machine Intelligence (frequency = 129), Advanced En-
Computer-aided Civil and Infrastructure USA 7 3% gineering Informatics (frequency = 98), Computer-aided Civil and Infra-
Engineering structure Engineering (frequency = 84). Markedly, four of these five
Structural Control & Health Monitoring USA 7 3% journals were also among the top productive journals, which have made
significant contributions to computer vision studies in the construction
industry.
multimedia data analysis, classification, retrieval, and processing and
The nodes denoted by a purple ring indicate that these journals
the three dimensional (3D) reconstruction.
possesses a high betweenness centrality, and therefore act as a medium
As shown in Table 2, a highly co-cited author may not have a high
whereby key academics share their intellectual outputs [31]. Within a
betweenness centrality. However, when a node has a high citation
journal's co-citation network, a high betweenness centrality indicates
frequency, and centrality, it indicates the author has had a fundamental
its interdisciplinary nature [32]. In Fig. 5, the nodes highlighted by
influence on the development of computer vision research. These au-
purple rings possess high levels of betweenness centrality. For example,
thors were: (1) Brilakis Ioannis K. (frequency = 61, centrality = 0.22);
ASCE Journal of Construction Engineering and Management (cen-
and (2) Zhu Zhenhua (frequency = 46, centrality = 0.24). In the co-
trality = 0.25), Lecture Notes in Computer Science (centrality = 0.16),
citation, several computer scientists were also identified such as Lowe
and Computer-aided Civil and Infrastructure Engineering (cen-
David who presents an algorithm that applies a priority search on
trality = 0.15).
hierarchical k-means trees to solve nearest neighbor matching in high-
dimensional spaces [30].
3.4. Keywords analysis
3.3. Journals analysis
Keywords are representative and concise descriptions of a research
The identification of journals that have published leading scholarly article's content. They can also describe the existing research topics of a
works is an important source for researchers aiming to acquire insights specific domain. In this paper, keywords analysis contained two parts:
into the latest developments and emerging trends with a field [11]. (1) co-occurrence; and (2) evaluation. Within the WoS database, there
With this in mind, we analyzed the 216 articles to determine the main are two types of keywords used in the co-occurrence analysis: (1) ‘au-
research outlets for computer vision research. Table 3 indicates that thor keywords’; and (2) ‘keywords plus’, which are identified by the
Automation in Construction is the leading scholarly journal for computer- journal. Both types of keywords from the 216 bibliographic records
vision based research with 67 articles (31%), followed by the ASCE were used to generate the keyword co-occurrence network using
Journal of Computing in Civil Engineering (35 articles), Advanced En- CiteSpace, which aimed to identify the inter-closeness of research to-
gineering Informatics (33 articles), Computer-aided Civil, Infrastructure pics. Using word clouds, the keyword evaluation analysis can reflect
Engineering (7 articles) and Structural Control & Health Monitoring (7 changes within a research topic over a period of time. Word clouds are
articles). graphical representations of keyword frequency that provide greater
6
B. Zhong, et al. Automation in Construction 107 (2019) 102919
prominence to words that appear more frequently [33]. In this paper, on-site from video streams. The nodes with high centrality included
the author keywords were used to generate word clouds ‘image processing (centrality = 0.24)’, ‘action recognition (centrality
= 0.22)’, and ‘tracking (centrality = 0.15)’. Thus, the three keywords
with high centrality and frequency were: (1) ‘image processing’; (2)
3.4.1. Keyword co-occurrence ‘action recognition’; and (3) ‘tracking’. These keywords have empha-
Using CiteSpace, we presented the science mapping result of the co- sized the focus of computer vision in the field of construction.
occurrence keywords in Fig. 7. When creating this map, we set the The co-occurrence network highlights a strong relationship between
criterion only to include the keywords that co-occurred a minimum of ‘image processing’ and ‘video’, ‘crack detection’, ‘inspection’, and
three times. Some general keywords, however, were removed, such as ‘tracking’. With advances in digital imaging enabled by cameras and
‘computer vision’, ‘model’, and ‘construction’. The top ten keywords videos, there has been an increase in their use to monitor and manage
were displayed in Table 4. In the keyword co-occurrence network, we the process of construction. Furthermore, images captured from digital
determined the node size by the frequency of words in the bibliometric cameras have been used to inspect infrastructure for defects [35],
record, and the links indicate the interrelatedness between a pair of tracking excavation activities [36], and performance analysis [6]. A
keywords. pertinent example is the work of Yamaguchi et al. [37] who developed
Determining the co-occurrence relations can provide scholars with a a crack model based on the concept of percolation. Similarly, Yu et al.
means to identify research topics in a specific domain. For example, in [38] developed a crack detection method that was integrated with a
Fig. 6, the keyword of ‘equipment’ tended to occur with several others, mobile robot system to automate the inspection of concrete cracks in
such as ‘tracking’, ‘construction worker’, and ‘action recognition’. The tunnels. In these studies, the image processing technique is a pre-
analysis revealed that equipment was mainly associated with tracking requisite for computer vision-based defect detection in civil infra-
the location of equipment to improve safety or recognize its presence. structures, such as template matching, histogram transforms, back-
In highlighting this association, we refer to the works of Memarzadeh, ground subtraction, filtering, edge and boundary detection [10]. Image-
et al. [34] who proposed Histograms of Oriented Gradients (HOG) and processing techniques are mathematical or statistical algorithms that
Colors based algorithm to detect the presence of equipment and people change the visual appearance or geometric properties of an image. The
algorithms enable several pictures to be automatically processed and
Table 4 therefore identify and classify material clusters [39]. Color histograms,
The top 10 frequent keywords. for example, can help determine similarity and dissimilarity between
Keyword Frequency Average published year Centrality images. Other algorithms, such as, Gaussian or wavelet (for noise re-
moval), Fourier analysis(low pass filtering, phase reconstruction), wa-
Image processing 50 2010 0.24 velet decomposition, Laplacian and oriented pyramids and Gabor filters
Tracking 38 2011 0.15
can assist in obtaining explicit texture form images. Additionally, image
Action recognition 29 2010 0.22
Construction worker 20 2016 0.02 processing is prevalent in crack detection and maintenance since it can
Equipment 16 2015 0.08 filter the noise of images and therefore provide additional detail that is
Identification 13 2011 0.01 invisible to humans. In sum, image processing, therefore, can be used to
Machine learning 13 2014 0.03 recognize cracks in concrete and determine a structure's displacement.
Construction safety 12 2012 0.08
Segmentation 12 2014 0.05
‘Tracking’ referred to the use of digital cameras to track the location
Classification 12 2014 0.09 of resources on-sites, such as workers, equipment, and materials.
7
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Compared with the sensor-based technology (e.g., Radio Frequency displayed in Table 5 that as computer vision technologies have become
Identification, Geographical Positioning Systems, and Zigbee), the use more mature, there has been a subtle but distinct shift in research
of computer vision has several advantages over these techniques as it contents and methods.
can provide additional information (e.g., locations, geometrical in- Computer vision research has paid particular attention to safety
formation, and behaviors) and cover large areas. Furthermore, there is monitoring and defect detection. Seo et al. [2], for example, has been
no need to attach sensors and receivers on project entities [40]. Re- particularly influential in the area of safety monitoring as they classi-
search worthy of mention here is that of Park, et al. [41] who compared fied computer vision-based approaches into three categories: (1) scene-
various two-dimensional (2D) vision trackers to determine the most based; (2) location-based; (3) action-based risk identification. This work
appropriate one to follow resources on congested sites. In a similar vein, has set the scene for a series of studies that have aimed to improve
Brilakis, et al. [40] developed a vision-based tracking framework that safety management on-sites [44,45]. Similarly, an influential study
aimed to determine the spatial location of entities on a site. The computer vision focusing on defect detection is Yousaf, et al. [46]
tracking multiple of objects, however, remains to be a challenge due to where the potholes in a road were detected using SVM with an accuracy
their interacting trajectories, which can cause occlusions [27]. More- of 95.7%.
over, determining the optimized camera location is also an issue, but Early research computer vision methods tended to rely on the use of
when it is identified tracking accuracy can be significantly improved. shallow machine learning, which used handcrafted features (Fig. 8b).
For example, Zhang, et al. [42] proposed a creative approach to opti- Shallow machine learning methods contain two steps: (1) feature ex-
mize camera locations whereby there is 100% site coverage, however, traction and representation; and (2) classification. Some feature ex-
this approach is not generalizable and restricted to a specific site con- tractors were used to extract features from images using descriptors
text. such as HOG [47] and Histogram of Optical Flow (HOF) [96]. These
‘Action recognition’ usually co-occurred with ‘safety’, ‘video’, and features were inputted into a shallow machine learning classifier such
‘tracking’. Action recognition techniques have been applied to con- as SVM and k-Nearest Neighbor. For example, Park and Brilakis [47]
struction as a means to extract motion information that is necessary for used HOG and Haar-like features to detect individuals and equipment
automated safety monitoring. For example, a person is not in danger if from images. Memarzadeh, et al. [34] developed a HOG and Color
the excavator is not operating even the distance between them and the based detector to automatically detect individuals in videos. However,
excavator is small. Golparvar-Fard, et al. [3] proposed a method to the use of hand-crafted features is a time-consuming and costly process.
recognize single actions of earthmoving equipment from site video Current research focuses on the use of deep learning methods as
streams based on a multiclass Support Vector Machine (SVM) classifier. they have the ability to obtain robustly represent features using end-to-
The videos were represented as spatiotemporal visual features which end learning methods, such as convolutional neural networks (CNNs)
were described with a HOG and classified using SVM. [49], Fast R-CNN [50], and Mask R-CNN [51]. The emergence of CNNs
has led to the rapid developments in the area of object detection,
tracking, and recognition [52]. For example, Fang, et al. [49] applied a
3.4.2. Keyword evolution
Fast R-CNN algorithm to detect entities from images and have achieved
Considering the limited number of published articles published
an outstanding result with a high level of accuracy on-site (i.e., 91%
between 2000 and 2010, we divided the sample of publications for
and 95% for individuals and equipment, respectively).
analysis into three distinct periods: (1) 2000 to 2010; (2) 2011 to 2014;
and (3) 2015 to 2018. The upshot was that we can see how the ver-
nacular of computer vision research has evolved with time (Fig. 8).
3.5. Document analysis
The word clouds provide a quantitative and visualized method to
determine the evolution of research topics (Fig. 8). What is more, the
We have used document co-citation and cluster analysis to objec-
term frequency-inverse document frequency (TF-IDF) algorithm was
tively identify influential articles and research themes. We defined
used to identify essential keywords in the corresponding stage. It pro-
document co-citation as the frequency that two documents are cited
vides each keyword with a weight based on the following two criteria:
together within others [53]. It was used to reveal the authority of re-
(1) the frequency of its usage in the specified document (TF); and (2)
ferences cited by selected papers. Cluster analysis is a knowledge dis-
the rarity of its appearance in the other documents in the corpus (IDF)
covery technique, which aims to determine the semantic themes hidden
[43]. The keywords with high TF-IDF for different stages are listed in
in the textual data [54]. A large corpus of data was classified into dif-
Table 5. Some common keywords were removed such as ‘computer
ferent clusters according to their relative degree of correlation. Based
vision’ and ‘construction’.
on the document co-citation network, cluster analysis was used to un-
We can see from Table 5 that ‘image processing’ was a con-
derpin intellectual structures of a scientific research domain, and detect
temporary keyword. This was, however, expected as it is a fundamental
research themes for future research directions [12].
component of computer vision. It also appeared from the findings
8
B. Zhong, et al. Automation in Construction 107 (2019) 102919
Table 5
Top five TF-IDF keywords.
2000 to 2010 2011 to 2014 2015 to 2018
3.5.1. Document co-citation extracting three features from images: motion, shape, and color.
In Fig. 9, the document co-citation network was presented. Setting Some computer-vision supported applications can be applied in
the minimum number of document co-citations at 20 in CiteSpace, 9 out practice, such as safety monitoring [2] and used for productivity ana-
of total 288 nodes met the thresholds. Each node represents a document lysis [6]. For example, Gong and Caldas [56] developed a prototype
that is represented by the first author's name and its publication year. system to automated measure productivity from videos. Seo, et al. [2]
Each link provides the co-citation relationship between the two docu- presented a systematic review on computer vision-based safety mon-
ments, while the size of nodes represents their co-citation frequency. itoring, in which detailed object detection, tracking, and recognition
We presented the top-cited documents in Table 6. techniques were discussed. Additionally, the works of Chi and Caldas
After a manual review on these high cited articles, we observed [55] and Gong and Caldas [6] have a high centrality 0.10, which sug-
three common usages for computer vision: (1) entity detection ([47]; gest they have made a source of reference for a vast number of studies.
[55]; [34]); (2) tracking ([40]; [56]) and (3) action recognition ([3];
[57]). For example, Chi and Caldas [55] developed an approach to
3.5.2. Cluster analysis
detect on-site objects in real-time in heavy-equipment-intensive con-
In Fig. 10, a total of 13 clusters were identified through the log-
struction sites. However, it lacks information regarding how to distin-
likelihood ratio (LLR) algorithm, which was embedded in CiteSpace
guish a worker from a pedestrian, which is a serious problem when a
[31]. A label for each identified cluster was generated using the LLR
construction site is located in residential areas. This problem was solved
algorithm to reveal the main research content and generate a label for
by Park and Brilakis [47], who used background subtraction, the HOG,
the corresponding cluster using keywords derived from the cited
and the HSV color histogram to detect construction workers by
documents. The quality of cluster labeling depends on the variety,
Table 6
High cited articles.
Title Frequency Centrality Reference
Construction worker detection in video frames for initializing vision trackers 27 0.03 [47]
Automated vision tracking of project related entities 26 0.04 [40]
Automated Object Identification Using Optical Video Cameras on Construction Sites 24 0.10 [55]
Automated 2D detection of construction equipment and workers from site video streams using histograms of oriented gradients and colors 23 0.03 [34]
Computer Vision-Based Video Interpretation Model for Automated Productivity Analysis of Construction Operations 22 0.10 [6]
An object recognition, tracking, and contextual reasoning-based video interpretation method for rapid productivity analysis of construction 22 0.06 [56]
operations
Vision-based action recognition of earthmoving equipment using spatio-temporal features and support vector machine classifiers 21 0.02 [3]
Computer vision techniques for construction safety and health monitoring 21 0.02 [2]
Automated Visual Recognition of Dump Trucks in Construction Videos 19 0.06 [57]
9
B. Zhong, et al. Automation in Construction 107 (2019) 102919
breadth, and depth of the set of terms derived from keywords within In Table 7, we presented detailed information about the clusters,
articles [58]. Notably, a critical review was needed to interpret the which included silhouette, cluster labels, alternative labels, and size
clustering labels and summarize research themes based on articles (i.e., the number of members). The silhouette value, ranging from −1
contained within each cluster. For instance, cluster 5 (‘Automated de- to 1, indicates the uncertainty that we need to take into account when
fect detection’) and cluster 12 (‘Detecting sewer pipe defect’) revealed interpreting the nature of the cluster [58]. It reflects the average
the same research theme. homogeneity of a cluster [59]. The value of 1 means a perfect
Table 7
Documents co-citation clusters.
Cluster ID Size Silhouette Cluster label (LLR) Alternative label Mean Year Reference
10
B. Zhong, et al. Automation in Construction 107 (2019) 102919
separation from other clusters. The clustering result has high reliability
Reference
when the silhouette value of a cluster exceeds 0.7 [58]. In CiteSpace,
[44]
[69]
[51]
[70]
[71]
[72]
the representative papers were identified based on the metric of ‘cov-
erage’ which was calculated using Eq. (2):
A Mask Region Based CNN was used to recognize the relationship between people
Applied computer vision technique to extract spatial information for further fuzzy
Developed a framework for real-time pro-active safety assistance for mobile crane
N
Coverage = , (0 < Coverage 1)
M (2)
4. Qualitative interpretation
Safety Monitoring
lifting operation
Computer vision research published up until 2018 has focused the
Descriptions
object recognition and tracking people and plant on-site (e.g., cluster
#2 and #6) and the status and quality of infrastructure (e.g., cluster #5
and 12). Based on the results of the cluster analysis, we examined these
themes in further detail.
Laser scanner,
Types of data
sensors
with regard to people's behavior and site condition (Table 8) and in the
Video
Video
Video
context of performance tracking on project and operational monitoring
levels (Table 9): Falling when climbing and dismounting from
Cross structural support (e.g., concrete and
a ladder
actions
time).
Prior works on computer vision-based safety monitoring.
Spatial conflicts
11
B. Zhong, et al. Automation in Construction 107 (2019) 102919
[74]
[76]
[75]
[56]
the productivity from videos by analyzing the pose of people while they
were tying rebar with the results of 85% to 89% accuracy.
have been several studies that have focused on the automatic genera-
measure their effect (e.g., depth, width, and length) [10]. It has been
occupancy
2D images; Time-
Video streams
Types of data
lapse images
therefore vital that journal editors require all papers that are accepted
Appearance-based progress
Occupancy-based progress
for publication to provide a cop the training sets used and even datasets
used for a particular study. Though, privacy laws can prevent this from
occurring.
In the meantime, researchers reliant on using small databases will
assessment
assessment
detection
detection
Operation analysis of
Tracking progress on
construction workers
construction sites
countries, and prevailing privacy laws may prevent the sharing of data.
level information
monitoring
Performance
12
B. Zhong, et al. Automation in Construction 107 (2019) 102919
[82]
[83]
[84]
data collection (e.g. 2D images, time-lapse images, and videos); and (2)
analysis. In the case of data collection, the positioning and orientation
method for image classification and object detection due to their ability
to automatically extract features [44,89]. While deep learning is being
widely used, there are several technical challenges that confront its use
Image processing algorithm (e.g., Histogram-based classification
in practice. First, deep learning can only learn from the correlation
algorithm and support vector machines (SVM)); Image capture
between input and output and is not able to determine causality. In the
case of safety monitoring, for example, not only is there a need to
Pattern recognition approach (Segmentation, template
from the images by computer vision algorithms and the high-level se-
mantic meaning that people recognize from an image [90]. As a result
of this semantic gap, developments in automated computer vision may
be stymied. In the case of hazards identification, for example, not only
Types of data
2D images
2D images
resource
knowledge with explicit and rich semantics, which can enable knowl-
edge query and reasoning to be performed [93].
A framework combining ontology and computer vision techniques
infrastructures (Bridges, tunnels, underground pipe,
model where different rules can be encoded for the specific appli-
cation, such as hazard reasoning and defect identification;
• spatial relationships between objects, which can be automated de-
tected using computer vision algorithms [51]; and
•
and asphalt pavements)
tion).
Table 10
13
B. Zhong, et al. Automation in Construction 107 (2019) 102919
can be further improved. vision-based posture classification, Construction Research Congress (2016), https://
doi.org/10.1061/9780784479827.082.
[10] C. Koch, K. Georgieva, V. Kasireddy, B. Akinci, P. Fieguth, A review on computer
5. Conclusion vision based defect detection and condition assessment of concrete and asphalt civil
infrastructure, Adv. Eng. Inform. 29 (2) (2015) 196–210, https://doi.org/10.1016/
Computer vision has attracted an increasing amount of attention j.aei.2015.01.008.
[11] M.R. Hosseini, I. Martek, E.K. Zavadskas, A.A. Aibinu, M. Arashpour, N. Chileshe,
from researchers and practitioners. We have undertaken a detailed re- Critical evaluation of off-site construction research: a scientometric analysis,
view of the extant literature using a science mapping approach to: (1) Autom. Constr. 87 (2018) 235–247, https://doi.org/10.1016/j.autcon.2017.12.
determine the collaboration of authors and institutions, and indicate 002.
[12] J. Song, H. Zhang, W. Dong, A review of emerging trends in global PPP research:
the influential journals;(2) identify the keywords used within the do- analysis and visualization, Scientometrics 107 (3) (2016) 1111–1147, https://doi.
main of computer vision; (3) identify high cited articles and create org/10.1007/s11192-016-1918-1.
different clusters with labels to objectively reflect the emerging re- [13] X. Zhao, A scientometric review of global BIM research: analysis and visualization,
Autom. Constr. 80 (2017) 37–47, https://doi.org/10.1016/j.autcon.2017.04.002.
search themes; and (4) categorize the main research themes so that gaps
[14] R. Jin, S. Gao, A. Cheshmehzangi, E. Aboagye-Nimo, A holistic review of off-site
in knowledge can be identified. construction literature published between 2008 and 2018, J. Clean. Prod. 202
We identified three research communities and leading figureheads (2018) 1202–1219, https://doi.org/10.1016/j.jclepro.2018.08.195.
in the field of computer vision in construction: (1) Brilakis Ioannis K., [15] T.O. Olawumi, D.W.M. Chan, J.K.W. Wong, Evolution in the intellectual structure of
BIM research: a bibliometric analysis, J. Civ. Eng. Manag. 23 (8) (2017) 1060–1081,
(2) Li Heng, and (3) Ding Lieyun. The most productive authors were https://doi.org/10.3846/13923730.2017.1374301.
Brilakis Ioannis K, Li Heng, and Park Manwoo Mani. Our analysis re- [16] A.R. Ramos-Rodriguez, J. Ruiz-Navarro, Changes in the intellectual structure of
vealed that most computer vision based articles emanated from the strategic management research: a bibliometric study of the Strategic Management
Journal, 1980–2000, Strateg. Manag. J. 25 (10) (2004) 981–1004, https://doi.org/
Georgia Institute of Technology and Hong Kong Polytechnic University. 10.1002/smj.397.
While considerable headway has been made to apply computer vi- [17] L.B. De Rezende, P. Blackwell, M.D. Pessanha Gonçalves, Research focuses, trends,
sion to construction in areas such as safety monitoring, productivity and major findings on project complexity: a bibliometric network analysis of 50
years of project complexity research, Proj. Manag. J. 49 (1) (2018) 42–56, https://
analysis, and determining structural defects, several technical chal- doi.org/10.1177/875697281804900104.
lenges hindered its development. We have observed that there is a [18] D. Neufeld, Y. Fang, S.L. Huff, The IS identity crisis, Commun. Assoc. Inf. Syst. 19
paucity of adequately sized databases for training and issues associated (2007), https://doi.org/10.17705/1cais.01919.
[19] A. Serenko, N. Bontis, L. Booker, K. Sadeddin, T. Hardie, A scientometric analysis of
with data privacy can prevent them from being shared. Juxtaposed with knowledge management and intellectual capital academic literature (1994-2008),
the technical issues that we have identified (e.g., an inability to identify J. Knowl. Manag. 14 (1) (2010) 3–23, https://doi.org/10.1108/
causality and the need for new algorithms to identify simultaneously 13673271011015534.
[20] C. Chen, CiteSpace: a practical guide for mapping scientific literature, Hauppauge,
identify several tasks) the field of computer vision in construction re-
N.Y, Nova Science 169 (2016), https://doi.org/10.22201/iibi.24488321xe.2017.
mains a fertile line of inquiry. nesp1.57894.
Despite the contribution of this paper, the findings still have lim- [21] C. Chen, Science mapping: a systematic review of the literature, Journal of Data and
itations. First, in this paper, two metrics provided by CiteSpace were Information Science 2 (2) (2017) 1–40, https://doi.org/10.1515/jdis-2017-0006.
[22] C. Chen, Z. Hu, S. Liu, H. Tseng, Emerging trends in regenerative medicine: a sci-
used to find the key nodes in each network: betweenness centrality and entometric analysis in CiteSpace, Expert. Opin. Biol. Ther. 12 (5) (2012) 593–608,
burst strength. More metrics should be considered in future studies, https://doi.org/10.1517/14712598.2012.674507.
such as the negative degree, eigenvector centrality, clustering coeffi- [23] L.C. Freeman, A set of measures of centrality based on betweenness, Sociometry 40
(1) (1977) 35–41, https://doi.org/10.2307/3033543.
cient, and average neighbor degree [95]. Second, the literature sample [24] J. Kleinberg, K. Discovery, Bursty and hierarchical structure in streams, Data Min.
was limited to English journal articles, which might exclude studies that Knowl. Disc. 7 (4) (2003) 373–397, https://doi.org/10.1145/775060.775061.
have been published in other languages. [25] H. Son, C. Kim, 3D structural component recognition and modeling method using
color and 3D data for construction progress monitoring, Autom. Constr. 19 (7)
(2010) 844–854, https://doi.org/10.1016/j.autcon.2010.03.003.
Acknowledgments [26] Y.H. Wu, H. Kim, C. Kim, S.H. Han, Object recognition in construction-site images
using 3D CAD-based filtering, J. Comput. Civ. Eng. 24 (1) (2010) 56–64, https://
doi.org/10.1061/(asce)0887-3801(2010)24:1(56).
This research is partly supported by “National Natural Science
[27] J. Yang, O. Arif, P.A. Vela, J. Teizer, Z.K. Shi, Tracking multiple workers on con-
Foundation of China” (No. 51878311, No. 71732001, No. 71821001). struction sites using video cameras, Adv. Eng. Inform. 24 (4) (2010) 428–434,
https://doi.org/10.1016/j.aei.2010.06.008.
[28] J.S. Katz, B.R. Martin, What is research collaboration? Res. Policy 26 (1) (1997)
References
1–18, https://doi.org/10.1016/S0048-7333(96)00917-1.
[29] A.E. Bayer, J.C. Smart, G.W. McLaughlin, Mapping intellectual structure of a sci-
[1] M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysis and Machine Vision, 3 rd. entific subfield through author cocitations, Journal of the American Society for
ed., Springer, Boston, 2008, https://doi.org/10.1007/978-1-4899-3216-7. Information Science banner 41 (6) (1990) 444, https://doi.org/10.1002/(sici)1097-
[2] J. Seo, S. Han, S. Lee, H. Kim, Computer vision techniques for construction safety 4571(199009)41:6<444::aid-asi12>3.0.co;2-j.
and health monitoring, Adv. Eng. Inform. 29 (2) (2015) 239–251, https://doi.org/ [30] M. Muja, D. Lowe, Fast approximate nearest neighbors with automatic algorithm
10.1016/j.aei.2015.02.001. configuration, VISAPP International Conference on Computer Vision Theory and
[3] M. Golparvar-Fard, A. Heydarian, J.C. Niebles, Vision-based action recognition of Applications 2, 2 2009, pp. 331–340, , https://doi.org/10.5220/
earthmoving equipment using spatio-temporal features and support vector machine 0001787803310340.
classifiers, Adv. Eng. Inform. 27 (4) (2013) 652–663, https://doi.org/10.1016/j.aei. [31] T.O. Olawumi, D. Chan, A scientometric review of global research on sustainability
2013.09.001. and sustainable development, J. Clean. Prod. 183 (2018) 231–250, https://doi.org/
[4] M.W. Park, I. Brilakis, Continuous localization of construction workers via in- 10.1016/j.jclepro.2018.02.162.
tegration of detection and tracking, Autom. Constr. 72 (2016) 129–142, https://doi. [32] L. Leydesdorff, Betweenness centrality as an indicator of the interdisciplinarity of
org/10.1016/j.autcon.2016.08.039. scientific journals, J. Am. Soc. Inf. Sci. Technol. 58 (9) (2007) 1303–1319, https://
[5] K.K. Han, M. Golparvar-Fard, Appearance-based material classification for mon- doi.org/10.1002/asi.20614.
itoring of operation-level construction progress using 4D BIM and site photologs, [33] R. Riggs, S. Hu, Disassembly liaison graphs inspired by word clouds, Procedia CIRP
Autom. Constr. 53 (2015) 44–57, https://doi.org/10.1016/j.autcon.2015.02.007. 7 (2013) 521–526, https://doi.org/10.1016/j.procir.2013.06.026.
[6] J. Gong, C.H. Caldas, Computer vision-based video interpretation model for auto- [34] M. Memarzadeh, M. Golparvar-Fard, J. Niebles, Automated 2D detection of con-
mated productivity analysis of construction operations, J. Comput. Civ. Eng. 24 (3) struction equipment and workers from site video streams using histograms of or-
(2010) 252–263, https://doi.org/10.1061/(asce)cp.1943-5487.0000027. iented gradients and colors, Autom. Constr. 32 (2013) 24–37, https://doi.org/10.
[7] Q. Fang, H. Li, X.C. Luo, L.Y. Ding, H.B. Luo, C.Q. Li, Computer vision aided in- 1016/j.autcon.2012.12.002.
spection on falling prevention measures for steeplejacks in an aerial environment, [35] N. Metni, T. Hamel, A UAV for bridge inspection: visual servoing control law with
Autom. Constr. 93 (2018) 148–164, https://doi.org/10.1016/j.autcon.2018.05. orientation limits, Autom. Constr. 17 (1) (2007) 3–10, https://doi.org/10.1016/j.
022. autcon.2006.12.010.
[8] B.E. Mneymneh, M. Abbas, H. Khoury, Evaluation of computer vision techniques for [36] C.A. Quinones-Rozo, Y.M.A. Hashash, L.Y. Liu, Digital image reasoning for tracking
automated hardhat detection in indoor construction safety applications, Frontiers of excavation activities, Autom. Constr. 17 (5) (2008) 608–622, https://doi.org/10.
Engineering Management 5 (2) (2018) 227–239, https://doi.org/10.15302/j-fem- 1016/j.autcon.2007.10.008.
2018071. [37] T. Yamaguchi, S. Hashimoto, Fast crack detection method for large-size concrete
[9] J. Seo, K. Yin, S. Lee, Automated postural ergonomic assessment using a computer surface images using percolation-based image processing, Mach. Vis. Appl. 21 (5)
14
B. Zhong, et al. Automation in Construction 107 (2019) 102919
15
B. Zhong, et al. Automation in Construction 107 (2019) 102919
[92] X. Xing, B. Zhong, H. Luo, H. Li, H. Wu, Ontology for safety risk identification in construction site layout planning tasks, Autom. Constr. 97 (2019) 205–219, https://
metro construction, Comput. Ind. 109 (2019) 14–30, https://doi.org/10.1016/j. doi.org/10.1016/j.autcon.2018.10.012.
compind.2019.04.001. [95] J. Tang, L. Khoja, H. Heinimann, Characterisation of survivability resilience with
[93] C.J. Anumba, R.R. Issa, J. Pan, I. Mutis, Ontology-based information and knowledge dynamic stock interdependence in financial networks, Applied Network Science 3
management in construction, Constr. Innov. 8 (3) (2008) 218–239, https://doi.org/ (1) (2018) 23, , https://doi.org/10.1007/s41109-018-0086-z.
10.1108/14714170810888976. [96] H. Wang, C. Schmid, Action recognition with improved trajectories, IEEE
[94] K. Schwabe, J. Teizer, M. König, Applying rule-based model-checking to International Conference on Computer Vision (2013) 3551–3558.
16