You are on page 1of 18

Safety Science 126 (2020) 104658

Contents lists available at ScienceDirect

Safety Science
journal homepage: www.elsevier.com/locate/safety

A critical review of vision-based occupational health and safety monitoring T


of construction site workers
Mingyuan Zhang⁎, Rui Shi, Zhen Yang
Department of Construction Management, Dalian University of Technology, Dalian 116000, China

ARTICLE INFO ABSTRACT

Keywords: Globally, the occupational health and safety (OHS) of construction workers has long been a serious concern. To
Computer vision address this issue, there is an urgent need for an efficient means to continuously monitor the construction site to
Occupational health and safety eliminate potential hazards in a timely manner. As robust and automated video and image information ex-
Monitoring traction and processing tools for construction sites, computer vision techniques have been considered to be
Construction site
effective solutions and been applied for the occupational health and safety monitoring of construction site
workers. This paper aims to use bibliometric and content-based analysis methods to review the previous at-
tempts in related fields and present the current research status in this field. The results clarify the major lim-
itations and challenges of the current research from both technical and practical perspectives, in turn suggesting
the direction of future research.

1. Introduction et al., 2005; Thomas Ng et al., 2005; Lingard, 2013). It is crucial for the
construction industry to develop and implement effective OSH man-
Due to hazardous working environments at construction sites, agement programs to reduce fatalities and injuries at sites (Teo et al.,
workers frequently face potential safety and health risks throughout the 2005; Lingard, 2013), and the project manager is particularly important
construction process. According to statistics released by the Ministry of throughout organizational management process (Lingard, 2013; HS,
Housing and Urban-Rural Development of the People's Republic of 2018). Other studies have suggested that the key to controlling occu-
China (MOHURD), from 2011 to 2017, there were 4766 deaths in the pational health and safety programs and safety processes for workers is
construction industry in mainland China, with an average of 1.87 to change the assessment methods (Pawłowska, 2015; Sinelnikov et al.,
deaths per day. In 2018, as of November 30, the statistics show that 2015; Sheehan et al., 2016). Occupational health and safety (OHS)
there were 712 accidents and 798 deaths (Accident List, 2018). The measurement relies heavily on “failure-focused” (Sinelnikov et al.,
mortality rate remained high. OHS problems in the construction in- 2015) lagging indicators. Recent research emphasizes a more proactive
dustry are a global issue and are not unique to any single country. Fatal evaluation of OHS activity that emphasizes leading indicators, or in-
construction injuries in the United States accounted for approximately puts, that allow organizations to predict safety concerns, possibly re-
18% of all occupational deaths in 2014 (CFOI, 2017a). Private con- ducing the likelihood of an OHS incident occurring (Grabowski et al.,
struction had the highest count of fatal injuries in 2016, with 991 2007; Lingard et al., 2011; Reiman and Pietikainen, 2012, Pawłowska,
deaths, which was the third highest fatal work injury rate, i.e., 10.1% 2015; Sinelnikov et al., 2015; Sheehan et al., 2016). The content of OHS
(CFOI, 2017a). The International Labor Organization estimates that in issues is quite complex, and there are also many influencing factors and
some countries, 30% of construction workers suffer from back pain or solutions. However, technological interventions perhaps provide the
other musculoskeletal diseases (Lingard, 2013). In addition, the in- most significant potential for eliminating OHS hazards or reducing risks
cident rate for nonfatal occupational injuries and illness in construction at the source (Lingard, 2013). In the past 20 years, the construction site
is 30% higher than the average industry (2012). occupational health and safety management research and tool devel-
In view of these worrying data, many studies are dedicated to im- opment based on innovative technology have grown greatly. Virtual
proving this situation and ensuring the occupational health and safety reality (VR) has been used for on-site worker training and hazard
of workers on the construction site. Some studies have emphasized the identification (Hadikusumo and Rowlinson, 2002; Sacks et al., 2009;
importance of organization and management systems in OHS (Teo Park and Kim, 2013; Albert et al., 2014; Perlman et al., 2014; Le et al.,


Corresponding author.
E-mail addresses: myzhang@dlut.edu.cn (M. Zhang), srlg@mail.dlut.edu.cn (R. Shi), yangz@mail.dlut.edu.cn (Z. Yang).

https://doi.org/10.1016/j.ssci.2020.104658
Received 2 July 2019; Received in revised form 31 December 2019; Accepted 1 February 2020
0925-7535/ © 2020 Elsevier Ltd. All rights reserved.
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 1. The research logic framework.

2015). Building Information Modeling (BIM) technology has been used limitations such as lack of critical synthesis, less supportive databases,
to pre-identify hidden dangers (Zhang et al., 2015). The development of possibility of technical inaccuracy and expert opinions. Therefore, this
sensor technology has greatly improved information collection, data study incorporates a comprehensive overview using science mapping as
transmission and processing, and has played a powerful role in con- well as a critical synthesis of previous literature in an attempt to
struction safety management (Mingyuan et al., 2017). Among them, achieve the best of both worlds. This is justified since previous studies
computer vision technology quickly became a focus of researchers, with loosely explain the relationship of application domains of computer
its characteristics of nonintrusion, appropriate cost and high recogni- vision in construction OHS but a thorough review using science map-
tion efficiency. ping tools on it is missing in the literature. In addition, there is currently
Computer vision technology has been investigated and tentatively no comprehensive review of the latest researches on construction OHS
implemented in the OHS monitoring of construction site workers. OHS instead of just targeting some specific issues in this field since 2015 (Seo
deals with all aspects of health and safety in the workplace and has a et al., 2015). Such a synthesis will not only help provide the much
strong focus on the primary prevention of hazards (wpro, 2015; WHO needed automation to construction OHS monitoring but also pave a
Defination of Health, 2016), and, in the literature, computer vision way for positive adoption of vision-based techniques due to its un-
techniques have been widely used in the monitoring of workers' occu- discovered strength in safety and health implementation.
pational health and safety, because it can help to efficiently and auto- This paper aims to clarify the application ability of computer vision
matically understand and continuously monitor the construction site, in the field of construction site workers OHS monitoring through the
labor, materials, machinery, and interactions. However, it is worth form of a review. It is believed that the applications of computer vision
noting that the research on computer vision in this field is mainly fo- techniques in this field can be divided into the following two aspects
cused on the safety of workers on construction sites (in addition to, of according to the object of monitoring: 1) the workers themselves and 2)
course, other aspects); this safety is also the focus of this review. the interaction between the worker and the surroundings. This paper
Due to the popularity of computer vision technology in the con- also aims to address the gaps and limitations of the current state-of-art
struction field, there have been research articles to conduct manual research, to drive the prospective research in the direction that is most
reviews of computer vision research in construction, such as defect valuable and to identify the critical areas that the industrial and aca-
detection and condition assessment (Koch et al., 2015), tracking of demic communities are adhering to.
temporary resources (Teizer, 2015), tracking of workers (Konstantinou To accomplish these goals, the following text is organized as fol-
et al., 2019), safety (Seo et al., 2015). However, Markoulli et al. lows: first, the research background will be introduced. Second, the
(Markoulli et al., 2017) axiomatized that traditional manual reviews review methods and materials used are described. The third section of
focus on some trees but fail to provide an overview of the whole for- the article overviews the development of the field using scientometric
est—pointing to a lack of holism. Thus, the entire exercise makes such tools. The next section presents a detailed overview of the research
reviews more subjective and run a risk of introducing bias (Zhong et al., works on applying computer vision techniques to the OHS monitoring
2019). To improve upon such limitations, Botao Zhong (Zhong et al., of construction site workers. The current research challenges are dis-
2019) reviewed computer vision research in construction through sci- cussed in the following section, and future research directions are
ence mapping tools. However, science mapping has certain inherent given. The results and significance of the study are described at the end

2
M. Zhang, et al. Safety Science 126 (2020) 104658

Table 1
Generic accident causation models.
Model Proposer/proposers Year this model was OHS accident causes Reference
proposed

Goals Freedom Alertness Theory Kerr 1957 (1) psychological work environment (Kerr, 1957)
Domino theory Heinrich 1969 (1) environment (Heinrich, 1969)
(2) fault of person: unsafe act
Multiple causation Peterson 1971 (1) behavioral factors (human factors) (Petersen, 1971)
(2) environmental factors (physical
working conditions)
Updated domino sequence Bird and Loftus 1974 (1) management (Bird, 1974)
(2) personal factors
(3) job factors
(4) substandard acts and conditions
Ferrel Theory Heinrich 1980 (1) workload (Heinrich, 1980)
(2) human capability
the “Swiss Cheese” model of human error Reason 1990 (1) management (Reason, 1990)
(2) conditions
(3) acts
(4) hard defenses
(5) soft defenses
Human Factor Analysis and Classification Shappell, S. A., and 1997 (1) organization (Shappell and Wiegmann,
System (HFACS) Wiegmann, D. A. (2) supervision 1997)
(3) acts
(4) conditions

*Hard defenses include automated safety facilities, early warning systems, and more. Soft defenses refer to training and education, etc.

of the article. The research logic framework of the article is shown in also been developed or adapted from the generic causation models
Fig. 1. (e.g., Abdelhamid and Everett 2000). The models (as shown in Table 2,
Abdelhamid and Everett, 2000; Garrett and Teizer, 2009; Suraji et al.,
2001; Chua and Goh, 2004; Mitropoulos et al., 2005) may assist safety
2. Background researchers and practitioners to conceptually understand the mechan-
isms behind construction accidents. At present, the research has begun
2.1. The main causes of occupational accidents and illness in the to deeply explore the causes of specific construction accidents (Chi
construction industry et al., 2005; Chan et al., 2008; Wong et al., 2016). Wong et al. (Wong
et al., 2016) found that fall accidents were mainly caused by the fol-
Many researchers in the field have tried to understand occupational lowing four factors: planning, supervision, management and human
accidents and illness in the construction industry by introducing and factors. The above theoretical research reveals that the causes of acci-
building accidents causation models and analyzing the elements con- dents and illness in construction workers are complex and diverse.
tributing to occupational accidents and illnesses. Some theoretical There are connections and interactions between factors, as shown in
models are dedicated to studying the mechanisms behind occupational Fig. 2. Complex factors, which include environmental, and human
accidents and diseases. The researchers tried to establish some generic elements, are mentioned in almost every accident causation model and
theoretical models to identify the root causes of accidents (Kerr, 1957; relative research as the main cause of occupational accidents and ill-
Heinrich, 1969; Petersen, 1971; Bird, 1974; Heinrich, 1980; Reason, ness. Environmental factors mainly refer to the condition of the site and
1990; Shappell and Wiegmann, 1997), as shown in Table 1, which the climate and economic environment. The human factors mainly in-
provides valuable learning opportunities for the development of pre- clude the physical and psychological state of the person, behavior,
ventative measures. Models focusing on construction accidents have

Table 2
Causation models specifically developed for construction accidents.
model Proposer/proposers Year this model was OHS accident causes Reference
proposed

Accident Root Causes Tracing Model Abdelhamid, T., and Everett, 2000 (1) unsafe conditions (Abdelhamid and Everett,
(ARCTM) J (2) workers’ improper response to unsafe 2000)
conditions
(3) worker’s unsafe acts
Constraint-response Model Suraji et al. 2001 (1) proximal factors (Suraji et al., 2001)
(2) distance factors
Modified Loss Causation Model Chua and Goh 2004 (1) situation (Chua and Goh, 2004)
(2) management system
(3) underlying factors
Descriptive Accident Causation Model Mitropoulos et al. 2005 (1) characteristics of the activity and (Mitropoulos et al., 2005)
context
(2) safety efforts to control conditions
(3) task unpredictability.
Human Error Awareness Training Garrett, J. W., and Teizer, J. 2009 (1) organizational culture (Garrett and Teizer, 2009)
(HEAT) (2) supervisory influences
(3) preconditions
(4) unsafe acts.

*(1) Characteristics of the activity and context: (a) the work technology, (b) the physical conditions, and (c) the surrounding activities.

3
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 2. Modern accident cause theory model diagram (Bird, 1974).

personality characteristics and work ability. Monitoring these two types synthesizing the literature and rationalizing outcomes and has been
of factors and their interactions by using computer vision techniques is widely applied in the research field of engineering/construction man-
the main way to prevent worker occupational health diseases and safety agement (Yi and Chan, 2014; Mok et al., 2015; Liang et al., 2016; Li
accidents. et al., 2018). Web of Science was used to select a number of articles that
are related to the targeted areas. Through the theme, research field, and
journal screening, the fit and quality of the research literature materials
2.2. Overview of computer vision
obtained are guaranteed. Specifically, there are three stages, as fol-
lowing:
Research in the field of computer vision has developed rapidly in
recent years, especially in the areas of object detection, object tracking,
1) An exhaustive search was carried out using the Web of Science
activity recognition, and scene understanding. The current mainstream
search engine. The search period was fixed between the years
object detection algorithm is mainly based on deep learning (Yann
2000–2019. To ensure all relevant literature are captured through
et al., 2015), which can be divided into the following two categories:
the literature retrieval process, different unique search keywords are
two-stage algorithms that predominate in accuracy and one-stage al-
required. Boolean operators AND and OR were used to formalize
gorithms that predominate in speed. A typical representative of the
keyword search. The search strings used were “((construction site)
former is the R-CNN series algorithm based on a region proposal (e.g.,
AND ((safety) OR (health)) AND ((computer vision) OR (vision-
R-CNN, Girshick et al., 2013; Fast R-CNN, Ren et al., 2015; Faster R-
based)))”, “((construction site) AND ((worker) OR (safety) OR
CNN, Shaoqing et al., 2017), and typical algorithms of the latter are
(health)) AND ((computer vision) OR (vision-based)))” and “((con-
YOLO (Redmon et al., 2016) and SSD (Liu et al., 2016c). Similar to
struction site) AND ((worker) OR (accidents) OR (illness)) AND
object detection, object tracking is one of the hottest areas of computer
((computer vision) OR (vision-based)))”. The search explored the
vision. In the past few decades, considerable progress has been made in
title, abstract and keywords which were sufficient to identify and
the field of object tracking. After a long development of the object
extract the relevant articles. To limit scope of the search results, a
tracking algorithm, including the following stages: 1) the classic
filtering process was employed by evaluating the search results
tracking methods stage (e.g., Meanshift, Vojir et al., 2014; Particle
against two selection criteria: (1) The document should be written in
Filter, Zheng and Bhandarkar, 2006) 2) the track by detection (Kalal
English. (2) Studies indexed in database Web of Science Core
et al., 2012) and the correlation filter(Bolme et al., 2010) stage and 3)
Collection (WOSCC) are in the scope. A total of 198 documents in-
deep learning based tracking stage (Nam and Han, 2016; Held et al.,
cluding journal and conference papers were extracted as a result of
2016), the number of objects tracked, the speed of tracking and the
this exercise until March 2019.
accuracy of tracking have been greatly improved. The purpose of ac-
2) The publications that do not include the abovementioned keywords
tivity recognition in the field of computer vision is mainly to identify
in their titles or abstracts were artificially screened out in stage 2.
human activities through cameras. For example, Microsoft Kinect uses
This stage of screening of documents was conducted mainly by a
RGBD cameras to construct human bone features and recognize human
thorough review of abstract to identify the articles relevant only to
poses (Piyathilaka and Kodagoda, 2013). The task of visual relationship
construction workers’ safety and health, resulting into 141 articles
recognition/detection is not only to identify the objects in the images
for further analysis.
and their position but also to identify the relationships between the
3) Screening out the less relevant and irrelevant papers after a brief
objects. The Language Prior model (Lu et al., 2016) and Visual Trans-
visual examination of the content of the article in the stage 3,
lation Embedded network (Zhang et al., 2017) model were established
leaving a total of 117 publications for further analysis. The two
to understand the richer scene information in the images.
criteria of the content examination in the second literature filtering
The development of computer vision techniques provides strong
process were: (1) focuses on health and safety monitoring of con-
technical support for quickly extracting construction site information,
struction site workers. (2) focuses on computer vision techniques or
identifying construction site conditions and worker behavior, and
techniques integrated with computer vision.
achieving safety and health monitoring of construction site workers.

To provide a broad range review of the target areas, and considering


3. Material and methods that conference paper may propose latest interesting methods, we did
not exclude published conference papers which account for a certain
This research leverages the content analysis-based review method. proportion (21.37%), from the below-listed conferences with the year
This method is a well-recognized method for reviewing and

4
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 3. Numbers of papers published in the selected journals.

and the sponsor provided: Civil Engineering (JCCE), Journal of Construction Engineering and
Management (JCEM), validating the quality and importance of the lit-
• Construction Research Congress, 2012(American Society of Civil erature we collected.
Engineers, ASCE) The search method may not guarantee a comprehensive coverage of
• Construction Research Congress, 2014(ASCE) the papers that are worth reviewing. However, such an approach suf-
• Construction Research Congress, 2016(ASCE) fices for providing with a considerable number of significant state-of-
• Construction Research Congress, 2018(ASCE) the-art works, from which the study could generalize findings and re-
• International Workshop on Computing in Civil Engineering, commend future works.
2012(ASCE)
• International Workshop on Computing in Civil Engineering, 4. Overview of the literature based on statistical and bibliometric
2015(ASCE)
• International Workshop on Computing in Civil Engineering, tools
2017(ASCE)
• International Conference on Computing in Civil and Building As shown in Fig. 4, the emergence of the first publication in 2003,
marks the point when researchers begin to apply computer vision
Engineering, 2014(ASCE)
technology to on-site worker occupational health and safety mon-
The basic data analysis of our literature materials was conducted as itoring. After 2010, research in this field caught the attention of re-
shown in Fig. 3. The sample articles are from excellent journals in the searchers and a large number of results began to emerge, and gradually,
field, such as Automation in Construction, Journal of Computing in it became a popular research issue. 28 papers were published in both
2017 and 2018. Such research trends are inextricably linked to the

Fig. 4. Numbers of papers published in different years.

5
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 5. Research hotspot heat map.

development of the computer vision technology itself. At the beginning of the ball in the picture shows that Georgia Tech, the University of
of this century, computer vision research based on learning became Michigan, Huazhong University of Science and Technology, etc., cur-
popular, and the success of computer vision technology based on deep rently appear to be performing well. It can be observed from the con-
learning in 2015 promoted research in this field. nections that the research institutions represented by the balls of the
VOSviewer is a software tool for constructing and visualizing bib- same color tend to be related to the same research field.
liometric networks. To find the research hotspots in the field, we used The differences in the publication quantities of various countries
VOSviewer to analyze the keywords of the literature, and a heat map may imply the extent to which research commitment and value are
was obtained, as shown in Fig. 5. The core domain of the research lies in observed in these countries. From the statistical results, 13 countries
“computer vision“ and “security”. The hotspots include detection, have carried out research in this field, and the United States, China,
tracking, recognition, understanding, deep learning, etc. Australia, Canada and South Korea are leading the research in this field,
Similarly, to reflect the research performance of research institu- as shown in Fig. 7. There were three reasons for this result. First, the
tions in this field, we conducted statistics on the authors of the litera- sheer size of the construction worker community and the seriousness of
ture, and the results of the visual analysis are shown in Fig. 6. The size occupational health and safety issues have prompted researchers in the

Fig. 6. Analysis of research institutions.

6
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 7. Analysis of research countries.

country to develop workable solutions. Second, the society's high at- 5.2. Establishment and updating of the construction site visual database
tention to the occupational safety and health issues of construction
workers has promoted the development of research. Third, government The training and testing of the related models of computer vision
agencies and related companies have provided sufficient research technology often requires the aid of a data set prepared in advance.
support in this area. Additionally, another reasonable justification Currently, a number of open source databases (e.g., ImageNet, MINIST)
would be that the five countries have been leading the research and have been established in the field of computer vision, thus promoting
technology advancements in visualization and informatics for years. the development of domain-modifying algorithms. However, these
image databases do not contain specific images in the construction
5. Preparation field. Therefore, the studies in the construction field need to build
specific image databases to train the proposed model or method.
Computer vision techniques are popular areas of research in the An image database containing a wealth of construction site images is of
field of computer science, and their applications are extensive. great significance in improving the accuracy of applications, such as target
Applications of computer vision in civil engineering or construction detection and behavior recognition. M. Golparvar-Fard et al (Golparvar-
began late compared to other applications (e.g., electrical and elec- Fard et al., 2013; Soltani et al., 2016) highlighted the need to establish a
tronic engineering). The question is how to make computer vision comprehensive data set in the construction sector and introduced a new
techniques more suitable for the complex and dynamic environment of operational identification data set for common earthmoving equipment
the construction site, to realize the transfer of the techniques, so that (dump trucks and excavators). Humaira Tajeen (Tajeen and Zhu, 2014)
they can be perfectly applied to the occupational health and safety established a standard dataset for site images containing five types of
monitoring of workers on the construction site and become the focus of construction equipment. Yang et al (Yang et al., 2016) established 1176
relevant researchers. Throughout the literature, the research has made video dips covering 11 types of trades. Sowmya (Sowmya, 2017) estab-
attempts regarding the following aspects, which this paper defines as lished a new construction worker database with 389 images, containing
preparation: (1) improvement and innovation of computer vision al- five types of activities activities done at construction sites namely ladder
gorithms and (2) the establishment and updating of the construction climbing, brick laying, carpentry work, painting and plastering work. Luo
site visual (images and videos) database. established database containing photographs of workers installing re-
inforcement (Luo et al., 2018a). Computer vision methods relying on
5.1. Improvement and innovation of computer vision algorithms machine learning all need to establish an image data set to complete the
training of the model and ensure the accuracy of its application.
Computer vision techniques need to meet complex, chaotic and
changeable construction sites to complete unique construction site 6. Applications
monitoring tasks. Therefore, some studies improved on and provided
innovations of the achievements in the field of computer vision and This section will review the application of computer vision technology
established a unique technical framework. The improvements and in- in the occupational health and safety monitoring of construction workers;
novations in this area focused on the current excellent target detection/ this monitoring is supported by the research on occupational accidents and
recognition algorithms (Golparvar-Fard et al., 2013; Rubaiyat et al., disease occurrence mechanisms. It is believed that the role of computer
2016; Ding et al., 2018; Fang et al., 2018b; Fang et al., 2018c). For vision is to prevent and control the occurrence of diseases and safety ac-
example, to more quickly and accurately detect the wearing of personal cidents through monitoring the workers themselves and the interactions of
protection equipment by construction site workers, Fang et al. (Fang workers with the construction site. In the literature reviewed in this paper,
et al., 2018b) developed a Faster-R-CNN and deep CNN model to the workers themselves mainly refer to the behaviors of the workers, and
identify workers and their harnesses. At the same time, IFaster-R-CNN the site environment mainly includes the worksite condition, equipment
was developed to identify workers and heavy equipment at the con- and materials. Based on the literature research, according to the different
struction site (Fang et al., 2018c). objects of computer vision in the occupational health and safety

7
M. Zhang, et al. Safety Science 126 (2020) 104658

Table 3
PPE-use detection research details.
Reference Object(s) Method Contributions Limitations

(Park and Brilakis, 2012) Vest (1) HOG + HSV (1) Provided research ideas for subsequent (1) depends on color stability
hardhat (2) background subtraction research (2) affected by lighting conditions
(Shrestha et al., 2015) hardhat (1) Harr (face detection) (1) Real-time cutting video and feedback (1) depends on high resolution cameras
(expandable) (2) edge detection (2) could only capture it on the front
(3) color detection
(Rubaiyat et al., 2016) hardhat (1) Semantic Segmentation (1) In the worker detection part, a (1) depended on high resolution cameras
(expandable) (2) HOG machine learning method was adopted. (2) did not distinguish between hard hats and
(3) color-based and Circle Hough (2) Accuracy rate reaches 84% regular caps
Tranform (CHT) feature extraction
(Fang et al., 2018a) hardhat (1) Faster-R-CNN (1) precision and recall rate were still (1) poor lighting conditions and severe
beyond 95% under nonsevere occlusions could seriously affect the system
occlusions performance
(2) could be used in far-field surveillance
videos
(3) precision and recall rate of 95.7% and
94.9%
(Fang et al., 2018b) harness (1) Faster-R-CNN (1) precision of 99% (1) affected by lighting conditions and occlusions
(2) Deep CNN (2) overcome difficulties in detecting the
use of harnesses

monitoring of workers on the construction site, the practice is divided into These studies have proven that it is feasible to use computer vision
the following two categories: (1) the workers themselves and (2) inter- technology and use a camera installed on the construction site to au-
actions between workers and the site environment. tomatically supervise the use of PPE on the construction site and to
provide early warning signals. It is of great significance to the occu-
6.1. Worker himself pational health and safety surveillance of workers. First, safe behavior
can be monitored without disturbing work. Second, a wide range of
Workers have always been dynamic and are the most difficult to work areas can be monitored simultaneously, reducing the costs and
control on the construction site, and they are one of the most important time associated with inspections. The studies on PPE-use detection have
factors causing safety accidents and illness. At present, the typical use made great progress in terms of the detection accuracy and speed,
of computer vision to monitor workers mainly consists of the following especially with the support of deep learning methods. Researchers have
applications: 1) the use of personal protective equipment detection made great breakthroughs, but there are still technical restrictions. The
(PPE) and 2) worker behavior identification monitoring. system faces the challenge of instability due to lighting conditions, and
severe occlusions. Research on the use of computer vision for protective
footwear and protective gloves has not been carried out because these
6.1.1. Personal protective equipment use detection objects require higher recognition accuracy. In addition, the scalability
The leading causes of construction site fatalities were falls, slips, of the research results in this area has not been proven.
being struck by objects, electrocution, and being caught in/between
objects. The proper use of personal protective equipment is believed to
reduce the risk of casualties (Jaafar et al., 2018). Relevant regulations 6.1.2. Behavior recognition
also mandate the proper use of personal protective equipment at the Effective identification of labor activity on construction sites is the
construction site. In China, for example, people working above a height basis for safety and health monitoring. Studies have shown that 80% of
of two meters are required by law to use fall arrest equipment. the construction accidents are caused by the unsafe behavior of construction
Occupational Safety and Health Administration (Osha) stipulates that workers (Li et al., 2015). On the other hand, incorrect working posture is
the employer is responsible for requiring the wearing of appropriate an important cause of occupational diseases (Yi and Chan, 2016; Wang
personal protective equipment in all operations where there is an ex- et al., 2016; Seo et al., 2016; Schaub, 2006; Karhu et al., 1977; Delleman
posure to hazardous conditions or where this part indicates the need for et al., 2000; Mcatamney And Nigel Corlett, 1993). Previous behavioral
using such equipment to reduce the occurrence of injury to the em- observations and analyses were based primarily on manual observations
ployees (Reese and Edison, 2006). However, the workers might not (Krause, 1997; Mcsween, 2003) and survey visits (Straker et al., 2010). In
follow the requirements exactly due to, e.g., fatigue, distractions, and this way, there are obvious disadvantages in terms of cost and efficiency.
carelessness, even if they have been previously educated and trained Computer technology, implemented by means of a camera on the con-
(Green and Tominack, 2012). The approach taken before was that se- struction site, has become the subject of much research because of the
curity officers conducted on-site supervision, which had great dis- relatively low cost of obtaining a wide range of worker behavior in-
advantages in terms of cost and efficiency. formation and the nonintrusion of activities of interest.
With the development of computer vision technology, to address this Behavior recognition is a complex topic, and there are many ways to
problem and to reduce the unnecessary casualties caused by this, re- classify it. Turaga et al. (Turaga et al., 2008) divided human behavior
searchers have gradually applied this computer vision wireless sensing recognition into three parts, namely, movement recognition, action
technology to the automatic and noninvasive monitoring of the use of recognition and activity recognition, which are considered as low-level
personal protective equipment for field workers (Park and Brilakis, 2012; vision, middle-level vision and high-level vision. Gavrila (Gavrila,
Shrestha et al., 2015; Fang et al., 2018a; Fang et al., 2018b). The main 1999) used 2D and 3D methods to study human behavior separately.
practice is to use object detection technology to determine whether workers Depending on the level of complexity, J. K. Aggarwal and M. S. Ryoo
are wearing the appropriate personal protective equipment. The current (Aggarwal and Ryoo, 2011) classify behaviors as follows: gestures, ac-
research mainly involves hardhats (Park and Brilakis, 2012; Shrestha et al., tions, interactions and group. In the field of construction, limited by the
2015; Park et al., 2015; Rubaiyat et al., 2016; Mneymneh et al., 2017; Fang development of computer vision technology itself, the current appli-
et al., 2018c, Wu and Zhao, 2018), safety vests (Park and Brilakis, 2012), cation research mainly remains at the first two levels, namely, the re-
and safety harnesses (Fang et al., 2018b), as shown in Table 3. cognition of gestures and actions.

8
M. Zhang, et al. Safety Science 126 (2020) 104658

The early research focused on feature extraction (time (dynamic studies (Liu et al., 2017; Ding et al., 2018; Luo et al., 2018a) favored
recognition, video-based) and spatial features (static recognition, convolutional neural networks or CNN-based hybrid deep learning models.
image-based)) for worker action classification (Peddi et al., 2009; Gong The same limitation, that is, the knowledge randomly selects a worker in
et al., 2011; Liu et al., 2016a). For example, the 3D-Harris detector as the image or video for gesture or motion recognition, which is inconsistent
the feature detector, local histograms as the feature representation, Bag with the situation at a crowded construction site. Convolutional networks
of-Words as the feature model, and Bayesian network models as the are used to identify activities (Luo et al., 2018b) that are encoded in spatial
learning mechanism for action learning and classification, using back- and temporal streams, which are capable of simultaneously identifying and
ground subtraction to extract the worker's posture. Gong et al. (Gong marking different behaviors of multiple workers, given the severe camera
et al., 2011) divided the template work related actions into the fol- motion and low resolution of site surveillance videos and the marginal
lowing action: traveling, transporting, bending, nailing with hammer, interclass difference and significant intraclass variation of the workers'
aligning formwork. activities. The recognition average accuracy rate was 80.5%.
The behavior representation based on contour and color feature According to the literature, visual-based behavioral recognition has
extraction is easy to obtain and is stable, but it is relatively rough. In developed rapidly over the past decade as shown in Table 4. The main
view of this notion, the researchers then gradually began to use three- developments are as follows: 1) obtaining dynamic behavioral information
dimensional motion information established by depth images and from a single static behavioral message by adding time constraints; 2)
stereo cameras to complete the fine-level behavior representation, obtaining 3D behavior information from 2D behavior information by
which provides a wealth of information about human movement (e.g., adding depth information; 3) simultaneous acquisition of multi-person
angle information for ergonomic and biomechanical analysis) for er- sports information developed from that of single-person behavior in-
gonomic health and safety assessment (Ray and Teizer, 2012; Brilakis formation; and 4) motion representation through the automatic acquisi-
et al., 2013; Khosrowpour et al., 2014; Seo et al., 2015b; Seo et al., tion of skeleton information and fine posture estimation rather than
2015a; Zhang et al., 2018). This type of research is based on still images through rough motion classification as at the beginning. No matter which
and lacks a description of the motion between the images. Therefore, aspect of research is pursued, the accuracy rate is improved and the cost is
the research began with dynamic video development, and the rapid reduced. However, as was mentioned at the beginning, today's research
development occurred in the development of vision-based motion still has two levels of posture and action. There is still a large gap in the
capture systems (Han and Lee, 2013; Han et al., 2013; Liu et al., 2016b; interactions between people and the environment and the acquisition of
Konstantinou and Brilakis, 2016). This type of motion capture system, group behavior. Some studies have begun to try to improve the accuracy
in theory, usually includes the two processes of pose estimation and and fineness of recognition by means of the tools and equipment used in
tracking (Moeslund et al., 2006), adding temporal constraints on the behavior recognition, but the recognition of smaller targets is still un-
position of the body joint between successive frames to improve the satisfactory. It is important to note that the behavioral information is
efficiency of the data extraction. For example, in Meiyin Liu's research obtained for subsequent safety assessment and health analysis, so rich and
(Liu et al., 2016b), a stereo vision system was proposed for 3D pose detailed information is needed. This type of richness should not only in-
estimation, which tracks the position of human joints through 2D image clude independent behavioral information but should also consider the
frames and extracts a 3D human skeletal model from multiview image behavioral background and environment.
sequences. The acquisition of the 3D bone data includes the following
three processes: (1) 2D human joint tracking; (2) 2D skeleton matching; 6.2. Worker interactions with the construction site
and (3) 3D skeleton reconstruction. It can be observed from the above
that the use of depth image information and tracking technology im- Construction sites are characterized by a variety of interactions
proves the motion representation to a more precise and reasonable between workers and construction equipment, materials, and other site
level, enriching the motion data and behavior background, and greatly conditions. This dynamic and complex environment poses hidden
improving the efficiency of the automatic acquisition of motion data. dangers to the workers' occupational safety and health. Incorrect or
Today, deep learning has been used for behavior recognition, which inappropriate interactions can cause accidents. For example, collisions
greatly improves the accuracy and robustness of the recognition. Previous between different pieces of construction equipment, especially heavy

Table 4
Behavior recognition research details.
Reference Object Method(s) 2D/3D S/C Contribution(s) Limitation(s)
level

(Peddi et al., 2009) gesture (1) background 2D S (1) real time (1) a gesture can belong to
subtraction (2) classifying worker gesture into three multiple categories
(2) neural network classes
(Gong et al., 2011) action (1) 3D-Harris detector 2D C (1) the accuracy is 73.6%-79% (1) constrained to a single person
(2) Bag-of-Video–Feature-Words (2) action analyze based on image (2) Not real time
(3) Bayesian network models sequences
(Seo et al., 2015) gesture (1) depth images (KINECT) 3D S (1) 93% of accuracy (1) Limited location of devices
(2) helps to identify the risk of WMSDs (2) lacks a description of the motion
between the images
(Liu et al., 2016b) gesture (1) 2D body joints tracking 3D C (1) accuracy of positions is 3.8 cm (1) interference from occlusion,
(2) 2D skeleton matching (2) free from the large amount of training illumination and camera
(3) 3D skeleton data sets views
reconstruction
(Luo et al., 2018a) activity (1) an improved convolutional neural network (CNN) 2D C (1) recall rates were all 100%, and the (1) interference from occlusion
that integrates RGB, optical flow, and gray stream average accuracy was 85% (2) lack of time series
CNNs (2) Significantly improved robustness of (3) a single worker
model
(Luo et al., 2018b) group (1) Two-stream convolutional networks: spatial 2D S (1) simultaneously identifying and (1) cold start
and temporal streams marking different activities of
multiple workers

*Object level is based on the behavior classification (Aggarwal and Ryoo, 2011) and S/C means using still or consecutive images.

9
M. Zhang, et al. Safety Science 126 (2020) 104658

construction equipment, between construction equipment and con- cost of the sensor will increase greatly. Computer vision technology is
struction materials, and between construction equipment and workers considered an effective alternative due to its cost and suitability advantages.
are important types of safety incidents at construction sites. In 2011, At present, using computer technology to track and locate building
the US Department of Labor Statistics’ Chief Secretary for resources mainly involves two core contents, i.e., workers and con-
Administration reported that 122 people were killed by construction struction equipment. There are two main tracking methods mentioned
workers colliding with equipment or objects, accounting for 17% of the in the existing research (Kim and Chi, 2017), i.e., detection-based and
deaths in the construction industry, and the data did not improve in the sequential analysis-based tracking. The former independently detects
data published in 2017. Monitoring the various building resources at the object of each frame of the video, while the latter analyzes the si-
the construction site and mastering the interactions among them and milarities and differences between successive sequences. The core of
workers are important ways to use computer vision technology to detection-based tracking lies in the image features and machine
prevent workers' occupational diseases and safety accidents. The main learning. The current research mainly uses the histogram-of-gradients
aspects currently involved are as follows: resource tracking, resource (HOG) and color information to perform personnel and device detection
location, proximity analysis, proximity warning system design, etc., and through feature cascade classifiers to facilitate object tracking (Park
there is a connection between these applications, as shown in Fig. 8. et al., 2011; Azar and McCabe, 2012; Azar, 2016). On the other hand,
sequential analysis-based tracking methods have also been evaluated in
the research. This type of tracking method relies mainly on Kalman
6.2.1. Resource tracking and positioning
filters, particle filters, and mean-shifts. The previous research has fo-
Tracking and Positioning resources are important tasks in construction
cused on the tracking of individual workers (Teizer and Vela, 2009;
safety and health management. With the development of automation
Park et al., 2011). Subsequently, it began to gradually explore the
technology, manual monitoring has been gradually replaced. The available
possibility of tracking multiple workers at the same time (Yang et al.,
tracking solutions are based primarily on RF technologies, including Global
2010; Park and Brilakis, 2016; Konstantinou and Brilakis, 2018; Lee and
Positioning Systems (GPS), Radio Frequency Identification (RFID) and
Park, 2019). Yang et al. (Yang et al., 2010) first proposed a tracking
Ultra-Wide Band (UWB) technologies. These technologies all work on the
scheme that uses cameras to track multiple workers at the construction
same principle, i.e., installing a sensor on each entity. However, in regard to
site. The program uses online color model learning and Kernel
large-scale crowded construction sites, the installation time and economic

Fig. 8. Application diagram.

10
M. Zhang, et al. Safety Science 126 (2020) 104658

Table 5
Typical research in resource tracking and positioning.
Reference Object(s) Methods Contribution(s) Limitation(s)

(Teizer and Vela, 2009) single worker (1) density mean-shift, (1) assessed ability (1) focused on a single target for each
(2) Bayesian segmentation (2) of 4 algorithms to track under typical video sequence
(3) active contours construction
(4) graph-cuts (3) site conditions
(4) two error metrics were determined
(5) discussed potential benefits and
disadvantages of the tracking algorithms.
(Park et al., 2011) (mutiple (1) kernel-based tracking (1) tested and evaluated the performance of (1) focused on a single target
categories) approaches th two categories approaches
single object (2) point-based tracking (2) five metrics were set
approaches
(Yang et al., 2010) mutiple (1) Kernel covariance tracking (1) tracked multiple personnel (1) algorithm might fail when
workers (2) Online color model (2) in a given video sequence appearance changed too much
learning (3) could handle the scale change, trajectory and severe occlusion last too long
crossing and occlusion well
(Yang et al., 2011) crane (1) Gaussian background (1) estimated velocity from position (1) A crude crane model was
modeling algorithm measurements determined in advance
(2) estimated jib angle
(Brilakis et al., 2011) (project related (1) SURF feature detectors (1) 3D tracking of each entity (1) the error resulted from the
entities) single (2) STFs calibration and 2D vision-based
target (3) triangulation tracking processes cannot be
ignored
(2) tracked a single object per entity
type
(Lee and Park, 2019) multiple (1) entity matching (1) track multiple workers in 3D (1) processing time was over the
workers (2) triangulation (2) 96% of workers' movements are retrieved limitation
with the mean error of 0.293 m (2) Lacked of long-term testing under
(3) estimate trajectories is quantified in three complex environmental conditions
ways (completeness, continuity, and
localization accuracy)

covariance tracking to demonstrate reliable performance in the sce- (workers, construction equipment, temporary materials); (2) traditional
narios given by the experiment. However, in some difficult situations, 2D tracking and positioning has been developed to 3D tracking and po-
such as changes in the appearance of workers, changes in the en- sitioning; and (3) tracking and locating the target as a whole and its part
vironment, and severe occlusion, the algorithm may fail. (for example, the local node of heavy construction equipment; the position
As researchers become aware of the potential of computer applica- of the driver's head). Obviously, these results are closely related to the
tions, the scope of the resource tracking and location research is gradually development of the computer techniques themselves, and the construction
extended to other temporary resources. Examples include construction site resource tracking and positioning application relies on its own de-
equipment, cranes (Yang et al., 2011, Chen et al., 2017; Kim et al., 2018), velopment. As mentioned earlier, the application model is mainly based on
excavators and dump trucks (Maximilian et al., 2014; Bao et al., 2016; Kim the existing tracker for adaptive improvement.
et al., 2018), concrete mixer trucks (Kim, 2018), roller (Ren et al., 2016), In summary, it is obvious that the various breakthrough develop-
building materials, earthwork or concrete (Park et al., 2011; Teizer et al., ments mentioned above are intertwined. Achieving long-term stable
2013), and temporary structures, including scaffolding, support plates, tracking and accurate three-dimensional positioning of multitype re-
shingle walls (Wang et al., 2015) and safety guardrail (Kolar et al., 2018). sources and multiple targets, and more comprehensively grasping the
Taking into account safety factors, in the study of tracking and positioning construction site resource information, is paving the way for later safety
of some heavy construction equipment, the researchers are committed to performance evaluation and control.
completing the attitude estimation of the equipment for local positioning
(Golparvar-Fard et al., 2013; Wang et al., 2014; Yuan et al., 2017; Soltani 6.2.2. Blindspots
et al., 2017; Soltani et al., 2018). With the introduction of stereo vision, The interaction between construction workers and construction
traditional 2D tracking can no longer meet the needs of the safety per- equipment can be dangerous (CFOI, 2017a; CPWR, 2017b). The limited
formance assessment at the construction site, and resource tracking and visibility due to construction equipment blindspots is responsible for more
positioning are gradually tilted toward high-precision three-dimensional than half of the visibility-related fatalities in the construction industry
positioning (Brilakis et al., 2011; Zhu et al., 2016). On the basis of the (Teizer, 2015; Ray and Teizer, 2016). Device blindspots are spaces that are
previous research, they gradually use the principle of triangulation to not visible to the equipment operator. The existence of such a space poses
improve the depth information of resources in terms of personnel and a life-threatening danger to those working around the equipment. There-
equipment positioning. In the latest research, Yong-Ju Lee et al. (Lee and fore, monitoring equipment blindspots and conducting potential hazard
Park, 2019) completed the tracking and positioning of the construction identification can effectively reduce fatal cases (Teizer, 2015).
site workers by means of functional integration and online learning The intervention of computer vision technology in this aspect is
modules for detection and tracking. Under experimental verification, the mainly the determination of the dynamic range of blindspots at the
method was successful in the 0.821 m positioning error range, where 96% later stage. The intervention of computer vision technology in this as-
of the actual motion was inverted with a 99.7% confidence level. pect is mainly the determination of the dynamic range of blindspots in
For the dynamic construction site environment, related research that is the later stage. The dynamic model obtained by the stereo camera
similar to resource detection, due to the strong connection between them, system data is superimposed on the static blindspot map of the device
has overcome great challenges in the resource tracking and positioning (Teizer et al., 2010), which is customized by scanning or other spatial
shown in Table 5, such as the following: (1) tracking and locating in- data to complete the dynamic blindspot map (Ray and Teizer, 2011;
dividual workers, worker groups (multiple) and more resource types Ray and Teizer, 2013; Ray et al., 2013; Tao and Teizer, 2014), as shown

11
M. Zhang, et al. Safety Science 126 (2020) 104658

Fig. 9. Blindspot research framework (Ray and Teizer, 2016).

in Fig. 9. Obviously, the monitoring of dynamic equipment blindspots avoid such safety incidents. Proximity refers to the minimum distance
requires other technologies to complete the information fusion. The between devices or between equipment and workers (Kim et al., 2016).
main contribution of computer vision technology is to use the obtained The main method of using computer vision is that the researcher uses
depth image information of the stereo camera to complete the estima- the tracking and recognition object as the data acquisition method, uses
tion and tracking of the driver's head or face posture (Ray and Teizer, the positioning result (distance, motion speed, positional relationship,
2011; Ray and Teizer, 2016). Most of these studies are based on ma- etc.) as the data input, analyzes the proximity relationship between the
chine learning training depth information data to complete the classi- targets, and gives the final results of the safety assessment, which are
fication and regression of data; the main methods include the following: fed back to the relevant workers. It can be observed that the target
principal component analysis, support vector regression, and the location and distance measurement are the basis for proximity mon-
Random Forests algorithm, which has a connotation similar to the at- itoring.
titude estimation algorithm mentioned above. At present, vision-based distance measurement mainly involves two
However, even with the latest research results, in the outdoor en- methods as shown in Table 6. One is to use depth information to de-
vironment, the model test experiments were also unsatisfactory due to termine three-dimensional coordinates (Seo et al., 2015b; Seokho and
excessive interference data, thus lacking practical application ability. Caldas, 2012), and the other is to directly measure using two-dimen-
Researchers are currently trying to improve their application cap- sional images (Kim et al., 2016; Kim et al., 2017; Kim et al., 2019). The
abilities with more sophisticated techniques, such as face tracking. In former mainly uses the depth sensor (stereo camera, RGB-D sensor) to
addition, they also realized that monocular vision should be trans- directly obtain the image depth information and construct the three-
formed to binocular vision to better fit the visual mode of the person dimensional coordinates of the target or uses the binocular camera
itself, and the driver's differences should be considered as well (Ray and system to reconstruct the image depth information by means of the
Teizer, 2016). triangulation principle and feature matching (Yang et al., 2013; Brilakis
It should be noted that the dynamic blindspots monitoring results et al., 2011). The latter mainly uses the pixel information of the two-
need to provide real-time feedback to the equipment operators to im- dimensional image and uses the deep learning (convolution neural
prove their safety awareness. Of course, the more effective application network, etc.) method to estimate the Euclidean distance between the
of this monitoring result is to assess whether the interactions between targets.
the workers and equipment are appropriate in real time with the The proximity analysis and safety assessment are at the heart of
monitoring and tracking of the people around the equipment. proximity monitoring. The obtained proximity information with the
preset rules (specification, management experience, industry rules) is
used to compare and analyze to form a safety assessment result. Due to
6.2.3. Distance measurement and proximity analysis the lack of a high precision range or a mandatory range of distances, the
The incorrect interaction of construction workers with the built currently used assessment method is fuzzy evaluation (Seokho and
environment is an important factor in occupational health and safety Caldas, 2012; Kim et al., 2017). However, the current research situation
issues, such as the collision of workers and construction equipment in the literature, although there is speed information (speed direction
mentioned above. J. Hinze (Hinze et al., 2013) believes that security- and size) related to moving objects, it does not give the speed mea-
related leading information can be used to proactively prevent acci- surement link, so the evaluation system that comprehensively considers
dents, and this information is defined as information generated in real the speed of moving objects is not yet mature (Wang and Razavi, 2016).
time at specific work sites. In other words, leading information includes Timely feedback of the assessment results is an important part and the
the dangers surrounding current workers and the automatic monitoring key to the performance of the system. Once it is confirmed that there
of proximity relationships; active intervention is an important way to

Table 6
Vision-based distance measurement research details.
Reference Method 2D/3D Precision Device Limitation

Seokho and Caldas, 2012 triangulation 3D (1) ±1 m at 35 m Bumblebee XB3 High computational cost
Kim et al., 2016 Pixel distance conversion 2D \ stationary camera Loss of depth information, projection
distortion
Kim et al., 2017 Pixel distance conversion 2D (1) 0.93 m at 2.8 m stationary camera/CCTV Limit device location and need reference
object
Kim et al., 2019 (1) image rectification 2D (1) 97.43% at 26.3 m (lab-scale) camera-mounted UAVs Need reference object
(2) Pixel distance conversion (2) ±0.9 m at 20 m (field study)

*2D/3D means whether depth information is included.

12
M. Zhang, et al. Safety Science 126 (2020) 104658

may be a security risk (such as the risk of being hit), it will be trans- “prejudice” of algorithms based on data training and improving the
mitted to the relevant personnel in various forms of information. At accuracy and robustness of the application in complex and variable
present, it is mainly based on audible alarms, wearable sensor vibration construction environments, to the extent that the technical limitations
information or visual information feedback. caused by the environment are solved. The establishment of such a data
This section reviews the use of computer vision technology to in- set may require building up a data-sharing platform for global stake-
tervene in the interaction between workers and their environment. holders to upload data and receive rewards to encourage their active
Such interventions include timely feedback on the static situation of the sharing behavior. One potential challenge of it is how to manage and
worker's environment, that is, the content of the scene within a certain maintain this sharing system, such as working out the rewards method.
range, and, more importantly, predicting the dynamic interactions be- Another is to develop processes to standardize data for efficient data-
tween workers and the environment. This intervention is essentially an sharing.
automated, static and dynamic response to a timely response to a se-
curity risk assessment. As far as the current research is concerned, the 7.1.2. Integration of multiple technologies
degree of automation, that is, the timeliness of analysis and feedback, Compared with other methods, computer vision technology has the
the accuracy, the accuracy and robustness of the harsh and variable advantages of low interference, a simple layout and low cost. However, in
environment, are not fully supported by the practice on the site. some respects, there are still problems that are difficult to solve at present,
However, the potential of computer vision in this field is huge. including the visual range of applications, and it is difficult to avoid ex-
Compared with other automation technologies, it has a strong cost ternal interference. This paper argues that the complementation of mul-
appeal, a great application range and an efficient application return. tiple information technologies is probably the most promising solution.
Some studies have supported this view (Yang et al., 2011b; Ray and
7. Open research challenges and future research directions Teizer, 2011; Ray and Teizer, 2013; Ray et al., 2013; Tao and Teizer, 2014;
Seo et al., 2015b; Soltani et al., 2018). However, the scope of current
The following thoughts highlight a few challenges and future di- research attempts is narrow and the actual application capabilities have
rections that are important to researchers: not been verified. Therefore, it is still a research challenge to combine
computer vision technology with other information technologies, espe-
7.1. Technical restrictions cially other sensor technologies, and to make full use of its advantages to
achieve practical solutions for the health and safety supervision of workers
The success of the monitoring system is subject to the accuracy and on the construction site. One of the difficulties in the integration of this
reliability of the visual information obtained. The dynamic and complex technology is data conversion and comprehensive reasoning. It may re-
construction environment presents challenges to computer vision tech- quire the creation of complex algorithmic models. Another difficulty is to
nology for worker health and safety monitoring at the construction site ensure the consistency of the data (object, time, space, etc.). For example,
(Seo et al., 2015b), for example, sites involving multiple workers, different in the proximity analysis of heavy equipment at workers and construction
types of equipment and materials, and changing working environments. sites, the consistency between dynamic data obtained from other speed
Computer vision technology has created a number of technical limitations sensors and visually monitored data is the basis for accurate completion of
in the face of such environments, as follows: lighting conditions (Park and comprehensive reasoning and safety assessment. Potential solutions are
Brilakis, 2012; Fang et al., 2018a), occlusion (Yang et al., 2011a), visual developing processing hardware that supports multiple complex fusion
range (Ding et Al., 2018; Luo et al., 2018a), and camera placement (Seo algorithms and introducing unique data identifiers such as timestamps. In
et al., 2015b). This paper reviews the literature within the scope of this addition, future research needs to select the most economical and most
paper, trying to overcome these technical limitations from algorithms, suitable technology integration solution to enhance the application po-
models, and method improvements. One of the researchers' proposed so- tential of computer vision technology.
lutions is to use deep learning algorithms to extend the training data set to
address the limitations imposed by partial occlusion (Fang et al., 2018a; 7.2. Method validation and evaluation
Fang et al., 2018c). Other studies have proposed the use of drones to re-
place cameras that are fixed at the construction site to address the spatial It is important in the designing and testing of algorithms to validate
limitations of camera placement (Irizary et al., 2012; Gheisari et al., 2014; their performance (Teizer, 2015). Establishing reasonable evaluation
Seo et al., 2015b; Kim et al., 2017). However, there is currently no research criteria not only provides project managers with reference to the al-
on its feasibility. Inspired by some of the research, this paper proposes gorithm selection, but also guides the research in the field. It should be
several feasible solutions to the technical limitations for future researchers. emphasized that the value of the research in this field is based on its
practical application capabilities. This is the core criterion for judging
7.1.1. Establishment of a large-scale comprehensive construction site data the significance of the research. In the complex and variable con-
set struction environment, it is more important to verify the actual per-
The current research has recognized the importance of establishing formance of the proposed method, model and framework. From the
publicly available data sets for complex construction sites; establishing perspectives of evaluation purposes, the correctness of the algorithm
these data sets is important to overcome the technical constraints im- could be regarded as most concerned validation criterion and evalua-
posed by the environment. As mentioned earlier, many studies have tion effect, within the scope of the review. In addition, some studies
now established data sets for specific computer vision tasks, including also mentioned the robustness and generalization ability of the algo-
the classification of workers' gestures and construction equipment ac- rithm. There are some potential problems with these validation and
tivities. However, these public data sets still have a huge gap between evaluation. First, the correctness of the algorithm needs to be evaluated
the amount of data and the type of data and the large datasets (image by comparing with the benchmarks—the “truth”, which raises another
net, etc.) in the existing computer vision field. Therefore, more research problem, how to provide the benchmarks. The definition of benchmarks
work is needed in the future to create a comprehensive data set in the in some previous studies is reasonable. Some studies in the two-di-
building field. It should include not only image data for specific visual mensional image ranging field used high-precision distance measuring
tasks (pose recognition, target detection, activity classification, etc.) but instrument measurements to reflect the truth (Kim et al., 2017; Kim
also visual data reflecting the construction site’s complex features, in- et al., 2019). Another example is in some PPE-use detection (Park et al.,
cluding a variety of construction site elements and different changes in 2015; Rubaiyat et al., 2016; Mneymneh et al., 2017; Fang et al., 2018c,
viewing angles, lighting, and occlusion conditions. This comprehensive Wu and Zhao, 2018) and behavior recognition researches (Peddi et al.,
and complex integrated data set is of great significance in avoiding the 2009; Gong et al., 2011; Liu et al., 2016a), the truth was manually

13
M. Zhang, et al. Safety Science 126 (2020) 104658

determined in ahead. However, in the majority of previous studies, what will happen (Gregory, 2013). In the field of management, al-
definition or detailed descriptions of validation benchmarks were though computer vision technology has made remarkable progress in
missing, for instance, resource tracking (Teizer and Vela, 2009). It has target detection and resource tracking, the few studies involving the
been argued that definition of validation benchmarks is necessary to overall grasp of the scene are often limited to the study of a certain
achieve improved credibility of proposed method (Oberkampf and computer visual specific underlying content, ignoring the spatial re-
Trucano, 2008). Therefore, from this perspective, the first challenge is lationships and the logical connections lack the mining, extraction,
to set validation and evaluation benchmarks. In this paper, it is believed interpretation and reasoning of the semantic information of the scene,
that a detailed validation benchmarks definition is defining appropriate making it difficult to form advanced visual information. However, it is
benchmarks for the evaluation of solution accuracy and stating con- especially important for the assessment of occupational health and
ceptual description and uncertainty quantification of the benchmark safety risks and the prevention of occupational safety accidents. Over-
precisely (Oberkampf and Trucano, 2008). The second evaluation cri- coming the challenge can effectively improve the automation of the use
terion is the robustness of the algorithm. A small number of studies of computer vision technology for the occupational health and safety
gave a detailed description of the robustness of the algorithm (Fang supervision of workers. Dynamic scene understanding dominated by
et al., 2018a; Fang et al., 2018b). Overviewing the literature, the ver- visual sensing devices in the field of workers’ health and safety mon-
ification of robustness is based on simulating the actual environment itoring at construction site, is not only to dig out information about
(lighting, occlusion, etc.). The limitation of this verification mode is the worker health and safety, for instance, explaining the spatial relation-
absence of long-term algorithmic testing. In other words, it is necessary ship of elements in the monitoring scene, but also to grasp their motion
to set the time standard for validation and evaluation experiments. intention, infer the evolution trend of the scene, for analyzing the
Because the algorithms are practiced at the construction site. They will health and safety risks, and achieving self-monitoring. This type of
definitely face the harsh environment and withstand the long-term test. construction site scene understanding, i.e., using the overall informa-
The survivability of the system determines its application in practice. In tion of the scene, can also improve the accuracy and reliability of the
addition, attention should be paid to the method generalization vali- completion of multiple underlying visual tasks. For example, combining
dation. For example, in the study of worker pose estimation, the re- the spatial semantic relationship between workers' scenes (materials,
search often gives a classification result of the model for a small number construction equipment, etc.) can significantly improve the accuracy of
of posture categories and emphasizes that the model has strong gen- the algorithm for worker pose estimation (Jun 2018). To achieving
eralization ability. However, it has not been verified yet. The statement scene understanding, one potential solution is to use the existing ad-
is not convincing enough. vanced visual algorithms such as visual relationship detection model
In addition to appealing existing method validation and evaluation (VTransE, Language Prior) to extract visual semantics. The challenge of
criteria, in the specific area of the construction industry, social en- the solution mentioned above is that it is difficult to calibrate and train
vironment should also be taken into consideration. It is essential for the visual relationships. Another solution is to consider advanced in-
application and promotion of the methods. This factor has not been formation technologies such as ontology technology. Ontology tech-
considered in the existing evaluation of methods such as the legal en- nology can formalize concepts and their relationships in specific do-
vironment (Seo et al., 2015). Because of the psychological pressure mains (Dhingra and Bhatia, 2015). Using ontology can narrow the gap
computer vision techniques may bring to workers, the loss of trust in between low-level and high-level information (Kazi Tani et al., 2017;
managers, and the infringement of human privacy (Tabak and Smith, Hernandez-Leal et al., 2017).
2005), many regions and countries have clear regulations on the usage
of cameras at construction sites. For example, South Korea restricts the 7.4. Integral monitoring system design
use of high-precision cameras, and the target recognition algorithm
based on high-precision pixels (Shrestha et al., 2015) loses its appli- The research goal of the researcher should be to propose a safety
cation capability. Besides, the economics of the method have not been monitoring system with application capabilities that can finally be
reflected in the existing algorithm evaluation. In the occupational practiced. The integrity of this system includes data entry, data pro-
health and safety monitoring of workers area, only the most efficient cessing, result output and feedback. The current research focuses on the
and economical method will be selected and applied. development of effective data collection and processing methods that
In the evaluation and verification of the method, besides clarifying often overlook health and safety assessments and output requirements.
the actual value of the method, the deficiency and limitation should be First, within the scope of the review, only a few articles have considered
explained to promote research in this field. the health and safety assessment of workers (Seokho and Caldas, 2012;
In summary, the following challenges are faced in the verification Kim et al., 2017; Han and Lee, 2013; Guo et al., 2018). The existing
and evaluation of methods: assessment of the construction workers' health and safety is based on
preset rules (specifications, experience, laws, etc.) to assess the workers'
1) How to set verification benchmarks and evaluation criteria, design health and safety risks, which are often subjective (e.g., based on
verification experiments, and evaluate the practical application management experience) and fuzzy. Seokho Chi et al. interpreted
capabilities and application values of the methods. Worker’s safety by considering proximity and crowdedness together,
2) The verification and evaluation of the method needs to be com- but the two indicators are subjective regarding the definition of security
prehensive, fully considering the practical ability, depending on the risks (Seokho and Caldas, 2012; Kim et al., 2017). Research on reliable
actual situation of the application, including but not limited to the evaluation rules and quantitative indicators should be done before data
following: effectiveness, robustness, generalization ability, practic- processing. the lack of reliable and objective quantitative indicators
ability and economy. corresponding to specific tasks limits the accuracy and reliability of the
3) Identify the limitations and deficiencies of the proposed method, automatic monitoring of workers' health and safety at the construction
identify the source of such restrictions, and provide potential im- site by computer vision techniques. It is not conducive to the practice of
provements. technology and, therefore, the research on the establishment of quan-
titative and reliable evaluation rules and indicators should be
7.3. Complex scene understanding strengthened.
Second, it is essential to design a reasonable output port and feed-
Scene understanding means integrating information about the var- back method to put the proposed system into practice. Some studies
ious components of the construction site scene to explain the following: have considered feedback methods, such as using mobile apps or alerts
what is in the scene; what is their relationship; what happened; and to warn workers (Seokho and Caldas, 2012; Kim et al., 2017). But the

14
M. Zhang, et al. Safety Science 126 (2020) 104658

effectiveness of this feedback has not been tested. The feedback method design. Through literature research, the article proposes to solve the
has a single feedback object—workers. As far as the construction site is current technical limitations by establishing open and comprehensive
concerned, a complete project-level multi-user friendly interface should data sets and putting forward computer vision technology-led multi-
be designed to deal with the following issues: technology fusion solutions. It is necessary to design reasonable the
experiments and standards to verify and evaluate the proposed method
1) users with differences, i.e., the personnel involved in the construc- and, further, to assess the practical application ability of the research
tion site are complex, with different skill levels, different cultural results. In addition, from the perspective of the development of com-
levels, different age groups, etc.; puter vision technology, the research should focus on improving the
2) project-level application level; and ability of computer vision technology to automate the understanding of
3) simple, fast and timely feedback requirements. complex scenes at the construction site and decision-making and in-
creasing the level of automation of worker health and safety mon-
Achieving complete system functions requires not only the func- itoring. It is also suggested that more research should be done on the
tional design of each subsystem, but also the integration of all sub- evaluation and output of the monitoring system based on computer
system, which has not been considered in the current studies. The vision technology. In summary, computer vision technology needs to be
challenge here is the data transfer between the various subsystems. For improved as a whole to cope with the dynamic and changeable con-
the health and safety monitoring system of construction site workers, struction site application environment and complex worker health and
timely feedback is often required, which puts high demands on data safety monitoring issues.
transmission between subsystems. Edge computing may be a potential Considering the technical advantages and application potential of
solution for real-time system performance. Edge computing has been computer vision technology in the automatic supervision of worker
applied to self-driving cars and intelligent transportation. A geo- health and safety at construction sites, I believe that these challenges
graphically distributed architecture of public clouds and edges that and limitations can be well solved in the future research, and the ap-
extend down to the cameras is a feasible approach to meet the strict plication potential of computer vision technology can be brought into
real-time requirements of large-scale live video analytics full play to improve the automatic management level of worker health
(Ananthanarayanan et al., 2017). Therefore, Edge computing can be and safety on construction sites.
considered as a suitable method to solve the latency, bandwidth, and
provisioning challenges in the monitoring system. And another possible Acknowledgement
solution is to use advanced fifth-generation (5G) mobile wireless net-
works (Li et al., 2017). The 5G network has been considered as a new This work was supported by the 2019 Natural Science Foundation of
communication network, which can provide high-capacity and high- Liaoning Province, China: Research on Identification and Early Warning
rate data transmissions (Liu et al., 2018). The 5G network can flexibly of Hazard Scenes on Construction Site Based on Machine Vision and
support a variety of devices and services, which makes it possible to Crowd-Sensing (Project No. 2019-MS-052).
combine all the subsystems.
References
8. Conclusions
Abdelhamid, T.S., Everett, J.G., 2000. Identifying root causes of construction accidents.
In the past two decades, computer vision techniques have received (Statistical data included). J. Construct. Eng. Manage. 126, 52. https://doi.org/10.
1061/(ASCE)0733-9364(2000)126:1(52).
extensive attention in the field of worker safety and health monitoring Aggarwal, J.K., Ryoo, M.S., 2011. Human activity analysis: A review. ACM Comput. Surv.
and the prevention of safety accidents and occupational illness, and (CSUR) 43, 1–43. https://doi.org/10.1145/1922649.1922653.
great achievements have been made in this field. Although great efforts Albert, A., Hallowell, M.R., Kleiner, B., Ao, C., 2014. Enhancing construction hazard re-
cognition with high-fidelity augmented virtuality. (Report) (Author Abstract). J.
have been made, given the significant progress in research in this area, Construct. Eng. Manage. 140. https://doi.org/10.1061/(ASCE)CO.1943-7862.
we recognize that this review is not exhaustive. The article first used 0000860.
statistical and bibliometric analysis tools to conduct a macroscopic in- Ananthanarayanan, G., Bahl, P., Bodík, P., Chintalapudi, K., Philipose, M., Ravindranath,
L., Sinha, S., 2017. Real-time video analytics: The killer app for edge computing.
terpretation of the field development of the literature within the scope Computer 50, 58–67.
of the review. Azar, E.R., 2016. Construction equipment identification using marker-based recognition
In the implementation of content-based review, based on the re- and an active zoom camera. J. Comput. Civil Eng. 30. https://doi.org/10.1061/
(ASCE)CP.1943-5487.0000507.
search on the mechanism of occupational diseases and safety accidents,
Azar, E.R., McCabe, B., 2012. Part based model and spatial-temporal reasoning to re-
the applications of computer vision technology in the health and safety cognize hydraulic excavators in construction images and videos. Autom. Constr. 24,
supervision of workers on the construction site were divided into the 194. https://doi.org/10.1016/j.autcon.2012.03.003.
following aspects: (1) workers themselves (2) workers interaction with Bao, R.X., Sadeghi, M.A., Golparvar-Fard, M., 2016. Characterizing construction equip-
ment activities in long video sequences of earthmoving operations via kinematic
the site. Additionally, the research background, significance, technical features. In: Construction Research Congress 2016: Old and New Construction
characteristics and research progress of each subarea in the two aspects Technologies Converge in Historic San Juan, 849–858. https://doi.org/10.1061/
were elaborated. The research progress in the field fully demonstrated 9780784479827.086.
Bird, F., 1974. Management Guide to Loss Control. Institute Press, Atlanta, GA.
the application ability and potential of computer vision technology. Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M., 2010. Visual object tracking using
Computer vision technology has strong cost and technical advantages in adaptive correlation filters. IEEE Comput. Soc, Los Alamitos 2544–2550. https://doi.
collecting visual information about the occupational health and safety org/10.1109/CVPR.2010.5539960.
Brilakis, I., Lee, S., Becerik-gerber, B., 2013. Motion data-driven unsafe pose identifica-
of workers at the construction site. The monitoring includes the tion through biomechanical analysis. In: ASCE International Workshop on Computing
worker's own unsafe behavior (personal wearing, worker's posture, etc.) in Civil Engineering. vol. Reston, VA: American Society of Civil Engineers, Reston,
and interactions between people and the construction site the proximity VA, pp. 693–700.
Brilakis, I., Park, M., Jog, G., 2011. Automated vision tracking of project related entities.
of workers and equipment) and can process and evaluate the rich in-
Adv. Eng. Inf. 25, 713–724. https://doi.org/10.1016/j.aei.2011.01.003.
formation obtained to prevent occupational diseases and safety acci- Chan, A.P.C., Wong, F.K.W., Chan, D.W.M., Yam, M.C.H., Kwok, A.W.K., Lam, E.W.M.,
dents. Cheung, E., 2008. Work at height fatalities in the repair, maintenance, alteration, and
addition works. J. Constr. Eng. Manage. 134 (7), 527–535. https://doi.org/10.1061/
Although breakthroughs have been made in the research of vision-
(ASCE)0733-9364(2008)134:7(527).
based health and safety monitoring of construction site workers, many Chen, J.D., Fang, Y.H., Cho, Y.K., 2017. Mobile asset tracking for dynamic 3D crane
problems have still not been fully solved, such as the following: (1) workspace generation in real time. In: Computing in Civil Engineering 2017: Sensing,
technical limitations; (2) verification and evaluation of methods; (3) Simulation, and visualization, pp. 122–129.
Chi, C., Chang, T., Ting, H., 2005. Accident patterns and prevention measures for fatal
complex scene understanding; and (4) integral monitoring system

15
M. Zhang, et al. Safety Science 126 (2020) 104658

occupational falls in the construction industry. Appl. Ergon. 36, 391–400. https:// 295–316.
doi.org/10.1016/j.apergo.2004.09.011. Kerr, W., 1957. Complementary theories of safety psychology. J. Social Psychol. 45, 3–9.
Chua, D.K.H., Goh, Y.M., 2004. Incident causation model for improving feedback of safety https://doi.org/10.1080/00224545.1957.9714280.
knowledge. J. Constr. Eng. Manage. 130 (4), 542–551. https://doi.org/10.1061/ Khosrowpour, A., Niebles, J.C., Golparvar-Fard, M., 2014. Vision-based workface as-
(ASCE)0733-9364(2004)130:4(542). sessment using depth images for activity analysis of interior construction operations.
Delleman, N., Boocock, M., Kapitaniak, B., Schaefer, P., Schaub, K., 2000. ISO/FDIS Autom. Constr. 48, 74–87. https://doi.org/10.1016/j.autcon.2014.08.003.
11226: evaluation of static working postures. In: Proceedings of the Human Factors Kim, D., Yin, K., Liu, M., Lee, S., Kamat, V.R., 2017. Feasibility of a drone-based on-site
and Ergonomics Society ... Annual Meeting 6, vol. 442. proximity detection in an outdoor construction site. In: Computing in Civil
Dhingra, V., Bhatia, K.K., 2015. Development of ontology in laptop domain for knowledge Engineering 2017: Smart Safety, Sustainability, and Resilience, pp. 392–400.
representation. Procedia Comput. Sci. 46, 249–256. Kim, D., Liu, M., Lee, S., Kamat, V.R., 2019. Remote proximity monitoring between
Ding, L., Fang, W., Luo, H., Love, P.E.D., Zhong, B., Ouyang, X., 2018. A deep hybrid mobile construction resources using camera-mounted UAVs. Autom. Constr. 99,
learning model to detect unsafe behavior: integrating convolution neural networks 168–182. https://doi.org/10.1016/j.autcon.2018.12.014.
and long short-term memory. Autom. Constr. 86, 118–124. https://doi.org/10.1016/ Kim, H., 2018. 3D reconstruction of a concrete mixer truck for training object detectors.
j.autcon.2017.11.002. Autom. Constr. 88, 23–30. https://doi.org/10.1016/j.autcon.2017.12.034.
Fang, Q., Li, H., Luo, X., Ding, L., Luo, H., Rose, T.M., An, W., 2018a. Detecting non- Kim, H., Kim, H., Hong, Y.W., Byun, H., 2018a. Detecting construction equipment using a
hardhat-use by a deep learning method from far-field surveillance videos. Autom. region-based fully convolutional network and transfer learning. J. Comput. Civil Eng.
Constr. 85, 1–9. https://doi.org/10.1016/j.autcon.2017.09.018. 32. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000731.
Fang, W., Ding, L., Luo, H., Love, P., 2018b. Falls from heights: A computer vision-based Kim, H., Kim, K., Kim, H., 2016. Vision-based object-centric safety assessment using fuzzy
approach for safety harness detection. Autom. Constr. 91, 53. https://doi.org/10. inference: monitoring struck-by accidents with moving objects. J. Comput. Civil Eng.
1016/j.autcon.2018.02.018. 30. https://doi.org/10.1061/(ASCE)CP.1943-5487.0000562.
Fang, W., Ding, L., Zhong, B., Love, P.E.D., Luo, H., 2018c. Automated detection of Kim, J., Chi, S., Seo, J., 2018b. Interaction analysis for vision-based activity identification
workers and heavy equipment on construction sites: A convolutional neural network of earthmoving excavators and dump trucks. Autom. Constr. 87, 297–308. https://
approach. Adv. Eng. Inf. 37, 139–149. https://doi.org/10.1016/j.aei.2018.05.003. doi.org/10.1016/j.autcon.2017.12.016.
Garrett, J.W., Teizer, J., 2009. Human factors analysis classification system relating to Kim, J., Chi, S., 2017. Adaptive detector and tracker on construction sites using functional
human error awareness taxonomy in construction safety. J. Construct. Eng. Manage. integration and online learning. J. Comput. Civil Eng. 31, 4017026. https://doi.org/
135, 754–763. https://doi.org/10.1061/(ASCE)CO.1943-7862.0000034. 10.1061/(ASCE)CP.1943-5487.0000677.
Gavrila, D.M., 1999. The visual analysis of human movement: A survey. Comput. Vis. Kim, K., Kim, H., Kim, H., 2017b. Image-based construction hazard avoidance system
Image Underst. 73, 82–98. https://doi.org/10.1006/cviu.1998.0716. using augmented reality in wearable device. Autom. Constr. 83, 390. https://doi.org/
Gheisari, M., Irizarry, J., Walker, A.B.N., 2014. UAS4SAFETY: The potential of unmanned 10.1016/j.autcon.2017.06.014.
aerial systems for construction safety applications. In: Construction Research Koch, C., Georgieva, K., Kasireddy, V., Akinci, B., Fieguth, P., 2015. A review on computer
Congress 2014: Construction in a Global Network. 2014, pp. 1801–1810. https:// vision based defect detection and condition assessment of concrete and asphalt civil
ascelibrary.org/doi/abs/10.1061/9780784413517.184. infrastructure. Adv. Eng.Informat. 29, 196–210.
Girshick, R., Donahue, J., Darrell, T., Malik, J., 2013. Rich feature hierarchies for accurate Kolar, Z., Chen, H.N., Luo, X.W., 2018. Transfer learning and deep convolutional neural
object detection and semantic segmentation. https://doi.org/10.1109/CVPR. networks for safety guardrail detection in 2D images. Autom. Constr. 89, 58–70.
2014.81. https://doi.org/10.1016/j.autcon.2018.01.003.
Golparvar-Fard, M., Heydarian, A., Niebles, J.C., 2013. Vision-based action recognition of Konstantinou, E., Brilakis, I., 2016. 3D matching of resource vision tracking trajectories.
earthmoving equipment using spatio-temporal features and support vector machine In: Construction Research Congress 2016: Old and New Construction Technologies
classifiers. Adv. Eng. Inf. 27, 652–663. https://doi.org/10.1016/j.aei.2013.09.001. Converge in Historic San Juan, pp. 1742–1752.
Gong, J., Caldas, C.H., Gordon, C., 2011. Learning and classifying actions of construction Konstantinou, E., Brilakis, I., 2018. Matching construction workers across views for au-
workers and equipment using bag-of-video-feature-words and bayesian network tomated 3D vision tracking on-site. J. Construct. Eng. Manage. 144. https://doi.org/
models. Adv. Eng. Inf. 25, 771–782. https://doi.org/10.1016/j.aei.2011.06.002. 10.1061/(ASCE)CO.1943-7862.0001508.
Grabowski, M., Ayyalasomayajula, P., Merrick, J., Harrald, J.R., Roberts, K., 2007. Konstantinou, E., Lasenby, J., Brilakis, I., 2019. Adaptive computer vision-based 2D
Leading indicators of safety in virtual organizations. Saf. Sci. 45, 1013–1043. https:// tracking of workers in complex environments. Automat. Construct. 103, 168–184.
doi.org/10.1016/j.ssci.2006.09.007. Krause, T.R., 1997. The Behavior-Based Safety Process: Managing Involvement for an
Gregory, E., 2013. Understanding scene understanding. Front. Psychol. 4. https://doi. Injury-Free Culture. Van Nostrand Reinhold, New York, New York.
org/10.3389/fpsyg.2013.00954. Le, Q., Pedro, A., Park, C., 2015. A social virtual reality based construction safety edu-
Guo, H., Yu, Y., Ding, Q., Skitmore, M., 2018. Image-and-skeleton-based parameterized cation system for experiential learning. J. Intell. Rob. Syst. 79, 487–506. https://doi.
approach to real-time identification of construction workers’ unsafe behaviors. J. org/10.1007/s10846-014-0112-z.
Constr. Eng. Manage. - ASCE 144 (6), 1–10. https://doi.org/10.1061/(ASCE)CO. Lee, Y., Park, M., 2019. 3D tracking of multiple onsite workers based on stereo vision.
1943-7862.0001497. Article number: 04018042. Autom. Constr. 98, 146–159. https://doi.org/10.1016/j.autcon.2018.11.017.
Hadikusumo, B.H.W., Rowlinson, S., 2002. Integration of virtually real construction Li, H., Lu, M., Hsu, S., Gray, M., Huang, T., 2015. Proactive behavior-based safety
model and design-for-safety-process database. Autom. Constr. 11, 501–509. https:// management for construction safety improvement. Saf. Sci. 75, 107–117. https://doi.
doi.org/10.1016/S0926-5805(01)00061-9. org/10.1016/j.ssci.2015.01.013.
Han, S., Lee, S., Pena-Mora, F., 2013. Vision-based detection of unsafe actions of a con- Li, X., Yi, W., Chi, H., Wang, X., Chan, A.P.C., 2018. A critical review of virtual and
struction worker: case study of ladder climbing. J. Comput. Civil Eng. 27, 635–644. augmented reality (VR/AR) applications in construction safety. Autom. Constr. 86,
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000279. 150–162. https://doi.org/10.1016/j.autcon.2017.11.003.
Han, S., Lee, S., 2013. A vision-based motion capture and recognition framework for Li, X., Samaka, M., Chan, H.A., Bhamare, D., Gupta, L., Guo, C., Jain, R., 2017. Network
behavior-based safety management. Autom. Constr. 35, 131–141. https://doi.org/10. slicing for 5G: challenges and opportunities. IEEE InternetComput. 21, 20–27.
1016/j.autcon.2013.05.001. Liang, X., Shen, G.Q., Bu, S.S., 2016. Multiagent systems in construction: A ten-year re-
Heinrich, H.W., 1969. Industrial Accident Prevention, 4th ed. McGraw-Hill, New York. view. J. Comput. Civil Eng. 30. https://doi.org/10.1061/(ASCE)CP.1943-5487.
Heinrich, H.W., 1980. Industrial Accident Prevention. McGraw-Hil, New York. 0000574.
Held, D., Thrun, S., Savarese, S., 2016. Learning to track at 100 FPS with deep regression Lingard, H., 2013. Occupational health and safety in the construction industry. Construct.
networks. in: Lecture Notes in Computer Science, GEWERBESTRASSE 11, CHAM, CH- Manage. Econ. 31, 505–514. https://doi.org/10.1080/01446193.2013.816435.
6330, SWITZERLAND, pp. 749–765. Lingard, H., Wakefield, R., Cashin, P., 2011. The development and testing of a hier-
Hernandez-Leal, P., Escalante, H.J., Sucar, L.E., 2017. Towards a Generic Ontology for archical measure of project OHS performance. Eng., Construct. Architect. Manage.
Video Surveillance. Springer, pp. 3–7. 18, 30–49. https://doi.org/10.1108/09699981111098676.
Hinze, J., Thurman, S., Wehle, A., 2013. Leading indicators of construction safety per- Liu, M.Y., Hong, D.P., Han, S., Lee, S., 2016a. Silhouette-based on-site human action
formance. Saf. Sci. 51, 23–28. https://doi.org/10.1016/j.ssci.2012.05.016. recognition in single-view video. In: Construction Research Congress 2016: Old and
HS, 2018. BS ISO 45001:2018 - Occupational Health and Safety Management Systems. New Construction Technologies Converge in Historic San Juan, 951-959. https://doi.
Requirements with Guidance for Use. org/10.1061/9780784479827.096.
Irizary, J., Gheisari, M., Walker, B.N., 2012. Usability assessment of drone technology as Liu, M., Han, S., Lee, A.S., 2017. Potential of Convolutional Neural Network-Based 2D
safety inspection tools. Electronic J. Informat. Technol. Construct. 17, 194–212. Human Pose Estimation for On-Site Activity Analysis of Construction Workers. pp.
Jaafar, M.H., Arifin, K., Aiyub, K., Razman, M.R., Ishak, M.I.S., Samsurijan, M.S., 2018. 141–149. https://ascelibrary.org/doi/abs/10.1061/9780784480847.018.
Occupational safety and health management in the construction industry: A review. Liu, M., Han, S., Lee, S., 2016b. Tracking-based 3D human skeleton extraction from stereo
Int. J. Occupat. Saf. Ergon. 24, 493–506. https://doi.org/10.1080/10803548.2017. video camera toward an on-site safety and ergonomic analysis. Construct. Innovat.
1366129. 16, 348–367. https://doi.org/10.1108/CI-10-2015-0054.
Jun, Y., 2018. Enhancing action recognition of construction workers using data-driven Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C., 2016c. SSD:
scene parsing. J. Civil Eng. Manage. 24. https://doi.org/10.3846/jcem.2018.6133. Single Shot MultiBox Detector. Springer International Publishing AG, CHAM, pp.
Kalal, Z., Mikolajczyk, K., Matas, J., 2012. Tracking-learning-detection. IEEE Trans. 21–37. https://doi.org/10.1007/978-3-319-46448-0_2.
Pattern Anal. Mach. Intell. 34, 1409–1422. https://doi.org/10.1109/TPAMI.2011. Liu, X., Jia, M., Zhang, X., Lu, W., 2018. A novel multi-channel Internet of Things based
239. on dynamic spectrum sharing in 5G communication. IEEE Internet Things J.
Karhu, O., Kansi, P., Kuorinka, I., 1977. Correcting working postures in industry: A Lu, C., Krishna, R., Bernstein, M., Fei-Fei, L., 2016. Visual relationship detection with
practical method for analysis. Appl. Ergon. 8, 199–201. https://doi.org/10.1016/ language priors. In: Lecture Notes in Computer Science, GEWERBESTRASSE 11,
0003-6870(77)90164-8. CHAM, CH-6330, SWITZERLAND, pp. 852–869. https://doi.org/10.1109/IPAS.2018.
Kazi Tani, M., Ghomari, A., Lablack, A., Bilasco, I., 2017. OVIS: ontology video surveil- 8708855.
lance indexing and retrieval system. Int. J. Multimedia Informat. Retrieval 6, Luo, H., Xiong, C., Fang, W., Love, P., Zhang, B., Ouyang, X., 2018a. Convolutional neural

16
M. Zhang, et al. Safety Science 126 (2020) 104658

networks: computer vision-based workforce activity assessment in construction. 1016/j.ssci.2011.07.015.


Autom. Constr. 94, 282. https://doi.org/10.1016/j.autcon.2018.06.007. Ren, S.Q., He, K.M., Girshick, R., Sun, J., 2015. Faster R-CNN: Towards Real-Time Object
Luo, X., Li, H., Cao, D., Yu, Y., Yang, X., Huang, T., 2018b. Towards efficient and objective Detection with Region Proposal Networks. Neural Information Processing Systems
work sampling: recognizing workers' activities in site surveillance videos with two- (NIPS), LA Jolla. https://doi.org/10.1109/TPAMI.2016.2577031.
stream convolutional networks. Autom. Constr. 94, 360–370. https://doi.org/10. Ren, X.N., Zhu, Z.H., Chen, Z., Dai, F., 2016. Project related entities tracking on con-
1016/j.autcon.2018.07.011. struction sites by particle filtering. In: Construction Research Congress 2016: Old and
Markoulli, M.P., Lee, C.I.S.G., Byington, E., Felps, W.A., 2017. Mapping human resource New Construction Technologies Converge IN Historic San Juan, 909–918. https://
management: reviewing the field and charting future directions. Human Resource doi.org/10.1061/9780784479827.092.
Manage.Rev. 27, 367–396. Rubaiyat, A.M., Toma, T.T., Kalantari-Khandani, M., Rahman, S.A., Chen, L.W., Ye, Y.F.,
A comprehensive methodology for vision-based progress and activity estimation of ex- Pan, C.S., 2016. Automatic detection of helmet uses for construction safety. In: 2016
cavation processes for productivity assessment. vol. Unpublished. https://doi.org/10. IEEE/WIC/ACM International Conference On Web Intelligence Workshops (WIW
13140/RG.2.1.4630.2561. 2016), 135–142. https://doi.org/10.1109/WIW.2016.10.
Mcatamney, L., Nigel Corlett, E., 1993. RULA: A survey method for the investigation of Sacks, R., Rozenfeld, O., Rosenfeld, Y., 2009. Spatial and temporal exposure to safety
work-related upper limb disorders. Appl. Ergon. 24, 91–99. https://doi.org/10.1016/ hazards in construction. (Author abstract) (Report). J. Construct. Eng. Manage. 135,
0003-6870(93)90080-S. 726. https://doi.org/10.1061/(ASCE)0733-9364(2009)135:8(726).
Mcsween, T.E., 2003. Value-Based Safety Process: Improving Your Safety Culture with Schaub, K., 2006. Ergonomics of manual handling - part 1: lifting and carrying. In:
Behavior-Based Safety. Wiley-Interscience, Hoboken, N.J. Karwowski, W. (Ed.), Handbook on Standards and Guidelines in Ergonomics and
Mingyuan, Z., Tianzhuo, C., Xuefeng, Z., 2017. Applying sensor-based technology to Human Factors. Lawrence Erlbaum, London, pp. 254–269.
improve construction safety management. Sensors 17, 1841. https://doi.org/10. Seo, J., Starbuck, R., Han, S., Lee, S., Armstrong, T.J., 2015a. Motion Data-driven bio-
3390/s17081841. mechanical analysis during construction tasks on sites. J. Comput. Civil Eng. 29.
Mitropoulos, P., Abdelhamid, T.S., Howell, G.A., 2005. Systems model of construction https://doi.org/10.1061/(ASCE)CP.1943-5487.0000400.
accident causation. J. Constr. Eng. Manage. 131, 816. https://doi.org/10.1061/ Seo, J., Yin, K.Q., Lee, S., 2016. Automated postural ergonomic assessment using a
(ASCE)0733-9364(2005)131:7(816). computer vision-based posture classification. In: Construction Research CONGRESS
Mneymneh, B.E., Abbas, M., Khoury, H., 2017. Automated Hardhat Detection for 2016: Old and New Construction Technologies Converge in Historic San Juan,
Construction Safety Applications. Elsevier Science BV, Amsterdam, pp. 895–902. 809–818. https://doi.org/10.1061/9780784479827.082.
https://doi.org/10.1016/j.proeng.2017.08.022. Seo, J., Han, S., Lee, S., Kim, H., 2015b. Computer vision techniques for construction
Moeslund, T.B., Hilton, A., Krüger, V., 2006. A survey of advances in vision-based human safety and health monitoring. Adv. Eng. Inf. 29, 239–251. https://doi.org/10.1016/j.
motion capture and analysis. Comput. Vis. Image Underst. 104, 90–126. https://doi. aei.2015.02.001.
org/10.1016/j.cviu.2006.08.002. Seokho, C., Caldas, C.H., 2012. Image-based safety assessment: automated spatial safety
Mok, K.Y., Shen, G.Q., Yang, J., 2015. Stakeholder management studies in mega con- risk identification of earthmoving and surface mining activities. (Author Abstract). J.
struction projects: A review and future directions. Int. J. Project Manage. 33, Construct. Eng. Manage. 138, 341. https://doi.org/10.1061/(ASCE)CO.1943-7862.
446–457. https://doi.org/10.1016/j.ijproman.2014.08.007. 0000438.
Nam, H., Han, B., 2016. Learning Multi-Domain Convolutional Neural Networks for Shaoqing, R., Kaiming, H., Girshick, R., Jian, S., 2017. Faster R-CNN: towards real-time
Visual Tracking. IEEE, New York, pp. 4293–4302. https://doi.org/10.1109/CVPR. object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach.
2016.465. Intell. 39, 1137–1149. https://doi.org/10.1109/TPAMI.2016.2577031.
Oberkampf, W.L., Trucano, T.G., 2008. Verification and validation benchmarks. Nucl. Shappell, S.A., Wiegmann, D.A., 1997. A human error approach to accident investigation:
Eng. Design 238, 716–743. the taxonomy of unsafe operations. Int. J. Aviation Psychol. 7, 269–291. https://doi.
Park, C., Kim, H., 2013. A framework for construction safety management and visuali- org/10.1207/s15327108ijap0704_2.
zation system. Autom. Constr. 33, 95. https://doi.org/10.1016/j.autcon.2012.09. Sheehan, C., Donohue, R., Shea, T., Cooper, B., Cieri, H.D., 2016. Leading and lagging
012. indicators of occupational health and safety: the moderating role of safety leadership.
Park, M.W., Brilakis, I., 2016. Continuous localization of construction workers via in- Accid. Anal. Prev. 92, 130–138. https://doi.org/10.1016/j.aap.2016.03.018.
tegration of detection and tracking. Autom. Constr. 72, 129–142. https://doi.org/10. Shrestha, K., Shrestha, P.P., Bajracharya, D., Yfantis, E.A., 2015. Hard-hat detection for
1016/j.autcon.2016.08.039. construction safety visualization. J. Construct. Eng. 2015. https://doi.org/10.1155/
Park, M.W., Elsafty, N., Zhu, Z.H., 2015. Hardhat-wearing detection for enhancing on-site 2015/721380.
safety of construction workers. J. Construct. Eng. Manage. 141. https://doi.org/10. Sinelnikov, S., Inouye, J., Kerper, S., 2015. Using leading indicators to measure occu-
1061/(ASCE)CO.1943-7862.0000974. pational health and safety performance. Saf. Sci. 72, 240–248. https://doi.org/10.
Park, M., Brilakis, I., 2012. Construction worker detection in video frames for initializing 1016/j.ssci.2014.09.010.
vision trackers. Autom. Constr. 28, 15–25. https://doi.org/10.1016/j.autcon.2012. Soltani, M.M., Zhu, Z.H., Hammad, A., 2016. Automated annotation for visual recognition
06.001. of construction resources using synthetic images. Autom. Constr. 62, 14–23. https://
Park, M., Makhmalbaf, A., Brilakis, I., 2011. Comparative study of vision tracking doi.org/10.1016/j.autcon.2015.10.002.
methods for tracking of construction site resources. Autom. Constr. 20, 905–915. Soltani, M.M., Zhu, Z.H., Hammad, A., 2018. Framework for location data fusion and pose
https://doi.org/10.1016/j.autcon.2011.03.007. estimation of excavators using stereo vision. J. Comput. Civil Eng. 32. https://doi.
Pawłowska, Z., 2015. Using lagging and leading indicators for the evaluation of occu- org/10.1061/(ASCE)CP.1943-5487.0000783.
pational safety and health performance in industry. Int. J. Occupat. Saf. Ergon. 21, Soltani, M.M., Zhu, Z.H., Hammad, A., 2017. Skeleton estimation of excavator by de-
284–290. https://doi.org/10.1080/10803548.2015.1081769. tecting its parts. Autom. Constr. 82, 1–15. https://doi.org/10.1016/j.autcon.2017.
Peddi, A., Huan, L., Bai, Y., Kim, S., 2009. Development of human pose analyzing algo- 06.023.
rithms for the determination of construction productivity in real-time. In: Sowmya, K.S., 2017. Construction workers activity detection using BOF. In: 2017
Ariaratnam, S.T., Rojas, E.M. (Eds.), Construction Research Congress 2009. vol. International Conference on Recent ADVANCES IN electronics and Communication
Reston, VA: American Society of Civil Engineers, Reston, VA, pp. 11–20. https://doi. Technology (ICRAECT), 159–163. https://doi.org/10.1109/ICRAECT.2017.54.
org/10.1061/41020(339)2. Straker, L., Campbell, A., Coleman, J., Ciccarelli, M., Dankaerts, W., 2010. In vivo la-
Perlman, A., Sacks, R., Barak, R., 2014. Hazard recognition and risk perception in con- boratory validation of the physiometer: A measurement system for long-term re-
struction. Saf. Sci. 64, 22–31. https://doi.org/10.1016/j.ssci.2013.11.019. cording of posture and movements in the workplace. Ergonomics: An Int. J. Res.
Petersen, D., 1971. Techniques of Safety Management. McGraw-Hil, New York. Practice Human Factors Ergon. https://doi.org/10.1080/00140131003671975.
Piyathilaka, L., Kodagoda, S., 2013. Gaussian Mixture Based HMM for Human Daily Suraji, A., Duff, A.R., Peckitt, S.J., 2001. Development of causal model of construction
Activity Recognition Using 3D Skeleton Features, pp. 567–572. accident causation. J. Construct. Eng. Manage. 127, 337. https://doi.org/10.1061/
Ray, S.J., Teizer, J., 2011. Coarse head pose estimation of construction equipment op- (ASCE)0733-9364(2001) 127:4(337).
erators to formulate dynamic blind spots. Adv. Eng. Inf. 26. https://doi.org/10.1016/ Tabak, F., Smith, W., 2005. Privacy and electronic monitoring in the workplace: A model
j.aei.2011.09.005. of managerial cognition and relational trust development. Employee Responsib.
Ray, S.J., Teizer, J., 2013. Computing 3D blind spots of construction equipment: im- Rights J. 17, 173–189. https://doi.org/10.1007/s10672-005-6940-z.
plementation and evaluation of an automated measurement and visualization method Tajeen, H., Zhu, Z., 2014. Image Dataset development for measuring construction
utilizing range point cloud data. Autom. Constr. 36, 95–107. https://doi.org/10. equipment recognition performance. Autom. Constr. 48, 1–10. https://doi.org/10.
1016/j.autcon.2013.08.007. 1016/j.autcon.2014.07.006.
Ray, S.J., Teizer, J., 2016. Dynamic blindspots measurement for construction equipment Tao, C., Teizer, J., 2014. Modeling tower crane operator visibility to minimize the risk of
operators. Saf. Sci. 85, 139–151. https://doi.org/10.1016/j.ssci.2016.01.011. limited situational awareness. (Report). J. Comput. Civil Eng. 28, 4014004. https://
Ray, S.J., Teizer, J., 2012. Real-time construction worker posture analysis for ergonomics doi.org/10.1061/(ASCE)CP.1943-5487.0000282.
training. Adv. Eng. Inf. 26, 439–455. https://doi.org/10.1016/j.aei.2012.02.011. Teizer, J., Vela, P.A., 2009. Personnel tracking on construction sites using video cameras.
Ray, S., Teizer, J., Bostelman, R., Agronin, M., Albanese, D., 2013. Improved Methods for Adv. Eng. Inf. 23, 452–462. https://doi.org/10.1016/j.aei.2009.06.011.
Evaluation of Visibility for Industrial Vehicles Towards Safety Standards. IAARC Teizer, J., 2015. Status Quo and open challenges in vision-based sensing and tracking of
Publications, Waterloo, pp. 1–8. temporary resources on infrastructure construction sites. Adv. Eng. Inf. 29, 225–238.
Reason, J., 1990. Human error. West J. Med. 12, 393–396. https://doi.org/10.1016/j.aei.2015.03.006.
Redmon, J., Divvala, S., Girshick, R., Farhadi, A., 2016. You Only Look Once: Unified, Teizer, J., Allread, B.S., Mantripragada, U., 2010. automating the blind spot measurement
Real-Time Object Detection. IEEE, NEW YORK, pp. 779–788. https://doi.org/10. of construction equipment. Autom. Constr. 19, 491–501. https://doi.org/10.1016/j.
1109/CVPR.2016.91. autcon.2009.12.012.
Reese, C.D., Eidson, J.V., 2006. Handbook of OSHA Construction Safety and Health. Teizer, J., Cheng, T., Fang, Y., 2013. Location tracking and data visualization technology
Reiman, T., Pietikainen, E., 2012. Leading indicators of system safety - monitoring and to advance construction ironworkers' education and training in safety and pro-
driving the organizational safety potential. Saf. Sci. 50, 1993. https://doi.org/10. ductivity. Autom. Constr. 35, 53–68. https://doi.org/10.1016/j.autcon.2013.03.004.

17
M. Zhang, et al. Safety Science 126 (2020) 104658

Teo, E.A.L., Ling, F.Y.Y., Chong, A.F.W., 2005. Framework for project managers to 10.1038/nature14539.
manage construction safety. Int. J. Project Manage. 23, 329–341. https://doi.org/10. Yi, W., Chan, A.P.C., 2014. Critical review of labor productivity research in construction
1016/j.ijproman.2004.09.001. journals. (Author Abstract). J. Manage. Eng. 30, 214. https://doi.org/10.1061/
Thomas Ng, S., Pong Cheng, K., Martin Skitmore, R., 2005. A framework for evaluating (ASCE)ME.1943-5479.0000194.
the safety performance of construction contractors. Build. Environ. 40, 1347–1355. Yi, W., Chan, A., 2016. Health profile of construction workers in Hong Kong. Int. J.
https://doi.org/10.1016/j.buildenv.2004.11.025. Environ. Res. Publ. Health 13, 1232. https://doi.org/10.3390/ijerph13121232.
Turaga, P., Chellappa, R., Subrahmanian, V.S., Udrea, O., 2008. Machine recognition of Yuan, C.X., Li, S., Cai, H.B., 2017. Vision-based excavator detection and tracking using
human activities: A survey. IEEE Trans. Circuits Syst. Video Technol. 18, 1473–1488. hybrid kinematic shapes and key nodes. J. Comput. Civil Eng. 31. https://doi.org/10.
https://doi.org/10.1109/TCSVT.2008.2005594. 1061/(ASCE)CP.1943-5487.0000602.
Vojir, T., Noskova, J., Matas, J., 2014. Robust scale-adaptive mean-shift for tracking. Zhang, H.W., Kyaw, Z., Chang, S.F., Chua, T.S., 2017. Visual Translation Embedding
Pattern Recogn. Lett. 49, 250–258. https://doi.org/10.1016/j.patrec.2014.03.025. Network for Visual Relation Detection. IEEE, New York, pp. 3107–3115. https://doi.
Wang, H.B., Zou, H.L., Zhang, R.Z., 2014. A Novel Approach to Detect the Posture of org/10.1109/CVPR.2017.331.
Excavator's Manipulator. Trans Tech Publications Ltd, Stafa-Zurich, pp. 891–898. Zhang, H., Yan, X.Z., Li, H., 2018. Ergonomic posture recognition using 3D view-invariant
Wang, J., Razavi, S.N., 2016. Low false alarm rate model for unsafe-proximity detection features from single ordinary camera. Autom. Constr. 94, 1–10. https://doi.org/10.
in construction. J. Comput. Civil Eng. 30. https://doi.org/10.1061/(ASCE)CP.1943- 1016/j.autcon.2018.05.033.
5487.0000470. Zhang, S., Sulankivi, K., Kiviniemi, M., Romo, I., Eastman, C.M., Teizer, J., 2015. BIM-
Wang, J., Zhang, S., Teizer, J., 2015. Geotechnical and safety protective equipment based fall hazard identification and prevention in construction safety planning. Saf.
planning using range point cloud data and rule checking in building information Sci. 72, 31–45. https://doi.org/10.1016/j.ssci.2014.08.001.
modeling. Autom. Constr. 49, 250–261. https://doi.org/10.1016/j.autcon.2014.09. Zheng, W.L., Bhandarkar, S.M., 2006. A Boosted Adaptive Particle Filter for Face
002. Detection and Tracking. IEEE, New York, p. 2821. https://doi.org/10.1109/ICIP.
Wang, X., Dong, X.S., Choi, S.D., Dement, J., 2016. Work-related musculoskeletal dis- 2006.312995.
orders among construction workers in the united states from 1992 to 2014. Occup. Zhong, B., Wu, H., Ding, L., Love, P.E.D., Li, H., Luo, H., Jiao, L., 2019. Mapping computer
Environ. Med. 74, 374. https://doi.org/10.1136/oemed-2016-103943. vision research in construction: Developments, knowledge gaps and implications for
Wong, L., Wang, Y., Law, T., Lo, C.T., 2016. Association of root causes in fatal fall-from- research. Automat. Construct. 107.
height constructionaccidents in Hong Kong. J. Construct. Eng. Manage 142. https:// Zhu, Z.H., Park, M.W., Koch, C., Soltani, M., Hammad, A., Davari, K., 2016. Predicting
doi.org/10.1061/(ASCE)CO.1943-7862.0001098. movements of onsite workers and mobile equipment for enhancing construction site
Wu, H., Zhao, J.S., 2018. An intelligent vision-based approach for helmet identification safety. Autom. Constr. 68, 95–101. https://doi.org/10.1016/j.autcon.2016.04.009.
for work safety. Comput. Ind. 100, 267–277. https://doi.org/10.1016/j.compind.
2018.03.037.
Yang, J., Arif, O., Vela, P.A., Teizer, J., Shi, Z., 2010. Tracking multiple workers on Web References
construction sites using video cameras. Adv. Eng. Inf. 24, 428–434. https://doi.org/
10.1016/j.aei.2010.06.008. 2012. Nonfatal Occupational Injuries and Illnesses Requiring Days Away from Work, 2.
Yang, J., Cheng, T., Teizer, J., Vela, P.A., Shi, Z.K., 2011a. A performance evaluation of Bureau of Labor Statistics. http://www.bls.gov/news.release/osh2.nr0.htm (Nov.
vision and radio frequency tracking methods for interacting workforce. Adv. Eng. Inf. 12, 8).
25, 736–747. https://doi.org/10.1016/j.aei.2011.04.001. 2015. http://www.wpro.who.int (Nov. 26, 8).
Yang, J., Shi, Z.K., Wu, Z.Y., 2016. Vision-based action recognition of construction 2016. WHO Definition of Health. World Health Organization. http://www.pitt.edu/
workers using dense trajectories. Adv. Eng. Inf. 30, 327–336. https://doi.org/10. ~super1/globalhealth/What%20is%20Health.htm (Nov. 8, 8).
1016/j.aei.2016.04.009. 2017a. Census of Fatal Occupational Injuries (CFOI). Bureau of Labor Statistics. https://
Yang, J., Vela, P., Teizer, J., Shi, Z., 2011. Vision-Based crane tracking for understanding www.bls.gov/iif/oshcfoi1.htm (Nov. 02, 8).
construction activity. In: Zhu, Y., Issa, R.R. (Eds.), International Workshop on 2017b. CPWR Quarterly Data Report: Struck-By Injuries in the Construction Industry. The
Computing in Civil Engineering 2011. vol. Reston, VA: American Society of Civil Center for Construction Research and Training. https://www.cpwr.com/
Engineers, Reston, VA, pp. 258–265. https://doi.org/10.1061/(ASCE)CP.1943-5487. publications/cpwr-updates/cpwr-quarterly-data-report-struck-injuries-construction-
0000242. industry (Feb. 16, 9).
Yang, M., Chao, C., Huang, K., Lu, L., Chen, Y., 2013. Image-based 3D scene re- 2018. Accident List. http://sgxxxt.mohurd.gov.cn/Public/AccidentList.aspx (Nov. 30, 8).
construction and exploration in augmented reality. Autom. Constr. 33, 48–60. Green, L.A.T.G. 2012. “Real-time Proactive Safety in Construction.” http://www.
https://doi.org/10.1016/j.autcon.2012.09.017. powermag.com/real-time-proactive-safety-in -construction/ (Aug. 3, 2018).
Yann, L., Yoshua, B., Geoffrey, H., 2015. Deep learning. Nature 521, 436. https://doi.org/

18

You might also like