Professional Documents
Culture Documents
Cai Et Al 2022 Prediction Based Path Planning For Safe and Efficient Human Robot Collaboration in Construction Via Deep
Cai Et Al 2022 Prediction Based Path Planning For Safe and Efficient Human Robot Collaboration in Construction Via Deep
Abstract: Robotics has attracted broad attention as an emerging technology in construction to help workers with repetitive, physically
demanding, and dangerous tasks, thus improving productivity and safety. Under the new era of human–robot coexistence and collaboration
in dynamic and complex workspaces, it is critical for robots to navigate to the targets efficiently without colliding with moving workers.
This study proposes a new deep reinforcement learning (DRL)–based robot path planning method that integrates the predicted movements of
construction workers to achieve safe and efficient human–robot collaboration in construction. First, an uncertainty-aware long short-term
memory network is developed to predict the movements of construction workers and associated uncertainties. Second, a DRL framework is
formulated, where predicted movements of construction workers are innovatively integrated into the state space and the computation of the
reward function. By incorporating predicted trajectories in addition to current locations, the proposed method enables proactive planning such
that the robot could better adapt to human movements, thus ensuring both safety and efficiency. The proposed method was demonstrated and
evaluated using simulations generated based on real construction scenarios. The results show that prediction-based DRL path planning
achieved a 100% success rate (with a total of 10,000 episodes) for robots to achieve the destination along the near-shortest path. Furthermore,
it reduced the collision rate with moving workers by 23% compared with the conventional DRL method, which does not consider predicted
information. DOI: 10.1061/(ASCE)CP.1943-5487.0001056. © 2022 American Society of Civil Engineers.
space to provide additional information for robot path planning and have the same scale that is determined by the environment.
(detailed in the next section). Thus, it can be directly used as inputs for deep Q-networks (DQNs)
as detailed in the “DQN Model” section.
State Space
Action Space
The state space reflects the observations of the robots about the
environment, which serves as the input of its decision process. In Given the current and target locations in the 2D grid environment,
conventional studies on robot navigation and control (e.g., Gao a robot takes action to move from the current location to the target
et al. 2020; Yan et al. 2021), raw sensory data [e.g., light detection location along the grids. Existing studies typically defined the ac-
and ranging (LiDAR) measurement and images from cameras] are tion space of a mobile robot to include four discrete actions, i.e., up,
used to model state space. In contrast, this study modeled state down, left, and right (Xu et al. 2019; Zhang et al. 2019). In addition
space using high-level information, including the current location to the four actions, this study also introduces a fifth action, i.e., stay,
and destination of robots, as well as current and predicted locations to incorporate the scenario where the robot can choose to wait
of construction workers, which can be processed from raw sensory for workers to move away based on the predicted movement of
data using various object localization and tracking algorithms workers. It was assumed that the robot moves at a fixed speed (one
(Roberts and Golparvar-Fard 2019; Cai and Cai 2020). Through grid per time step), and thus, the updated location of the robot after
such an approach, the proposed method can be naturally extended it takes one specific action can be illustrated in Fig. 3, which de-
to other systems with different sensors. As a result, at any time termines the transition rule of robot location.
step t, the state space is a 16-dimensional vector, denoted
Reward Function
St ¼ ½xtr ; ytr ; xtd ; ytd ; xt;0
o ; yo ; xo ; yo ; : : : ; xo ; yo
t;0 t;1 t;1 t;5 t;5
The reward function is critical to the success of DRL in solving
where ½xtr ; ytr = location of the robot in terms of 2D grid coordi- complex tasks. In mobile robot path planning on construction
nates, varying at each time step after the robot takes an action, and sites, a robot is expected to move efficiently from its start point
in practice, this information could be obtained via multiple sources, to its destination without collision with other workers moving on
such as onboard global positioning system (GPS), cameras, and/or the jobsites. Therefore, the design of the reward function con-
LiDAR; ½xtd ; ytd = destination of the robot in terms of 2D grid co- siders both efficiency and safety. For efficiency, three factors are
incorporated:
ordinates, remaining unchanged during the decision horizon; this is
• At any time step, the robot should be motivated to move closer
reasonable because a robot is typically given a specific task with
to the target position instead of moving away from the target.
a known goal (i.e., destination) in practice; and ½xt;0 o ; y o ; xo ;
t;0 t;1
Therefore, after taking an action, if the distance between robot’s
yo ; : : : ; xo ; yo = current location (½xo ; yo ) and predicted loca-
t;1 t;5 t;5 t;0 t;0
location and desired position is smaller than their previous
tions (½xt;1o ; yo ; : : : ; xo ; yo ) of the robot’s nearest worker. The
t;1 t;5 t;5
of the predicted trajectory (i.e., five time steps) are considered, updated in every training step, and such stability could also im-
a deduction factor is applied for potential collision with the pre- prove the convergence.
dicted positions of workers. Let robs denote the penalty for col- Specifically, a fully connected (FC) neural network with two
liding with a worker (i.e., the robot and worker are in the same hidden layers (24 nodes for each layer) was used as approximators
grid), and γ is the deduction factor; then, the penalty (negative of both action-value function and target function of the DQN
reward) for colliding with a worker n-step later is denoted by (Du and Ghavidel 2022). The 16-dimensional state vector was con-
γ n robs (no deduction for colliding at the current location). As a sidered as input to the neural network. The rectified linear unit
result, a negative reward is formulated for potential collision (ReLU) activation function was implemented for the two hidden
with future positions of workers. To avoid applying penalties layers, and a linear activation function was implemented for the
repeatedly, only the earliest time when a collision may occur output layer, illustrated in Fig. 4.
is considered. Thus, rcollision ¼ minfrobs ; γ 1 robs ; : : : ; γ n robs g,
where n ¼ 5 in this case. Specifically, if the robot does not col-
lide with the worker at t, robs for that step is zero. In this way, the Experiments and Results
penalty is only applied to the earliest time step of a potential
collision, and the later the collision, the smaller the penalty. Environment Setup
• Once the robot exceeds the boundary of the site, a large penalty,
denoted as rboundary , is applied if they go beyond the boundary The proposed prediction-based DRL framework for robot path
to ensure the robot move within its workspace. As a result, the planning was demonstrated and evaluated using simulations that
reward function of the proposed method is r ¼ rdist þ raimless þ are generated based on real construction scenarios, detailed as
rdest þ rcollision þ rboundary . follows.
First, construction videos from three real construction projects
[two recorded by authors, and one collected from YouTube (2019)]
DQN Model
were used to train the uncertainty-aware LSTM model for worker
The MDP problem is solved via DRL, a modern approach to over- trajectory prediction, which was then used to generate simula-
come the limitation of traditional Q-learning by improving compu- tion environments to validate the proposed method for robot path
tational efficiency and convergence, leveraging advanced function planning. The videos consisted of a total of 84 workers conducting
approximators (e.g., neural networks) instead of the Q-table various construction activities, including site preparation, material
(Du and Ghavidel 2022). Various DRL algorithms have been devel- handling, formwork, and so on. The videos were taken from vari-
oped, including DQN (Mnih et al. 2013), deep deterministic policy ous angles at height, similar to the perspective of surveillance cam-
gradients (DDPG) (Lillicrap et al. 2015), and actor-critic methods eras. Fig. 5 illustrates some sample images of the collected videos.
(Mnih et al. 2016). In this study, the proposed DRL-based robot All videos were downsampled to 2 frames per second (fps),
path planning method was implemented based on a typical DRL compatible with other studies (Alahi et al. 2016) on video-based
algorithm, DQN, which employs neural networks to approximate human trajectory prediction. The videos were preprocessed, and
the Q values. The Bellman equation [Eq. (2)] was adopted for value bounding boxes of workers were manually labeled to extract entity
positions, serving as inputs of the prediction model. A total of from known dimensions of construction equipment. Such approxi-
2,544 frames consisting of 241 trajectories with various lengths mation may not influence the simulation of robot path planning
were extracted (Cai et al. 2020), where a trajectory was terminated because the locations of different workers and the boundary of the
if the worker was severely occluded by other objects. In this study, jobsite were estimated from the same set of parameters. The con-
the observation duration of worker movements was 3 s (i.e., six struction site, after being converted to the ground plane, is a 16 ×
frames) and the prediction duration was 5 s (i.e., 10 frames), fol- 11 m area, and was further divided into a 32 × 22 grid map with a
lowing relevant studies (Alahi et al. 2016) on human trajectory pre- grid size of 0.5 m.
diction. Therefore, the 241 trajectories were further split into Third, to train a generalized DRL model, a large number of
16-frame tracks using a sliding window with a stride of two frames, training data are needed. To achieve this, extensive simulations
resulting in 3,640 tracks for training and testing the prediction were conducted where construction environments with different
model. Consistent with previous studies (Cai et al. 2020, 2021), configurations were generated based on the aforementioned con-
the data set was randomly split into training set (80%), validation struction scenario (Fig. 6). Specifically, in each training episode of
set (10%), and testing set (10%). The network was trained with the the DRL model, (1) the initial and target locations of the robot were
Adam optimizer (Kingma and Ba 2014), with a learning rate of randomly generated within the environment (i.e., 32 × 22 grid map
0.001, batch size of 20, and dropout rate of 0.5. The proposed obtained in the previous step); (2) the number of workers was ran-
method was implemented on a desktop computer with Intel domly selected, ranging from one to five; and (3) the initial move-
i7-9700 CPU, 32 GB, and NVIDIA GeForce GTX 2060 GPU using ments of workers were randomly extracted from all trajectories in
the Keras library within the Tensorflow platform. the video to reflect real construction scenarios, and their following
Fig. 6 visualizes an example of predicted trajectories with asso- trajectories during the training process were sampled from pre-
ciated uncertainties, where lines represent the actual trajectories of dicted trajectory distribution obtained from the proposed LSTM
construction workers. In Fig. 6, the predicted trajectories closest to model in the first step.
the actual one (lighter color) have the highest probability, whereas
those farther away (darker color) exhibit relatively low probability.
This establishes a reliable basis to incorporate predicted trajectories Model Training
and their associated uncertainties for robot path planning. In the proposed path planning method, the parameter values in
Second, the aforementioned construction scenario (Fig. 6) was the reward function, especially the relative magnitudes of different
selected to simulate the environment to train the DRL model for components, have a significant influence on the model performance
robot path planning. To establish a realistic environment, the con- and convergence and should be determined considering the plan-
struction site and worker locations were converted from image ning context and objective. For instance, the relative magnitude
plane to ground plane via projective transformation (Hartley and between the reward of moving toward the destination (i.e., rdist )
Zisserman 2003), where the transformation matrix was estimated and the penalty of collision (i.e., rcollision ) indicates the relative
should be compatible with the reward (or penalty) of the robot mov- was implemented for the neural network training with an initial
ing close to (or far away from) the destination. Otherwise, the robot learning rate of 0.01 and exponential learning rate decay. Here,
tends to move back and forth instead of moving toward the goal. 50,000 training episodes were considered for the DRL model train-
The reward for achieving the destination should be the largest so ing. For each training episode, the process was terminated if any of
that the robot has the motivation to complete the task within the the three criteria were satisfied: (1) the step reaches 100; (2) the
planning horizon. In addition, the boundary of the workspace was robot reaches its destination; or (3) the robot exceeds environ-
considered a hard constraint, and the episode was terminated if ment boundaries. The convergence of the training curve is shown
the robot exceeded the boundary. Therefore, the penalty for exceed- in Fig. 7.
ing the boundary (rboundary ) should be the largest so that the robot
can rapidly learn the boundary states and avoid exceeding the
boundary. Performance of the DRL Model
Given the aforementioned principles and after experiments with
The trained DRL agent was further tested using 10,000 random
various combinations of values, the parameters in the reward func-
episodes (100 steps per episode) to evaluate the learning perfor-
tion were determined as follows: rdest ¼ 100, indicating the robot
mance. In addition, a conventional DRL agent that does not con-
is given a þ 100 reward for reaching the destination; rdist ¼ 1,
sider the predicted locations of workers was trained and tested for
representing that the robot receives a þ 1 reward or −1 penalty
comparison. Specifically, in the conventional DRL method, the ob-
each step for moving close to or far away from the destination;
servation space only contained the current and target locations of
if the distance between robot and destination remains unchanged
in two steps, rdist ¼ 0; raimless ¼ −1, indicating the robot receives the robot, and the current location of the nearest worker. The pro-
posed reward function and corresponding parameters were also
used in the conventional DRL agent for consistency, except for the
exclusion of penalty for potential collision at future time steps.
Quantitative metrics were used for performance evaluation in
terms of both efficiency and safety. For efficiency, two metrics were
used: (1) success rate, computed as the ratio between the number of
successful cases and the total number of testing episodes, where a
successful case is when the robot achieves the desired destination
before the end of the episode; and (2) average ratio between the
actual path and shortest path, where for each episode, the actual
path is learned from the DRL agent, and the shortest path is com-
puted as absðx0r − xd Þ þ absðy0r − yd Þ.
For safety, the collision rate was used for evaluation, where one
collision case is counted if the robot collides with a worker (at the
current time step) at least once during an episode. To examine
the efficacy of the proposed model in proactive planning, collision
rates at anticipatory locations of workers were also examined, the
smaller of which offers a more reliable and safer path and can
accommodate potential sensing uncertainties in practice. Table 1
Fig. 7. Training curve of the proposed DRL model.
presents the results of these metrics for both models.
Table 1. Performance comparison between proposed and conventional DRL models (total of 10,000 episodes)
Collision rate (%)
Average ratio
between actual At 1-step At 2-step At 3-step At 4-step At 5-step
Success and shortest At current predicted predicted predicted predicted predicted
Model rate (%) paths (ideally 1) location location location location location location
Conventional DRL (without prediction) 97.36 1.08 (1.005) 3.34 3.46 3.43 3.22 3.27 3.30
Prediction-based DRL (proposed in this study) 100 1.0003 2.56 2.47 2.54 2.54 2.50 2.58
Note: The value in brackets is the average ratio between actual and shortest paths for successful episodes using the conventional DRL model. Scenarios when
the workers occupy the same cell with the destination were excluded when determining the number of collisions.
Table 1 indicates that the proposed prediction-based DRL model the robot to move to the destination along the near-shortest path;
outperformed the conventional model that does not consider the and (2) it reduced the collision rate with moving workers by 23%
predicted locations of workers in both efficiency and safety. The compared with the conventional DRL method. By leveraging high-
proposed method achieved a 100% success rate with an average level information (e.g., preprocessed locations) instead of raw sens-
ratio between actual and shortest paths close to 1, which means ing data, this method is adaptive to various robotics and application
that the robot could successfully reach the destination in all 10,000 scenarios with different settings of sensors. In addition, by incor-
episodes along the near-shortest path. It represents the high effi- porating location information of the nearest neighbor instead of
ciency of the planned paths. Fig. 8 shows the visualization of the all obstacles in the environment, the proposed method is robust to
robot’s planned path in one example, where the site was zoomed to scenarios with a varying number of moving obstacles and can be
the central area with moving workers and the robot. extended to other environments (e.g., search and resecure or
In contrast, the robot agent that was learned using the conven- manufacturing).
tional DRL model failed to reach the destination in 2.64% of total The proposed framework is critical for human–robot collabora-
cases (or 264 cases). In almost all unsuccessful cases, the robot tion on unstructured and dynamic construction sites because it al-
ended up moving back and forth without heading to the destination, lows proactive adjustment of robot’s movement to avoid collisions
which has never occurred in the robot agent learned using the pro- without interrupting normal operation. It endows the mobile robot
posed model. One possible reason is that by introducing predicted with the capability to automatically generate a safe and efficient
locations of workers and associated penalties for future collision, path considering site dynamics, which is an essential prerequisite
the robot agent has more knowledge not only of the current situa- of multiple robotic construction operations, such as material han-
tion but also of the situation in near future. It provides the robot dling, site inspection, and so on. With intelligent path planning that
with more information and incentive to make decisions to move takes into account anticipatory movements of site entities, it is safer
to the destination. The lower chance for the robot to collide with and more feasible to introduce construction robots to onsite oper-
workers (both at the current time and in the near future), as indi-
ations and share the workspaces with other workers who do not
cated in Table 1, also proves the advantage of the proposed method
interact with them directly on a single task. Furthermore, in terms
in terms of safety.
of human–robot collaborative tasks, human operators are released
to only provide high-level commands that are related to the in-
tended tasks (e.g., assign target areas to be inspected or materials
Conclusions
to be handled), without having to specify a detailed path, which will
This study proposed a new prediction-enabled path planning significantly reduce the mental load of workers and increase the
method for construction robots considering the predicted trajecto- productivity.
ries of onsite workers. Specifically, a LSTM-based model was There remain some limitations to be addressed in a future study.
developed to predict the trajectory distribution of workers instead First, despite the high successful rate, worker trajectories in only
of absolute trajectories to incorporate the uncertainty. The DRL one real-world scenario (captured in construction video) were ex-
framework was used for path planning, where the predicted tra- tracted and augmented by adding randomness in the simulated
jectory was innovatively introduced to the observation space and environments. In future studies, more diverse scenarios will be
integrated into the computation of reward function to ensure both generated to train the agent, which will be further tested on both
safety and efficiency. Extensive simulations generated from real robotic simulation and real robot implementation. Second, because
construction scenarios were conducted to validate the proposed this study focused on moving workers with uncertain predicted
framework, which was also compared with conventional DRL- trajectories, static obstacles were not modeled in the framework.
based path planning that did not consider prediction information. However, the proposed framework can be easily extended to in-
The results showed that the proposed method generates safer clude static obstacles by integrating locations of static obstacles
and more efficient paths: (1) it achieved a 100% success rate for in the observation space and the computation of reward function,
quest, including (1) data and codes for generating DRL environ- www.mckinsey.com/business-functions/operations/our-insights/the
ment and training and testing DRL path planning model, and -construction-productivity-imperative.
(2) codes for training and testing the LSTM-based trajectory pre- Chen, Z., K. Chen, C. Song, X. Zhang, J. C. P. Cheng, and D. Li. 2022.
diction model. “Global path planning based on BIM and physics engine for UGVs in
indoor environments.” Autom. Constr. 139 (Jul): 104263. https://doi.org
/10.1016/j.autcon.2022.104263.
Dijkstra, E. W. 1959. “A note on two problems in connexion with graphs.”
Acknowledgments Numer. Math. 1 (1): 269–271. https://doi.org/10.1007/BF01386390.
Dong, C., H. Li, X. Luo, L. Ding, J. Siebert, and H. Luo. 2018. “Proactive
This research was partially funded by the University of Texas at struck-by risk detection with movement patterns and randomness.” Autom.
San Antonio (UTSA), Office of the Vice President for Research, Constr. 91: 246–255. https://doi.org/10.1016/j.autcon.2018.03.021.
Economic Development, and Knowledge Enterprise, and the US Dong, X. S., E. Betit, A. M. Dale, G. Barlet, and Q. Wei. 2019. “Trends
National Science Foundation (NSF) via Grant Nos. 2138514 and of musculoskeletal disorders and interventions in the construction
2129003. The authors gratefully acknowledge UTSA’s and NSF’s industry.” Accessed June 2, 2022. https://www.cpwr.com/wp-content
supports. /uploads/2020/06/Quarter3-QDR-2019.pdf.
Du, A., and A. Ghavidel. 2022. “Parameterized deep reinforcement
learning-enabled maintenance decision-support and life-cycle risk as-
sessment for highway bridge portfolios.” Struct. Saf. 97 (Jul): 102221.
References
https://doi.org/10.1016/j.strusafe.2022.102221.
Ajeil, F. H., I. K. Ibraheem, A. T. Azar, and A. J. Humaidi. 2020. Dusty Robotics. 2022. “Build better with BIM-Drien layout.” Accessed
“Grid-based mobile robot path planning using aging-based ant colony June 2, 2022. https://www.dustyrobotics.com/.
optimization algorithm in static and dynamic environments.” Sensors Freimuth, H., and M. König. 2018. “Planning and executing construction
(Switzerland) 20 (7): 1880. https://doi.org/10.3390/s20071880. inspections with unmanned aerial vehicles.” Autom. Constr. 96 (Dec):
Alahi, A., K. Goel, V. Ramanathan, A. Robicquet, L. Fei-Fei, and 540–553. https://doi.org/10.1016/j.autcon.2018.10.016.
S. Savarese. 2016. “Social LSTM: Human trajectory prediction in Fridovich-Keil, D., A. Bajcsy, J. F. Fisac, S. L. Herbert, S. Wang, A. D.
crowded spaces.” In Proc., IEEE Computer Society Conf. on Computer Dragan, and C. J. Tomlin. 2020. “Confidence-aware motion prediction
Vision and Pattern Recognition, 961–971. Manhattan, NY: IEEE. for real-time collision avoidance.” Int. J. Rob. Res. 39 (2–3): 250–265.
https://doi.org/10.1109/CVPR.2016.110. https://doi.org/10.1177/0278364919859436.
Asadi, E., B. Li, and I. M. Chen. 2018. “Pictobot: A cooperative painting Gao, J., W. Ye, J. Guo, and Z. Li. 2020. “Deep reinforcement learning
robot for interior finishing of industrial developments.” IEEE Rob. for indoor mobile robot path planning.” Sensors (Switzerland) 20 (19):
Autom. Mag. 25 (2): 82–94. https://doi.org/10.1109/MRA.2018.2816972. 1–15. https://doi.org/10.3390/s20195493.
Asadi, K., V. R. Haritsa, K. Han, and J.-P. Ore. 2021. “Automated object Ge, S. S., and Y. J. Cui. 2002. “Dynamic motion planning for mobile robots
manipulation using vision-based mobile robotic system for construction using potential field method.” Auton. Rob. 13 (3): 207–222. https://doi
applications.” J. Comput. Civ. Eng. 35 (1): 04020058. https://doi.org/10 .org/10.1023/A:1020564024509.
.1061/(ASCE)CP.1943-5487.0000946. Gong, D., Y. Wang, J. Yu, and G. Zuo. 2019. “Motion mapping from a
Asadi, K., A. Kalkunte Suresh, A. Ender, S. Gotad, S. Maniyar, S. Anand, human arm to a heterogeneous excavator-like robotic arm for intui-
M. Noghabaei, K. Han, E. Lobaton, and T. Wu. 2020. “An integrated tive teleoperation.” In Proc., 2019 IEEE Int. Conf. on Real-Time Com-
UGV-UAV system for construction site data collection.” Autom. Constr. puting and Robotics, RCAR 2019, 493–498. Manhattan, NY: IEEE.
112 (Apr): 103068. https://doi.org/10.1016/j.autcon.2019.103068. https://doi.org/10.1109/RCAR47638.2019.9044131.
ASI (Autonomous Solution, Inc). 2019. “Robotic excavators.” Accessed Hart, P. E., N. J. Nilsson, and B. Raphael. 1968. “A formal basis for the
October 12, 2019. https://www.asirobots.com/mining/excavator/. heuristic determination of minimum cost paths.” IEEE Trans. Syst. Sci.
Associated Builders and Contractors. 2021. “Construction spending Cybern. 4 (2): 100–107. https://doi.org/10.1109/TSSC.1968.300136.
and employment: History and forecast terms and sources.” Accessed Hartley, R., and A. Zisserman. 2003. Multiple view geometry in computer
June 2, 2021. https://www.abc.org/Portals/1/NewsReleases/explainer vision. Cambridge, UK: Cambridge University Press.
-2021.pdf?ver=2021-03-23-162431-727. Hospital Construction. 2019. “Hospital construction.” Accessed April 7, 2019.
Botteghi, N., B. Sirmacek, K. A. A. Mustafa, M. Poel, and S. Stramigioli. https://www.youtube.com/channel/UCEKwrM78pRv8WRcKvZNtE1w.
2020. “On reward shaping for mobile robot navigation: A reinforcement Hu, D., S. Li, J. Cai, and Y. Hu. 2020. “Toward intelligent workplace:
learning and SLAM based approach.” Preprint, submitted February 10, Prediction-enabled proactive planning for human-robot coexistence on
2020. http://arxiv.org/abs/2002.04109. unstructured construction sites.” In Proc., Winter Simulation Conf.
Cai, J., and H. Cai. 2020. “Robust hybrid approach of vision-based track- 2020, 2412–2423. New York: IEEE. https://doi.org/10.1109/WSC48552
ing and radio-based identification and localization for 3D tracking of .2020.9384077.
multiple construction workers.” J. Comput. Civ. Eng. 34 (4): 04020021. Jacob-Loyola, N., F. Muñoz-La Rivera, R. F. Herrera, and E. Atencio.
https://doi.org/10.1061/(ASCE)CP.1943-5487.0000901. 2021. “Unmanned aerial vehicles (UAVs) for physical progress moni-
Cai, J., A. Du, and S. Li. 2021. “Prediction-enabled collision risk estima- toring of construction.” Sensors 21 (12): 4227. https://doi.org/10.3390
tion for safe human-robot collaboration on unstructured and dynamic /s21124227.
mobile robot navigation for as-is 3D data collection and registration in Automation, 3310–3317. New York: IEEE. https://doi.org/10.1007
cluttered environments.” Autom. Constr. 106 (Oct): 102918. https://doi /978-1-4615-6325-9_11.
.org/10.1016/j.autcon.2019.102918. Tsuruta, T., K. Miura, and M. Miyaguchi. 2019. “Mobile robot for marking
Kim, S., M. Peavy, P. C. Huang, and K. Kim. 2021. “Development of BIM- free access floors at construction sites.” Autom. Constr. 107 (Nov):
integrated construction robot task planning and simulation system.” 102912. https://doi.org/10.1016/j.autcon.2019.102912.
Autom. Constr. 127 (Jul): 103720. https://doi.org/10.1016/j.autcon Wang, X., C.-J. Liang, C. C. Menassa, and V. R. Kamat. 2021. “Interactive
.2021.103720. and immersive process-level digital twin for collaborative human–robot
Kim, S. K., J. S. Russell, and K. J. Koo. 2003. “Construction robot construction work.” J. Comput. Civ. Eng. 35 (6): 04021023. https://doi
path-planning for earthwork operations.” J. Comput. Civ. Eng. 17 (2): .org/10.1061/(ASCE)CP.1943-5487.0000988.
97–104. https://doi.org/10.1061/(ASCE)0887-3801(2003)17:2(97). Wang, Z., H. Li, and X. Yang. 2020. “Vision-based robotic system for
Kingma, D. P., and J. L. Ba. 2014. “Adam: A method for stochastic opti- on-site construction and demolition waste sorting and recycling.”
mization.” Preprint, submitted December 22, 2014. http://arxiv.org/abs J. Build. Eng. 32 (Nov): 101769. https://doi.org/10.1016/j.jobe.2020
/1412.6980. .101769.
Lavalle, S. M. 1998. “Rapidly-exploring random trees: A new tool for Wen, S., Y. Zhao, X. Yuan, Z. Wang, D. Zhang, and L. Manfredi. 2020.
path planning.” Accessed June 2, 2022. https://www.cs.csustan.edu “Path planning for active SLAM based on deep reinforcement learning
/∼xliang/Courses/CS4710-21S/Papers/06RRT.pdf. under unknown environments.” Intell. Serv. Rob. 13 (2): 263–272.
Lillicrap, T. P., J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, https://doi.org/10.1007/s11370-019-00310-w.
and D. Wierstra. 2015. “Continuous control with deep reinforcement Xiao, X., B. Liu, G. Warnell, and P. Stone. 2022. “Motion planning and
learning.” Preprint, submitted September 9, 2015. http://arxiv.org/abs control for mobile robot navigation using machine learning: A survey.”
/1509.02971. Auton. Rob. 46 (6): 569–597. https://doi.org/10.1007/s10514-022
Liu, Y., M. Habibnezhad, and H. Jebelli. 2021a. “Brain-computer interface -10039-8.
for hands-free teleoperation of construction robots.” Autom. Constr. Xie, L., S. Wang, A. Markham, and N. Trigoni. 2017. “Towards monocular
123 (Mar): 103523. https://doi.org/10.1016/j.autcon.2020.103523. vision based obstacle avoidance through deep reinforcement learning.”
Liu, Y., M. Habibnezhad, and H. Jebelli. 2021b. “Brainwave-driven human- Preprint, submitted June 29, 2017. http://arxiv.org/abs/1706.09829.
robot collaboration in construction.” Autom. Constr. 124 (Apr): 103556. Xu, H., N. Wang, H. Zhao, and Z. Zheng. 2019. “Deep reinforce-
https://doi.org/10.1016/j.autcon.2021.103556. ment learning-based path planning of underactuated surface vessels.”
Lu, T., S. Tervola, X. Lü, C. J. Kibert, Q. Zhang, T. Li, and Z. Yao. 2021. Cyber-Phys. Syst. 5 (1): 1–17. https://doi.org/10.1080/23335777.2018
“A novel methodology for the path alignment of visual SLAM in indoor .1540018.
construction inspection.” Autom. Constr. 127 (Jul): 103723. https://doi Yan, N., S. Huang, and C. Kong. 2021. “Reinforcement learning-based
.org/10.1016/j.autcon.2021.103723. autonomous navigation and obstacle avoidance for USVs under parti-
Madsen, A. J. 2019. “The SAM100: Analyzing labor productivity.” ally observable conditions.” Math. Probl. Eng. 2021: 1–13. https://doi
Accessed July 5, 2022. https://digitalcommons.calpoly.edu/cmsp/243/. .org/10.1155/2021/5519033.
Min, H. Q., J. H. Zhu, and X. J. Zheng. 2005. “Obstacle avoidance with multi- Zhang, H. Y., W. M. Lin, and A. X. Chen. 2018. “Path planning for the
objective optimization by PSO in dynamic environment.” In Proc., 2005 mobile robot: A review.” Symmetry 10 (10): 450. https://doi.org/10
Int. Conf. on Machine Learning and Cybernetics, ICMLC 2005, 2950– .3390/sym10100450.
2956. Manhattan, NY: IEEE. https://doi.org/10.1109/icmlc.2005.1527447. Zhang, S., J. Teizer, N. Pradhananga, and C. M. Eastman. 2015. “Work-
Mnih, V., A. P. Badia, L. Mirza, A. Graves, T. Harley, T. P. Lillicrap, force location tracking to model, visualize and analyze workspace
D. Silver, and K. Kavukcuoglu. 2016. “Asynchronous methods for deep requirements in building information models for construction safety
reinforcement learning.” In Proc., 33rd Int. Conf. on Machine Learning, planning.” Autom. Constr. 60: 74–86. https://doi.org/10.1016/j.autcon
ICML 2016, 2850–2869. Norfolk, MA: Journal of Machine Learning .2015.09.009.
Research. Zhang, X., C. Wang, Y. Liu, and X. Chen. 2019. “Decision-making for the
Mnih, V., K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. autonomous navigation of maritime autonomous surface ships based on
Wierstra, and M. Riedmiller. 2013. “Playing Atari with deep reinforce- scene division and deep reinforcement learning.” Sensors (Switzerland)
ment learning.” Preprint, submitted December 19, 2013. http://arxiv.org 19 (18): 4055. https://doi.org/10.3390/s19184055.
/abs/1312.5602. Zhou, T., Q. Zhu, Y. Shi, and J. Du. 2022. “Construction robot teleoperation
Narazaki, Y., V. Hoskere, G. Chowdhary, and B. F. Spencer. 2022. “Vision- safeguard based on real-time human hand motion prediction.” J. Constr.
based navigation planning for autonomous post-earthquake inspection Eng. Manage. 148 (7): 04022040. https://doi.org/10.1061/(ASCE)CO
of reinforced concrete railway viaducts using unmanned aerial vehicles.” .1943-7862.0002289.